From patchwork Fri Jul 26 08:25:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manolis Tsamis X-Patchwork-Id: 1965207 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=NRxB89Ak; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVgpC2f2Hz20FJ for ; Fri, 26 Jul 2024 18:26:43 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AA46F3858283 for ; Fri, 26 Jul 2024 08:26:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by sourceware.org (Postfix) with ESMTPS id 2F7ED3858403 for ; Fri, 26 Jul 2024 08:25:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F7ED3858403 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2F7ED3858403 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982324; cv=none; b=pjCgbTj9D/MIlmADgS2M9nX9N0DceD1K97ZFngZafAIgQLyALFXMvwWXfRru2Q1OQNeq+RsWHgVHkK4CGgsbCmKyr6CIaSqgpWrOSgcbxEGDUEVuLodbDimAcakGJoHLD7T4cxoqXILzpKmEe+AM6erMO/R7CytFoiFLXBu1mec= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982324; c=relaxed/simple; bh=zAkLIblJTl29iWp6teG42u6MhE/5ivOomU3C1z+qZ4E=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=LAN0E1RSzviaXfA4NzqxaLc265jzbW9JKTTCJNcHFV2IhM2zunETsSwOWnfwl6OIUCtQCWq6gTaCeutGZySQORX8L6dX66mjvUMUmhJUbEp52UP4SzsdhsGMwuStCfnFl/4Y3r8R5ckBTdFyDYE4pcjlyN5ohNhAqU6sVWgEWVY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-52fc4388a64so1599679e87.1 for ; Fri, 26 Jul 2024 01:25:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1721982317; x=1722587117; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zbGyKKhyUGgzLHKwwPa4jusrfAe5fE50ZZj01U+1cos=; b=NRxB89Ak88cYZE+BM06/FlrLcagpZXDQJmqESJC+ekYdceQygoRP5FLMUoIoEqlAMy y6bOYUfmyOHr79rJ0vw2NKrpU6R/5n7l5JXJBuwb68mfXCn3QNdSbBO/8UeVmbGmbHcA hV0ayVry5Ft/ggD5zIbsYkhEich05k7Yu9kHL4l/0qbjlCsJDSxfKFAp86aAceT7sUKE 1KfUh2fFqQD1MwyMHE5We9T8C2W2xcaf9fm/A14NrI9Za4cuTOUXNM6FXTrNPr1sUZDH /p6EPLyKbHTQYJzfBPpLn6CU0x5ZBs4EEFapVEw8PxzBheaExm4d1uLWEMjW7kP6+YaM Q6rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721982317; x=1722587117; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zbGyKKhyUGgzLHKwwPa4jusrfAe5fE50ZZj01U+1cos=; b=IWlatpgs0UOc5VtiMqc6J8sGIgRrGi+VeVpa33ng+7tC3wSm/nMAV9hdKR0HY94zpl +hV/haD+whJTjYTjHyNG6uR6cEZASV0HfSmzHctSnSk/fh2Q5D+qnUadRtAysdotcw2y SM+Xz560owdPw2FGk1U/tqcQXOA4svb+XqUfkgU0EtXufGfb8lOYS9/3qcMabjrciURH ZnZ3ggdW96EIKtO1immQftT7FLAK55kDN6w98oyhu8yGdUpOUpiA14HPcI3ZANqPsP/O Nn/cB+JuuBcYADNyC5BOOOk4iCwWhEaUW1wq5y6LdxbJ/CPAMXM0SwL4zDwnVhJu0jLb T1RQ== X-Gm-Message-State: AOJu0Ywwzon17DHpz0UJQoSyMYzS87pt2DYnlsZEpVPZZcVRzrMzGUr0 gmDp81HXQhoNwgq2ibJ6f0vympnh3O/r/RxUuccjndP8cRSAjAfRcWpV4oq1H6jJ8hp1hxXvFE6 439E= X-Google-Smtp-Source: AGHT+IGVRu36sMYR0357ZoTrfhX7GfVR4PapCLCljii9u0TfRi9Rt/1rIypdQ+YaYFRMiqN9f6DPyg== X-Received: by 2002:a05:6512:3504:b0:52f:31:c2a3 with SMTP id 2adb3069b0e04-52fd6087488mr3578768e87.12.1721982316722; Fri, 26 Jul 2024 01:25:16 -0700 (PDT) Received: from helsinki-03.engr ([2a01:4f9:6b:2a47::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-52fd5c354d2sm432604e87.290.2024.07.26.01.25.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 01:25:16 -0700 (PDT) From: Manolis Tsamis To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Richard Biener , Philipp Tomsich , Jeff Law , Robin Dapp , Jiangning Liu , Manolis Tsamis Subject: [PATCH v5 3/3] [RFC] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets Date: Fri, 26 Jul 2024 10:25:01 +0200 Message-Id: <20240726082501.4086489-4-manolis.tsamis@vrull.eu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> References: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The existing implementation of need_cmov_or_rewire and noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG. This commit enchances them so they can handle/rewire arbitrary set statements. To do that a new helper struct noce_multiple_sets_info is introduced which is used by noce_convert_multiple_sets and its helper functions. This results in cleaner function signatures, improved efficientcy (a number of vecs and hash set/map are replaced with a single vec of struct) and simplicity. gcc/ChangeLog: * ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets_info. (init_noce_multiple_sets_info): Initialize noce_multiple_sets_info. (noce_convert_multiple_sets_1): Use noce_multiple_sets_info and handle rewiring of multiple registers. (noce_convert_multiple_sets): Updated to use noce_multiple_sets_info. * ifcvt.h (struct noce_multiple_sets_info): Introduce new struct noce_multiple_sets_info to store info for noce_convert_multiple_sets. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ifcvt_multiple_sets_rewire.c: New test. Signed-off-by: Manolis Tsamis --- (no changes since v1) gcc/ifcvt.cc | 256 ++++++++---------- gcc/ifcvt.h | 16 ++ .../aarch64/ifcvt_multiple_sets_rewire.c | 20 ++ 3 files changed, 148 insertions(+), 144 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index 2a5e5e5b608..3e25f30b67e 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -98,14 +98,10 @@ static bool dead_or_predicable (basic_block, basic_block, basic_block, edge, bool); static void noce_emit_move_insn (rtx, rtx); static rtx_insn *block_has_only_trap (basic_block); -static void need_cmov_or_rewire (basic_block, hash_set *, - hash_map *); +static void init_noce_multiple_sets_info (basic_block, + auto_delete_vec &); static bool noce_convert_multiple_sets_1 (struct noce_if_info *, - hash_set *, - hash_map *, - auto_vec *, - auto_vec *, - auto_vec *, int *); + auto_delete_vec &, int *); /* Count the number of non-jump active insns in BB. */ @@ -3487,24 +3483,13 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) rtx x = XEXP (cond, 0); rtx y = XEXP (cond, 1); - /* The true targets for a conditional move. */ - auto_vec targets; - /* The temporaries introduced to allow us to not consider register - overlap. */ - auto_vec temporaries; - /* The insns we've emitted. */ - auto_vec unmodified_insns; - - hash_set need_no_cmov; - hash_map rewired_src; - - need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src); + auto_delete_vec insn_info; + init_noce_multiple_sets_info (then_bb, insn_info); int last_needs_comparison = -1; bool ok = noce_convert_multiple_sets_1 - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, - &unmodified_insns, &last_needs_comparison); + (if_info, insn_info, &last_needs_comparison); if (!ok) return false; @@ -3519,8 +3504,7 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) end_sequence (); start_sequence (); ok = noce_convert_multiple_sets_1 - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, - &unmodified_insns, &last_needs_comparison); + (if_info, insn_info, &last_needs_comparison); /* Actually we should not fail anymore if we reached here, but better still check. */ if (!ok) @@ -3529,12 +3513,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) /* We must have seen some sort of insn to insert, otherwise we were given an empty BB to convert, and we can't handle that. */ - gcc_assert (!unmodified_insns.is_empty ()); + gcc_assert (!insn_info.is_empty ()); /* Now fixup the assignments. */ - for (unsigned i = 0; i < targets.length (); i++) - if (targets[i] != temporaries[i]) - noce_emit_move_insn (targets[i], temporaries[i]); + for (unsigned i = 0; i < insn_info.length (); i++) + if (insn_info[i]->target != insn_info[i]->temporary) + noce_emit_move_insn (insn_info[i]->target, insn_info[i]->temporary); /* Actually emit the sequence if it isn't too expensive. */ rtx_insn *seq = get_insns (); @@ -3549,10 +3533,10 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) set_used_flags (insn); /* Mark all our temporaries and targets as used. */ - for (unsigned i = 0; i < targets.length (); i++) + for (unsigned i = 0; i < insn_info.length (); i++) { - set_used_flags (temporaries[i]); - set_used_flags (targets[i]); + set_used_flags (insn_info[i]->temporary); + set_used_flags (insn_info[i]->target); } set_used_flags (cond); @@ -3571,7 +3555,7 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) return false; emit_insn_before_setloc (seq, if_info->jump, - INSN_LOCATION (unmodified_insns.last ())); + INSN_LOCATION (insn_info.last ()->unmodified_insn)); /* Clean up THEN_BB and the edges in and out of it. */ remove_edge (find_edge (test_bb, join_bb)); @@ -3592,20 +3576,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) return true; } -/* This goes through all relevant insns of IF_INFO->then_bb and tries to - create conditional moves. In case a simple move sufficis the insn - should be listed in NEED_NO_CMOV. The rewired-src cases should be - specified via REWIRED_SRC. TARGETS, TEMPORARIES and UNMODIFIED_INSNS - are specified and used in noce_convert_multiple_sets and should be passed - to this function.. */ +/* This goes through all relevant insns of IF_INFO->then_bb and tries to create + conditional moves. Information for the insns is kept in INSN_INFO. */ static bool noce_convert_multiple_sets_1 (struct noce_if_info *if_info, - hash_set *need_no_cmov, - hash_map *rewired_src, - auto_vec *targets, - auto_vec *temporaries, - auto_vec *unmodified_insns, + auto_delete_vec &insn_info, int *last_needs_comparison) { basic_block then_bb = if_info->then_bb; @@ -3624,11 +3600,6 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, rtx_insn *insn; int count = 0; - - targets->truncate (0); - temporaries->truncate (0); - unmodified_insns->truncate (0); - bool second_try = *last_needs_comparison != -1; FOR_BB_INSNS (then_bb, insn) @@ -3637,6 +3608,8 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, if (!active_insn_p (insn)) continue; + noce_multiple_sets_info *info = insn_info[count]; + rtx set = single_set (insn); gcc_checking_assert (set); @@ -3644,9 +3617,12 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, rtx temp; rtx new_val = SET_SRC (set); - if (int *ii = rewired_src->get (insn)) - new_val = simplify_replace_rtx (new_val, (*targets)[*ii], - (*temporaries)[*ii]); + + int i, ii; + FOR_EACH_VEC_ELT (info->rewired_src, i, ii) + new_val = simplify_replace_rtx (new_val, insn_info[ii]->target, + insn_info[ii]->temporary); + rtx old_val = target; /* As we are transforming @@ -3687,7 +3663,7 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, /* We have identified swap-style idioms before. A normal set will need to be a cmov while the first instruction of a swap-style idiom can be a regular move. This helps with costing. */ - bool need_cmov = !need_no_cmov->contains (insn); + bool need_cmov = info->need_cmov; /* If we had a non-canonical conditional jump (i.e. one where the fallthrough is to the "else" case) we need to reverse @@ -3814,12 +3790,13 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, COND may use the flags register and if INSN clobbers flags then we may be unable to emit a valid sequence (e.g. in x86 that would require saving and restoring the flags register). */ - for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) - if (modified_in_p (cond, iter)) - { - end_sequence (); - return false; - } + if (!second_try) + for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cond, iter)) + { + end_sequence (); + return false; + } if (cc_cmp && seq == seq1) { @@ -3841,9 +3818,10 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, /* Bookkeeping. */ count++; - targets->safe_push (target); - temporaries->safe_push (temp_dest); - unmodified_insns->safe_push (insn); + + info->target = target; + info->temporary = temp_dest; + info->unmodified_insn = insn; } /* Even if we did not actually need the comparison, we want to make sure @@ -3851,11 +3829,84 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, if (*last_needs_comparison == -1) *last_needs_comparison = 0; - return true; } +/* Find local swap-style idioms in BB and mark the first insn (1) + that is only a temporary as not needing a conditional move as + it is going to be dead afterwards anyway. + + (1) int tmp = a; + a = b; + b = tmp; + + ifcvt + --> + tmp = a; + a = cond ? b : a_old; + b = cond ? tmp : b_old; + + Additionally, store the index of insns like (2) when a subsequent + SET reads from their destination. + + (2) int c = a; + int d = c; + + ifcvt + --> + + c = cond ? a : c_old; + d = cond ? d : c; // Need to use c rather than c_old here. +*/ + +static void +init_noce_multiple_sets_info (basic_block bb, + auto_delete_vec &insn_info) +{ + rtx_insn *insn; + int count = 0; + auto_vec dests; + bitmap bb_live_out = df_get_live_out (bb); + + /* Iterate over all SETs, storing the destinations in DEST. + - If we encounter a previously changed register, + rewire the read to the original source. + - If we encounter a SET that writes to a destination + that is not live after this block then the register + does not need to be moved conditionally. */ + FOR_BB_INSNS (bb, insn) + { + if (!active_insn_p (insn)) + continue; + + noce_multiple_sets_info *info = new noce_multiple_sets_info; + info->target = NULL_RTX; + info->temporary = NULL_RTX; + info->unmodified_insn = NULL; + insn_info.safe_push (info); + + rtx set = single_set (insn); + gcc_checking_assert (set); + + rtx src = SET_SRC (set); + rtx dest = SET_DEST (set); + + gcc_checking_assert (REG_P (dest)); + info->need_cmov = bitmap_bit_p (bb_live_out, REGNO (dest)); + + /* Check if the current SET's source is the same + as any previously seen destination. + This is quadratic but the number of insns in BB + is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ + for (int i = count - 1; i >= 0; --i) + if (reg_mentioned_p (dests[i], src)) + insn_info[count]->rewired_src.safe_push (i); + + dests.safe_push (dest); + count++; + } +} /* Return true iff basic block TEST_BB is suitable for conversion to a series of conditional moves. Also check that we have more than one @@ -4325,89 +4376,6 @@ check_cond_move_block (basic_block bb, return true; } -/* Find local swap-style idioms in BB and mark the first insn (1) - that is only a temporary as not needing a conditional move as - it is going to be dead afterwards anyway. - - (1) int tmp = a; - a = b; - b = tmp; - - ifcvt - --> - - tmp = a; - a = cond ? b : a_old; - b = cond ? tmp : b_old; - - Additionally, store the index of insns like (2) when a subsequent - SET reads from their destination. - - (2) int c = a; - int d = c; - - ifcvt - --> - - c = cond ? a : c_old; - d = cond ? d : c; // Need to use c rather than c_old here. -*/ - -static void -need_cmov_or_rewire (basic_block bb, - hash_set *need_no_cmov, - hash_map *rewired_src) -{ - rtx_insn *insn; - int count = 0; - auto_vec insns; - auto_vec dests; - - /* Iterate over all SETs, storing the destinations - in DEST. - - If we hit a SET that reads from a destination - that we have seen before and the corresponding register - is dead afterwards, the register does not need to be - moved conditionally. - - If we encounter a previously changed register, - rewire the read to the original source. */ - FOR_BB_INSNS (bb, insn) - { - rtx set, src, dest; - - if (!active_insn_p (insn)) - continue; - - set = single_set (insn); - if (set == NULL_RTX) - continue; - - src = SET_SRC (set); - if (SUBREG_P (src)) - src = SUBREG_REG (src); - dest = SET_DEST (set); - - /* Check if the current SET's source is the same - as any previously seen destination. - This is quadratic but the number of insns in BB - is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ - if (REG_P (src)) - for (int i = count - 1; i >= 0; --i) - if (reg_overlap_mentioned_p (src, dests[i])) - { - if (find_reg_note (insn, REG_DEAD, src) != NULL_RTX) - need_no_cmov->add (insns[i]); - else - rewired_src->put (insn, i); - } - - insns.safe_push (insn); - dests.safe_push (dest); - - count++; - } -} - /* Given a basic block BB suitable for conditional move conversion, a condition COND, and pointer maps THEN_VALS and ELSE_VALS containing the register values depending on COND, emit the insns in the block as diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h index 37147f99129..204bcf6d18a 100644 --- a/gcc/ifcvt.h +++ b/gcc/ifcvt.h @@ -40,6 +40,22 @@ struct ce_if_block int pass; /* Pass number. */ }; +struct noce_multiple_sets_info +{ + /* A list of indices to instructions that we need to rewire into this + instruction when we replace them with temporary conditional moves. */ + auto_vec rewired_src; + /* The true targets for a conditional move. */ + rtx target; + /* The temporaries introduced to allow us to not consider register + overlap. */ + rtx temporary; + /* The insns we've emitted. */ + rtx_insn *unmodified_insn; + /* True if a simple move can be used instead of a conditional move. */ + bool need_cmov; +}; + /* Used by noce_process_if_block to communicate with its subroutines. The subroutines know that A and B may be evaluated freely. They diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c new file mode 100644 index 00000000000..448425fba03 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ce1" } */ + +void sink2(int, int); + +void cond1(int cond, int x, int y, int z) +{ + if (x) + { + x = y + z; + y = z + x; + } + + sink2(x, y); +} + +/* { dg-final { scan-assembler-times "csel\tw0, w0, w1" 1 } } */ +/* { dg-final { scan-assembler-times "csel\tw1, w3, w2" 1 } } */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_convert_multiple_sets" 1 "ce1" } } */