From patchwork Fri Jul 26 08:24:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manolis Tsamis X-Patchwork-Id: 1965209 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=KwLboLEm; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVgr33jFGz1ybY for ; Fri, 26 Jul 2024 18:28:19 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C79413865C32 for ; Fri, 26 Jul 2024 08:28:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by sourceware.org (Postfix) with ESMTPS id 6DA8B3858C41 for ; Fri, 26 Jul 2024 08:25:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6DA8B3858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6DA8B3858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982319; cv=none; b=EmioGp2BdRi8qOrdoZkthyPV1e2CMGm5VpjKTWXhQguMMho18P/MsW4lY9C+isXztogd+eVHB1uEQAfT9TUKvexXA2Y0s0j34ROe4PvnGtsUOLFFbHBbs81vydXxw6tMRNDk0GbzfWwwZhU6xBi7/6OIciTdloGfuiS1JJSXQ20= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982319; c=relaxed/simple; bh=Llg//GpA1RY5f8GQcZdJ03ge9JgUznxzfmWyOO65Bxg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=OPYqJN0+waNSLfMtmxIs7fsXWOm8FSkVBEmyNQxKcfQCZih59O6fXspRfN865pT5SG0NFFYFP9acFXmd3gGuTGRk0blHbfhAvSBLwmXIA1fLqXN1eNZHpXNjX06htZe9kLypH79iTWy20ehzEFeLiKlUkvpiYEGxJl86AUh5evA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-52f042c15e3so882331e87.0 for ; Fri, 26 Jul 2024 01:25:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1721982315; x=1722587115; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x9JlXwtWhwWNPSk05eY5jqN5WkIbOo15RVyPnu3Fty4=; b=KwLboLEmD4Lzw7fz6ztbZ61wTTQDQTxIL4cAuKWxg2JT8YVbOOqHyDdS/PgbXbXiye nxA9DedlbpnjVKy4nOp55NFyFxTMUDDA4B9NYGXvRkbZwTPqqNfi3tSyf3RdLFqxXT8k HslJ892WcBMksNHpqxnHNl0s9tym7W8OUjkv7yMHtBhWk9eYn8DEZ4nAwob7C4ydqbg0 1nKCkPhgZCJKR2hUctKfBeEjSbMoPcklB929ULvm+EAeLA+fhYxCMFnn5Rm3vZA0z7Ka KY56rEsR7+kfSH1LrR8IfOZLcxf/3h+iBuKVnKp55LWwePU/Sf6DgvBC/zrRF9T5Oxw/ QN2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721982315; x=1722587115; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x9JlXwtWhwWNPSk05eY5jqN5WkIbOo15RVyPnu3Fty4=; b=N7yyV3FPvuc253zTbUGK2aTZ4dG7QuRzzLN/K6opWczYph5h+Xu8NtyORRNwg0BHtu yzrH+Hk+hEOJDB518FQC8WC54rd9SDe1X82GZyWrrZejLXotQ/w74/RPtBSwN8RR38J4 6OniM0PYbpWCX9ikl3C5R7LPg0yHBkkaahOkEVUiYAPw/58MYwQjWUQRaCYunU5HN7Ui Mb8MOa3Qfu47aP3zsHMnecds6U7yY1gg9Zbd3tfYNc942W0UHk3W8me8iFkeFbMzWFlo eWM/UY/yLtPa3WAGs60TT2thutvK0gsM3suc85MN/qVthmgwkhHdNg6arXeppHu4zySu Vu4Q== X-Gm-Message-State: AOJu0YxtSWc59fNkBj789MkX2uaH1tOBDi05yLdvw4Q1bqRCf9FekKcK g3c0Sm4TrkY4fyKuD7PEil12+M76nRXfosI/S2UlGGTbMo/Kocy8JlpoJDjbh+OAO0pnlprmLcH pvxw= X-Google-Smtp-Source: AGHT+IH1Gzi7u5bnta52BS5BybLuPrHD7sRE0tEmGyV1v7pGos8lt9xJ1P1oW8U6xfn/5PXccW62uQ== X-Received: by 2002:a05:6512:3192:b0:52f:3c:a79 with SMTP id 2adb3069b0e04-52fcf8afd05mr2307659e87.7.1721982314233; Fri, 26 Jul 2024 01:25:14 -0700 (PDT) Received: from helsinki-03.engr ([2a01:4f9:6b:2a47::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-52fd5c354d2sm432604e87.290.2024.07.26.01.25.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 01:25:14 -0700 (PDT) From: Manolis Tsamis To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Richard Biener , Philipp Tomsich , Jeff Law , Robin Dapp , Jiangning Liu , Manolis Tsamis Subject: [PATCH v5 1/3] [RFC] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets Date: Fri, 26 Jul 2024 10:24:59 +0200 Message-Id: <20240726082501.4086489-2-manolis.tsamis@vrull.eu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> References: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This is an extension of what was done in PR106590. Currently if a sequence generated in noce_convert_multiple_sets clobbers the condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards (sequences that emit the comparison itself). Since this applies only from the next iteration it assumes that the sequences generated (in particular seq2) doesn't clobber the condition rtx itself before using it in the if_then_else, which is only true in specific cases (currently only register/subregister moves are allowed). This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in the current iteration. It also checks whether the resulting sequence clobbers the condition attached to the jump. This makes it possible to include arithmetic operations in noce_convert_multiple_sets. It also makes the code that checks whether the condition is used outside of the if_then_else emitted more robust. gcc/ChangeLog: * ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead. (noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp. Punt if seq clobbers cond. Refactor the code that sets read_comparison. Signed-off-by: Manolis Tsamis --- (no changes since v1) gcc/ifcvt.cc | 128 +++++++++++++++++++++++++++++++-------------------- 1 file changed, 79 insertions(+), 49 deletions(-) diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index 58ed42673e5..ff6c934c613 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -3592,20 +3592,6 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) return true; } -/* Helper function for noce_convert_multiple_sets_1. If store to - DEST can affect P[0] or P[1], clear P[0]. Called via note_stores. */ - -static void -check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0) -{ - rtx *p = (rtx *) p0; - if (p[0] == NULL_RTX) - return; - if (reg_overlap_mentioned_p (dest, p[0]) - || (p[1] && reg_overlap_mentioned_p (dest, p[1]))) - p[0] = NULL_RTX; -} - /* This goes through all relevant insns of IF_INFO->then_bb and tries to create conditional moves. In case a simple move sufficis the insn should be listed in NEED_NO_CMOV. The rewired-src cases should be @@ -3731,36 +3717,71 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, creating an additional compare for each. If successful, costing is easier and this sequence is usually preferred. */ if (cc_cmp) - seq2 = try_emit_cmove_seq (if_info, temp, cond, - new_val, old_val, need_cmov, - &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); + { + seq2 = try_emit_cmove_seq (if_info, temp, cond, + new_val, old_val, need_cmov, + &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); + + /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp is + clobbered. We can't safely use the sequence in this case. */ + for (rtx_insn *iter = seq2; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cc_cmp, iter) + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, iter))) + { + seq2 = NULL; + break; + } + } /* The backend might have created a sequence that uses the - condition. Check this. */ + condition as a value. Check this. */ + + /* We cannot handle anything more complex than a reg or constant. */ + if (!REG_P (XEXP (cond, 0)) && !CONSTANT_P (XEXP (cond, 0))) + read_comparison = true; + + if (!REG_P (XEXP (cond, 1)) && !CONSTANT_P (XEXP (cond, 1))) + read_comparison = true; + rtx_insn *walk = seq2; - while (walk) + int if_then_else_count = 0; + while (walk && !read_comparison) { - rtx set = single_set (walk); + rtx exprs_to_check[2]; + unsigned int exprs_count = 0; - if (!set || !SET_SRC (set)) + rtx set = single_set (walk); + if (set && XEXP (set, 1) + && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE) { - walk = NEXT_INSN (walk); - continue; + /* We assume that this is the cmove created by the backend that + naturally uses the condition. */ + exprs_to_check[exprs_count++] = XEXP (XEXP (set, 1), 1); + exprs_to_check[exprs_count++] = XEXP (XEXP (set, 1), 2); + if_then_else_count++; } + else if (NONDEBUG_INSN_P (walk)) + exprs_to_check[exprs_count++] = PATTERN (walk); - rtx src = SET_SRC (set); + /* Bail if we get more than one if_then_else because the assumption + above may be incorrect. */ + if (if_then_else_count > 1) + { + read_comparison = true; + break; + } - if (XEXP (set, 1) && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE) - ; /* We assume that this is the cmove created by the backend that - naturally uses the condition. Therefore we ignore it. */ - else + for (unsigned int i = 0; i < exprs_count; i++) { - if (reg_mentioned_p (XEXP (cond, 0), src) - || reg_mentioned_p (XEXP (cond, 1), src)) - { - read_comparison = true; - break; - } + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, exprs_to_check[i], NONCONST) + if (*iter != NULL_RTX + && (reg_overlap_mentioned_p (XEXP (cond, 0), *iter) + || reg_overlap_mentioned_p (XEXP (cond, 1), *iter))) + { + read_comparison = true; + break; + } } walk = NEXT_INSN (walk); @@ -3788,22 +3809,31 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, return false; } - if (cc_cmp) + /* Although we use temporaries if there is register overlap of COND and + TARGET, it is possible that SEQ modifies COND anyway. For example, + COND may use the flags register and if INSN clobbers flags then + we may be unable to emit a valid sequence (e.g. in x86 that would + require saving and restoring the flags register). */ + for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cond, iter)) + { + end_sequence (); + return false; + } + + if (cc_cmp && seq == seq1) { - /* Check if SEQ can clobber registers mentioned in - cc_cmp and/or rev_cc_cmp. If yes, we need to use - only seq1 from that point on. */ - rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp }; - for (walk = seq; walk; walk = NEXT_INSN (walk)) - { - note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair); - if (cc_cmp_pair[0] == NULL_RTX) - { - cc_cmp = NULL_RTX; - rev_cc_cmp = NULL_RTX; - break; - } - } + /* Check if SEQ can clobber registers mentioned in cc_cmp/rev_cc_cmp. + If yes, we need to use only SEQ1 from that point on. + Only check when we use SEQ1 since we have already tested SEQ2. */ + for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cc_cmp, iter) + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, iter))) + { + cc_cmp = NULL_RTX; + rev_cc_cmp = NULL_RTX; + break; + } } /* End the sub sequence and emit to the main sequence. */ From patchwork Fri Jul 26 08:25:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manolis Tsamis X-Patchwork-Id: 1965206 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=VfVkFjWv; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVgp467BCz1ybY for ; Fri, 26 Jul 2024 18:26:36 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 00304385C6C0 for ; Fri, 26 Jul 2024 08:26:35 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x134.google.com (mail-lf1-x134.google.com [IPv6:2a00:1450:4864:20::134]) by sourceware.org (Postfix) with ESMTPS id A4DB23858C33 for ; Fri, 26 Jul 2024 08:25:17 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A4DB23858C33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A4DB23858C33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::134 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982321; cv=none; b=rUeMpeSzOwf+yi8csfm0El7axd3h/Bsh5lIaDoF/m2f+4GaMs/gnnBVkUIxki+PMYTsKseXC+lOsbWB8Gx4dgA/NX57dGWTJEq7qIu50XpctkCzN0DXdDuZgppzvxVP+v7HtN53QZK8okey/xCRLOAWxIlPEkbryTfEEpiY7H9Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982321; c=relaxed/simple; bh=lqUWy4GtubzA37IFLH6E69X9aRR0+sTTdSYJOKvzI/s=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=O50YvPzjMMNyclcj3QdZbDbvXi8PM8syL99B3Z4l6KCbBRM1iDKs7IeAcxHVN9zVYNvt+98tiBK7WPbFQOUtCVwntZ23PzwyBN+78sgXFcAFz2KqkZRunqkuSHBH9WlHtu2MZWcCND1W4qqVQCnh9HkvEeHNfYJnymW1ZYK+bdY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x134.google.com with SMTP id 2adb3069b0e04-52efd855adbso1492700e87.2 for ; Fri, 26 Jul 2024 01:25:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1721982316; x=1722587116; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V9mS/7kQqYpFTEh3k9pjlxrCfACvDqXTpMNrHRfRb7M=; b=VfVkFjWvxxcZGMRS/vyY0vEgdvTImq1WX5/2wvmP6Rhkx76ACx8iy7XfwnoccDSPOH Eq1f+EhXIYFPU6uXv1jZ6oPlARAk7oNBVTa9yxBYlxTP2qo6bgvxX3CBNEUz0NmZ5TDo m1WUU0V+n5owAEPirLadWSqF42J+zR3LpyBMUwnphKFrdRZULfY0EFGMNLfP3DSAzFmR vKonnRRv0R3ANNypTGdtzF7XbwFHjd/Rcoghxpgx5hwEvXou5a4Lb2d3cA3917D4jNJb hZsKhHmhmrAVc93TZcXPgvYb+KRWg45wiQ+EoeHj+RG0FG1H4+hq4S7vbxq1Ghl+lkII vsMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721982316; x=1722587116; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V9mS/7kQqYpFTEh3k9pjlxrCfACvDqXTpMNrHRfRb7M=; b=Gl9QYPZNPTE/pmMPIuvV8JKVh+Ed4sL37aq/C/Dofa8Wq+6oLeS10Qc17qZx5n/UDH uICeCJaXJ1B1pj8JxjW+z1cyJvZpkA2HUG0BYJb5j79AQ6QFgFhxYrFRXkO3tAuEdHW4 mLugEx5MQ9es+Fhsdal/MvBikWLLN8X7PBEnSGg33UYjLeg7HWcgQZaFTf5eqKWSJhZh BuwJ1Q2rFJwkkDNwnMpguYILcISaf5AUjozVlYTmRlXK3GZsh/G3Pet5UeOzW3McP97B S0ZCi2A3ujmKMIgaKlECKhIEBb4u4Sv+AeZfWkF3qFsV6HHE3GIpiXUepxDF31MgvU8y Z+WA== X-Gm-Message-State: AOJu0Yw2i49Q4WN2szpo9viUttnDhvpco/Pz42YwXOswNpE+ESx6CmlD F6edX4r2FXH5lsiJDHxpp75uGsbvb0oMDfNqbyYCo1SnJwoy8hqYuFIiLACjlLvvizuv+AHaG4U cIgY= X-Google-Smtp-Source: AGHT+IFDeqcWEOj88yjpSEBTDaowpR8wPu9SdIDKPzgStyfyCc310Ryusk7HZF8tPWxDy1vdDCy6tQ== X-Received: by 2002:a05:6512:3e15:b0:52e:7125:c70a with SMTP id 2adb3069b0e04-52fd3f7e820mr3690755e87.47.1721982315471; Fri, 26 Jul 2024 01:25:15 -0700 (PDT) Received: from helsinki-03.engr ([2a01:4f9:6b:2a47::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-52fd5c354d2sm432604e87.290.2024.07.26.01.25.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 01:25:15 -0700 (PDT) From: Manolis Tsamis To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Richard Biener , Philipp Tomsich , Jeff Law , Robin Dapp , Jiangning Liu , Manolis Tsamis Subject: [PATCH v5 2/3] [RFC] ifcvt: Allow more operations in multiple set if conversion Date: Fri, 26 Jul 2024 10:25:00 +0200 Message-Id: <20240726082501.4086489-3-manolis.tsamis@vrull.eu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> References: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Currently the operations allowed for if conversion of a basic block with multiple sets are few, namely REG, SUBREG and CONST_INT (as controlled by bb_ok_for_noce_convert_multiple_sets). This commit allows more operations (arithmetic, compare, etc) to participate in if conversion. The target's profitability hook and ifcvt's costing is expected to reject sequences that are unprofitable. This is especially useful for targets which provide a rich selection of conditional instructions (like aarch64 which has cinc, csneg, csinv, ccmp, ...) which are currently not used in basic blocks with more than a single set. gcc/ChangeLog: * ifcvt.cc (try_emit_cmove_seq): Modify comments. (noce_convert_multiple_sets_1): Modify comments. (bb_ok_for_noce_convert_multiple_sets): Allow more operations. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ifcvt_multiple_sets_arithm.c: New test. Signed-off-by: Manolis Tsamis --- Changes in v5: - Loop over SEQ and check modified_in_p for all instructions. - Fix x86-related bug when SEQ modifies COND. gcc/ifcvt.cc | 34 +++----- .../aarch64/ifcvt_multiple_sets_arithm.c | 79 +++++++++++++++++++ 2 files changed, 92 insertions(+), 21 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index ff6c934c613..2a5e5e5b608 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -3432,13 +3432,13 @@ try_emit_cmove_seq (struct noce_if_info *if_info, rtx temp, /* We have something like: if (x > y) - { i = a; j = b; k = c; } + { i = EXPR_A; j = EXPR_B; k = EXPR_C; } Make it: - tmp_i = (x > y) ? a : i; - tmp_j = (x > y) ? b : j; - tmp_k = (x > y) ? c : k; + tmp_i = (x > y) ? EXPR_A : i; + tmp_j = (x > y) ? EXPR_B : j; + tmp_k = (x > y) ? EXPR_C : k; i = tmp_i; j = tmp_j; k = tmp_k; @@ -3857,11 +3857,10 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, -/* Return true iff basic block TEST_BB is comprised of only - (SET (REG) (REG)) insns suitable for conversion to a series - of conditional moves. Also check that we have more than one set - (other routines can handle a single set better than we would), and - fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets. While going +/* Return true iff basic block TEST_BB is suitable for conversion to a + series of conditional moves. Also check that we have more than one + set (other routines can handle a single set better than we would), + and fewer than PARAM_MAX_RTL_IF_CONVERSION_INSNS sets. While going through the insns store the sum of their potential costs in COST. */ static bool @@ -3887,20 +3886,13 @@ bb_ok_for_noce_convert_multiple_sets (basic_block test_bb, unsigned *cost) rtx dest = SET_DEST (set); rtx src = SET_SRC (set); - /* We can possibly relax this, but for now only handle REG to REG - (including subreg) moves. This avoids any issues that might come - from introducing loads/stores that might violate data-race-freedom - guarantees. */ - if (!REG_P (dest)) - return false; - - if (!((REG_P (src) || CONSTANT_P (src)) - || (GET_CODE (src) == SUBREG && REG_P (SUBREG_REG (src)) - && subreg_lowpart_p (src)))) + /* Do not handle anything involving memory loads/stores since it might + violate data-race-freedom guarantees. */ + if (!REG_P (dest) || contains_mem_rtx_p (src)) return false; - /* Destination must be appropriate for a conditional write. */ - if (!noce_operand_ok (dest)) + /* Destination and source must be appropriate. */ + if (!noce_operand_ok (dest) || !noce_operand_ok (src)) return false; /* We must be able to conditionally move in this mode. */ diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c new file mode 100644 index 00000000000..ba7f948aba5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_arithm.c @@ -0,0 +1,79 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ce1" } */ + +void sink2(int, int); +void sink3(int, int, int); + +void cond1(int cond, int x, int y) +{ + if (cond) + { + x = x << 4; + y = 1; + } + + sink2(x, y); +} + +void cond2(int cond, int x, int y) +{ + if (cond) + { + x++; + y++; + } + + sink2(x, y); +} + +void cond3(int cond, int x1, int x2, int x3) +{ + if (cond) + { + x1++; + x2++; + x3++; + } + + sink3(x1, x2, x3); +} + +void cond4(int cond, int x, int y) +{ + if (cond) + { + x += 2; + y += 3; + } + + sink2(x, y); +} + +void cond5(int cond, int x, int y, int r1, int r2) +{ + if (cond) + { + x = r1 + 2; + y = r2 - 34; + } + + sink2(x, y); +} + +void cond6(int cond, int x, int y) +{ + if (cond) + { + x = -x; + y = ~y; + } + + sink2(x, y); +} + +/* { dg-final { scan-assembler-times "cinc\t" 5 } } */ +/* { dg-final { scan-assembler-times "csneg\t" 1 } } */ +/* { dg-final { scan-assembler-times "csinv\t" 1 } } */ +/* { dg-final { scan-assembler "csel\t" } } */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_convert_multiple_sets" 6 "ce1" } } */ From patchwork Fri Jul 26 08:25:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manolis Tsamis X-Patchwork-Id: 1965207 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=NRxB89Ak; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVgpC2f2Hz20FJ for ; Fri, 26 Jul 2024 18:26:43 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id AA46F3858283 for ; Fri, 26 Jul 2024 08:26:40 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x130.google.com (mail-lf1-x130.google.com [IPv6:2a00:1450:4864:20::130]) by sourceware.org (Postfix) with ESMTPS id 2F7ED3858403 for ; Fri, 26 Jul 2024 08:25:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2F7ED3858403 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 2F7ED3858403 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::130 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982324; cv=none; b=pjCgbTj9D/MIlmADgS2M9nX9N0DceD1K97ZFngZafAIgQLyALFXMvwWXfRru2Q1OQNeq+RsWHgVHkK4CGgsbCmKyr6CIaSqgpWrOSgcbxEGDUEVuLodbDimAcakGJoHLD7T4cxoqXILzpKmEe+AM6erMO/R7CytFoiFLXBu1mec= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982324; c=relaxed/simple; bh=zAkLIblJTl29iWp6teG42u6MhE/5ivOomU3C1z+qZ4E=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=LAN0E1RSzviaXfA4NzqxaLc265jzbW9JKTTCJNcHFV2IhM2zunETsSwOWnfwl6OIUCtQCWq6gTaCeutGZySQORX8L6dX66mjvUMUmhJUbEp52UP4SzsdhsGMwuStCfnFl/4Y3r8R5ckBTdFyDYE4pcjlyN5ohNhAqU6sVWgEWVY= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x130.google.com with SMTP id 2adb3069b0e04-52fc4388a64so1599679e87.1 for ; Fri, 26 Jul 2024 01:25:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1721982317; x=1722587117; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zbGyKKhyUGgzLHKwwPa4jusrfAe5fE50ZZj01U+1cos=; b=NRxB89Ak88cYZE+BM06/FlrLcagpZXDQJmqESJC+ekYdceQygoRP5FLMUoIoEqlAMy y6bOYUfmyOHr79rJ0vw2NKrpU6R/5n7l5JXJBuwb68mfXCn3QNdSbBO/8UeVmbGmbHcA hV0ayVry5Ft/ggD5zIbsYkhEich05k7Yu9kHL4l/0qbjlCsJDSxfKFAp86aAceT7sUKE 1KfUh2fFqQD1MwyMHE5We9T8C2W2xcaf9fm/A14NrI9Za4cuTOUXNM6FXTrNPr1sUZDH /p6EPLyKbHTQYJzfBPpLn6CU0x5ZBs4EEFapVEw8PxzBheaExm4d1uLWEMjW7kP6+YaM Q6rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721982317; x=1722587117; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zbGyKKhyUGgzLHKwwPa4jusrfAe5fE50ZZj01U+1cos=; b=IWlatpgs0UOc5VtiMqc6J8sGIgRrGi+VeVpa33ng+7tC3wSm/nMAV9hdKR0HY94zpl +hV/haD+whJTjYTjHyNG6uR6cEZASV0HfSmzHctSnSk/fh2Q5D+qnUadRtAysdotcw2y SM+Xz560owdPw2FGk1U/tqcQXOA4svb+XqUfkgU0EtXufGfb8lOYS9/3qcMabjrciURH ZnZ3ggdW96EIKtO1immQftT7FLAK55kDN6w98oyhu8yGdUpOUpiA14HPcI3ZANqPsP/O Nn/cB+JuuBcYADNyC5BOOOk4iCwWhEaUW1wq5y6LdxbJ/CPAMXM0SwL4zDwnVhJu0jLb T1RQ== X-Gm-Message-State: AOJu0Ywwzon17DHpz0UJQoSyMYzS87pt2DYnlsZEpVPZZcVRzrMzGUr0 gmDp81HXQhoNwgq2ibJ6f0vympnh3O/r/RxUuccjndP8cRSAjAfRcWpV4oq1H6jJ8hp1hxXvFE6 439E= X-Google-Smtp-Source: AGHT+IGVRu36sMYR0357ZoTrfhX7GfVR4PapCLCljii9u0TfRi9Rt/1rIypdQ+YaYFRMiqN9f6DPyg== X-Received: by 2002:a05:6512:3504:b0:52f:31:c2a3 with SMTP id 2adb3069b0e04-52fd6087488mr3578768e87.12.1721982316722; Fri, 26 Jul 2024 01:25:16 -0700 (PDT) Received: from helsinki-03.engr ([2a01:4f9:6b:2a47::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-52fd5c354d2sm432604e87.290.2024.07.26.01.25.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 01:25:16 -0700 (PDT) From: Manolis Tsamis To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Richard Biener , Philipp Tomsich , Jeff Law , Robin Dapp , Jiangning Liu , Manolis Tsamis Subject: [PATCH v5 3/3] [RFC] ifcvt: Handle multiple rewired regs and refactor noce_convert_multiple_sets Date: Fri, 26 Jul 2024 10:25:01 +0200 Message-Id: <20240726082501.4086489-4-manolis.tsamis@vrull.eu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> References: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org The existing implementation of need_cmov_or_rewire and noce_convert_multiple_sets_1 assumes that sets are either REG or SUBREG. This commit enchances them so they can handle/rewire arbitrary set statements. To do that a new helper struct noce_multiple_sets_info is introduced which is used by noce_convert_multiple_sets and its helper functions. This results in cleaner function signatures, improved efficientcy (a number of vecs and hash set/map are replaced with a single vec of struct) and simplicity. gcc/ChangeLog: * ifcvt.cc (need_cmov_or_rewire): Renamed init_noce_multiple_sets_info. (init_noce_multiple_sets_info): Initialize noce_multiple_sets_info. (noce_convert_multiple_sets_1): Use noce_multiple_sets_info and handle rewiring of multiple registers. (noce_convert_multiple_sets): Updated to use noce_multiple_sets_info. * ifcvt.h (struct noce_multiple_sets_info): Introduce new struct noce_multiple_sets_info to store info for noce_convert_multiple_sets. gcc/testsuite/ChangeLog: * gcc.target/aarch64/ifcvt_multiple_sets_rewire.c: New test. Signed-off-by: Manolis Tsamis --- (no changes since v1) gcc/ifcvt.cc | 256 ++++++++---------- gcc/ifcvt.h | 16 ++ .../aarch64/ifcvt_multiple_sets_rewire.c | 20 ++ 3 files changed, 148 insertions(+), 144 deletions(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index 2a5e5e5b608..3e25f30b67e 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -98,14 +98,10 @@ static bool dead_or_predicable (basic_block, basic_block, basic_block, edge, bool); static void noce_emit_move_insn (rtx, rtx); static rtx_insn *block_has_only_trap (basic_block); -static void need_cmov_or_rewire (basic_block, hash_set *, - hash_map *); +static void init_noce_multiple_sets_info (basic_block, + auto_delete_vec &); static bool noce_convert_multiple_sets_1 (struct noce_if_info *, - hash_set *, - hash_map *, - auto_vec *, - auto_vec *, - auto_vec *, int *); + auto_delete_vec &, int *); /* Count the number of non-jump active insns in BB. */ @@ -3487,24 +3483,13 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) rtx x = XEXP (cond, 0); rtx y = XEXP (cond, 1); - /* The true targets for a conditional move. */ - auto_vec targets; - /* The temporaries introduced to allow us to not consider register - overlap. */ - auto_vec temporaries; - /* The insns we've emitted. */ - auto_vec unmodified_insns; - - hash_set need_no_cmov; - hash_map rewired_src; - - need_cmov_or_rewire (then_bb, &need_no_cmov, &rewired_src); + auto_delete_vec insn_info; + init_noce_multiple_sets_info (then_bb, insn_info); int last_needs_comparison = -1; bool ok = noce_convert_multiple_sets_1 - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, - &unmodified_insns, &last_needs_comparison); + (if_info, insn_info, &last_needs_comparison); if (!ok) return false; @@ -3519,8 +3504,7 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) end_sequence (); start_sequence (); ok = noce_convert_multiple_sets_1 - (if_info, &need_no_cmov, &rewired_src, &targets, &temporaries, - &unmodified_insns, &last_needs_comparison); + (if_info, insn_info, &last_needs_comparison); /* Actually we should not fail anymore if we reached here, but better still check. */ if (!ok) @@ -3529,12 +3513,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) /* We must have seen some sort of insn to insert, otherwise we were given an empty BB to convert, and we can't handle that. */ - gcc_assert (!unmodified_insns.is_empty ()); + gcc_assert (!insn_info.is_empty ()); /* Now fixup the assignments. */ - for (unsigned i = 0; i < targets.length (); i++) - if (targets[i] != temporaries[i]) - noce_emit_move_insn (targets[i], temporaries[i]); + for (unsigned i = 0; i < insn_info.length (); i++) + if (insn_info[i]->target != insn_info[i]->temporary) + noce_emit_move_insn (insn_info[i]->target, insn_info[i]->temporary); /* Actually emit the sequence if it isn't too expensive. */ rtx_insn *seq = get_insns (); @@ -3549,10 +3533,10 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) set_used_flags (insn); /* Mark all our temporaries and targets as used. */ - for (unsigned i = 0; i < targets.length (); i++) + for (unsigned i = 0; i < insn_info.length (); i++) { - set_used_flags (temporaries[i]); - set_used_flags (targets[i]); + set_used_flags (insn_info[i]->temporary); + set_used_flags (insn_info[i]->target); } set_used_flags (cond); @@ -3571,7 +3555,7 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) return false; emit_insn_before_setloc (seq, if_info->jump, - INSN_LOCATION (unmodified_insns.last ())); + INSN_LOCATION (insn_info.last ()->unmodified_insn)); /* Clean up THEN_BB and the edges in and out of it. */ remove_edge (find_edge (test_bb, join_bb)); @@ -3592,20 +3576,12 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) return true; } -/* This goes through all relevant insns of IF_INFO->then_bb and tries to - create conditional moves. In case a simple move sufficis the insn - should be listed in NEED_NO_CMOV. The rewired-src cases should be - specified via REWIRED_SRC. TARGETS, TEMPORARIES and UNMODIFIED_INSNS - are specified and used in noce_convert_multiple_sets and should be passed - to this function.. */ +/* This goes through all relevant insns of IF_INFO->then_bb and tries to create + conditional moves. Information for the insns is kept in INSN_INFO. */ static bool noce_convert_multiple_sets_1 (struct noce_if_info *if_info, - hash_set *need_no_cmov, - hash_map *rewired_src, - auto_vec *targets, - auto_vec *temporaries, - auto_vec *unmodified_insns, + auto_delete_vec &insn_info, int *last_needs_comparison) { basic_block then_bb = if_info->then_bb; @@ -3624,11 +3600,6 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, rtx_insn *insn; int count = 0; - - targets->truncate (0); - temporaries->truncate (0); - unmodified_insns->truncate (0); - bool second_try = *last_needs_comparison != -1; FOR_BB_INSNS (then_bb, insn) @@ -3637,6 +3608,8 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, if (!active_insn_p (insn)) continue; + noce_multiple_sets_info *info = insn_info[count]; + rtx set = single_set (insn); gcc_checking_assert (set); @@ -3644,9 +3617,12 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, rtx temp; rtx new_val = SET_SRC (set); - if (int *ii = rewired_src->get (insn)) - new_val = simplify_replace_rtx (new_val, (*targets)[*ii], - (*temporaries)[*ii]); + + int i, ii; + FOR_EACH_VEC_ELT (info->rewired_src, i, ii) + new_val = simplify_replace_rtx (new_val, insn_info[ii]->target, + insn_info[ii]->temporary); + rtx old_val = target; /* As we are transforming @@ -3687,7 +3663,7 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, /* We have identified swap-style idioms before. A normal set will need to be a cmov while the first instruction of a swap-style idiom can be a regular move. This helps with costing. */ - bool need_cmov = !need_no_cmov->contains (insn); + bool need_cmov = info->need_cmov; /* If we had a non-canonical conditional jump (i.e. one where the fallthrough is to the "else" case) we need to reverse @@ -3814,12 +3790,13 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, COND may use the flags register and if INSN clobbers flags then we may be unable to emit a valid sequence (e.g. in x86 that would require saving and restoring the flags register). */ - for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) - if (modified_in_p (cond, iter)) - { - end_sequence (); - return false; - } + if (!second_try) + for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cond, iter)) + { + end_sequence (); + return false; + } if (cc_cmp && seq == seq1) { @@ -3841,9 +3818,10 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, /* Bookkeeping. */ count++; - targets->safe_push (target); - temporaries->safe_push (temp_dest); - unmodified_insns->safe_push (insn); + + info->target = target; + info->temporary = temp_dest; + info->unmodified_insn = insn; } /* Even if we did not actually need the comparison, we want to make sure @@ -3851,11 +3829,84 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, if (*last_needs_comparison == -1) *last_needs_comparison = 0; - return true; } +/* Find local swap-style idioms in BB and mark the first insn (1) + that is only a temporary as not needing a conditional move as + it is going to be dead afterwards anyway. + + (1) int tmp = a; + a = b; + b = tmp; + + ifcvt + --> + tmp = a; + a = cond ? b : a_old; + b = cond ? tmp : b_old; + + Additionally, store the index of insns like (2) when a subsequent + SET reads from their destination. + + (2) int c = a; + int d = c; + + ifcvt + --> + + c = cond ? a : c_old; + d = cond ? d : c; // Need to use c rather than c_old here. +*/ + +static void +init_noce_multiple_sets_info (basic_block bb, + auto_delete_vec &insn_info) +{ + rtx_insn *insn; + int count = 0; + auto_vec dests; + bitmap bb_live_out = df_get_live_out (bb); + + /* Iterate over all SETs, storing the destinations in DEST. + - If we encounter a previously changed register, + rewire the read to the original source. + - If we encounter a SET that writes to a destination + that is not live after this block then the register + does not need to be moved conditionally. */ + FOR_BB_INSNS (bb, insn) + { + if (!active_insn_p (insn)) + continue; + + noce_multiple_sets_info *info = new noce_multiple_sets_info; + info->target = NULL_RTX; + info->temporary = NULL_RTX; + info->unmodified_insn = NULL; + insn_info.safe_push (info); + + rtx set = single_set (insn); + gcc_checking_assert (set); + + rtx src = SET_SRC (set); + rtx dest = SET_DEST (set); + + gcc_checking_assert (REG_P (dest)); + info->need_cmov = bitmap_bit_p (bb_live_out, REGNO (dest)); + + /* Check if the current SET's source is the same + as any previously seen destination. + This is quadratic but the number of insns in BB + is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ + for (int i = count - 1; i >= 0; --i) + if (reg_mentioned_p (dests[i], src)) + insn_info[count]->rewired_src.safe_push (i); + + dests.safe_push (dest); + count++; + } +} /* Return true iff basic block TEST_BB is suitable for conversion to a series of conditional moves. Also check that we have more than one @@ -4325,89 +4376,6 @@ check_cond_move_block (basic_block bb, return true; } -/* Find local swap-style idioms in BB and mark the first insn (1) - that is only a temporary as not needing a conditional move as - it is going to be dead afterwards anyway. - - (1) int tmp = a; - a = b; - b = tmp; - - ifcvt - --> - - tmp = a; - a = cond ? b : a_old; - b = cond ? tmp : b_old; - - Additionally, store the index of insns like (2) when a subsequent - SET reads from their destination. - - (2) int c = a; - int d = c; - - ifcvt - --> - - c = cond ? a : c_old; - d = cond ? d : c; // Need to use c rather than c_old here. -*/ - -static void -need_cmov_or_rewire (basic_block bb, - hash_set *need_no_cmov, - hash_map *rewired_src) -{ - rtx_insn *insn; - int count = 0; - auto_vec insns; - auto_vec dests; - - /* Iterate over all SETs, storing the destinations - in DEST. - - If we hit a SET that reads from a destination - that we have seen before and the corresponding register - is dead afterwards, the register does not need to be - moved conditionally. - - If we encounter a previously changed register, - rewire the read to the original source. */ - FOR_BB_INSNS (bb, insn) - { - rtx set, src, dest; - - if (!active_insn_p (insn)) - continue; - - set = single_set (insn); - if (set == NULL_RTX) - continue; - - src = SET_SRC (set); - if (SUBREG_P (src)) - src = SUBREG_REG (src); - dest = SET_DEST (set); - - /* Check if the current SET's source is the same - as any previously seen destination. - This is quadratic but the number of insns in BB - is bounded by PARAM_MAX_RTL_IF_CONVERSION_INSNS. */ - if (REG_P (src)) - for (int i = count - 1; i >= 0; --i) - if (reg_overlap_mentioned_p (src, dests[i])) - { - if (find_reg_note (insn, REG_DEAD, src) != NULL_RTX) - need_no_cmov->add (insns[i]); - else - rewired_src->put (insn, i); - } - - insns.safe_push (insn); - dests.safe_push (dest); - - count++; - } -} - /* Given a basic block BB suitable for conditional move conversion, a condition COND, and pointer maps THEN_VALS and ELSE_VALS containing the register values depending on COND, emit the insns in the block as diff --git a/gcc/ifcvt.h b/gcc/ifcvt.h index 37147f99129..204bcf6d18a 100644 --- a/gcc/ifcvt.h +++ b/gcc/ifcvt.h @@ -40,6 +40,22 @@ struct ce_if_block int pass; /* Pass number. */ }; +struct noce_multiple_sets_info +{ + /* A list of indices to instructions that we need to rewire into this + instruction when we replace them with temporary conditional moves. */ + auto_vec rewired_src; + /* The true targets for a conditional move. */ + rtx target; + /* The temporaries introduced to allow us to not consider register + overlap. */ + rtx temporary; + /* The insns we've emitted. */ + rtx_insn *unmodified_insn; + /* True if a simple move can be used instead of a conditional move. */ + bool need_cmov; +}; + /* Used by noce_process_if_block to communicate with its subroutines. The subroutines know that A and B may be evaluated freely. They diff --git a/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c new file mode 100644 index 00000000000..448425fba03 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/ifcvt_multiple_sets_rewire.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-rtl-ce1" } */ + +void sink2(int, int); + +void cond1(int cond, int x, int y, int z) +{ + if (x) + { + x = y + z; + y = z + x; + } + + sink2(x, y); +} + +/* { dg-final { scan-assembler-times "csel\tw0, w0, w1" 1 } } */ +/* { dg-final { scan-assembler-times "csel\tw1, w3, w2" 1 } } */ + +/* { dg-final { scan-rtl-dump-times "if-conversion succeeded through noce_convert_multiple_sets" 1 "ce1" } } */