From patchwork Fri Jul 26 08:24:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manolis Tsamis X-Patchwork-Id: 1965209 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=vrull.eu header.i=@vrull.eu header.a=rsa-sha256 header.s=google header.b=KwLboLEm; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WVgr33jFGz1ybY for ; Fri, 26 Jul 2024 18:28:19 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C79413865C32 for ; Fri, 26 Jul 2024 08:28:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x12d.google.com (mail-lf1-x12d.google.com [IPv6:2a00:1450:4864:20::12d]) by sourceware.org (Postfix) with ESMTPS id 6DA8B3858C41 for ; Fri, 26 Jul 2024 08:25:16 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6DA8B3858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6DA8B3858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::12d ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982319; cv=none; b=EmioGp2BdRi8qOrdoZkthyPV1e2CMGm5VpjKTWXhQguMMho18P/MsW4lY9C+isXztogd+eVHB1uEQAfT9TUKvexXA2Y0s0j34ROe4PvnGtsUOLFFbHBbs81vydXxw6tMRNDk0GbzfWwwZhU6xBi7/6OIciTdloGfuiS1JJSXQ20= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1721982319; c=relaxed/simple; bh=Llg//GpA1RY5f8GQcZdJ03ge9JgUznxzfmWyOO65Bxg=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=OPYqJN0+waNSLfMtmxIs7fsXWOm8FSkVBEmyNQxKcfQCZih59O6fXspRfN865pT5SG0NFFYFP9acFXmd3gGuTGRk0blHbfhAvSBLwmXIA1fLqXN1eNZHpXNjX06htZe9kLypH79iTWy20ehzEFeLiKlUkvpiYEGxJl86AUh5evA= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-lf1-x12d.google.com with SMTP id 2adb3069b0e04-52f042c15e3so882331e87.0 for ; Fri, 26 Jul 2024 01:25:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; t=1721982315; x=1722587115; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x9JlXwtWhwWNPSk05eY5jqN5WkIbOo15RVyPnu3Fty4=; b=KwLboLEmD4Lzw7fz6ztbZ61wTTQDQTxIL4cAuKWxg2JT8YVbOOqHyDdS/PgbXbXiye nxA9DedlbpnjVKy4nOp55NFyFxTMUDDA4B9NYGXvRkbZwTPqqNfi3tSyf3RdLFqxXT8k HslJ892WcBMksNHpqxnHNl0s9tym7W8OUjkv7yMHtBhWk9eYn8DEZ4nAwob7C4ydqbg0 1nKCkPhgZCJKR2hUctKfBeEjSbMoPcklB929ULvm+EAeLA+fhYxCMFnn5Rm3vZA0z7Ka KY56rEsR7+kfSH1LrR8IfOZLcxf/3h+iBuKVnKp55LWwePU/Sf6DgvBC/zrRF9T5Oxw/ QN2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721982315; x=1722587115; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x9JlXwtWhwWNPSk05eY5jqN5WkIbOo15RVyPnu3Fty4=; b=N7yyV3FPvuc253zTbUGK2aTZ4dG7QuRzzLN/K6opWczYph5h+Xu8NtyORRNwg0BHtu yzrH+Hk+hEOJDB518FQC8WC54rd9SDe1X82GZyWrrZejLXotQ/w74/RPtBSwN8RR38J4 6OniM0PYbpWCX9ikl3C5R7LPg0yHBkkaahOkEVUiYAPw/58MYwQjWUQRaCYunU5HN7Ui Mb8MOa3Qfu47aP3zsHMnecds6U7yY1gg9Zbd3tfYNc942W0UHk3W8me8iFkeFbMzWFlo eWM/UY/yLtPa3WAGs60TT2thutvK0gsM3suc85MN/qVthmgwkhHdNg6arXeppHu4zySu Vu4Q== X-Gm-Message-State: AOJu0YxtSWc59fNkBj789MkX2uaH1tOBDi05yLdvw4Q1bqRCf9FekKcK g3c0Sm4TrkY4fyKuD7PEil12+M76nRXfosI/S2UlGGTbMo/Kocy8JlpoJDjbh+OAO0pnlprmLcH pvxw= X-Google-Smtp-Source: AGHT+IH1Gzi7u5bnta52BS5BybLuPrHD7sRE0tEmGyV1v7pGos8lt9xJ1P1oW8U6xfn/5PXccW62uQ== X-Received: by 2002:a05:6512:3192:b0:52f:3c:a79 with SMTP id 2adb3069b0e04-52fcf8afd05mr2307659e87.7.1721982314233; Fri, 26 Jul 2024 01:25:14 -0700 (PDT) Received: from helsinki-03.engr ([2a01:4f9:6b:2a47::2]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-52fd5c354d2sm432604e87.290.2024.07.26.01.25.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 01:25:14 -0700 (PDT) From: Manolis Tsamis To: gcc-patches@gcc.gnu.org Cc: Richard Sandiford , Richard Biener , Philipp Tomsich , Jeff Law , Robin Dapp , Jiangning Liu , Manolis Tsamis Subject: [PATCH v5 1/3] [RFC] ifcvt: handle sequences that clobber flags in noce_convert_multiple_sets Date: Fri, 26 Jul 2024 10:24:59 +0200 Message-Id: <20240726082501.4086489-2-manolis.tsamis@vrull.eu> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> References: <20240726082501.4086489-1-manolis.tsamis@vrull.eu> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This is an extension of what was done in PR106590. Currently if a sequence generated in noce_convert_multiple_sets clobbers the condition rtx (cc_cmp or rev_cc_cmp) then only seq1 is used afterwards (sequences that emit the comparison itself). Since this applies only from the next iteration it assumes that the sequences generated (in particular seq2) doesn't clobber the condition rtx itself before using it in the if_then_else, which is only true in specific cases (currently only register/subregister moves are allowed). This patch changes this so it also tests if seq2 clobbers cc_cmp/rev_cc_cmp in the current iteration. It also checks whether the resulting sequence clobbers the condition attached to the jump. This makes it possible to include arithmetic operations in noce_convert_multiple_sets. It also makes the code that checks whether the condition is used outside of the if_then_else emitted more robust. gcc/ChangeLog: * ifcvt.cc (check_for_cc_cmp_clobbers): Use modified_in_p instead. (noce_convert_multiple_sets_1): Don't use seq2 if it clobbers cc_cmp. Punt if seq clobbers cond. Refactor the code that sets read_comparison. Signed-off-by: Manolis Tsamis --- (no changes since v1) gcc/ifcvt.cc | 128 +++++++++++++++++++++++++++++++-------------------- 1 file changed, 79 insertions(+), 49 deletions(-) diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc index 58ed42673e5..ff6c934c613 100644 --- a/gcc/ifcvt.cc +++ b/gcc/ifcvt.cc @@ -3592,20 +3592,6 @@ noce_convert_multiple_sets (struct noce_if_info *if_info) return true; } -/* Helper function for noce_convert_multiple_sets_1. If store to - DEST can affect P[0] or P[1], clear P[0]. Called via note_stores. */ - -static void -check_for_cc_cmp_clobbers (rtx dest, const_rtx, void *p0) -{ - rtx *p = (rtx *) p0; - if (p[0] == NULL_RTX) - return; - if (reg_overlap_mentioned_p (dest, p[0]) - || (p[1] && reg_overlap_mentioned_p (dest, p[1]))) - p[0] = NULL_RTX; -} - /* This goes through all relevant insns of IF_INFO->then_bb and tries to create conditional moves. In case a simple move sufficis the insn should be listed in NEED_NO_CMOV. The rewired-src cases should be @@ -3731,36 +3717,71 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, creating an additional compare for each. If successful, costing is easier and this sequence is usually preferred. */ if (cc_cmp) - seq2 = try_emit_cmove_seq (if_info, temp, cond, - new_val, old_val, need_cmov, - &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); + { + seq2 = try_emit_cmove_seq (if_info, temp, cond, + new_val, old_val, need_cmov, + &cost2, &temp_dest2, cc_cmp, rev_cc_cmp); + + /* The if_then_else in SEQ2 may be affected when cc_cmp/rev_cc_cmp is + clobbered. We can't safely use the sequence in this case. */ + for (rtx_insn *iter = seq2; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cc_cmp, iter) + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, iter))) + { + seq2 = NULL; + break; + } + } /* The backend might have created a sequence that uses the - condition. Check this. */ + condition as a value. Check this. */ + + /* We cannot handle anything more complex than a reg or constant. */ + if (!REG_P (XEXP (cond, 0)) && !CONSTANT_P (XEXP (cond, 0))) + read_comparison = true; + + if (!REG_P (XEXP (cond, 1)) && !CONSTANT_P (XEXP (cond, 1))) + read_comparison = true; + rtx_insn *walk = seq2; - while (walk) + int if_then_else_count = 0; + while (walk && !read_comparison) { - rtx set = single_set (walk); + rtx exprs_to_check[2]; + unsigned int exprs_count = 0; - if (!set || !SET_SRC (set)) + rtx set = single_set (walk); + if (set && XEXP (set, 1) + && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE) { - walk = NEXT_INSN (walk); - continue; + /* We assume that this is the cmove created by the backend that + naturally uses the condition. */ + exprs_to_check[exprs_count++] = XEXP (XEXP (set, 1), 1); + exprs_to_check[exprs_count++] = XEXP (XEXP (set, 1), 2); + if_then_else_count++; } + else if (NONDEBUG_INSN_P (walk)) + exprs_to_check[exprs_count++] = PATTERN (walk); - rtx src = SET_SRC (set); + /* Bail if we get more than one if_then_else because the assumption + above may be incorrect. */ + if (if_then_else_count > 1) + { + read_comparison = true; + break; + } - if (XEXP (set, 1) && GET_CODE (XEXP (set, 1)) == IF_THEN_ELSE) - ; /* We assume that this is the cmove created by the backend that - naturally uses the condition. Therefore we ignore it. */ - else + for (unsigned int i = 0; i < exprs_count; i++) { - if (reg_mentioned_p (XEXP (cond, 0), src) - || reg_mentioned_p (XEXP (cond, 1), src)) - { - read_comparison = true; - break; - } + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, exprs_to_check[i], NONCONST) + if (*iter != NULL_RTX + && (reg_overlap_mentioned_p (XEXP (cond, 0), *iter) + || reg_overlap_mentioned_p (XEXP (cond, 1), *iter))) + { + read_comparison = true; + break; + } } walk = NEXT_INSN (walk); @@ -3788,22 +3809,31 @@ noce_convert_multiple_sets_1 (struct noce_if_info *if_info, return false; } - if (cc_cmp) + /* Although we use temporaries if there is register overlap of COND and + TARGET, it is possible that SEQ modifies COND anyway. For example, + COND may use the flags register and if INSN clobbers flags then + we may be unable to emit a valid sequence (e.g. in x86 that would + require saving and restoring the flags register). */ + for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cond, iter)) + { + end_sequence (); + return false; + } + + if (cc_cmp && seq == seq1) { - /* Check if SEQ can clobber registers mentioned in - cc_cmp and/or rev_cc_cmp. If yes, we need to use - only seq1 from that point on. */ - rtx cc_cmp_pair[2] = { cc_cmp, rev_cc_cmp }; - for (walk = seq; walk; walk = NEXT_INSN (walk)) - { - note_stores (walk, check_for_cc_cmp_clobbers, cc_cmp_pair); - if (cc_cmp_pair[0] == NULL_RTX) - { - cc_cmp = NULL_RTX; - rev_cc_cmp = NULL_RTX; - break; - } - } + /* Check if SEQ can clobber registers mentioned in cc_cmp/rev_cc_cmp. + If yes, we need to use only SEQ1 from that point on. + Only check when we use SEQ1 since we have already tested SEQ2. */ + for (rtx_insn *iter = seq; iter; iter = NEXT_INSN (iter)) + if (modified_in_p (cc_cmp, iter) + || (rev_cc_cmp && modified_in_p (rev_cc_cmp, iter))) + { + cc_cmp = NULL_RTX; + rev_cc_cmp = NULL_RTX; + break; + } } /* End the sub sequence and emit to the main sequence. */