From patchwork Wed Sep 25 14:08:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 1989423 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XDJVj44VPz1xst for ; Thu, 26 Sep 2024 00:08:45 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 52A513858416 for ; Wed, 25 Sep 2024 14:08:43 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 1AC963858D28 for ; Wed, 25 Sep 2024 14:08:19 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1AC963858D28 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1AC963858D28 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727273304; cv=none; b=qTkY8JC3YQ0vdmvwDWY+BU3UHe6e7/uMldnp/eDtFu10A5FB2aQ21qv+Td9+GZW2X1xqFXmpLm1NXxlyTjC3Qyje5iNtb3DnN+sTpbw9bw6dacSQ7pwzIBs/IBV7ydVdIeT8+a9h8BeePb81DgWVvIvOggHAsbJqLID36bK3uAM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727273304; c=relaxed/simple; bh=2fsQLUQ2Vnx+ZUbKhMqEZHercp+B0XCs6Jq95LOxOjw=; h=Message-ID:Date:MIME-Version:To:From:Subject; b=tcE3ynjMuA8QZYeDnPJFsPJVQ3zLzVW5KlSOY9E1AEh2478JT0dDJ7jUx7Hy5ECE0cpDF5wkipgDfKsv02aQHoOqGq6LeSqH/fVQRZSpx/2ZRlXuC00fV5CdaLI4UVGXYs9BYU88fpeIQGG8AgMjoWrf/BkSm47YW7gLN3vDcMo= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 4E04B1576; Wed, 25 Sep 2024 07:08:48 -0700 (PDT) Received: from [10.57.21.8] (unknown [10.57.21.8]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 388CB3F64C; Wed, 25 Sep 2024 07:08:18 -0700 (PDT) Message-ID: <58c24c28-0af8-4f0a-b884-a482c84e707e@arm.com> Date: Wed, 25 Sep 2024 15:08:08 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Content-Language: en-US To: gcc-patches@gcc.gnu.org Cc: ramanara@nvidia.com, richard.earnshaw@arm.com From: "Andre Vieira (lists)" Subject: [PATCH] arm: Fix missed CE optimization for armv8.1-m.main [PR 116444] X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi, This patch restores missed optimizations for armv8.1-m.main targets that were missed when the generation of csinc, csinv and csneg were enabled or the same with patch series containing: commit c2bb84be4a6e581bbf45891457ee632a07416982 Author: Sudi Das Date: Fri Sep 18 15:47:46 2020 +0100 [PATCH 2/5][Arm] New pattern for CSINV instructions The original patch series makes use of the "noce" machinery to transfor RTL into patterns that later match the Armv8.1-M Mainline, by getting the target hook TARGET_HAVE_CONDITIONAL_EXECUTION, to return FALSE for such targets prior to reload_completed. The same machinery however was transforming other RTL patterns which were later on causing the "ce" pass post reload_completed to no longer optimize conditional execution opportunities, which was causing the regression observed in PR target/116444. This patch implements the target hook TARGET_NOCE_CONVERSION_PROFITABLE_P to only allow "noce" to generate patterns that match CSINV, CSINC and CSNEG. Thus ensuring that the early "ce" passes do not ruin things for later ones. gcc/ChangeLog: * config/arm/arm-protos.h (arm_noce_conversion_profitable_p): New declaration. * config/arm/arm.cc (arm_is_v81m_cond_insn): New helper function used in ... (arm_noce_conversion_profitable_p): ... here. New function to implement ... (TARGET_NOCE_PROFITABLE_P): ... this target hook. New define. Regression tested on arm-none-eabi, also specifically tested thumb-icvt-2.c for -mcpu=cortex-m55. OK for trunk and backport to gcc-14 (after a week and testing ofc)? Kind Regards, Andre Vieira diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index 50cae2b513a262880e4f6ee577341d7a5d2c39c6..b694589cab4b45730b257baacfdbd9db00584cdc 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -210,6 +210,7 @@ extern bool arm_pad_reg_upward (machine_mode, tree, int); #endif extern int arm_apply_result_size (void); extern opt_machine_mode arm_get_mask_mode (machine_mode mode); +extern bool arm_noce_conversion_profitable_p (rtx_insn *,struct noce_if_info *); #endif /* RTX_CODE */ diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc index de34e9867e67f5c4730e92a4bc6f347d83847e58..fe9b1ad5bf8560858d3996c0659d65c9887307d9 100644 --- a/gcc/config/arm/arm.cc +++ b/gcc/config/arm/arm.cc @@ -815,6 +815,9 @@ static const scoped_attribute_specs *const arm_attribute_table[] = #undef TARGET_MODES_TIEABLE_P #define TARGET_MODES_TIEABLE_P arm_modes_tieable_p +#undef TARGET_NOCE_CONVERSION_PROFITABLE_P +#define TARGET_NOCE_CONVERSION_PROFITABLE_P arm_noce_conversion_profitable_p + #undef TARGET_CAN_CHANGE_MODE_CLASS #define TARGET_CAN_CHANGE_MODE_CLASS arm_can_change_mode_class @@ -36074,6 +36077,90 @@ arm_get_mask_mode (machine_mode mode) return default_get_mask_mode (mode); } +/* Helper function to determine whether SEQ represents a sequence of + instructions representing the Armv8.1-M Mainline conditional arithmetic + instructions: csinc, csneg and csinv. The cinc instruction is generated + using a different mechanism. */ + +static bool +arm_is_v81m_cond_insn (rtx_insn *seq) +{ + rtx_insn *curr_insn = seq; + rtx set; + /* The pattern may start with a simple set with register operands. Skip + through any of those. */ + while (curr_insn) + { + set = single_set (curr_insn); + if (!set + || !REG_P (SET_DEST (set))) + return false; + + if (!REG_P (SET_SRC (set))) + break; + curr_insn = NEXT_INSN (curr_insn); + } + + if (!set) + return false; + + /* The next instruction should be one of: + NEG: for csneg, + PLUS: for csinc, + NOT: for csinv. */ + if (GET_CODE (SET_SRC (set)) != NEG + && GET_CODE (SET_SRC (set)) != PLUS + && GET_CODE (SET_SRC (set)) != NOT) + return false; + + curr_insn = NEXT_INSN (curr_insn); + if (!curr_insn) + return false; + + /* The next instruction should be a COMPARE. */ + set = single_set (curr_insn); + if (!set + || !REG_P (SET_DEST (set)) + || GET_CODE (SET_SRC (set)) != COMPARE) + return false; + + curr_insn = NEXT_INSN (curr_insn); + if (!curr_insn) + return false; + + /* And the last instruction should be an IF_THEN_ELSE. */ + set = single_set (curr_insn); + if (!set + || !REG_P (SET_DEST (set)) + || GET_CODE (SET_SRC (set)) != IF_THEN_ELSE) + return false; + + return !NEXT_INSN (curr_insn); +} + +/* For Armv8.1-M Mainline we have both conditional execution through IT blocks, + as well as conditional arithmetic instructions controlled by + TARGET_COND_ARITH. To generate the latter we rely on a special part of the + "ce" pass that generates code for targets that don't support conditional + execution of general instructions known as "noce". These transformations + happen before 'reload_completed'. However, "noce" also triggers for some + unwanted patterns [PR 116444] that prevent "ce" optimisations after reload. + To make sure we can get both we use the TARGET_NOCE_CONVERSION_PROFITABLE_P + hook to only allow "noce" to generate the patterns that are profitable. */ + +bool +arm_noce_conversion_profitable_p (rtx_insn *seq, struct noce_if_info *if_info) +{ + if (!TARGET_COND_ARITH + || reload_completed) + return true; + + if (arm_is_v81m_cond_insn (seq)) + return true; + + return false; +} + /* Output assembly to read the thread pointer from the appropriate TPIDR register into DEST. If PRED_P also emit the %? that can be used to output the predication code. */