From patchwork Wed Aug 3 12:20:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1663304 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=TKxnCKjY; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4LyWCq1wbNz9s5W for ; Wed, 3 Aug 2022 22:20:38 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3600D388302C for ; Wed, 3 Aug 2022 12:20:33 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id AF86A3858C20 for ; Wed, 3 Aug 2022 12:20:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AF86A3858C20 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=ghBy0FFr/NoIq9JBlCykNR1NcomrrXs8YBn9jcA0uZQ=; b=TKxnCKjYSNmyyIecEIyGYUhnCy dGpFiZBsJfSo1jmljkWIwNRogQjdBK1witn35AJLgLn5upNJQIHsd7dIXYKg3YyYq9UUHoD5quejg IaUh7w0vKDjHmTn/e8iENHsAccRdoDnpJ373/nQXd/7/TjnzfCiPF/i30d2ijZ/O6lsukELB5pTSt /Q4aJsw2uG/WwTDO6X6i6V9pIRnO68nCl6W+MJLe19UY8XJAUsy66PK49Ak7eqdb+DJT1+yVYngZQ mks9SFEW6NDceua8Vqeq9ePqAkbGGF6HeDuU+nxj7SE07+aNtDYhGHlFj/Wcu3U5naEZK4rSd/Vlj Cr4hWgGA==; Received: from [185.62.158.67] (port=53496 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1oJDMJ-00048n-KM for gcc-patches@gcc.gnu.org; Wed, 03 Aug 2022 08:20:19 -0400 From: "Roger Sayle" To: Subject: [PATCH] middle-end: Allow backend to expand/split double word compare to 0/-1. Date: Wed, 3 Aug 2022 13:20:17 +0100 Message-ID: <02ac01d8a733$653480e0$2f9d82a0$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdinMmVa5fxTZkRmRXm3Vxa3MDzOgw== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch to the middle-end's RTL expansion reorders the code in emit_store_flag_1 so that the backend has more control over how best to expand/split double word equality/inequality comparisons against zero or minus one. With the current implementation, the middle-end always decides to lower this idiom during RTL expansion using SUBREGs and word mode instructions, without ever consulting the backend's machine description. Hence on x86_64, a TImode comparison against zero is always expanded as: (parallel [ (set (reg:DI 91) (ior:DI (subreg:DI (reg:TI 88) 0) (subreg:DI (reg:TI 88) 8))) (clobber (reg:CC 17 flags))]) (set (reg:CCZ 17 flags) (compare:CCZ (reg:DI 91) (const_int 0 [0]))) This patch, which makes no changes to the code itself, simply reorders the clauses in emit_store_flag_1 so that the middle-end first attempts expansion using the target's doubleword mode cstore optab/expander, and only if this fails, falls back to lowering to word mode operations. On x86_64, this allows the expander to produce: (set (reg:CCZ 17 flags) (compare:CCZ (reg:TI 88) (const_int 0 [0]))) which is a candidate for scalar-to-vector transformations (and combine simplifications etc.). On targets that don't define a cstore pattern for doubleword integer modes, there should be no change in behaviour. For those that do, the current behaviour can be restored (if desired) by restricting the expander/insn to not apply when the comparison is EQ or NE, and operand[2] is either const0_rtx or constm1_rtx. This change just keeps RTL expansion more consistent (in philosophy). For other doubleword comparisons, such as with operators LT and GT, or with constants other than zero or -1, the wishes of the backend are respected, and only if the optab expansion fails are the default fall-back implementations using narrower integer mode operations (and conditional jumps) used. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32}, with no new failures. I'm happy to help tweak any backends that notice a change in their generated code. Ok for mainline? 2022-08-03 Roger Sayle gcc/ChangeLog * expmed.cc (emit_store_flag_1): Move code to expand double word equality and inequality against zero or -1, using word operations, to after trying to use the backend's cstore4 optab/expander. Thanks in advance, Roger diff --git a/gcc/expmed.cc b/gcc/expmed.cc index 9b01b5a..8d7418b 100644 --- a/gcc/expmed.cc +++ b/gcc/expmed.cc @@ -5662,63 +5662,9 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1, break; } - /* If we are comparing a double-word integer with zero or -1, we can - convert the comparison into one involving a single word. */ - scalar_int_mode int_mode; - if (is_int_mode (mode, &int_mode) - && GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2 - && (!MEM_P (op0) || ! MEM_VOLATILE_P (op0))) - { - rtx tem; - if ((code == EQ || code == NE) - && (op1 == const0_rtx || op1 == constm1_rtx)) - { - rtx op00, op01; - - /* Do a logical OR or AND of the two words and compare the - result. */ - op00 = simplify_gen_subreg (word_mode, op0, int_mode, 0); - op01 = simplify_gen_subreg (word_mode, op0, int_mode, UNITS_PER_WORD); - tem = expand_binop (word_mode, - op1 == const0_rtx ? ior_optab : and_optab, - op00, op01, NULL_RTX, unsignedp, - OPTAB_DIRECT); - - if (tem != 0) - tem = emit_store_flag (NULL_RTX, code, tem, op1, word_mode, - unsignedp, normalizep); - } - else if ((code == LT || code == GE) && op1 == const0_rtx) - { - rtx op0h; - - /* If testing the sign bit, can just test on high word. */ - op0h = simplify_gen_subreg (word_mode, op0, int_mode, - subreg_highpart_offset (word_mode, - int_mode)); - tem = emit_store_flag (NULL_RTX, code, op0h, op1, word_mode, - unsignedp, normalizep); - } - else - tem = NULL_RTX; - - if (tem) - { - if (target_mode == VOIDmode || GET_MODE (tem) == target_mode) - return tem; - if (!target) - target = gen_reg_rtx (target_mode); - - convert_move (target, tem, - !val_signbit_known_set_p (word_mode, - (normalizep ? normalizep - : STORE_FLAG_VALUE))); - return target; - } - } - /* If this is A < 0 or A >= 0, we can do this by taking the ones complement of A (for GE) and shifting the sign bit to the low bit. */ + scalar_int_mode int_mode; if (op1 == const0_rtx && (code == LT || code == GE) && is_int_mode (mode, &int_mode) && (normalizep || STORE_FLAG_VALUE == 1 @@ -5764,6 +5710,7 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1, return op0; } + /* Next try expanding this via the backend's cstore4. */ mclass = GET_MODE_CLASS (mode); FOR_EACH_MODE_FROM (compare_mode, mode) { @@ -5788,6 +5735,60 @@ emit_store_flag_1 (rtx target, enum rtx_code code, rtx op0, rtx op1, } } + /* If we are comparing a double-word integer with zero or -1, we can + convert the comparison into one involving a single word. */ + if (is_int_mode (mode, &int_mode) + && GET_MODE_BITSIZE (int_mode) == BITS_PER_WORD * 2 + && (!MEM_P (op0) || ! MEM_VOLATILE_P (op0))) + { + rtx tem; + if ((code == EQ || code == NE) + && (op1 == const0_rtx || op1 == constm1_rtx)) + { + rtx op00, op01; + + /* Do a logical OR or AND of the two words and compare the + result. */ + op00 = simplify_gen_subreg (word_mode, op0, int_mode, 0); + op01 = simplify_gen_subreg (word_mode, op0, int_mode, UNITS_PER_WORD); + tem = expand_binop (word_mode, + op1 == const0_rtx ? ior_optab : and_optab, + op00, op01, NULL_RTX, unsignedp, + OPTAB_DIRECT); + + if (tem != 0) + tem = emit_store_flag (NULL_RTX, code, tem, op1, word_mode, + unsignedp, normalizep); + } + else if ((code == LT || code == GE) && op1 == const0_rtx) + { + rtx op0h; + + /* If testing the sign bit, can just test on high word. */ + op0h = simplify_gen_subreg (word_mode, op0, int_mode, + subreg_highpart_offset (word_mode, + int_mode)); + tem = emit_store_flag (NULL_RTX, code, op0h, op1, word_mode, + unsignedp, normalizep); + } + else + tem = NULL_RTX; + + if (tem) + { + if (target_mode == VOIDmode || GET_MODE (tem) == target_mode) + return tem; + if (!target) + target = gen_reg_rtx (target_mode); + + convert_move (target, tem, + !val_signbit_known_set_p (word_mode, + (normalizep ? normalizep + : STORE_FLAG_VALUE))); + return target; + } + } + return 0; }