From patchwork Sun Jul 14 13:08:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1960341 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=doR5Bs0E; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WMQdb5x6fz1xqj for ; Sun, 14 Jul 2024 23:09:05 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 60E24385DDC5 for ; Sun, 14 Jul 2024 13:09:03 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [69.48.154.134]) by sourceware.org (Postfix) with ESMTPS id 0128A3858CDB for ; Sun, 14 Jul 2024 13:08:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0128A3858CDB Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 0128A3858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=69.48.154.134 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720962520; cv=none; b=C39MCriPFALeMxoSWhRGlLCOEQnPta8rcaPtxLXUsVoucISVg4sH2XOaRsaabEz3bgRWW7FN/wzUwuhP1hf0Sx5fhuCISRAsdHvK+uQHwpV/tHajK0EDHKJDmdtyr2sDKVjJOoREGYWbh1atOckBuX4GYES0jHv0sMdpLcPSsXY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720962520; c=relaxed/simple; bh=uIq0f9bOkPw6mGDbKnkqZUGdOfrQq59iA4CIZR3240A=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=S3gBwdu3hfUucVFgV2FTWV8mDz8wEBglGr7uglteCpB703qzO+uMVUitQdTPQK/fwZPZmz4rQItFlT0cCVV8tGzAAqUO+qzxzl3Xmu7rp6fKHfhe47iSMypXPaQi6W8KTQWqT1lXudeQsH91GwULNsQNL2T5hWeelU93osRKK6k= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:In-Reply-To:References:Cc:To:From:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=USdOw5r3ZKw4qYikrw2NYkRxn8niAdVULPHiJRVHvQ0=; b=doR5Bs0EweWxI7ncsJkEYRYeG4 4BmPVu592V7CYl5NaBfdfXMnYNnNRbxU9jEj8hNxF6y6e24kgGg3lY+E+q8lgE6vFSXcren0ih9eM PShJiQ+n7n19UdTDg/wOrsxXi4Syy2Cb5cdZgue3d1Lgbz/7c2lJTqgCp3UfwO/TU4ia1xnNGngbW iwUpHCIUUODab2WCfC4oEidniQmNlUKNGYlHQZPOdkLBkVBPvO3HSgjADOD8z6Z42+jJBjSGng1Ec YFBxjyz+T1KLs/vvlAbf/aBDzpAnZE2cKIwG5KjgDazl8fNIc4ED84aygFrvPJ2vMFDFsr2YlI96H rI7G2Ckw==; Received: from [168.86.198.82] (port=58735 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1sSyxw-00000004enu-13bb; Sun, 14 Jul 2024 09:08:36 -0400 From: "Roger Sayle" To: "'Richard Biener'" Cc: References: <007301dad24f$40fe7c00$c2fb7400$@nextmovesoftware.com> In-Reply-To: Subject: [match.pd PATCH] PR tree-optimization/114661: Generalize MULT_EXPR recognition (take #2) Date: Sun, 14 Jul 2024 14:08:33 +0100 Message-ID: <011101dad5ee$efb573f0$cf205bd0$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdrV7V+MJAo0FZNhT0OZlibVctdbyw== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Hi Richard, Many thanks for the review and recommendation to use nop_convert?. This revised patch implements that suggestion, which required a little experimentation/tweaking as ranger/EVRP records the ranges on the useless type conversions rather than the multiplications. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2024-07-14 Roger Sayle Richard Biener gcc/ChangeLog PR tree-optimization/114661 * match.pd ((X*C1)|(X*C2) to X*(C1+C2)): Allow optional useless type conversions around multiplicaitions, such as those inserted by this transformation. gcc/testsuite/ChangeLog PR tree-optimization/114661 * gcc.dg/pr114661.c: New test case. Thanks again, Roger --- > -----Original Message----- > From: Richard Biener > Sent: 10 July 2024 12:34 > To: Roger Sayle > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [match.pd PATCH] PR tree-optimization/114661: Generalize > MULT_EXPR recognition. > > On Wed, Jul 10, 2024 at 12:28 AM Roger Sayle > wrote: > > > > This patch resolves PR tree-optimization/114661, by generalizing the > > set of expressions that we canonicalize to multiplication. This > > extends the > > optimization(s) contributed (by me) back in July 2021. > > https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575999.html > > > > The existing transformation folds (X*C1)^(X< > allowed. A subtlety is that for non-wrapping integer types, we > > actually fold this into (int)((unsigned)X*C3) so that we don't > > introduce an undefined overflow that wasn't in the original. > > Unfortunately, this transformation confuses itself, as the type-safe > > multiplication isn't recognized when further combining bit operations. > > Fixed here by adding transforms to turn (int)((unsigned)X*C1)^(X< > into (int)((unsigned)X*C3) so that match.pd and EVRP can continue to > > construct multiplications. > > > > For the example given in the PR: > > > > unsigned mul(unsigned char c) { > > if (c > 3) __builtin_unreachable(); > > return c << 18 | c << 15 | > > c << 12 | c << 9 | > > c << 6 | c << 3 | c; > > } > > > > GCC on x86_64 with -O2 previously generated: > > > > mul: movzbl %dil, %edi > > leal (%rdi,%rdi,8), %edx > > leal 0(,%rdx,8), %eax > > movl %edx, %ecx > > sall $15, %edx > > orl %edi, %eax > > sall $9, %ecx > > orl %ecx, %eax > > orl %edx, %eax > > ret > > > > with this patch we now generate: > > > > mul: movzbl %dil, %eax > > imull $299593, %eax, %eax > > ret > > > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap > > and make -k check, both with and without --target_board=unix{-m32} > > with no new failures. Ok for mainline? > > I'm looking at the difference between the existing > > (simplify > (op:c (mult:s@0 @1 INTEGER_CST@2) > (lshift:s@3 @1 INTEGER_CST@4)) > (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type) > && tree_int_cst_sgn (@4) > 0 > && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0) > (with { wide_int wone = wi::one (TYPE_PRECISION (type)); > wide_int c = wi::add (wi::to_wide (@2), > wi::lshift (wone, wi::to_wide (@4))); } > (mult @1 { wide_int_to_tree (type, c); })))) > > and > > + (simplify > + (op:c (convert:s@0 (mult:s@1 (convert @2) INTEGER_CST@3)) > + (lshift:s@4 @2 INTEGER_CST@5)) > + (if (INTEGRAL_TYPE_P (type) > + && INTEGRAL_TYPE_P (TREE_TYPE (@1)) > + && TREE_TYPE (@2) == type > + && TYPE_UNSIGNED (TREE_TYPE (@1)) > + && TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (@1)) > + && tree_int_cst_sgn (@5) > 0 > + && (tree_nonzero_bits (@0) & tree_nonzero_bits (@4)) == 0) > + (with { tree t = TREE_TYPE (@1); > + wide_int wone = wi::one (TYPE_PRECISION (t)); > + wide_int c = wi::add (wi::to_wide (@3), > + wi::lshift (wone, wi::to_wide (@5))); } > + (convert (mult:t (convert:t @2) { wide_int_to_tree (t, c); }))))) > > and wonder whether wrapping of the multiplication is required for correctness, > specifically the former seems to allow signed types with -fwrapv while the latter > won't. It also looks the patterns could be merged doing > > (simplify > (op:c (nop_convert:s? (mult:s@0 (nop_convert? @1) INTEGER_CST@2) > (lshift:s@3 @1 INTEGER_CST@4)) > > and by using nop_convert instead of convert simplify the condition? > > Richard. > > > > > 2024-07-09 Roger Sayle > > > > gcc/ChangeLog > > PR tree-optimization/114661 > > * match.pd ((X*C1)|(X*C2) to X*(C1+C2)): Additionally recognize > > multiplications surrounded by casts to an unsigned type and back > > such as those generated by these transformations. > > > > gcc/testsuite/ChangeLog > > PR tree-optimization/114661 > > * gcc.dg/pr114661.c: New test case. > > > > > > Thanks in advance, > > Roger > > -- > > diff --git a/gcc/match.pd b/gcc/match.pd index 4edfa2a..a66b00a 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -4156,30 +4156,39 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) Likewise, handle (X< 0 - && (tree_nonzero_bits (@0) & tree_nonzero_bits (@3)) == 0) - (with { wide_int wone = wi::one (TYPE_PRECISION (type)); + && (tree_nonzero_bits (@5) & tree_nonzero_bits (@3)) == 0) + (with { tree t = type; + if (!TYPE_OVERFLOW_WRAPS (t)) + t = unsigned_type_for (t); + wide_int wone = wi::one (TYPE_PRECISION (type)); wide_int c = wi::add (wi::to_wide (@2), wi::lshift (wone, wi::to_wide (@4))); } - (mult @1 { wide_int_to_tree (type, c); })))) + (convert (mult:t (convert:t @1) { wide_int_to_tree (t, c); }))))) (simplify - (op:c (mult:s@0 @1 INTEGER_CST@2) + (op:c (nop_convert?:s@3 (mult:s@0 (nop_convert? @1) INTEGER_CST@2)) @1) - (if (INTEGRAL_TYPE_P (type) && TYPE_OVERFLOW_WRAPS (type) - && (tree_nonzero_bits (@0) & tree_nonzero_bits (@1)) == 0) - (mult @1 - { wide_int_to_tree (type, - wi::add (wi::to_wide (@2), 1)); }))) + (if (INTEGRAL_TYPE_P (type) + && (tree_nonzero_bits (@3) & tree_nonzero_bits (@1)) == 0) + (with { tree t = type; + if (!TYPE_OVERFLOW_WRAPS (t)) + t = unsigned_type_for (t); + wide_int c = wi::add (wi::to_wide (@2), 1); } + (convert (mult:t (convert:t @1) { wide_int_to_tree (t, c); }))))) (simplify (op (lshift:s@0 @1 INTEGER_CST@2) (lshift:s@3 @1 INTEGER_CST@4)) diff --git a/gcc/testsuite/gcc.dg/pr114661.c b/gcc/testsuite/gcc.dg/pr114661.c new file mode 100644 index 0000000..e6b5c69 --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr114661.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-evrp" } */ + +unsigned mul(unsigned char c) { + if (c > 3) __builtin_unreachable(); + return c << 18 | c << 15 | + c << 12 | c << 9 | + c << 6 | c << 3 | c; +} +/* { dg-final { scan-tree-dump-times " \\* 299593" 1 "evrp" } } */