From patchwork Tue Aug 4 12:17:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 1340849 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: ozlabs.org; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=nextmovesoftware.com header.i=@nextmovesoftware.com header.a=rsa-sha256 header.s=default header.b=KoCQA5cV; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4BLYfw4sSzz9sRR for ; Tue, 4 Aug 2020 22:17:44 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 703B83860C34; Tue, 4 Aug 2020 12:17:41 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id B069B3857C49 for ; Tue, 4 Aug 2020 12:17:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org B069B3857C49 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=roger@nextmovesoftware.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:To:From:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=FJkqk/8dgTMjG3BgzSIE+VvsqPpuJq84F4/0FTrU7DU=; b=KoCQA5cV3obRgn6F7E/A+gJNPn ZM1TJVDhmB2QG2zdWShTbKRN94mKrV4YNRidqjtlfGyZ+XiAhe0a40LBS1R7JseBj7IJcwO++wsay 1O4l1kyO5yLrRZ8zjg2dVjuhAU/EfJKZvKgheUDoFzyZMRUgiK/AXAL2bQjHO8E/BJTXBs2Pt0Kl4 sTYWGyTfz9AWWg9AsNLTGzoAvsCWI/O/NfDwPwvor2vvI9ICfcZd+GR4tpQONdsaqyhgS944CXKZh BVS5iAGerSZlJ3y2giMhArhm8j7xT49g1EibsFugGfrvNQkhDyXcACt6W9KQ/ebyUOrzMP64oQl/b R26x5UAg==; Received: from host86-137-89-56.range86-137.btcentralplus.com ([86.137.89.56]:55328 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1k2vsw-0004W3-0j for gcc-patches@gcc.gnu.org; Tue, 04 Aug 2020 08:17:38 -0400 From: "Roger Sayle" To: "'GCC Patches'" Subject: [PATCH] middle-end: Recognize/canonicalize MULT_HIGHPART_EXPR and expand it. Date: Tue, 4 Aug 2020 13:17:36 +0100 Message-ID: <000701d66a59$3e170d20$ba452760$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdZqWOFXuObDqzz9RQ+R9UNd6Zt89A== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This middle-end patch teaches fold/match to recognize the idiom for a highpart multiplication and represent it internally as a MULT_HIGHPART_EXPR tree code. At RTL expansion time, the compiler will trying using an appropriate instruction (sequence) provided by the backend, but if that fails, this patch now provides a fallback by synthesizing a suitable sequence using either a widening multiply or a multiplication in a wider mode [matching the original tree]. The benefit of this internal canonicalization is that it allows GCC to generate muldi3_highpart instructions even on targets that require a libcall to perform TImode multiplications. Currently the RTL optimizers can recognize highpart multiplications in combine, but this matching fails when the multiplication requires a libcall. Rather than attempt to do something via REG_EQUAL_NOTEs, a clever solution is to make more use of the MULT_HIGHPART_EXPR tree code in the tree optimizers. This patch has been tested on x86_64-pc-linux-gnu with a "make bootstrap" and "make -k check", and on nvptx-none with a "make" and "make -k check", both with no few failures. There's an additional target-specific test in the nvptx patch to support "mul.hi.s64" and "mul.hi.u64" that I'm just about to post, but this code is already well exercised during bootstrap by libgcc. Ok for mainline? 2020-08-04 Roger Sayle gcc/ChangeLog * match.pd (((wide)x * (wide)y)>>C -> mult_highpart): New simplification/canonicalization to recognize MULT_HIGHPART_EXPR. * optabs.c (expand_mult_highpart_1): New function to expand MULT_HIGHPART_EXPR as a widening or a wide multiplication followed by a right shift (or a gen_highpart subreg). (expand_mult_highpart): Call the above function if the target doesn't provide a suitable optab. gcc/testsuite/ChangeLog * gcc.dg/fold-mult-highpart-1.c: New test. Thanks in advance, Roger --- Roger Sayle NextMove Software Cambridge, UK diff --git a/gcc/testsuite/gcc.dg/fold-mult-highpart-1.c b/gcc/testsuite/gcc.dg/fold-mult-highpart-1.c new file mode 100644 index 0000000..87daebd --- /dev/null +++ b/gcc/testsuite/gcc.dg/fold-mult-highpart-1.c @@ -0,0 +1,44 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -Wno-long-long -fdump-tree-optimized" } */ + +/* Not all platforms support TImode integers. */ +#if (defined(__LP64__) && !defined(__hppa__)) || defined(_WIN64) +typedef unsigned int __attribute ((mode(TI))) uti_t; +typedef int __attribute__ ((mode (TI))) ti_t; +#else +typedef unsigned long uti_type; +typedef long ti_type; +#endif + +short smulhi3_highpart(short x, short y) +{ + return ((int)x * (int)y) >> (8*sizeof(short)); +} + +int smulsi3_highpart(int x, int y) +{ + return ((long)x * (long)y) >> (8*sizeof(int)); +} + +long smuldi3_highpart(long x, long y) +{ + return ((ti_t)x * (ti_t)y) >> (8*sizeof(long)); +} + +unsigned short umulhi3_highpart(unsigned short x, unsigned short y) +{ + return ((unsigned int)x * (unsigned int)y) >> (8*sizeof(unsigned short)); +} + +unsigned int umulsi3_highpart(unsigned int x, unsigned int y) +{ + return ((unsigned long)x * (unsigned long)y) >> (8*sizeof(unsigned int)); +} + +unsigned long umuldi3_highpart(unsigned long x, unsigned long y) +{ + return ((uti_t)x * (uti_t)y) >> (8*sizeof(unsigned long)); +} + +/* { dg-final { scan-tree-dump-times " h\\* " 6 "optimized" } } */ + diff --git a/gcc/match.pd b/gcc/match.pd index a052c9e..15c33f2 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -6443,3 +6443,18 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) to the number of trailing zeroes. */ (match (ctz_table_index @1 @2 @3) (rshift (mult (bit_and:c (negate @1) @1) INTEGER_CST@2) INTEGER_CST@3)) + +/* Recognize MULT_HIGHPART_EXPR. */ +(simplify + (convert (rshift (mult:s (convert@3 @0) (convert @1)) + (INTEGER_CST@2))) + (if (INTEGRAL_TYPE_P (type) + && INTEGRAL_TYPE_P (TREE_TYPE (@3)) + && types_match (type, TREE_TYPE (@0)) + && types_match (type, TREE_TYPE (@1)) + && (TYPE_PRECISION (TREE_TYPE (@3)) + >= 2 * TYPE_PRECISION (type)) + && tree_fits_uhwi_p (@2) + && tree_to_uhwi (@2) == TYPE_PRECISION (type) + && TYPE_SIGN (TREE_TYPE (@3)) == TYPE_SIGN (type)) + (mult_highpart @0 @1))) diff --git a/gcc/optabs.c b/gcc/optabs.c index 184827f..2416a69 100644 --- a/gcc/optabs.c +++ b/gcc/optabs.c @@ -5870,6 +5870,52 @@ expand_vec_cmp_expr (tree type, tree exp, rtx target) return ops[0].value; } +/* Helper function of expand_mult_highpart. Expand a highpart + multiplication using a widening or wider multiplication. */ + +static rtx +expand_mult_highpart_1 (machine_mode mode, rtx op0, rtx op1, bool uns_p) +{ + scalar_int_mode narrow_mode; + rtx tem = NULL_RTX; + optab t; + + if (!is_a (mode, &narrow_mode)) + return NULL_RTX; + + scalar_int_mode wide_mode = GET_MODE_WIDER_MODE (narrow_mode).require (); + + /* Try a widening multiplication. */ + t = uns_p ? umul_widen_optab : smul_widen_optab; + if (convert_optab_handler (t, wide_mode, narrow_mode) != CODE_FOR_nothing) + tem = expand_binop (wide_mode, t, op0, op1, 0, uns_p, OPTAB_WIDEN); + + /* If that fails, try a wider multiplication. */ + if (!tem) + { + rtx_insn *insns; + rtx wop0, wop1; + start_sequence(); + wop0 = convert_modes (wide_mode, narrow_mode, op0, uns_p); + wop1 = convert_modes (wide_mode, narrow_mode, op1, uns_p); + tem = expand_binop (wide_mode, smul_optab, wop0, wop1, 0, + uns_p, OPTAB_LIB_WIDEN); + insns = get_insns (); + end_sequence (); + + if (!tem) + return NULL_RTX; + + emit_insn (insns); + } + + if (narrow_mode == word_mode) + return gen_highpart (narrow_mode, tem); + tem = expand_shift (RSHIFT_EXPR, wide_mode, tem, + GET_MODE_BITSIZE (narrow_mode), 0, 1); + return convert_modes (narrow_mode, wide_mode, tem, 0); +} + /* Expand a highpart multiply. */ rtx @@ -5887,7 +5933,8 @@ expand_mult_highpart (machine_mode mode, rtx op0, rtx op1, switch (method) { case 0: - return NULL_RTX; + /* We don't have an optab, try expanding this the hard way. */ + return expand_mult_highpart_1 (mode, op0, op1, uns_p); case 1: tab1 = uns_p ? umul_highpart_optab : smul_highpart_optab; return expand_binop (mode, tab1, op0, op1, target, uns_p,