From patchwork Tue Jun 29 14:50:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Julian Brown X-Patchwork-Id: 1498396 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4GDnXH1DFFz9s5R for ; Wed, 30 Jun 2021 00:53:03 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B77D53891026 for ; Tue, 29 Jun 2021 14:53:00 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa1.mentor.iphmx.com (esa1.mentor.iphmx.com [68.232.129.153]) by sourceware.org (Postfix) with ESMTPS id AAF6F388A82B for ; Tue, 29 Jun 2021 14:51:01 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org AAF6F388A82B Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com IronPort-SDR: i0aBjeW0qNROQscYxQ5vGtOjoZqqUAaFwG1qgJqX/kSbdeshh8kCjK9XG6+J/TYZMOTU5ZgbED 7erQJkmvB6Rb7FD72Ft9ZGu+Rnhloc044j70w+zGpRvXo4wZOBB+FsUWXwfO16CnBOCgiBxXc0 62s4+17t3kS3os04tPQ/p1YYKLvdSgBeTEfCkWvh0Q1hWIXe8A1JXXAYWuLS5tAJ6m/dESyDBe wXWAr4EBuu7qdUiVL6TiTvt+H14a/1kHz9KkQ+9Rq2FYzwq+oG+NdM658Qs/LZDg799MYbtbpr qyw= X-IronPort-AV: E=Sophos;i="5.83,309,1616486400"; d="scan'208";a="65377582" Received: from orw-gwy-01-in.mentorg.com ([192.94.38.165]) by esa1.mentor.iphmx.com with ESMTP; 29 Jun 2021 06:51:00 -0800 IronPort-SDR: J0FeVDyZYlBzucXaU7RZeQu2A1YXNOu5fqlpLsh5Wc5GP6LfPAebJpveqVwO3AvbntFzBVv3xn /UQWLelUlu4TakKYTxJsld9gPyqFb9t8vK/cIK2aKV1h4YnLAqie5ZVyJzTHPWvKjE0q8h383w mND7tZXnmZiu2VnhgZL6W/ByrRxCXqT/LqQbkgHM7W4dGcv52yfT8G1gpqKVhPff89kXG1XNEv C83ZGtlNMx920hfA3rKS5W6qqbMu3jgKG2rg9I7zr2CsXQoUZLVTzu61NK52Gq5Je4qB59xLhN zvg= From: Julian Brown To: Subject: [PATCH 3/3] amdgcn: Add [us]mulsid3/muldi3 patterns Date: Tue, 29 Jun 2021 07:50:40 -0700 Message-ID: <33f23044bbd98b98156c5bc9c501b5bedea9be56.1624977771.git.julian@codesourcery.com> X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-03.mgc.mentorg.com (139.181.222.3) To SVR-IES-MBX-04.mgc.mentorg.com (139.181.222.4) X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Andrew Stubbs Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" This patch improves 64-bit multiplication for AMD GCN: patterns for unsigned and signed 32x32->64 bit multiplication have been added, and also 64x64->64 bit multiplication is now open-coded rather than calling a library function (which may be a win for code size as well as speed: the function calling sequence isn't particularly concise for GCN). This version of the patch uses define_insn_and_split in order to keep multiply operations together during RTL optimisations up to register allocation: this appears to produce more compact code via inspection on small test cases than the previous approach using an expander. I will apply shortly. Julian 2021-06-29 Julian Brown gcc/ * config/gcn/gcn.md (mulsidi3, mulsidi3_reg, mulsidi3_imm, muldi3): Add patterns. --- gcc/config/gcn/gcn.md | 94 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 94 insertions(+) diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md index d1d49981ebb..82f7a468bce 100644 --- a/gcc/config/gcn/gcn.md +++ b/gcc/config/gcn/gcn.md @@ -1457,6 +1457,100 @@ (set_attr "length" "4,8,8") (set_attr "gcn_version" "gcn5,gcn5,*")]) +(define_expand "mulsidi3" + [(set (match_operand:DI 0 "register_operand" "") + (mult:DI (any_extend:DI + (match_operand:SI 1 "register_operand" "")) + (any_extend:DI + (match_operand:SI 2 "nonmemory_operand" ""))))] + "" +{ + if (can_create_pseudo_p () + && !TARGET_GCN5 + && !gcn_inline_immediate_operand (operands[2], SImode)) + operands[2] = force_reg (SImode, operands[2]); + + if (REG_P (operands[2])) + emit_insn (gen_mulsidi3_reg (operands[0], operands[1], operands[2])); + else + emit_insn (gen_mulsidi3_imm (operands[0], operands[1], operands[2])); + + DONE; +}) + +(define_insn_and_split "mulsidi3_reg" + [(set (match_operand:DI 0 "register_operand" "=&Sg, &v") + (mult:DI (any_extend:DI + (match_operand:SI 1 "register_operand" "%Sg, v")) + (any_extend:DI + (match_operand:SI 2 "register_operand" "Sg,vSv"))))] + "" + "#" + "reload_completed" + [(const_int 0)] + { + rtx dstlo = gen_lowpart (SImode, operands[0]); + rtx dsthi = gen_highpart_mode (SImode, DImode, operands[0]); + emit_insn (gen_mulsi3 (dstlo, operands[1], operands[2])); + emit_insn (gen_mulsi3_highpart (dsthi, operands[1], operands[2])); + DONE; + } + [(set_attr "gcn_version" "gcn5,*")]) + +(define_insn_and_split "mulsidi3_imm" + [(set (match_operand:DI 0 "register_operand" "=&Sg,&Sg,&v") + (mult:DI (any_extend:DI + (match_operand:SI 1 "register_operand" "Sg, Sg, v")) + (match_operand:DI 2 "gcn_32bit_immediate_operand" + "A, B, A")))] + "TARGET_GCN5 || gcn_inline_immediate_operand (operands[2], SImode)" + "#" + "&& reload_completed" + [(const_int 0)] + { + rtx dstlo = gen_lowpart (SImode, operands[0]); + rtx dsthi = gen_highpart_mode (SImode, DImode, operands[0]); + emit_insn (gen_mulsi3 (dstlo, operands[1], operands[2])); + emit_insn (gen_mulsi3_highpart (dsthi, operands[1], operands[2])); + DONE; + } + [(set_attr "gcn_version" "gcn5,gcn5,*")]) + +(define_insn_and_split "muldi3" + [(set (match_operand:DI 0 "register_operand" "=&Sg,&Sg, &v,&v") + (mult:DI (match_operand:DI 1 "register_operand" "%Sg, Sg, v, v") + (match_operand:DI 2 "nonmemory_operand" "Sg, i,vSv, A"))) + (clobber (match_scratch:SI 3 "=&Sg,&Sg,&v,&v")) + (clobber (match_scratch:BI 4 "=cs, cs, X, X")) + (clobber (match_scratch:DI 5 "=X, X,cV,cV"))] + "" + "#" + "reload_completed" + [(const_int 0)] + { + rtx tmp = operands[3]; + rtx dsthi = gen_highpart_mode (SImode, DImode, operands[0]); + rtx op1lo = gcn_operand_part (DImode, operands[1], 0); + rtx op1hi = gcn_operand_part (DImode, operands[1], 1); + rtx op2lo = gcn_operand_part (DImode, operands[2], 0); + rtx op2hi = gcn_operand_part (DImode, operands[2], 1); + emit_insn (gen_umulsidi3 (operands[0], op1lo, op2lo)); + emit_insn (gen_mulsi3 (tmp, op1lo, op2hi)); + rtx add = gen_rtx_SET (dsthi, gen_rtx_PLUS (SImode, dsthi, tmp)); + rtx clob1 = gen_rtx_CLOBBER (VOIDmode, operands[4]); + rtx clob2 = gen_rtx_CLOBBER (VOIDmode, operands[5]); + add = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, add, clob1, clob2)); + emit_insn (add); + emit_insn (gen_mulsi3 (tmp, op1hi, op2lo)); + add = gen_rtx_SET (dsthi, gen_rtx_PLUS (SImode, dsthi, tmp)); + clob1 = gen_rtx_CLOBBER (VOIDmode, operands[4]); + clob2 = gen_rtx_CLOBBER (VOIDmode, operands[5]); + add = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (3, add, clob1, clob2)); + emit_insn (add); + DONE; + } + [(set_attr "gcn_version" "gcn5,gcn5,*,*")]) + (define_insn "mulhisi3" [(set (match_operand:SI 0 "register_operand" "=v") (mult:SI