From patchwork Wed Oct 7 11:57:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 1377952 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 4C5t9q2h0Cz9sSf for ; Wed, 7 Oct 2020 22:57:19 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6D2863870910; Wed, 7 Oct 2020 11:57:17 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa3.mentor.iphmx.com (esa3.mentor.iphmx.com [68.232.137.180]) by sourceware.org (Postfix) with ESMTPS id F0B983857C40 for ; Wed, 7 Oct 2020 11:57:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.3.2 sourceware.org F0B983857C40 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=Andrew_Stubbs@mentor.com IronPort-SDR: jDX2VxHrnZVL6Hn23tKHIhbv8648kDK/K8xarNUQY9Hi1SLASxAxOxTnPN+TysPrIxM6oeVUnG zu0YXdjx8kIbQ6cEnsoZ6mi0K+U9M4dqw9LPfBZKvMsVXiwvrjfZTZixBp3Y9NoM0zVDW264th kNGdiyvzJnU9fUHAE+Pj1uiwUunGz1y2s3r6TbNH4C7omX+w2PZncZOdVf8LnXgRXAR7EnYDVW eyOiYfC+vSsoly6XdAD3wu/7v1mUGRIgu9R53JDSTbmnzprviyzoWftGv3GeAh90kn+BAVJit9 bKY= X-IronPort-AV: E=Sophos;i="5.77,346,1596528000"; d="scan'208";a="53645592" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa3.mentor.iphmx.com with ESMTP; 07 Oct 2020 03:57:13 -0800 IronPort-SDR: T1BtnCKOh8hO89vy6r5dKJimq8LC6/eehuEeDFb1i7xZ+kaCvAu0oRxEWHC3NhngRcjtPfnDhp MzRRsb7VcVVVIJoXT/wied6I6oAmPPZo2inF8zFo8+z5UHJxnl1CyJb/MMzcSF2hTHbygT6zap KhyO1Hvm7XV+aNCV44gU0wraMw8q2r4XZGMNm75wcgkC1FjdHXOpVxrG1gbZTKAU3/W0J/UwVL tF0pEodB3jXE+9SbEOK9elmcF7FJBd7HacpzY3qAEi2U2LmQworbd4RrS7yy58XZWv9vrTJ/z/ 7WM= To: "gcc-patches@gcc.gnu.org" From: Andrew Stubbs Subject: [committed] amdgcn: Use scalar instructions for addptrdi3 Message-ID: <838c26fa-bf21-c7d0-e236-f5e55a994c81@codesourcery.com> Date: Wed, 7 Oct 2020 12:57:08 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 Content-Language: en-GB X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: SVR-IES-MBX-08.mgc.mentorg.com (139.181.222.8) To svr-ies-mbx-01.mgc.mentorg.com (139.181.222.1) X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, KAM_NUMSUBJECT, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces@gcc.gnu.org Sender: "Gcc-patches" This patch adds an extra alternative to the existing addptrdi3 pattern. It permits safe 64-bit addition in scalar registers, as well as vector registers. This is especially useful because the result of addptr typically gets fed to instructions that expect the base address to be in a scalar register. It was only the way it was because vector instructions can specify a custom register to clobber. Hopefully this will help prevent unnecessary register moves for address calculations. Andrew amdgcn: Use scalar instructions for addptrdi3 Allow addptr to use SPGRs as well as VGPRs for pointers. This ought to prevent some unnecessary copying back and forth. gcc/ChangeLog: * config/gcn/gcn.md (unspec): Add UNSPEC_ADDPTR. (addptrdi3): Add SGPR alternative. diff --git a/gcc/config/gcn/gcn.md b/gcc/config/gcn/gcn.md index 0e73fea93cf..763e77008ad 100644 --- a/gcc/config/gcn/gcn.md +++ b/gcc/config/gcn/gcn.md @@ -67,6 +67,7 @@ (define_c_enum "unspecv" [ UNSPECV_ICACHE_INV]) (define_c_enum "unspec" [ + UNSPEC_ADDPTR UNSPEC_VECTOR UNSPEC_BPERMUTE UNSPEC_SGPRBASE @@ -1219,29 +1220,47 @@ (define_insn "addcsi3_scalar_zero" ; "addptr" is the same as "add" except that it must not write to VCC or SCC ; as a side-effect. Unfortunately GCN does not have a suitable instruction -; for this, so we use a custom VOP3 add with CC_SAVE_REG as a temp. -; Note that it is not safe to save/clobber/restore SCC because doing so will -; break data-flow analysis, so this must use vector registers. +; for this, so we use CC_SAVE_REG as a temp. +; Note that it is not safe to save/clobber/restore as separate insns because +; doing so will break data-flow analysis, so this must use multiple +; instructions in one insn. ; ; The "v0" should be just "v", but somehow the "0" helps LRA not loop forever ; on testcase pr54713-2.c with -O0. It's only an optimization hint anyway. +; +; The SGPR alternative is preferred as it is typically used with mov_sgprbase. (define_insn "addptrdi3" - [(set (match_operand:DI 0 "register_operand" "= v") - (plus:DI (match_operand:DI 1 "register_operand" " v0") - (match_operand:DI 2 "nonmemory_operand" "vDA")))] + [(set (match_operand:DI 0 "register_operand" "= v, Sg") + (unspec:DI [ + (plus:DI (match_operand:DI 1 "register_operand" "^v0,Sg0") + (match_operand:DI 2 "nonmemory_operand" "vDA,SgDB"))] + UNSPEC_ADDPTR))] "" { - rtx new_operands[4] = { operands[0], operands[1], operands[2], - gen_rtx_REG (DImode, CC_SAVE_REG) }; + if (which_alternative == 0) + { + rtx new_operands[4] = { operands[0], operands[1], operands[2], + gen_rtx_REG (DImode, CC_SAVE_REG) }; - output_asm_insn ("v_add%^_u32 %L0, %3, %L2, %L1", new_operands); - output_asm_insn ("v_addc%^_u32 %H0, %3, %H2, %H1, %3", new_operands); + output_asm_insn ("v_add%^_u32\t%L0, %3, %L2, %L1", new_operands); + output_asm_insn ("v_addc%^_u32\t%H0, %3, %H2, %H1, %3", new_operands); + } + else + { + rtx new_operands[4] = { operands[0], operands[1], operands[2], + gen_rtx_REG (BImode, CC_SAVE_REG) }; + + output_asm_insn ("s_mov_b32\t%3, scc", new_operands); + output_asm_insn ("s_add_u32\t%L0, %L1, %L2", new_operands); + output_asm_insn ("s_addc_u32\t%H0, %H1, %H2", new_operands); + output_asm_insn ("s_cmpk_lg_u32\t%3, 0", new_operands); + } return ""; } - [(set_attr "type" "vmult") - (set_attr "length" "16")]) + [(set_attr "type" "vmult,mult") + (set_attr "length" "16,24")]) ;; }}} ;; {{{ ALU special cases: Minus