From patchwork Tue Apr 4 09:57:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 1764772 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4PrNW32Ffgz1yZT for ; Tue, 4 Apr 2023 19:58:22 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id B17923858288 for ; Tue, 4 Apr 2023 09:58:16 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa4.mentor.iphmx.com (esa4.mentor.iphmx.com [68.232.137.252]) by sourceware.org (Postfix) with ESMTPS id 9B58D3858D1E for ; Tue, 4 Apr 2023 09:58:04 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9B58D3858D1E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.98,317,1673942400"; d="scan'208";a="1356724" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa4.mentor.iphmx.com with ESMTP; 04 Apr 2023 01:58:01 -0800 IronPort-SDR: 2H78ERLIUGyW7ztDy3B4CalsePGVehp3Fb/FvpwK8ZNHppVcGD0hiSgSpeXqPJ0HVmdis1BD1e X4KKoMrPmiisA+CbtOE+6QZDZ2wbg/Zxgv2KIZ2cMiutUvN5BZdvqfvNn+gOgsJ7VTq8DA9fj3 WAoSzVnQa4HZ1U46zzHNyuE3EerhU/aEyfaaJTur9oAU6A5UE5xPl5EFMiC8J4dmePAUbBezGn LxH1TLcLz6f9qFMTfw3fNcgV4a1Sc1rxK7iw4R0WzsRcv62O/wYExcuvEWSlwCgL+yLRvo6dQx mn8= Message-ID: Date: Tue, 4 Apr 2023 10:57:57 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Content-Language: en-GB From: Andrew Stubbs Subject: [committed] amdgcn: Add 64-bit vector not To: "gcc-patches@gcc.gnu.org" X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-10.mgc.mentorg.com (139.181.222.10) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, SPF_HELO_PASS, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" I've committed this patch to add a missing vector operator on amdgcn. The architecture doesn't have a 64-bit not instruction so we didn't have an insn for it, but the vectorizer didn't like that and caused the v64df_pow function to use 2MB of stack frame. This is a problem when you typically have over 3000 threads and only want to allocate 32k of stack space each! Andrew amdgcn: Add 64-bit vector not gcc/ChangeLog: * config/gcn/gcn-valu.md (one_cmpl2): New. diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index 44d107145db..c0b43fcfb64 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -2791,6 +2791,23 @@ (define_expand "neg2" DONE; }) +(define_insn_and_split "one_cmpl2" + [(set (match_operand:V_DI 0 "register_operand" "= v") + (not:V_DI + (match_operand:V_DI 1 "gcn_alu_operand" "vSvDB")))] + "" + "#" + "reload_completed" + [(set (match_dup 3) (not: (match_dup 5))) + (set (match_dup 4) (not: (match_dup 6)))] + { + operands[3] = gcn_operand_part (mode, operands[0], 0); + operands[4] = gcn_operand_part (mode, operands[0], 1); + operands[5] = gcn_operand_part (mode, operands[1], 0); + operands[6] = gcn_operand_part (mode, operands[1], 1); + } + [(set_attr "type" "mult")]) + ;; }}} ;; {{{ FP binops - special cases