From patchwork Thu Apr 27 16:38:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 1774559 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Q6hJc05fdz23vG for ; Fri, 28 Apr 2023 02:38:55 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7F23C385828D for ; Thu, 27 Apr 2023 16:38:51 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from esa2.mentor.iphmx.com (esa2.mentor.iphmx.com [68.232.141.98]) by sourceware.org (Postfix) with ESMTPS id 7FB0D3858D37 for ; Thu, 27 Apr 2023 16:38:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 7FB0D3858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=mentor.com X-IronPort-AV: E=Sophos;i="5.99,230,1677571200"; d="scan'208";a="4144963" Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa2.mentor.iphmx.com with ESMTP; 27 Apr 2023 08:38:34 -0800 IronPort-SDR: QDJPfZPRa7YbVbNflFwBvCYoFMcgLJ7/EOyM9FVnWdknb2PBuWaBJ7cw16p3r7oQg8AZora4lt 52Rmq0Aeh6PBkVlf4kikHVU12tEkpa+9elxdhpZ/3kMfnCnvJNBrHSl4cDveXdt3JMGCq/fQRG kwBDxwNtY/9jrJc38GUAs3OID3cC7q/z0WU5/mYjp4+A5hW+1ARPqq1oKgDoGlBQqKoGI6Z14V 1svrWVu/FyxXwWr/fe8MfHGhMZvlF28DqsmQqjNheSrzTw9l0GnqO/2ssC0+aMRHEkaLMeoxWB nfY= Message-ID: <32c7f0c6-1a92-5c8a-0607-5aaa1929216a@codesourcery.com> Date: Thu, 27 Apr 2023 17:38:30 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Content-Language: en-GB To: "gcc-patches@gcc.gnu.org" From: Andrew Stubbs Subject: [committed] amdgcn: Fix addsub bug X-Originating-IP: [137.202.0.90] X-ClientProxiedBy: svr-ies-mbx-14.mgc.mentorg.com (139.181.222.14) To svr-ies-mbx-11.mgc.mentorg.com (139.181.222.11) X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, KAM_DMARC_STATUS, RCVD_IN_MSPIKE_H2, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" I've committed this patch to fix a couple of bugs introduced in the recent CMul patch. First, the fmsubadd insn was accidentally all adds and no substracts. Second, there were input dependencies on the undefined output register which caused the compiler to reserve unnecessary slots in the stack-frame. Both issues are now fixed. This patch is already committed to OG12. I'll backport it to GCC 13 shortly. Andrew amdgcn: Fix addsub bug The vec_fmsubadd instuction actually had add twice, by mistake. Also improve code-gen for all the complex patterns by using properly undefined values. Mostly this just prevents the compiler reserving space in the stack frame. gcc/ChangeLog: * config/gcn/gcn-valu.md (cmul3): Use gcn_gen_undef. (cml4): Likewise. (vec_addsub3): Likewise. (cadd3): Likewise. (vec_fmaddsub4): Likewise. (vec_fmsubadd4): Likewise, and use sub for the odd lanes. diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index 44c48468dd6..7290cdc2fd0 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -2323,8 +2323,9 @@ (define_expand "cmul3" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_3_exec (dest, t1, t1_perm, dest, even)); - // a*c-b*d 0 + emit_insn (gen_3_exec (dest, t1, t1_perm, + gcn_gen_undef (mode), + even)); // a*c-b*d 0 rtx t2_perm = gen_reg_rtx (mode); emit_insn (gen_dpp_swap_pairs (t2_perm, t2)); // b*c a*d @@ -2368,7 +2369,8 @@ (define_expand "cml4" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_sub3_exec (dest, t1, t2_perm, dest, even)); + emit_insn (gen_sub3_exec (dest, t1, t2_perm, + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); @@ -2392,7 +2394,8 @@ (define_expand "vec_addsub3" rtx dest = operands[0]; rtx x = operands[1]; rtx y = operands[2]; - emit_insn (gen_sub3_exec (dest, x, y, dest, even)); + emit_insn (gen_sub3_exec (dest, x, y, gcn_gen_undef (mode), + even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_add3_exec (dest, x, y, dest, odd)); @@ -2419,7 +2422,9 @@ (define_expand "cadd3" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); - emit_insn (gen_3_exec (dest, x, y, dest, even)); + emit_insn (gen_3_exec (dest, x, y, + gcn_gen_undef (mode), + even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_3_exec (dest, x, y, dest, odd)); @@ -2439,7 +2444,8 @@ (define_expand "vec_fmaddsub4" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_sub3_exec (dest, t1, operands[3], dest, even)); + emit_insn (gen_sub3_exec (dest, t1, operands[3], + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); emit_insn (gen_add3_exec (dest, t1, operands[3], dest, odd)); @@ -2459,10 +2465,11 @@ (define_expand "vec_fmsubadd4" rtx even = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (even, get_exec (0x5555555555555555UL)); rtx dest = operands[0]; - emit_insn (gen_add3_exec (dest, t1, operands[3], dest, even)); + emit_insn (gen_add3_exec (dest, t1, operands[3], + gcn_gen_undef (mode), even)); rtx odd = gen_rtx_REG (DImode, EXEC_REG); emit_move_insn (odd, get_exec (0xaaaaaaaaaaaaaaaaUL)); - emit_insn (gen_add3_exec (dest, t1, operands[3], dest, odd)); + emit_insn (gen_sub3_exec (dest, t1, operands[3], dest, odd)); DONE; })