From patchwork Tue Jan 7 14:49:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Stubbs X-Patchwork-Id: 1218809 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-516800-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=codesourcery.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="pKtFLU/+"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 47sZzH4R4Dz9sNx for ; Wed, 8 Jan 2020 01:49:45 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; q=dns; s= default; b=U/woPaYiwoQqAhA6lUlDiy0tPXOCiXUd1ba9Ffeqhg2oA0Px8V2tm Gx2Hn76bxiFMH1OakW2jYuuWb89YCQ8pEdIUO5aFNB+/BGjY3hdo7UGQFmS2o3Qs JRoc2KRZskNMrYJf6VRXeF7gYPrTPkszLAEHawg+CGWzF7aR9itVkk= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:message-id:date:mime-version:content-type; s= default; bh=nd2LjDtx92o297ELZ1cJJxI2sm8=; b=pKtFLU/+Ybe7UUfXyC6X Ixv98dLwqfWmQ7RZlbkRYrODiUMs+i2clC0DzrB/DmO0pAg3wBc1J0oiItscR1do gHUWDpr/BkumrEccTueQXsb0Wpc4IstV6qRDej1fuKBQba/VOaMzMpxpE5Gyh7iB jDJAbAjTFB4ZJC6Xx2UnafQ= Received: (qmail 75991 invoked by alias); 7 Jan 2020 14:49:37 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 75983 invoked by uid 89); 7 Jan 2020 14:49:37 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-18.9 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, SPF_PASS autolearn=ham version=3.3.1 spammy=gcn.c, sk:gcn_inl, 9268, bsv X-HELO: esa1.mentor.iphmx.com Received: from esa1.mentor.iphmx.com (HELO esa1.mentor.iphmx.com) (68.232.129.153) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 07 Jan 2020 14:49:35 +0000 IronPort-SDR: lewVXZ5NUmgfmWG6tmYGxQH7xnNgaS0SL8MP1hJECooekEwYetoSRujvMztZ40hDwdnxIM0uFM xyuu4Q2swGaJnVDWC0yEMB6iPA45z77LxIT37d9DIU9ZI7n2hYJY4xacmvYnCrND4BVeziyD09 VVERwOy5SbfzORSe90yglrmzP7/z0Ss32DqhSQ27vxO1Y0uksYnidpMrakdeYzVLGOD+oiSsUJ HoJFRlfb9snxz9WoyOQSuAKgbhM2jzZpcvcMsmOdOthb0BbISxyCUdQOkVNrai+WtqubaM4bm9 5mA= Received: from orw-gwy-02-in.mentorg.com ([192.94.38.167]) by esa1.mentor.iphmx.com with ESMTP; 07 Jan 2020 06:49:33 -0800 IronPort-SDR: 6dtgGloLS+fRbTuc+D5UAAAmhLodYlZLBSFxD4RtKO3GabTabnAbRDQYcZp4OgutYancr4JrPI T+gycYdI4XEJdgfR/c0mhe5jvIqJQNCeWvaB4+GuH6qpau6+YpD514IF6waCCD7gPjf17p8iZ2 eub9LuyjmW9iclDitT5OV2B7qV/5225jWzjwEAuVvs5yxWXwBTWsgBDXDr9EG0BG65GAk/t/lO eM+j5xmAZSB52ST0g0+OZ0MCOi/Bc56r5m+eE25Lt1UIDKeNhex2ofrf1+qorMYfIUrUAa/SUu KjU= From: Andrew Stubbs Subject: [committed, amdgcn] Disallow 'B' constraints on addc/subb To: "gcc-patches@gcc.gnu.org" Message-ID: Date: Tue, 7 Jan 2020 14:49:25 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 This patch fixes some failures to assemble I encountered while fixing other issues (not yet posted). These were caused by emitting 'B' immediates that the ISA manual indicates should work, but the assembler rejects. I've confirmed that they are, in fact, not handled by the hardware; if I encode the instruction manually the GPU appears to decode the illegal instructions correctly, but the then gives wrong outputs. The solution is to remove the 'B' immediates from the addc and subc patterns. This has a knock-on effect for all the V64DImode patterns that split into the other patterns, so these have been adjusted to use a new 'Db' constraint that allows 'B' for the low-part, and only 'A' for the high-part. As a bonus, this patch fixes four test failures: FAIL: gcc.dg/vect/vect-alias-check-10.c execution test FAIL: gcc.dg/vect/vect-alias-check-11.c execution test FAIL: gcc.dg/vect/vect-alias-check-12.c execution test FAIL: gcc.dg/vect/vect-live-slp-3.c execution test These execution were more likely caused by some early-clobber issues (since they assembled fine), but the constraint review and rework has addressed that issue too. Andrew Disallow 'B' constraints on amdgcn addc/subb 2020-01-07 Andrew Stubbs gcc/ * config/gcn/constraints.md (DA): Update description and match. (DB): Likewise. (Db): New constraint. * config/gcn/gcn-protos.h (gcn_inline_constant64_p): Add second parameter. * config/gcn/gcn.c (gcn_inline_constant64_p): Add 'mixed' parameter. Implement 'Db' mixed immediate type. * config/gcn/gcn-valu.md (addcv64si3): Rework constraints. (addcv64si3_dup): Delete. (subcv64si3): Rework constraints. (addv64di3): Rework constraints. (addv64di3_exec): Rework constraints. (subv64di3): Rework constraints. (addv64di3_dup): Delete. (addv64di3_dup_exec): Delete. (addv64di3_zext): Rework constraints. (addv64di3_zext_exec): Rework constraints. (addv64di3_zext_dup): Rework constraints. (addv64di3_zext_dup_exec): Rework constraints. (addv64di3_zext_dup2): Rework constraints. (addv64di3_zext_dup2_exec): Rework constraints. (addv64di3_sext_dup2): Rework constraints. (addv64di3_sext_dup2_exec): Rework constraints. diff --git a/gcc/config/gcn/constraints.md b/gcc/config/gcn/constraints.md index f2de943ba16..dd6615b0fd7 100644 --- a/gcc/config/gcn/constraints.md +++ b/gcc/config/gcn/constraints.md @@ -53,12 +53,17 @@ (match_test "gcn_constant64_p (op)"))) (define_constraint "DA" - "Splittable inline immediate 64-bit parameter" + "Immediate 64-bit parameter, low and high part match 'A'" (and (match_code "const_int,const_double,const_vector") - (match_test "gcn_inline_constant64_p (op)"))) + (match_test "gcn_inline_constant64_p (op, 0)"))) + +(define_constraint "Db" + "Immediate 64-bit parameter, low part matches 'B', high part matches 'A'" + (and (match_code "const_int,const_double,const_vector") + (match_test "gcn_inline_constant64_p (op, 1)"))) (define_constraint "DB" - "Splittable immediate 64-bit parameter" + "Immediate 64-bit parameter, low and high part match 'B'" (match_code "const_int,const_double,const_vector")) (define_constraint "U" diff --git a/gcc/config/gcn/gcn-protos.h b/gcc/config/gcn/gcn-protos.h index 92a6aa77e84..e4dadd37f21 100644 --- a/gcc/config/gcn/gcn-protos.h +++ b/gcc/config/gcn/gcn-protos.h @@ -51,7 +51,7 @@ extern int gcn_hard_regno_nregs (int regno, machine_mode mode); extern void gcn_hsa_declare_function_name (FILE *file, const char *name, tree decl); extern HOST_WIDE_INT gcn_initial_elimination_offset (int, int); -extern bool gcn_inline_constant64_p (rtx); +extern bool gcn_inline_constant64_p (rtx, bool); extern bool gcn_inline_constant_p (rtx); extern int gcn_inline_fp_constant_p (rtx, bool); extern reg_class gcn_mode_code_base_reg_class (machine_mode, addr_space_t, diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md index e301a4356ec..7dd7bb96918 100644 --- a/gcc/config/gcn/gcn-valu.md +++ b/gcc/config/gcn/gcn-valu.md @@ -1090,20 +1090,21 @@ [(set_attr "type" "vop2,vop3b") (set_attr "length" "8,8")]) -; This pattern does not accept SGPR because VCC read already counts as an -; SGPR use and number of SGPR operands is limited to 1. +; v_addc does not accept an SGPR because the VCC read already counts as an +; SGPR use and the number of SGPR operands is limited to 1. It does not +; accept "B" immediate constants due to a related bus conflict. (define_insn "addcv64si3" - [(set (match_operand:V64SI 0 "register_operand" "=v,v") + [(set (match_operand:V64SI 0 "register_operand" "=v, v") (plus:V64SI (plus:V64SI (vec_merge:V64SI (vec_duplicate:V64SI (const_int 1)) (vec_duplicate:V64SI (const_int 0)) - (match_operand:DI 3 "register_operand" " cV,Sv")) - (match_operand:V64SI 1 "gcn_alu_operand" "%vA,vA")) - (match_operand:V64SI 2 "gcn_alu_operand" " vB,vB"))) - (set (match_operand:DI 4 "register_operand" "=cV,Sg") + (match_operand:DI 3 "register_operand" " cV,cVSv")) + (match_operand:V64SI 1 "gcn_alu_operand" "% v, vA")) + (match_operand:V64SI 2 "gcn_alu_operand" " vA, vA"))) + (set (match_operand:DI 4 "register_operand" "=cV,cVSg") (ior:DI (ltu:DI (plus:V64SI (plus:V64SI (vec_merge:V64SI @@ -1121,40 +1122,7 @@ (match_dup 1)) (match_dup 1))))] "" - "v_addc%^_u32\t%0, %4, %1, %2, %3" - [(set_attr "type" "vop2,vop3b") - (set_attr "length" "4,8")]) - -(define_insn "addcv64si3_dup" - [(set (match_operand:V64SI 0 "register_operand" "=v,v") - (plus:V64SI - (plus:V64SI - (vec_merge:V64SI - (vec_duplicate:V64SI (const_int 1)) - (vec_duplicate:V64SI (const_int 0)) - (match_operand:DI 3 "register_operand" " cV, Sv")) - (match_operand:V64SI 1 "gcn_alu_operand" "%vA, vA")) - (vec_duplicate:V64SI - (match_operand:SI 2 "gcn_alu_operand" "SvB,SvB")))) - (set (match_operand:DI 4 "register_operand" "=cV, Sg") - (ior:DI (ltu:DI (plus:V64SI (plus:V64SI - (vec_merge:V64SI - (vec_duplicate:V64SI (const_int 1)) - (vec_duplicate:V64SI (const_int 0)) - (match_dup 3)) - (match_dup 1)) - (vec_duplicate:V64SI - (match_dup 2))) - (vec_duplicate:V64SI - (match_dup 2))) - (ltu:DI (plus:V64SI (vec_merge:V64SI - (vec_duplicate:V64SI (const_int 1)) - (vec_duplicate:V64SI (const_int 0)) - (match_dup 3)) - (match_dup 1)) - (match_dup 1))))] - "" - "v_addc%^_u32\t%0, %4, %1, %2, %3" + "v_addc%^_u32\t%0, %4, %2, %1, %3" [(set_attr "type" "vop2,vop3b") (set_attr "length" "4,8")]) @@ -1188,8 +1156,9 @@ [(set_attr "type" "vop2,vop3b,vop2,vop3b") (set_attr "length" "8")]) -; This pattern does not accept SGPR because VCC read already counts -; as a SGPR use and number of SGPR operands is limited to 1. +; v_subb does not accept an SGPR because the VCC read already counts as an +; SGPR use and the number of SGPR operands is limited to 1. It does not +; accept "B" immediate constants due to a related bus conflict. (define_insn "subcv64si3" [(set (match_operand:V64SI 0 "register_operand" "= v, v, v, v") @@ -1198,10 +1167,10 @@ (vec_merge:V64SI (vec_duplicate:V64SI (const_int 1)) (vec_duplicate:V64SI (const_int 0)) - (match_operand:DI 3 "gcn_alu_operand" " cV,Sv,cV,Sv")) - (match_operand:V64SI 1 "gcn_alu_operand" " vA,vA,vB,vB")) - (match_operand:V64SI 2 "gcn_alu_operand" " vB,vB,vA,vA"))) - (set (match_operand:DI 4 "register_operand" "=cV,Sg,cV,Sg") + (match_operand:DI 3 "gcn_alu_operand" " cV,cVSv,cV,cVSv")) + (match_operand:V64SI 1 "gcn_alu_operand" " vA, vA, v, vA")) + (match_operand:V64SI 2 "gcn_alu_operand" " v, vA,vA, vA"))) + (set (match_operand:DI 4 "register_operand" "=cV,cVSg,cV,cVSg") (ior:DI (gtu:DI (minus:V64SI (minus:V64SI (vec_merge:V64SI (vec_duplicate:V64SI (const_int 1)) @@ -1223,13 +1192,13 @@ v_subbrev%^_u32\t%0, %4, %2, %1, %3 v_subbrev%^_u32\t%0, %4, %2, %1, %3" [(set_attr "type" "vop2,vop3b,vop2,vop3b") - (set_attr "length" "8")]) + (set_attr "length" "4,8,4,8")]) (define_insn_and_split "addv64di3" - [(set (match_operand:V64DI 0 "register_operand" "= &v") + [(set (match_operand:V64DI 0 "register_operand" "= &v, &v") (plus:V64DI - (match_operand:V64DI 1 "register_operand" "% v0") - (match_operand:V64DI 2 "gcn_alu_operand" "vSvB0"))) + (match_operand:V64DI 1 "register_operand" "%vDb,vDb0") + (match_operand:V64DI 2 "gcn_alu_operand" "vDb0, vDb"))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1255,13 +1224,13 @@ (set_attr "length" "8")]) (define_insn_and_split "addv64di3_exec" - [(set (match_operand:V64DI 0 "register_operand" "= &v") + [(set (match_operand:V64DI 0 "register_operand" "= &v, &v") (vec_merge:V64DI (plus:V64DI - (match_operand:V64DI 1 "register_operand" "% v0") - (match_operand:V64DI 2 "gcn_alu_operand" "vSvB0")) - (match_operand:V64DI 3 "gcn_register_or_unspec_operand" " U0") - (match_operand:DI 4 "gcn_exec_reg_operand" " e"))) + (match_operand:V64DI 1 "register_operand" "%vDb,vDb0") + (match_operand:V64DI 2 "gcn_alu_operand" "vDb0, vDb")) + (match_operand:V64DI 3 "gcn_register_or_unspec_operand" " U0, U0") + (match_operand:DI 4 "gcn_exec_reg_operand" " e, e"))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1292,10 +1261,10 @@ (set_attr "length" "8")]) (define_insn_and_split "subv64di3" - [(set (match_operand:V64DI 0 "register_operand" "= &v, &v, &v, &v") + [(set (match_operand:V64DI 0 "register_operand" "=&v, &v, &v, &v") (minus:V64DI - (match_operand:V64DI 1 "gcn_alu_operand" "vSvB,vSvB0, v, v0") - (match_operand:V64DI 2 "gcn_alu_operand" " v0, v,vSvB0,vSvB"))) + (match_operand:V64DI 1 "gcn_alu_operand" "vDb,vDb0, v, v0") + (match_operand:V64DI 2 "gcn_alu_operand" " v0, v,vDb0,vDb"))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1359,80 +1328,12 @@ [(set_attr "type" "vmult") (set_attr "length" "8")]) -(define_insn_and_split "addv64di3_dup" - [(set (match_operand:V64DI 0 "register_operand" "= &v") - (plus:V64DI - (match_operand:V64DI 1 "register_operand" " v0") - (vec_duplicate:V64DI - (match_operand:DI 2 "gcn_alu_operand" "SvDB")))) - (clobber (reg:DI VCC_REG))] - "" - "#" - "gcn_can_split_p (V64DImode, operands[0]) - && gcn_can_split_p (V64DImode, operands[1]) - && gcn_can_split_p (V64DImode, operands[2])" - [(const_int 0)] - { - rtx vcc = gen_rtx_REG (DImode, VCC_REG); - emit_insn (gen_addv64si3_vcc_dup - (gcn_operand_part (V64DImode, operands[0], 0), - gcn_operand_part (DImode, operands[2], 0), - gcn_operand_part (V64DImode, operands[1], 0), - vcc)); - emit_insn (gen_addcv64si3_dup - (gcn_operand_part (V64DImode, operands[0], 1), - gcn_operand_part (V64DImode, operands[1], 1), - gcn_operand_part (DImode, operands[2], 1), - vcc, vcc)); - DONE; - } - [(set_attr "type" "vmult") - (set_attr "length" "8")]) - -(define_insn_and_split "addv64di3_dup_exec" - [(set (match_operand:V64DI 0 "register_operand" "= &v") - (vec_merge:V64DI - (plus:V64DI - (match_operand:V64DI 1 "register_operand" " v0") - (vec_duplicate:V64DI - (match_operand:DI 2 "gcn_alu_operand" "SvDB"))) - (match_operand:V64DI 3 "gcn_register_or_unspec_operand" " U0") - (match_operand:DI 4 "gcn_exec_reg_operand" " e"))) - (clobber (reg:DI VCC_REG))] - "" - "#" - "gcn_can_split_p (V64DImode, operands[0]) - && gcn_can_split_p (V64DImode, operands[1]) - && gcn_can_split_p (V64DImode, operands[2]) - && gcn_can_split_p (V64DImode, operands[3])" - [(const_int 0)] - { - rtx vcc = gen_rtx_REG (DImode, VCC_REG); - emit_insn (gen_addv64si3_vcc_dup_exec - (gcn_operand_part (V64DImode, operands[0], 0), - gcn_operand_part (DImode, operands[2], 0), - gcn_operand_part (V64DImode, operands[1], 0), - vcc, - gcn_operand_part (V64DImode, operands[3], 0), - operands[4])); - emit_insn (gen_addcv64si3_dup_exec - (gcn_operand_part (V64DImode, operands[0], 1), - gcn_operand_part (V64DImode, operands[1], 1), - gcn_operand_part (DImode, operands[2], 1), - vcc, vcc, - gcn_operand_part (V64DImode, operands[3], 1), - operands[4])); - DONE; - } - [(set_attr "type" "vmult") - (set_attr "length" "8")]) - (define_insn_and_split "addv64di3_zext" - [(set (match_operand:V64DI 0 "register_operand" "=&v,&v") + [(set (match_operand:V64DI 0 "register_operand" "=&v, &v, &v, &v") (plus:V64DI (zero_extend:V64DI - (match_operand:V64SI 1 "gcn_alu_operand" "0vA,0vB")) - (match_operand:V64DI 2 "gcn_alu_operand" "0vB,0vA"))) + (match_operand:V64SI 1 "gcn_alu_operand" "0vA,0vB, vA, vB")) + (match_operand:V64DI 2 "gcn_alu_operand" "vDb,vDA,0vDb,0vDA"))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1453,17 +1354,18 @@ DONE; } [(set_attr "type" "vmult") - (set_attr "length" "8,8")]) + (set_attr "length" "8")]) (define_insn_and_split "addv64di3_zext_exec" - [(set (match_operand:V64DI 0 "register_operand" "=&v,&v") + [(set (match_operand:V64DI 0 "register_operand" "=&v, &v, &v, &v") (vec_merge:V64DI (plus:V64DI (zero_extend:V64DI - (match_operand:V64SI 1 "gcn_alu_operand" "0vA,0vB")) - (match_operand:V64DI 2 "gcn_alu_operand" "0vB,0vA")) - (match_operand:V64DI 3 "gcn_register_or_unspec_operand" " U0, U0") - (match_operand:DI 4 "gcn_exec_reg_operand" " e, e"))) + (match_operand:V64SI 1 "gcn_alu_operand" "0vA, vA,0vB, vB")) + (match_operand:V64DI 2 "gcn_alu_operand" "vDb,0vDb,vDA,0vDA")) + (match_operand:V64DI 3 "gcn_register_or_unspec_operand" + " U0, U0, U0, U0") + (match_operand:DI 4 "gcn_exec_reg_operand" " e, e, e, e"))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1489,15 +1391,15 @@ DONE; } [(set_attr "type" "vmult") - (set_attr "length" "8,8")]) + (set_attr "length" "8")]) (define_insn_and_split "addv64di3_zext_dup" - [(set (match_operand:V64DI 0 "register_operand" "=&v") + [(set (match_operand:V64DI 0 "register_operand" "= &v, &v") (plus:V64DI (zero_extend:V64DI (vec_duplicate:V64SI - (match_operand:SI 1 "gcn_alu_operand" "BSv"))) - (match_operand:V64DI 2 "gcn_alu_operand" "vA0"))) + (match_operand:SI 1 "gcn_alu_operand" " BSv, ASv"))) + (match_operand:V64DI 2 "gcn_alu_operand" "vDA0,vDb0"))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1521,15 +1423,15 @@ (set_attr "length" "8")]) (define_insn_and_split "addv64di3_zext_dup_exec" - [(set (match_operand:V64DI 0 "register_operand" "=&v") + [(set (match_operand:V64DI 0 "register_operand" "= &v, &v") (vec_merge:V64DI (plus:V64DI (zero_extend:V64DI (vec_duplicate:V64SI - (match_operand:SI 1 "gcn_alu_operand" "BSv"))) - (match_operand:V64DI 2 "gcn_alu_operand" "vA0")) - (match_operand:V64DI 3 "gcn_register_or_unspec_operand" " U0") - (match_operand:DI 4 "gcn_exec_reg_operand" " e"))) + (match_operand:SI 1 "gcn_alu_operand" " ASv, BSv"))) + (match_operand:V64DI 2 "gcn_alu_operand" "vDb0,vDA0")) + (match_operand:V64DI 3 "gcn_register_or_unspec_operand" " U0, U0") + (match_operand:DI 4 "gcn_exec_reg_operand" " e, e"))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1558,10 +1460,10 @@ (set_attr "length" "8")]) (define_insn_and_split "addv64di3_zext_dup2" - [(set (match_operand:V64DI 0 "register_operand" "= v") + [(set (match_operand:V64DI 0 "register_operand" "= &v") (plus:V64DI (zero_extend:V64DI (match_operand:V64SI 1 "gcn_alu_operand" " vA")) - (vec_duplicate:V64DI (match_operand:DI 2 "gcn_alu_operand" "BSv")))) + (vec_duplicate:V64DI (match_operand:DI 2 "gcn_alu_operand" "DbSv")))) (clobber (reg:DI VCC_REG))] "" "#" @@ -1584,7 +1486,7 @@ (set_attr "length" "8")]) (define_insn_and_split "addv64di3_zext_dup2_exec" - [(set (match_operand:V64DI 0 "register_operand" "= v") + [(set (match_operand:V64DI 0 "register_operand" "=&v") (vec_merge:V64DI (plus:V64DI (zero_extend:V64DI (match_operand:V64SI 1 "gcn_alu_operand" @@ -1621,7 +1523,7 @@ (set_attr "length" "8")]) (define_insn_and_split "addv64di3_sext_dup2" - [(set (match_operand:V64DI 0 "register_operand" "= v") + [(set (match_operand:V64DI 0 "register_operand" "=&v") (plus:V64DI (sign_extend:V64DI (match_operand:V64SI 1 "gcn_alu_operand" " vA")) (vec_duplicate:V64DI (match_operand:DI 2 "gcn_alu_operand" "BSv")))) @@ -1649,7 +1551,7 @@ (set_attr "length" "8")]) (define_insn_and_split "addv64di3_sext_dup2_exec" - [(set (match_operand:V64DI 0 "register_operand" "= v") + [(set (match_operand:V64DI 0 "register_operand" "=&v") (vec_merge:V64DI (plus:V64DI (sign_extend:V64DI (match_operand:V64SI 1 "gcn_alu_operand" @@ -3201,9 +3103,11 @@ { rtx tmp = gen_reg_rtx (V64DImode); rtx v1 = gen_rtx_REG (V64SImode, VGPR_REGNO (1)); + rtx op1vec = gen_reg_rtx (V64DImode); emit_insn (gen_mulv64di3_zext_dup2 (tmp, v1, operands[2])); - emit_insn (gen_addv64di3_dup (operands[0], tmp, operands[1])); + emit_insn (gen_vec_duplicatev64si (op1vec, operands[1])); + emit_insn (gen_addv64di3 (operands[0], tmp, op1vec)); DONE; }) diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c index b361cffbb84..4056f7257b5 100644 --- a/gcc/config/gcn/gcn.c +++ b/gcc/config/gcn/gcn.c @@ -902,16 +902,17 @@ gcn_constant_p (rtx x) /* Return true if X is a constant representable as two inline immediate constants in a 64-bit instruction that is split into two 32-bit - instructions. */ + instructions. + When MIXED is set, the low-part is permitted to use the full 32-bits. */ bool -gcn_inline_constant64_p (rtx x) +gcn_inline_constant64_p (rtx x, bool mixed) { if (GET_CODE (x) == CONST_VECTOR) { if (!vgpr_vector_mode_p (GET_MODE (x))) return false; - if (!gcn_inline_constant64_p (CONST_VECTOR_ELT (x, 0))) + if (!gcn_inline_constant64_p (CONST_VECTOR_ELT (x, 0), mixed)) return false; for (int i = 1; i < 64; i++) if (CONST_VECTOR_ELT (x, i) != CONST_VECTOR_ELT (x, 0)) @@ -925,7 +926,8 @@ gcn_inline_constant64_p (rtx x) rtx val_lo = gcn_operand_part (DImode, x, 0); rtx val_hi = gcn_operand_part (DImode, x, 1); - return gcn_inline_constant_p (val_lo) && gcn_inline_constant_p (val_hi); + return ((mixed || gcn_inline_constant_p (val_lo)) + && gcn_inline_constant_p (val_hi)); } /* Return true if X is a constant representable as an immediate constant