From patchwork Wed Jan 3 02:37:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: joshua X-Patchwork-Id: 1881846 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4T4YmT0fr4z1ydb for ; Wed, 3 Jan 2024 13:37:59 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 9AAD5385828B for ; Wed, 3 Jan 2024 02:37:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from out30-101.freemail.mail.aliyun.com (out30-101.freemail.mail.aliyun.com [115.124.30.101]) by sourceware.org (Postfix) with ESMTPS id CFD253858C2A for ; Wed, 3 Jan 2024 02:37:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CFD253858C2A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.alibaba.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=linux.alibaba.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CFD253858C2A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=115.124.30.101 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704249453; cv=none; b=i9ekIksbG+wCHw1aWhGpoyEvLfdxqHfZY20IQ7Ymc1ufnr6B82wI5FktvhIkad4OBlnqG3UDkJZ5qUC7D5Ay+/jWz8/jpsKXBSrr9t9Idq0xj3y4RYdRAU0qEZZpeO3wC0XxPgDL06vxwslGjkFd7dBJjNxwJfEiKVWjwQKnEbE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704249453; c=relaxed/simple; bh=B7l6TYoBIi2iL1k9y03Jt7z29mCdjfvjD9Z9Zj/lj20=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=Z2lpa1kFs7Yzco/4IWxJPbCx+lzro8SvfBwPelgFB3zJqPhl8wrgYEetcTXiwTH1ZPSE24gkVa2E4OONqzO56JHuM0VcLr4cLS06MPzTdthMD1Cm8lMXT6iLOH99wRKBIAmHiXFwqXyiJsHSsXsGQnQ1aorqoUFShZ/OZMnAdNI= ARC-Authentication-Results: i=1; server2.sourceware.org X-Alimail-AntiSpam: AC=PASS; BC=-1|-1; BR=01201311R181e4; CH=green; DM=||false|; DS=||; FP=0|-1|-1|-1|0|-1|-1|-1; HT=ay29a033018046056; MF=cooper.joshua@linux.alibaba.com; NM=1; PH=DS; RN=11; SR=0; TI=SMTPD_---0VzsC3pc_1704249440; Received: from localhost.localdomain(mailfrom:cooper.joshua@linux.alibaba.com fp:SMTPD_---0VzsC3pc_1704249440) by smtp.aliyun-inc.com; Wed, 03 Jan 2024 10:37:23 +0800 From: "Jun Sha (Joshua)" To: gcc-patches@gcc.gnu.org Cc: jim.wilson.gcc@gmail.com, palmer@dabbelt.com, andrew@sifive.com, philipp.tomsich@vrull.eu, jeffreyalaw@gmail.com, christoph.muellner@vrull.eu, juzhe.zhong@rivai.ai, "Jun Sha (Joshua)" , Jin Ma , Xianmiao Qu Subject: [PATCH v4] RISC-V: Fix register overlap issue for some xtheadvector instructions Date: Wed, 3 Jan 2024 10:37:10 +0800 Message-Id: <20240103023710.1578-1-cooper.joshua@linux.alibaba.com> X-Mailer: git-send-email 2.27.0.windows.1 In-Reply-To: <20231229040310.1047-1-cooper.joshua@linux.alibaba.com> References: <20231229040310.1047-1-cooper.joshua@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-20.3 required=5.0 tests=BAYES_00, ENV_AND_HDR_SPF_MATCH, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org For th.vmadc/th.vmsbc as well as narrowing arithmetic instructions and floating-point compare instructions, an illegal instruction exception will be raised if the destination vector register overlaps a source vector register group. To handle this issue, we use "group_overlap" and "enabled" attribute to disable some alternatives for xtheadvector. gcc/ChangeLog: * config/riscv/riscv.md (none,W21,W42,W84,W43,W86,W87,W0): (none,W21,W42,W84,W43,W86,W87,W0,th): * config/riscv/vector.md: Co-authored-by: Jin Ma Co-authored-by: Xianmiao Qu Co-authored-by: Christoph Müllner --- gcc/config/riscv/riscv.md | 6 +- gcc/config/riscv/vector.md | 314 +++++++++++++++++++++---------------- 2 files changed, 185 insertions(+), 135 deletions(-) diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 68f7203b676..d736501784d 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -504,7 +504,7 @@ ;; Widening instructions have group-overlap constraints. Those are only ;; valid for certain register-group sizes. This attribute marks the ;; alternatives not matching the required register-group size as disabled. -(define_attr "group_overlap" "none,W21,W42,W84,W43,W86,W87,W0" +(define_attr "group_overlap" "none,W21,W42,W84,W43,W86,W87,W0,th" (const_string "none")) (define_attr "group_overlap_valid" "no,yes" @@ -543,6 +543,10 @@ (and (eq_attr "group_overlap" "W0") (match_test "riscv_get_v_regno_alignment (GET_MODE (operands[0])) > 1")) (const_string "no") + + (and (eq_attr "group_overlap" "th") + (match_test "TARGET_XTHEADVECTOR")) + (const_string "no") ] (const_string "yes"))) diff --git a/gcc/config/riscv/vector.md b/gcc/config/riscv/vector.md index 5fa30716143..77eaba16c97 100644 --- a/gcc/config/riscv/vector.md +++ b/gcc/config/riscv/vector.md @@ -3255,7 +3255,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,none,none")]) (define_insn "@pred_msbc" [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") @@ -3274,7 +3275,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,th,none")]) (define_insn "@pred_madc_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3294,7 +3296,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,none")]) (define_insn "@pred_msbc_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3314,7 +3317,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,none")]) (define_expand "@pred_madc_scalar" [(set (match_operand: 0 "register_operand") @@ -3363,7 +3367,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,none")]) (define_insn "*pred_madc_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3384,7 +3389,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,none")]) (define_expand "@pred_msbc_scalar" [(set (match_operand: 0 "register_operand") @@ -3433,7 +3439,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,none")]) (define_insn "*pred_msbc_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3454,7 +3461,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "4") - (set (attr "avl_type_idx") (const_int 5))]) + (set (attr "avl_type_idx") (const_int 5)) + (set_attr "group_overlap" "th,none")]) (define_insn "@pred_madc_overflow" [(set (match_operand: 0 "register_operand" "=vr, &vr, &vr") @@ -3472,7 +3480,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,none,none")]) (define_insn "@pred_msbc_overflow" [(set (match_operand: 0 "register_operand" "=vr, vr, &vr, &vr") @@ -3490,7 +3499,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,th,none,none")]) (define_insn "@pred_madc_overflow_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3509,7 +3519,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,none")]) (define_insn "@pred_msbc_overflow_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3528,7 +3539,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,none")]) (define_expand "@pred_madc_overflow_scalar" [(set (match_operand: 0 "register_operand") @@ -3575,7 +3587,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,none")]) (define_insn "*pred_madc_overflow_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3595,7 +3608,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,none")]) (define_expand "@pred_msbc_overflow_scalar" [(set (match_operand: 0 "register_operand") @@ -3642,7 +3656,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,none")]) (define_insn "*pred_msbc_overflow_extended_scalar" [(set (match_operand: 0 "register_operand" "=vr, &vr") @@ -3662,7 +3677,8 @@ [(set_attr "type" "vicalu") (set_attr "mode" "") (set_attr "vl_op_idx" "3") - (set (attr "avl_type_idx") (const_int 4))]) + (set (attr "avl_type_idx") (const_int 4)) + (set_attr "group_overlap" "th,none")]) ;; ------------------------------------------------------------------------------- ;; ---- Predicated integer unary operations @@ -3982,7 +3998,8 @@ "TARGET_VECTOR" "vn.w%o4\t%0,%3,%v4%p1" [(set_attr "type" "vnshift") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,none,th,th,none,th,none,none,none,th,none,none")]) (define_insn "@pred_narrow__scalar" [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") @@ -4003,7 +4020,8 @@ "TARGET_VECTOR" "vn.w%o4\t%0,%3,%4%p1" [(set_attr "type" "vnshift") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,none,th,th,none,none")]) ;; vncvt.x.x.w (define_insn "@pred_trunc" @@ -4027,7 +4045,8 @@ (set_attr "vl_op_idx" "4") (set (attr "ta") (symbol_ref "riscv_vector::get_ta(operands[5])")) (set (attr "ma") (symbol_ref "riscv_vector::get_ma(operands[6])")) - (set (attr "avl_type_idx") (const_int 7))]) + (set (attr "avl_type_idx") (const_int 7)) + (set_attr "group_overlap" "none,none,th,th,none,none")]) ;; ------------------------------------------------------------------------------- ;; ---- Predicated fixed-point operations @@ -4433,7 +4452,8 @@ "TARGET_VECTOR" "vnclip.w%o4\t%0,%3,%v4%p1" [(set_attr "type" "vnclip") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,th,th,th,th,none,none,th,th,none,none")]) (define_insn "@pred_narrow_clip_scalar" [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") @@ -4455,7 +4475,8 @@ "TARGET_VECTOR" "vnclip.w%o4\t%0,%3,%4%p1" [(set_attr "type" "vnclip") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,th,th,none,none")]) ;; ------------------------------------------------------------------------------- ;; ---- Predicated integer comparison operations @@ -4506,23 +4527,24 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp" - [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr, &vr, &vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 7 "const_int_operand" " i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_ltge_operator" - [(match_operand:V_VLSI 4 "register_operand" " vr, vr, vr, vr") - (match_operand:V_VLSI 5 "vector_arith_operand" " vr, vr, vi, vi")]) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")))] + [(match_operand:V_VLSI 4 "register_operand" " vr, vr, vr, vr, vr, vr, vr, vr") + (match_operand:V_VLSI 5 "vector_arith_operand" " vr, vr, vi, vi, vr, vr, vi, vi")]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0, vu, 0")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,th,th,none,none,none,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_narrow" @@ -4542,7 +4564,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,th,th,th,th,none,none")]) (define_expand "@pred_ltge" [(set (match_operand: 0 "register_operand") @@ -4586,23 +4609,24 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_ltge" - [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK, rK, rK") - (match_operand 7 "const_int_operand" " i, i, i, i") - (match_operand 8 "const_int_operand" " i, i, i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "ltge_operator" - [(match_operand:V_VLSI 4 "register_operand" " vr, vr, vr, vr") - (match_operand:V_VLSI 5 "vector_neg_arith_operand" " vr, vr, vj, vj")]) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")))] + [(match_operand:V_VLSI 4 "register_operand" " vr, vr, vr, vr, vr, vr") + (match_operand:V_VLSI 5 "vector_neg_arith_operand" " vr, vr, vj, vj, vr, vj")]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,th,th,none,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_ltge_narrow" @@ -4622,7 +4646,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vms%B3.v%o5\t%0,%4,%v5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,th,th,th,th,none,none")]) (define_expand "@pred_cmp_scalar" [(set (match_operand: 0 "register_operand") @@ -4668,24 +4693,25 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:V_VLSI_QHS 4 "register_operand" " vr, vr") + [(match_operand:V_VLSI_QHS 4 "register_operand" " vr, vr, vr") (vec_duplicate:V_VLSI_QHS - (match_operand: 5 "register_operand" " r, r"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r"))]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" @@ -4706,7 +4732,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) (define_expand "@pred_eqne_scalar" [(set (match_operand: 0 "register_operand") @@ -4752,24 +4779,25 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:V_VLSI_QHS - (match_operand: 5 "register_operand" " r, r")) - (match_operand:V_VLSI_QHS 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r")) + (match_operand:V_VLSI_QHS 4 "register_operand" " vr, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" @@ -4790,7 +4818,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) ;; Handle GET_MODE_INNER (mode) = DImode. We need to split them since ;; we need to deal with SEW = 64 in RV32 system. @@ -4917,24 +4946,25 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:V_VLSI_D 4 "register_operand" " vr, vr") + [(match_operand:V_VLSI_D 4 "register_operand" " vr, vr, vr") (vec_duplicate:V_VLSI_D - (match_operand: 5 "register_operand" " r, r"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r"))]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" @@ -4955,28 +4985,30 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:V_VLSI_D - (match_operand: 5 "register_operand" " r, r")) - (match_operand:V_VLSI_D 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r")) + (match_operand:V_VLSI_D 4 "register_operand" " vr, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" @@ -4997,7 +5029,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) (define_insn "*pred_cmp_extended_scalar_merge_tie_mask" [(set (match_operand: 0 "register_operand" "=vm") @@ -5026,25 +5059,26 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_extended_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "comparison_except_eqge_operator" - [(match_operand:V_VLSI_D 4 "register_operand" " vr, vr") + [(match_operand:V_VLSI_D 4 "register_operand" " vr, vr, vr") (vec_duplicate:V_VLSI_D (sign_extend: - (match_operand: 5 "register_operand" " r, r")))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r")))]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode) && !TARGET_64BIT" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) (define_insn "*pred_cmp_extended_scalar_narrow" [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") @@ -5065,7 +5099,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode) && !TARGET_64BIT" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) (define_insn "*pred_eqne_extended_scalar_merge_tie_mask" [(set (match_operand: 0 "register_operand" "=vm") @@ -5094,25 +5129,26 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_extended_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:V_VLSI_D (sign_extend: - (match_operand: 5 "register_operand" " r, r"))) - (match_operand:V_VLSI_D 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " r, r, r"))) + (match_operand:V_VLSI_D 4 "register_operand" " vr, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode) && !TARGET_64BIT" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) (define_insn "*pred_eqne_extended_scalar_narrow" [(set (match_operand: 0 "register_operand" "=vm, vr, vr, &vr, &vr") @@ -5133,7 +5169,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode) && !TARGET_64BIT" "vms%B3.vx\t%0,%4,%5%p1" [(set_attr "type" "vicmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) ;; GE, vmsge.vx/vmsgeu.vx ;; @@ -7322,23 +7359,24 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i, i") + (match_operand 8 "const_int_operand" " i, i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "signed_order_operator" - [(match_operand:V_VLSF 4 "register_operand" " vr, vr") - (match_operand:V_VLSF 5 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + [(match_operand:V_VLSF 4 "register_operand" " vr, vr, vr, vr") + (match_operand:V_VLSF 5 "register_operand" " vr, vr, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vmf%B3.vv\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none,none")]) (define_insn "*pred_cmp_narrow_merge_tie_mask" [(set (match_operand: 0 "register_operand" "=vm") @@ -7381,7 +7419,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vmf%B3.vv\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,th,th,th,th,none,none")]) (define_expand "@pred_cmp_scalar" [(set (match_operand: 0 "register_operand") @@ -7427,24 +7466,25 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_cmp_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "signed_order_operator" - [(match_operand:V_VLSF 4 "register_operand" " vr, vr") + [(match_operand:V_VLSF 4 "register_operand" " vr, vr, vr") (vec_duplicate:V_VLSF - (match_operand: 5 "register_operand" " f, f"))]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " f, f, f"))]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_cmp_scalar_narrow" @@ -7465,7 +7505,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) (define_expand "@pred_eqne_scalar" [(set (match_operand: 0 "register_operand") @@ -7511,24 +7552,25 @@ ;; We don't use early-clobber for LMUL <= 1 to get better codegen. (define_insn "*pred_eqne_scalar" - [(set (match_operand: 0 "register_operand" "=vr, vr") + [(set (match_operand: 0 "register_operand" "=vr, vr, &vr") (if_then_else: (unspec: - [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1") - (match_operand 6 "vector_length_operand" " rK, rK") - (match_operand 7 "const_int_operand" " i, i") - (match_operand 8 "const_int_operand" " i, i") + [(match_operand: 1 "vector_mask_operand" "vmWc1,vmWc1,vmWc1") + (match_operand 6 "vector_length_operand" " rK, rK, rK") + (match_operand 7 "const_int_operand" " i, i, i") + (match_operand 8 "const_int_operand" " i, i, i") (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (match_operator: 3 "equality_operator" [(vec_duplicate:V_VLSF - (match_operand: 5 "register_operand" " f, f")) - (match_operand:V_VLSF 4 "register_operand" " vr, vr")]) - (match_operand: 2 "vector_merge_operand" " vu, 0")))] + (match_operand: 5 "register_operand" " f, f, f")) + (match_operand:V_VLSF 4 "register_operand" " vr, vr, vr")]) + (match_operand: 2 "vector_merge_operand" " vu, 0, vu")))] "TARGET_VECTOR && riscv_vector::cmp_lmul_le_one (mode)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "th,th,none")]) ;; We use early-clobber for source LMUL > dest LMUL. (define_insn "*pred_eqne_scalar_narrow" @@ -7549,7 +7591,8 @@ "TARGET_VECTOR && riscv_vector::cmp_lmul_gt_one (mode)" "vmf%B3.vf\t%0,%4,%5%p1" [(set_attr "type" "vfcmp") - (set_attr "mode" "")]) + (set_attr "mode" "") + (set_attr "group_overlap" "none,th,th,none,none")]) ;; ------------------------------------------------------------------------------- ;; ---- Predicated floating-point merge @@ -7769,7 +7812,8 @@ [(set_attr "type" "vfncvtftoi") (set_attr "mode" "") (set (attr "frm_mode") - (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))]) + (symbol_ref "riscv_vector::get_frm_mode (operands[8])")) + (set_attr "group_overlap" "none,none,th,th,none,none")]) (define_insn "@pred_narrow_" [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") @@ -7811,7 +7855,8 @@ [(set_attr "type" "vfncvtitof") (set_attr "mode" "") (set (attr "frm_mode") - (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))]) + (symbol_ref "riscv_vector::get_frm_mode (operands[8])")) + (set_attr "group_overlap" "none,none,th,th,none,none")]) (define_insn "@pred_trunc" [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") @@ -7834,7 +7879,8 @@ [(set_attr "type" "vfncvtftof") (set_attr "mode" "") (set (attr "frm_mode") - (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))]) + (symbol_ref "riscv_vector::get_frm_mode (operands[8])")) + (set_attr "group_overlap" "none,none,th,th,none,none")]) (define_insn "@pred_rod_trunc" [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr")