From patchwork Mon Mar 30 15:35:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: LIU Zhiwei X-Patchwork-Id: 1264060 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=c-sky.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 48rchV4cT1z9sPk for ; Tue, 31 Mar 2020 03:03:54 +1100 (AEDT) Received: from localhost ([::1]:51972 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jIwtD-0002bh-OP for incoming@patchwork.ozlabs.org; Mon, 30 Mar 2020 12:03:51 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:44447) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1jIwql-0000gU-98 for qemu-devel@nongnu.org; Mon, 30 Mar 2020 12:01:21 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1jIwqi-0005HI-E0 for qemu-devel@nongnu.org; Mon, 30 Mar 2020 12:01:19 -0400 Received: from smtp2200-217.mail.aliyun.com ([121.197.200.217]:42453) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1jIwqh-00057c-H9; Mon, 30 Mar 2020 12:01:16 -0400 X-Alimail-AntiSpam: AC=CONTINUE; BC=0.07608247|-1; CH=green; DM=|CONTINUE|false|; DS=CONTINUE|ham_system_inform|0.577342-0.00014803-0.42251; FP=0|0|0|0|0|-1|-1|-1; HT=e02c03301; MF=zhiwei_liu@c-sky.com; NM=1; PH=DS; RN=11; RT=11; SR=0; TI=SMTPD_---.H7jsLO._1585584064; Received: from L-PF1D6DP4-1208.hz.ali.com(mailfrom:zhiwei_liu@c-sky.com fp:SMTPD_---.H7jsLO._1585584064) by smtp.aliyun-inc.com(10.147.41.187); Tue, 31 Mar 2020 00:01:04 +0800 From: LIU Zhiwei To: richard.henderson@linaro.org, alistair23@gmail.com, chihmin.chao@sifive.com, palmer@dabbelt.com Subject: [PATCH v7 12/61] target/riscv: vector integer add-with-carry / subtract-with-borrow instructions Date: Mon, 30 Mar 2020 23:35:44 +0800 Message-Id: <20200330153633.15298-13-zhiwei_liu@c-sky.com> X-Mailer: git-send-email 2.23.0 In-Reply-To: <20200330153633.15298-1-zhiwei_liu@c-sky.com> References: <20200330153633.15298-1-zhiwei_liu@c-sky.com> MIME-Version: 1.0 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 3.x [generic] [fuzzy] X-Received-From: 121.197.200.217 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guoren@linux.alibaba.com, qemu-riscv@nongnu.org, qemu-devel@nongnu.org, wxy194768@alibaba-inc.com, wenmeng_zhang@c-sky.com, Alistair Francis , LIU Zhiwei Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" Signed-off-by: LIU Zhiwei Reviewed-by: Alistair Francis Reviewed-by: Richard Henderson --- target/riscv/helper.h | 33 ++++++ target/riscv/insn32.decode | 11 ++ target/riscv/insn_trans/trans_rvv.inc.c | 113 +++++++++++++++++++ target/riscv/vector_helper.c | 137 ++++++++++++++++++++++++ 4 files changed, 294 insertions(+) diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 608704850a..186d72a2aa 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -343,3 +343,36 @@ DEF_HELPER_6(vwadd_wx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(vwsub_wx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(vwsub_wx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(vwsub_wx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(vadc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vadc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vadc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vadc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vsbc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vsbc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vsbc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vsbc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmadc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmadc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmadc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmadc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmsbc_vvm_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmsbc_vvm_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmsbc_vvm_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vmsbc_vvm_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vadc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vadc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vadc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vadc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vsbc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vsbc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vsbc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vsbc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmadc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmadc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmadc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmadc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmsbc_vxm_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmsbc_vxm_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmsbc_vxm_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vmsbc_vxm_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 4bdbfd16fa..022c8ea18b 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -70,6 +70,7 @@ @r2_nfvm ... ... vm:1 ..... ..... ... ..... ....... &r2nfvm %nf %rs1 %rd @r_nfvm ... ... vm:1 ..... ..... ... ..... ....... &rnfvm %nf %rs2 %rs1 %rd @r_vm ...... vm:1 ..... ..... ... ..... ....... &rmrr %rs2 %rs1 %rd +@r_vm_1 ...... . ..... ..... ... ..... ....... &rmrr vm=1 %rs2 %rs1 %rd @r_wdvm ..... wd:1 vm:1 ..... ..... ... ..... ....... &rwdvm %rs2 %rs1 %rd @r2_zimm . zimm:11 ..... ... ..... ....... %rs1 %rd @@ -300,6 +301,16 @@ vwsubu_wv 110110 . ..... ..... 010 ..... 1010111 @r_vm vwsubu_wx 110110 . ..... ..... 110 ..... 1010111 @r_vm vwsub_wv 110111 . ..... ..... 010 ..... 1010111 @r_vm vwsub_wx 110111 . ..... ..... 110 ..... 1010111 @r_vm +vadc_vvm 010000 1 ..... ..... 000 ..... 1010111 @r_vm_1 +vadc_vxm 010000 1 ..... ..... 100 ..... 1010111 @r_vm_1 +vadc_vim 010000 1 ..... ..... 011 ..... 1010111 @r_vm_1 +vmadc_vvm 010001 1 ..... ..... 000 ..... 1010111 @r_vm_1 +vmadc_vxm 010001 1 ..... ..... 100 ..... 1010111 @r_vm_1 +vmadc_vim 010001 1 ..... ..... 011 ..... 1010111 @r_vm_1 +vsbc_vvm 010010 1 ..... ..... 000 ..... 1010111 @r_vm_1 +vsbc_vxm 010010 1 ..... ..... 100 ..... 1010111 @r_vm_1 +vmsbc_vvm 010011 1 ..... ..... 000 ..... 1010111 @r_vm_1 +vmsbc_vxm 010011 1 ..... ..... 100 ..... 1010111 @r_vm_1 vsetvli 0 ........... ..... 111 ..... 1010111 @r2_zimm vsetvl 1000000 ..... ..... 111 ..... 1010111 @r diff --git a/target/riscv/insn_trans/trans_rvv.inc.c b/target/riscv/insn_trans/trans_rvv.inc.c index 3b26d72c64..3ce70388ea 100644 --- a/target/riscv/insn_trans/trans_rvv.inc.c +++ b/target/riscv/insn_trans/trans_rvv.inc.c @@ -1249,3 +1249,116 @@ GEN_OPIWX_WIDEN_TRANS(vwaddu_wx) GEN_OPIWX_WIDEN_TRANS(vwadd_wx) GEN_OPIWX_WIDEN_TRANS(vwsubu_wx) GEN_OPIWX_WIDEN_TRANS(vwsub_wx) + +/* Vector Integer Add-with-Carry / Subtract-with-Borrow Instructions */ +/* OPIVV without GVEC IR */ +#define GEN_OPIVV_TRANS(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + uint32_t data = 0; \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + TCGLabel *over = gen_new_label(); \ + tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ + \ + data = FIELD_DP32(data, VDATA, MLEN, s->mlen); \ + data = FIELD_DP32(data, VDATA, VM, a->vm); \ + data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ + tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs1), \ + vreg_ofs(s, a->rs2), cpu_env, 0, \ + s->vlen / 8, data, fns[s->sew]); \ + gen_set_label(over); \ + return true; \ + } \ + return false; \ +} + +/* + * For vadc and vsbc, an illegal instruction exception is raised if the + * destination vector register is v0 and LMUL > 1. (Section 12.3) + */ +static bool opivv_vadc_check(DisasContext *s, arg_rmrr *a) +{ + return (vext_check_isa_ill(s) && + vext_check_reg(s, a->rd, false) && + vext_check_reg(s, a->rs2, false) && + vext_check_reg(s, a->rs1, false) && + ((a->rd != 0) || (s->lmul == 0))); +} + +GEN_OPIVV_TRANS(vadc_vvm, opivv_vadc_check) +GEN_OPIVV_TRANS(vsbc_vvm, opivv_vadc_check) + +/* + * For vmadc and vmsbc, an illegal instruction exception is raised if the + * destination vector register overlaps a source vector register group. + */ +static bool opivv_vmadc_check(DisasContext *s, arg_rmrr *a) +{ + return (vext_check_isa_ill(s) && + vext_check_reg(s, a->rs2, false) && + vext_check_reg(s, a->rs1, false) && + vext_check_overlap_group(a->rd, 1, a->rs1, 1 << s->lmul) && + vext_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul)); +} + +GEN_OPIVV_TRANS(vmadc_vvm, opivv_vmadc_check) +GEN_OPIVV_TRANS(vmsbc_vvm, opivv_vmadc_check) + +static bool opivx_vadc_check(DisasContext *s, arg_rmrr *a) +{ + return (vext_check_isa_ill(s) && + vext_check_reg(s, a->rd, false) && + vext_check_reg(s, a->rs2, false) && + ((a->rd != 0) || (s->lmul == 0))); +} + +/* OPIVX without GVEC IR */ +#define GEN_OPIVX_TRANS(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_opivx * const fns[4] = { \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + \ + return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, fns[s->sew], s);\ + } \ + return false; \ +} + +GEN_OPIVX_TRANS(vadc_vxm, opivx_vadc_check) +GEN_OPIVX_TRANS(vsbc_vxm, opivx_vadc_check) + +static bool opivx_vmadc_check(DisasContext *s, arg_rmrr *a) +{ + return (vext_check_isa_ill(s) && + vext_check_reg(s, a->rs2, false) && + vext_check_overlap_group(a->rd, 1, a->rs2, 1 << s->lmul)); +} + +GEN_OPIVX_TRANS(vmadc_vxm, opivx_vmadc_check) +GEN_OPIVX_TRANS(vmsbc_vxm, opivx_vmadc_check) + +/* OPIVI without GVEC IR */ +#define GEN_OPIVI_TRANS(NAME, ZX, OPIVX, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_opivx * const fns[4] = { \ + gen_helper_##OPIVX##_b, gen_helper_##OPIVX##_h, \ + gen_helper_##OPIVX##_w, gen_helper_##OPIVX##_d, \ + }; \ + return opivi_trans(a->rd, a->rs1, a->rs2, a->vm, \ + fns[s->sew], s, ZX); \ + } \ + return false; \ +} + +GEN_OPIVI_TRANS(vadc_vim, 0, vadc_vxm, opivx_vadc_check) +GEN_OPIVI_TRANS(vmadc_vim, 0, vmadc_vxm, opivx_vmadc_check) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index f2520332c8..6a43b53c8c 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -187,6 +187,14 @@ static void clearq(void *vd, uint32_t idx, uint32_t cnt, uint32_t tot) vext_clear(cur, cnt, tot); } +static inline void vext_set_elem_mask(void *v0, int mlen, int index, + uint8_t value) +{ + int idx = (index * mlen) / 64; + int pos = (index * mlen) % 64; + uint64_t old = ((uint64_t *)v0)[idx]; + ((uint64_t *)v0)[idx] = deposit64(old, pos, mlen, value); +} static inline int vext_elem_mask(void *v0, int mlen, int index) { @@ -1127,3 +1135,132 @@ GEN_VEXT_VX(vwadd_wx_w, 4, 8, clearq) GEN_VEXT_VX(vwsub_wx_b, 1, 2, clearh) GEN_VEXT_VX(vwsub_wx_h, 2, 4, clearl) GEN_VEXT_VX(vwsub_wx_w, 4, 8, clearq) + +/* Vector Integer Add-with-Carry / Subtract-with-Borrow Instructions */ +#define DO_VADC(N, M, C) (N + M + C) +#define DO_VSBC(N, M, C) (N - M - C) + +#define GEN_VEXT_VADC_VVM(NAME, ETYPE, H, DO_OP, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen = vext_mlen(desc); \ + uint32_t vl = env->vl; \ + uint32_t esz = sizeof(ETYPE); \ + uint32_t vlmax = vext_maxsz(desc) / esz; \ + uint32_t i; \ + \ + for (i = 0; i < vl; i++) { \ + ETYPE s1 = *((ETYPE *)vs1 + H(i)); \ + ETYPE s2 = *((ETYPE *)vs2 + H(i)); \ + uint8_t carry = vext_elem_mask(v0, mlen, i); \ + \ + *((ETYPE *)vd + H(i)) = DO_OP(s2, s1, carry); \ + } \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_VEXT_VADC_VVM(vadc_vvm_b, uint8_t, H1, DO_VADC, clearb) +GEN_VEXT_VADC_VVM(vadc_vvm_h, uint16_t, H2, DO_VADC, clearh) +GEN_VEXT_VADC_VVM(vadc_vvm_w, uint32_t, H4, DO_VADC, clearl) +GEN_VEXT_VADC_VVM(vadc_vvm_d, uint64_t, H8, DO_VADC, clearq) + +GEN_VEXT_VADC_VVM(vsbc_vvm_b, uint8_t, H1, DO_VSBC, clearb) +GEN_VEXT_VADC_VVM(vsbc_vvm_h, uint16_t, H2, DO_VSBC, clearh) +GEN_VEXT_VADC_VVM(vsbc_vvm_w, uint32_t, H4, DO_VSBC, clearl) +GEN_VEXT_VADC_VVM(vsbc_vvm_d, uint64_t, H8, DO_VSBC, clearq) + +#define GEN_VEXT_VADC_VXM(NAME, ETYPE, H, DO_OP, CLEAR_FN) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen = vext_mlen(desc); \ + uint32_t vl = env->vl; \ + uint32_t esz = sizeof(ETYPE); \ + uint32_t vlmax = vext_maxsz(desc) / esz; \ + uint32_t i; \ + \ + for (i = 0; i < vl; i++) { \ + ETYPE s2 = *((ETYPE *)vs2 + H(i)); \ + uint8_t carry = vext_elem_mask(v0, mlen, i); \ + \ + *((ETYPE *)vd + H(i)) = DO_OP(s2, (ETYPE)(target_long)s1, carry);\ + } \ + CLEAR_FN(vd, vl, vl * esz, vlmax * esz); \ +} + +GEN_VEXT_VADC_VXM(vadc_vxm_b, uint8_t, H1, DO_VADC, clearb) +GEN_VEXT_VADC_VXM(vadc_vxm_h, uint16_t, H2, DO_VADC, clearh) +GEN_VEXT_VADC_VXM(vadc_vxm_w, uint32_t, H4, DO_VADC, clearl) +GEN_VEXT_VADC_VXM(vadc_vxm_d, uint64_t, H8, DO_VADC, clearq) + +GEN_VEXT_VADC_VXM(vsbc_vxm_b, uint8_t, H1, DO_VSBC, clearb) +GEN_VEXT_VADC_VXM(vsbc_vxm_h, uint16_t, H2, DO_VSBC, clearh) +GEN_VEXT_VADC_VXM(vsbc_vxm_w, uint32_t, H4, DO_VSBC, clearl) +GEN_VEXT_VADC_VXM(vsbc_vxm_d, uint64_t, H8, DO_VSBC, clearq) + +#define DO_MADC(N, M, C) (C ? (__typeof(N))(N + M + 1) <= N : \ + (__typeof(N))(N + M) < N) +#define DO_MSBC(N, M, C) (C ? N <= M : N < M) + +#define GEN_VEXT_VMADC_VVM(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen = vext_mlen(desc); \ + uint32_t vl = env->vl; \ + uint32_t vlmax = vext_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + for (i = 0; i < vl; i++) { \ + ETYPE s1 = *((ETYPE *)vs1 + H(i)); \ + ETYPE s2 = *((ETYPE *)vs2 + H(i)); \ + uint8_t carry = vext_elem_mask(v0, mlen, i); \ + \ + vext_set_elem_mask(vd, mlen, i, DO_OP(s2, s1, carry));\ + } \ + for (; i < vlmax; i++) { \ + vext_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_VEXT_VMADC_VVM(vmadc_vvm_b, uint8_t, H1, DO_MADC) +GEN_VEXT_VMADC_VVM(vmadc_vvm_h, uint16_t, H2, DO_MADC) +GEN_VEXT_VMADC_VVM(vmadc_vvm_w, uint32_t, H4, DO_MADC) +GEN_VEXT_VMADC_VVM(vmadc_vvm_d, uint64_t, H8, DO_MADC) + +GEN_VEXT_VMADC_VVM(vmsbc_vvm_b, uint8_t, H1, DO_MSBC) +GEN_VEXT_VMADC_VVM(vmsbc_vvm_h, uint16_t, H2, DO_MSBC) +GEN_VEXT_VMADC_VVM(vmsbc_vvm_w, uint32_t, H4, DO_MSBC) +GEN_VEXT_VMADC_VVM(vmsbc_vvm_d, uint64_t, H8, DO_MSBC) + +#define GEN_VEXT_VMADC_VXM(NAME, ETYPE, H, DO_OP) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t mlen = vext_mlen(desc); \ + uint32_t vl = env->vl; \ + uint32_t vlmax = vext_maxsz(desc) / sizeof(ETYPE); \ + uint32_t i; \ + \ + for (i = 0; i < vl; i++) { \ + ETYPE s2 = *((ETYPE *)vs2 + H(i)); \ + uint8_t carry = vext_elem_mask(v0, mlen, i); \ + \ + vext_set_elem_mask(vd, mlen, i, \ + DO_OP(s2, (ETYPE)(target_long)s1, carry)); \ + } \ + for (; i < vlmax; i++) { \ + vext_set_elem_mask(vd, mlen, i, 0); \ + } \ +} + +GEN_VEXT_VMADC_VXM(vmadc_vxm_b, uint8_t, H1, DO_MADC) +GEN_VEXT_VMADC_VXM(vmadc_vxm_h, uint16_t, H2, DO_MADC) +GEN_VEXT_VMADC_VXM(vmadc_vxm_w, uint32_t, H4, DO_MADC) +GEN_VEXT_VMADC_VXM(vmadc_vxm_d, uint64_t, H8, DO_MADC) + +GEN_VEXT_VMADC_VXM(vmsbc_vxm_b, uint8_t, H1, DO_MSBC) +GEN_VEXT_VMADC_VXM(vmsbc_vxm_h, uint16_t, H2, DO_MSBC) +GEN_VEXT_VMADC_VXM(vmsbc_vxm_w, uint32_t, H4, DO_MSBC) +GEN_VEXT_VMADC_VXM(vmsbc_vxm_d, uint64_t, H8, DO_MSBC)