From patchwork Thu Jun 22 16:16:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798512 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=cwqUMnpB; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Bs1n0dz20Xt for ; Fri, 23 Jun 2023 02:18:13 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN05-0008Sr-UX; Thu, 22 Jun 2023 12:17:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCMzs-0008PK-MM for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:26 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCMzp-0004Xp-Me for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:24 -0400 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b5018cb4dcso37601955ad.2 for ; Thu, 22 Jun 2023 09:17:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450640; x=1690042640; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=l0DNTnBlEqS2Uqah4jWwbqY34NECnm5YNPEA9Xgvdsw=; b=cwqUMnpBF0k31w/kyBQ+ihhHy5CyPy+VVJFx0M/MWH6P4ER6tXkyNwLy7BtfpO43oZ /zsuFAe1CUzUIqLStpnMBULdnccqGIRv/powPvTeFGjwLfWEs+RgNCiRzbfOj9RGMp3V kAJy4SSWUTi8EBDAm+hE8Y0le1FsiC3VZrqNVXC1qmEn3uWn/9lVmF37s8c9W1jJpHF1 Ow0D1mhMhgb7/xhrHAef0fUIoYMPS6eM++7DQ0KV7eVwf4NOXM1u4tt2BnUEgVPBnREr 24j3yHWhooIwWkEZ3WKNKUpN5cVRP0JU+NlqIogeFBqNT2EuJ/GuE/GhE7TdBxWgTVX/ Nrmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450640; x=1690042640; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l0DNTnBlEqS2Uqah4jWwbqY34NECnm5YNPEA9Xgvdsw=; b=Qm2DXeK5DpFJHpB9QqVLJ3ttWPLMSIgPM9GFnoYYUXDrQ33URl+6wX/QsaFP/DYNND P18DKAXXAs4uMISLxGH7XfRknLkad8tO3GQasAeRwCOz3U99CPspbKJT2Yu+uoV+3VOc nRmEUJ9qpNJfIhr71CiaxZon3aQbgOUYlrwcqnsSvc3uctQS6EZU1E4CuowKwkOYqh/e 8jhTFdWZRIBoMbuST3Th2Lt7tXWgJA+KbzCmZc1HCqrmcmZCj8xW9g85pOcXrkzv2lyC v4cj1PBlLoRw4O5j3KxcFVo/qRtEJrLpakRhYmros3kktCDHyxChhuW2Uvn/YkEdxRti YqVg== X-Gm-Message-State: AC+VfDxZgTlrkZFPBXy5802pT972RNORfg4UVmEr6yawmVXUozfZdWr5 0uVi0jMkcb6ink7rdzoVT0pGRbybY2rrcXOhiSx8ZlJUTX2/91mOKbPhHu9fu/YL1r5ECfGkSSn TWitF55Q6JYL6RFYJfsO4alg2KyYxZOu25KX9u/9hTMnTJBkyGHhmnp5wvovO9tOVEgcFwEQZO4 uA+kE= X-Google-Smtp-Source: ACHHUZ75iFr4S7uSv2/xVMrm/N7Hgn5DuXveTMuAY6mKsGJIV9UFDf9qPLa5aZt+aDBAhFjE0g2nIQ== X-Received: by 2002:a17:902:8206:b0:1ac:8062:4f31 with SMTP id x6-20020a170902820600b001ac80624f31mr7813095pln.37.1687450639474; Thu, 22 Jun 2023 09:17:19 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.17.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:17:19 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Kiran Ostrolenk , Weiwei Li , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Liu Zhiwei Subject: [PATCH v4 01/17] target/riscv: Refactor some of the generic vector functionality Date: Fri, 23 Jun 2023 00:16:17 +0800 Message-Id: <20230622161646.32005-2-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=max.chou@sifive.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Kiran Ostrolenk Take some functions/macros out of `vector_helper` and put them in a new module called `vector_internals`. This ensures they can be used by both vector and vector-crypto helpers (latter implemented in proceeding commits). Signed-off-by: Kiran Ostrolenk Reviewed-by: Weiwei Li Signed-off-by: Max Chou --- target/riscv/meson.build | 1 + target/riscv/vector_helper.c | 201 +------------------------------- target/riscv/vector_internals.c | 81 +++++++++++++ target/riscv/vector_internals.h | 182 +++++++++++++++++++++++++++++ 4 files changed, 265 insertions(+), 200 deletions(-) create mode 100644 target/riscv/vector_internals.c create mode 100644 target/riscv/vector_internals.h diff --git a/target/riscv/meson.build b/target/riscv/meson.build index 7f56c5f88d..c3801ee5e0 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -16,6 +16,7 @@ riscv_ss.add(files( 'gdbstub.c', 'op_helper.c', 'vector_helper.c', + 'vector_internals.c', 'bitmanip_helper.c', 'translate.c', 'm128_helper.c', diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 1e06e7447c..57be83400d 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -26,6 +26,7 @@ #include "fpu/softfloat.h" #include "tcg/tcg-gvec-desc.h" #include "internals.h" +#include "vector_internals.h" #include target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1, @@ -72,68 +73,6 @@ target_ulong HELPER(vsetvl)(CPURISCVState *env, target_ulong s1, return vl; } -/* - * Note that vector data is stored in host-endian 64-bit chunks, - * so addressing units smaller than that needs a host-endian fixup. - */ -#if HOST_BIG_ENDIAN -#define H1(x) ((x) ^ 7) -#define H1_2(x) ((x) ^ 6) -#define H1_4(x) ((x) ^ 4) -#define H2(x) ((x) ^ 3) -#define H4(x) ((x) ^ 1) -#define H8(x) ((x)) -#else -#define H1(x) (x) -#define H1_2(x) (x) -#define H1_4(x) (x) -#define H2(x) (x) -#define H4(x) (x) -#define H8(x) (x) -#endif - -static inline uint32_t vext_nf(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, NF); -} - -static inline uint32_t vext_vm(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VM); -} - -/* - * Encode LMUL to lmul as following: - * LMUL vlmul lmul - * 1 000 0 - * 2 001 1 - * 4 010 2 - * 8 011 3 - * - 100 - - * 1/8 101 -3 - * 1/4 110 -2 - * 1/2 111 -1 - */ -static inline int32_t vext_lmul(uint32_t desc) -{ - return sextract32(FIELD_EX32(simd_data(desc), VDATA, LMUL), 0, 3); -} - -static inline uint32_t vext_vta(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VTA); -} - -static inline uint32_t vext_vma(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VMA); -} - -static inline uint32_t vext_vta_all_1s(uint32_t desc) -{ - return FIELD_EX32(simd_data(desc), VDATA, VTA_ALL_1S); -} - /* * Get the maximum number of elements can be operated. * @@ -152,21 +91,6 @@ static inline uint32_t vext_max_elems(uint32_t desc, uint32_t log2_esz) return scale < 0 ? vlenb >> -scale : vlenb << scale; } -/* - * Get number of total elements, including prestart, body and tail elements. - * Note that when LMUL < 1, the tail includes the elements past VLMAX that - * are held in the same vector register. - */ -static inline uint32_t vext_get_total_elems(CPURISCVState *env, uint32_t desc, - uint32_t esz) -{ - uint32_t vlenb = simd_maxsz(desc); - uint32_t sew = 1 << FIELD_EX64(env->vtype, VTYPE, VSEW); - int8_t emul = ctzl(esz) - ctzl(sew) + vext_lmul(desc) < 0 ? 0 : - ctzl(esz) - ctzl(sew) + vext_lmul(desc); - return (vlenb << emul) / esz; -} - static inline target_ulong adjust_addr(CPURISCVState *env, target_ulong addr) { return (addr & ~env->cur_pmmask) | env->cur_pmbase; @@ -199,20 +123,6 @@ static void probe_pages(CPURISCVState *env, target_ulong addr, } } -/* set agnostic elements to 1s */ -static void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, - uint32_t tot) -{ - if (is_agnostic == 0) { - /* policy undisturbed */ - return; - } - if (tot - cnt == 0) { - return; - } - memset(base + cnt, -1, tot - cnt); -} - static inline void vext_set_elem_mask(void *v0, int index, uint8_t value) { @@ -222,18 +132,6 @@ static inline void vext_set_elem_mask(void *v0, int index, ((uint64_t *)v0)[idx] = deposit64(old, pos, 1, value); } -/* - * Earlier designs (pre-0.9) had a varying number of bits - * per mask value (MLEN). In the 0.9 design, MLEN=1. - * (Section 4.5) - */ -static inline int vext_elem_mask(void *v0, int index) -{ - int idx = index / 64; - int pos = index % 64; - return (((uint64_t *)v0)[idx] >> pos) & 1; -} - /* elements operations for load and store */ typedef void vext_ldst_elem_fn(CPURISCVState *env, target_ulong addr, uint32_t idx, void *vd, uintptr_t retaddr); @@ -728,18 +626,11 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) * Vector Integer Arithmetic Instructions */ -/* expand macro args before macro */ -#define RVVCALL(macro, ...) macro(__VA_ARGS__) - /* (TD, T1, T2, TX1, TX2) */ #define OP_SSS_B int8_t, int8_t, int8_t, int8_t, int8_t #define OP_SSS_H int16_t, int16_t, int16_t, int16_t, int16_t #define OP_SSS_W int32_t, int32_t, int32_t, int32_t, int32_t #define OP_SSS_D int64_t, int64_t, int64_t, int64_t, int64_t -#define OP_UUU_B uint8_t, uint8_t, uint8_t, uint8_t, uint8_t -#define OP_UUU_H uint16_t, uint16_t, uint16_t, uint16_t, uint16_t -#define OP_UUU_W uint32_t, uint32_t, uint32_t, uint32_t, uint32_t -#define OP_UUU_D uint64_t, uint64_t, uint64_t, uint64_t, uint64_t #define OP_SUS_B int8_t, uint8_t, int8_t, uint8_t, int8_t #define OP_SUS_H int16_t, uint16_t, int16_t, uint16_t, int16_t #define OP_SUS_W int32_t, uint32_t, int32_t, uint32_t, int32_t @@ -763,16 +654,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) #define NOP_UUU_H uint16_t, uint16_t, uint32_t, uint16_t, uint32_t #define NOP_UUU_W uint32_t, uint32_t, uint64_t, uint32_t, uint64_t -/* operation of two vector elements */ -typedef void opivv2_fn(void *vd, void *vs1, void *vs2, int i); - -#define OPIVV2(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ -static void do_##NAME(void *vd, void *vs1, void *vs2, int i) \ -{ \ - TX1 s1 = *((T1 *)vs1 + HS1(i)); \ - TX2 s2 = *((T2 *)vs2 + HS2(i)); \ - *((TD *)vd + HD(i)) = OP(s2, s1); \ -} #define DO_SUB(N, M) (N - M) #define DO_RSUB(N, M) (M - N) @@ -785,40 +666,6 @@ RVVCALL(OPIVV2, vsub_vv_h, OP_SSS_H, H2, H2, H2, DO_SUB) RVVCALL(OPIVV2, vsub_vv_w, OP_SSS_W, H4, H4, H4, DO_SUB) RVVCALL(OPIVV2, vsub_vv_d, OP_SSS_D, H8, H8, H8, DO_SUB) -static void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, - CPURISCVState *env, uint32_t desc, - opivv2_fn *fn, uint32_t esz) -{ - uint32_t vm = vext_vm(desc); - uint32_t vl = env->vl; - uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); - uint32_t vma = vext_vma(desc); - uint32_t i; - - for (i = env->vstart; i < vl; i++) { - if (!vm && !vext_elem_mask(v0, i)) { - /* set masked-off elements to 1s */ - vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); - continue; - } - fn(vd, vs1, vs2, i); - } - env->vstart = 0; - /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); -} - -/* generate the helpers for OPIVV */ -#define GEN_VEXT_VV(NAME, ESZ) \ -void HELPER(NAME)(void *vd, void *v0, void *vs1, \ - void *vs2, CPURISCVState *env, \ - uint32_t desc) \ -{ \ - do_vext_vv(vd, v0, vs1, vs2, env, desc, \ - do_##NAME, ESZ); \ -} - GEN_VEXT_VV(vadd_vv_b, 1) GEN_VEXT_VV(vadd_vv_h, 2) GEN_VEXT_VV(vadd_vv_w, 4) @@ -828,18 +675,6 @@ GEN_VEXT_VV(vsub_vv_h, 2) GEN_VEXT_VV(vsub_vv_w, 4) GEN_VEXT_VV(vsub_vv_d, 8) -typedef void opivx2_fn(void *vd, target_long s1, void *vs2, int i); - -/* - * (T1)s1 gives the real operator type. - * (TX1)(T1)s1 expands the operator type of widen or narrow operations. - */ -#define OPIVX2(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ -static void do_##NAME(void *vd, target_long s1, void *vs2, int i) \ -{ \ - TX2 s2 = *((T2 *)vs2 + HS2(i)); \ - *((TD *)vd + HD(i)) = OP(s2, (TX1)(T1)s1); \ -} RVVCALL(OPIVX2, vadd_vx_b, OP_SSS_B, H1, H1, DO_ADD) RVVCALL(OPIVX2, vadd_vx_h, OP_SSS_H, H2, H2, DO_ADD) @@ -854,40 +689,6 @@ RVVCALL(OPIVX2, vrsub_vx_h, OP_SSS_H, H2, H2, DO_RSUB) RVVCALL(OPIVX2, vrsub_vx_w, OP_SSS_W, H4, H4, DO_RSUB) RVVCALL(OPIVX2, vrsub_vx_d, OP_SSS_D, H8, H8, DO_RSUB) -static void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, - CPURISCVState *env, uint32_t desc, - opivx2_fn fn, uint32_t esz) -{ - uint32_t vm = vext_vm(desc); - uint32_t vl = env->vl; - uint32_t total_elems = vext_get_total_elems(env, desc, esz); - uint32_t vta = vext_vta(desc); - uint32_t vma = vext_vma(desc); - uint32_t i; - - for (i = env->vstart; i < vl; i++) { - if (!vm && !vext_elem_mask(v0, i)) { - /* set masked-off elements to 1s */ - vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); - continue; - } - fn(vd, s1, vs2, i); - } - env->vstart = 0; - /* set tail elements to 1s */ - vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); -} - -/* generate the helpers for OPIVX */ -#define GEN_VEXT_VX(NAME, ESZ) \ -void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ - void *vs2, CPURISCVState *env, \ - uint32_t desc) \ -{ \ - do_vext_vx(vd, v0, s1, vs2, env, desc, \ - do_##NAME, ESZ); \ -} - GEN_VEXT_VX(vadd_vx_b, 1) GEN_VEXT_VX(vadd_vx_h, 2) GEN_VEXT_VX(vadd_vx_w, 4) diff --git a/target/riscv/vector_internals.c b/target/riscv/vector_internals.c new file mode 100644 index 0000000000..9cf5c17cde --- /dev/null +++ b/target/riscv/vector_internals.c @@ -0,0 +1,81 @@ +/* + * RISC-V Vector Extension Internals + * + * Copyright (c) 2020 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ + +#include "vector_internals.h" + +/* set agnostic elements to 1s */ +void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, + uint32_t tot) +{ + if (is_agnostic == 0) { + /* policy undisturbed */ + return; + } + if (tot - cnt == 0) { + return ; + } + memset(base + cnt, -1, tot - cnt); +} + +void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivv2_fn *fn, uint32_t esz) +{ + uint32_t vm = vext_vm(desc); + uint32_t vl = env->vl; + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + uint32_t vta = vext_vta(desc); + uint32_t vma = vext_vma(desc); + uint32_t i; + + for (i = env->vstart; i < vl; i++) { + if (!vm && !vext_elem_mask(v0, i)) { + /* set masked-off elements to 1s */ + vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); + continue; + } + fn(vd, vs1, vs2, i); + } + env->vstart = 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); +} + +void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivx2_fn fn, uint32_t esz) +{ + uint32_t vm = vext_vm(desc); + uint32_t vl = env->vl; + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + uint32_t vta = vext_vta(desc); + uint32_t vma = vext_vma(desc); + uint32_t i; + + for (i = env->vstart; i < vl; i++) { + if (!vm && !vext_elem_mask(v0, i)) { + /* set masked-off elements to 1s */ + vext_set_elems_1s(vd, vma, i * esz, (i + 1) * esz); + continue; + } + fn(vd, s1, vs2, i); + } + env->vstart = 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, vl * esz, total_elems * esz); +} diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internals.h new file mode 100644 index 0000000000..749d138beb --- /dev/null +++ b/target/riscv/vector_internals.h @@ -0,0 +1,182 @@ +/* + * RISC-V Vector Extension Internals + * + * Copyright (c) 2020 T-Head Semiconductor Co., Ltd. All rights reserved. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ + +#ifndef TARGET_RISCV_VECTOR_INTERNALS_H +#define TARGET_RISCV_VECTOR_INTERNALS_H + +#include "qemu/osdep.h" +#include "qemu/bitops.h" +#include "cpu.h" +#include "tcg/tcg-gvec-desc.h" +#include "internals.h" + +static inline uint32_t vext_nf(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, NF); +} + +/* + * Note that vector data is stored in host-endian 64-bit chunks, + * so addressing units smaller than that needs a host-endian fixup. + */ +#if HOST_BIG_ENDIAN +#define H1(x) ((x) ^ 7) +#define H1_2(x) ((x) ^ 6) +#define H1_4(x) ((x) ^ 4) +#define H2(x) ((x) ^ 3) +#define H4(x) ((x) ^ 1) +#define H8(x) ((x)) +#else +#define H1(x) (x) +#define H1_2(x) (x) +#define H1_4(x) (x) +#define H2(x) (x) +#define H4(x) (x) +#define H8(x) (x) +#endif + +/* + * Encode LMUL to lmul as following: + * LMUL vlmul lmul + * 1 000 0 + * 2 001 1 + * 4 010 2 + * 8 011 3 + * - 100 - + * 1/8 101 -3 + * 1/4 110 -2 + * 1/2 111 -1 + */ +static inline int32_t vext_lmul(uint32_t desc) +{ + return sextract32(FIELD_EX32(simd_data(desc), VDATA, LMUL), 0, 3); +} + +static inline uint32_t vext_vm(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VM); +} + +static inline uint32_t vext_vma(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VMA); +} + +static inline uint32_t vext_vta(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VTA); +} + +static inline uint32_t vext_vta_all_1s(uint32_t desc) +{ + return FIELD_EX32(simd_data(desc), VDATA, VTA_ALL_1S); +} + +/* + * Earlier designs (pre-0.9) had a varying number of bits + * per mask value (MLEN). In the 0.9 design, MLEN=1. + * (Section 4.5) + */ +static inline int vext_elem_mask(void *v0, int index) +{ + int idx = index / 64; + int pos = index % 64; + return (((uint64_t *)v0)[idx] >> pos) & 1; +} + +/* + * Get number of total elements, including prestart, body and tail elements. + * Note that when LMUL < 1, the tail includes the elements past VLMAX that + * are held in the same vector register. + */ +static inline uint32_t vext_get_total_elems(CPURISCVState *env, uint32_t desc, + uint32_t esz) +{ + uint32_t vlenb = simd_maxsz(desc); + uint32_t sew = 1 << FIELD_EX64(env->vtype, VTYPE, VSEW); + int8_t emul = ctzl(esz) - ctzl(sew) + vext_lmul(desc) < 0 ? 0 : + ctzl(esz) - ctzl(sew) + vext_lmul(desc); + return (vlenb << emul) / esz; +} + +/* set agnostic elements to 1s */ +void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, + uint32_t tot); + +/* expand macro args before macro */ +#define RVVCALL(macro, ...) macro(__VA_ARGS__) + +/* (TD, T1, T2, TX1, TX2) */ +#define OP_UUU_B uint8_t, uint8_t, uint8_t, uint8_t, uint8_t +#define OP_UUU_H uint16_t, uint16_t, uint16_t, uint16_t, uint16_t +#define OP_UUU_W uint32_t, uint32_t, uint32_t, uint32_t, uint32_t +#define OP_UUU_D uint64_t, uint64_t, uint64_t, uint64_t, uint64_t + +/* operation of two vector elements */ +typedef void opivv2_fn(void *vd, void *vs1, void *vs2, int i); + +#define OPIVV2(NAME, TD, T1, T2, TX1, TX2, HD, HS1, HS2, OP) \ +static void do_##NAME(void *vd, void *vs1, void *vs2, int i) \ +{ \ + TX1 s1 = *((T1 *)vs1 + HS1(i)); \ + TX2 s2 = *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) = OP(s2, s1); \ +} + +void do_vext_vv(void *vd, void *v0, void *vs1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivv2_fn *fn, uint32_t esz); + +/* generate the helpers for OPIVV */ +#define GEN_VEXT_VV(NAME, ESZ) \ +void HELPER(NAME)(void *vd, void *v0, void *vs1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + do_vext_vv(vd, v0, vs1, vs2, env, desc, \ + do_##NAME, ESZ); \ +} + +typedef void opivx2_fn(void *vd, target_long s1, void *vs2, int i); + +/* + * (T1)s1 gives the real operator type. + * (TX1)(T1)s1 expands the operator type of widen or narrow operations. + */ +#define OPIVX2(NAME, TD, T1, T2, TX1, TX2, HD, HS2, OP) \ +static void do_##NAME(void *vd, target_long s1, void *vs2, int i) \ +{ \ + TX2 s2 = *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) = OP(s2, (TX1)(T1)s1); \ +} + +void do_vext_vx(void *vd, void *v0, target_long s1, void *vs2, + CPURISCVState *env, uint32_t desc, + opivx2_fn fn, uint32_t esz); + +/* generate the helpers for OPIVX */ +#define GEN_VEXT_VX(NAME, ESZ) \ +void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ + void *vs2, CPURISCVState *env, \ + uint32_t desc) \ +{ \ + do_vext_vx(vd, v0, s1, vs2, env, desc, \ + do_##NAME, ESZ); \ +} + +#endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ From patchwork Thu Jun 22 16:16:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798511 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=ho0RM8S4; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Bp3Bzlz20Xt for ; Fri, 23 Jun 2023 02:18:10 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN0K-0000Wf-SC; Thu, 22 Jun 2023 12:17:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN06-00004t-G5 for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:43 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN02-0004ZT-3m for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:35 -0400 Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b51414b080so44007825ad.0 for ; Thu, 22 Jun 2023 09:17:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450652; x=1690042652; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=XwUJQhiAl4EVmU4h1QeBvtoql9shb+1W6JLFKowVkAg=; b=ho0RM8S4g7bflmJ/FkGI/OYU46VrUN1F6KU3b6AQPeWqYf83ivemaN6NLvJLHqwEKU 7VEs1XWmbFt/3fgVhcwMzWJWD8ALYZkxFGTwG2vQeFdfAKNYhmZVbmfPmNw+xexWSRQg FPq+/k/9rSSik8xonZky+PRxyPSQuKZz4/4LmtQdHw0WDOgXvJTPvvk4subIrSe9O4CP us2+IaBq2tUZJRzXAKDw4Y2QrHIPGsTHBqIUGfVfgflatZk1yzgUpBowgIG4hkzmWKgO c8UXKCNGTqKwXdK6O93tv8RC6QDHlMPyPFnFZZ4mXtJKirEc5FEgWvCku06+LLoBMwty YEqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450652; x=1690042652; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XwUJQhiAl4EVmU4h1QeBvtoql9shb+1W6JLFKowVkAg=; b=P0dtrY+xtUG7TDQqIccaKDg7G6PVqKf1ZjmeBqH+193aeHTBUqJI7wpcTcBBMB5BbG 5XAdeajvGjD2wn5OWTbXmRfQ2w/wpw1Jxnl58mSt1N60ATECuDoG2GKUPG3Pyp9kniZK poI9CVSvdnKMHoMi6JrY3Sqqyj1IZ7QZOM6SjSsq1kBm9OuRluf4Pr/uWXgj3QJjVQtb bvKfw0aEnJIb+U4ZrZXHNjl7vBqs8oRitT4WC0WkALrGLmWt/Sg3KZFrrpBeLSGKwCA6 YKDL7K/ByIS/RmYyxfXlduFPGRjKOFFJ3sIcjrhS1mdK6LINlbQ2VG+v/5e4eaUlRHOh HXXQ== X-Gm-Message-State: AC+VfDwBWBIeR3Yedp3DSz7QtSyVjpIRpcSmfiMcI+6Ncx6wIzwc9+u/ peM+sXhBQK3pJ0UypAy+krhakz6cWa95eJZNlLg05SzvkgCt+BBTsssRjijj9adtjjk79HNi1HM K/EGz3KMu6jFIdnAbAOLjjdlmvEPbifMVCXPBnQB7LX/HdzgMVLR3Rs31wgxuhUWXHd+t29T4FY TevyI= X-Google-Smtp-Source: ACHHUZ5epcb+IBazsZhYKHjpy0GAW52CjI7my+bsG3tY17Ys/P5MeoMobDCJTiobjIMcwL0VRVCoKQ== X-Received: by 2002:a17:903:2810:b0:1b3:bbe3:25a8 with SMTP id kp16-20020a170903281000b001b3bbe325a8mr13699625plb.55.1687450652349; Thu, 22 Jun 2023 09:17:32 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.17.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:17:32 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Kiran Ostrolenk , Richard Henderson , Alistair Francis , Weiwei Li , Max Chou , Palmer Dabbelt , Bin Meng , Liu Zhiwei , Junqiang Wang Subject: [PATCH v4 02/17] target/riscv: Refactor vector-vector translation macro Date: Fri, 23 Jun 2023 00:16:18 +0800 Message-Id: <20230622161646.32005-3-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=max.chou@sifive.com; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Kiran Ostrolenk Refactor the non SEW-specific stuff out of `GEN_OPIVV_TRANS` into function `opivv_trans` (similar to `opivi_trans`). `opivv_trans` will be used in proceeding vector-crypto commits. Signed-off-by: Kiran Ostrolenk Reviewed-by: Richard Henderson Reviewed-by: Alistair Francis Reviewed-by: Weiwei Li Signed-off-by: Max Chou --- target/riscv/insn_trans/trans_rvv.c.inc | 62 +++++++++++++------------ 1 file changed, 32 insertions(+), 30 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index c2f7527f53..4a8e62a8be 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -1643,38 +1643,40 @@ GEN_OPIWX_WIDEN_TRANS(vwadd_wx) GEN_OPIWX_WIDEN_TRANS(vwsubu_wx) GEN_OPIWX_WIDEN_TRANS(vwsub_wx) +static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm, + gen_helper_gvec_4_ptr *fn, DisasContext *s) +{ + uint32_t data = 0; + TCGLabel *over = gen_new_label(); + tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); + tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); + + data = FIELD_DP32(data, VDATA, VM, vm); + data = FIELD_DP32(data, VDATA, LMUL, s->lmul); + data = FIELD_DP32(data, VDATA, VTA, s->vta); + data = FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s); + data = FIELD_DP32(data, VDATA, VMA, s->vma); + tcg_gen_gvec_4_ptr(vreg_ofs(s, vd), vreg_ofs(s, 0), vreg_ofs(s, vs1), + vreg_ofs(s, vs2), cpu_env, s->cfg_ptr->vlen / 8, + s->cfg_ptr->vlen / 8, data, fn); + mark_vs_dirty(s); + gen_set_label(over); + return true; +} + /* Vector Integer Add-with-Carry / Subtract-with-Borrow Instructions */ /* OPIVV without GVEC IR */ -#define GEN_OPIVV_TRANS(NAME, CHECK) \ -static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ -{ \ - if (CHECK(s, a)) { \ - uint32_t data = 0; \ - static gen_helper_gvec_4_ptr * const fns[4] = { \ - gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ - gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ - }; \ - TCGLabel *over = gen_new_label(); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ - tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ - \ - data = FIELD_DP32(data, VDATA, VM, a->vm); \ - data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ - data = FIELD_DP32(data, VDATA, VTA, s->vta); \ - data = \ - FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s);\ - data = FIELD_DP32(data, VDATA, VMA, s->vma); \ - tcg_gen_gvec_4_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \ - vreg_ofs(s, a->rs1), \ - vreg_ofs(s, a->rs2), cpu_env, \ - s->cfg_ptr->vlen / 8, \ - s->cfg_ptr->vlen / 8, data, \ - fns[s->sew]); \ - mark_vs_dirty(s); \ - gen_set_label(over); \ - return true; \ - } \ - return false; \ +#define GEN_OPIVV_TRANS(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_gvec_4_ptr * const fns[4] = { \ + gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ + }; \ + return opivv_trans(a->rd, a->rs1, a->rs2, a->vm, fns[s->sew], s);\ + } \ + return false; \ } /* From patchwork Thu Jun 22 16:16:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798521 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=bjyE9GwD; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Fz5yl4z20XS for ; Fri, 23 Jun 2023 02:20:55 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN0M-0000ev-2Q; Thu, 22 Jun 2023 12:17:54 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0D-0000AC-UY for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:47 -0400 Received: from mail-pg1-x52d.google.com ([2607:f8b0:4864:20::52d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0A-0004b2-Bs for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:44 -0400 Received: by mail-pg1-x52d.google.com with SMTP id 41be03b00d2f7-543d32eed7cso3061971a12.2 for ; Thu, 22 Jun 2023 09:17:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450661; x=1690042661; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=k1oe993sHDscc+RkuSZFFvxcP0Omzd/xt0ncTYzs7R4=; b=bjyE9GwD5TYMoE5PNiFiKYG+v/+KSOLxyFBmwvJ00CuGK41MpECRS8C8h+95unaR9C 1Tow1f46eErt6SxUrrGFr10bVG4QDh05+J4DQGytp/4xJGq7+BhTNuPkj5Zo44N772Y6 0rHUFvAxzoHSNR8vqRKbOmK/uHktvStm/OlarAfZmwNNvG4TCEeA9LeyqPCXXZ6o64r7 j5SH45R/LcH9FvS34qwbaYV4ZnT7QfDiqsRAT28nRvbuAewhI8hKPksAGViNpNkK2ppR jcnB8g5lCl1t4ZuavJZ89HKkljXmerEIFbiGtBF2exIbmLYQdb/gtKV362ypdnX/R3xl PYQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450661; x=1690042661; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=k1oe993sHDscc+RkuSZFFvxcP0Omzd/xt0ncTYzs7R4=; b=NpqvkRiXRrpebHJACyvytGxlXrLiFb1ApAmetrlbqySojQEk3U0e/6rL3nH/56YXKR rms5FQFtF3sFmgb6+mmfzoc4xvGU01d1/Q7o5Hv7Pdws1Bjm+NyxMTGVcJZ1VheI7gYP eSP1frSi9k/HqZG5qL+nsJAVAz50IIGSybGN1YzOPKDBosWgUfxaCkxxt7aywq7MaYup yxjgNvNYYcDn3Bi8TgBhVgZA56il1sF82yKVV7WocGtDRNNfr3KanaEln5utrZOeHKvp ag70MRacV62AEh5FJ7lQJSKhm63bCMGG6PABYF35vN9ekPh3v/VDvZVmW5t2WeGyv6vo wq9Q== X-Gm-Message-State: AC+VfDzopcY1fFegN4oEymqefmyIcMKRACh70G+4HLo4UJvJQsZ89Ng6 zErOLZ0SFqTjXg5rR5WBi5MaF/T1ECZhSDugyC5fHSNx64a2HTtbAk2QJ0NF22Z9obDfw6BC7PY CgJ6jD24F1kyxpU2B/pjHJB/qyCwjBF0LikKXJTgdD7fjsRiD9ZqRgMFHT8HAnujyg3N6uE6Fxh SCP7c= X-Google-Smtp-Source: ACHHUZ5cfLk4VNkNsLQNeQ4oSv/k2lL0mct8C9cEVipzZ+ZCKnWPYAoXgIWjk3VQepymKdadHY3zLg== X-Received: by 2002:a17:902:744a:b0:1b5:25bd:df2b with SMTP id e10-20020a170902744a00b001b525bddf2bmr16653991plt.14.1687450660544; Thu, 22 Jun 2023 09:17:40 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.17.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:17:40 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Nazar Kazakov , Weiwei Li , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Liu Zhiwei , Richard Henderson , Frank Chang Subject: [PATCH v4 03/17] target/riscv: Remove redundant "cpu_vl == 0" checks Date: Fri, 23 Jun 2023 00:16:19 +0800 Message-Id: <20230622161646.32005-4-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52d; envelope-from=max.chou@sifive.com; helo=mail-pg1-x52d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Nazar Kazakov Remove the redundant "vl == 0" check which is already included within the vstart >= vl check, when vl == 0. Signed-off-by: Nazar Kazakov Reviewed-by: Weiwei Li Signed-off-by: Max Chou --- target/riscv/insn_trans/trans_rvv.c.inc | 31 +------------------------ 1 file changed, 1 insertion(+), 30 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 4a8e62a8be..7e194aae34 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -617,7 +617,6 @@ static bool ldst_us_trans(uint32_t vd, uint32_t rs1, uint32_t data, TCGv_i32 desc; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); @@ -786,7 +785,6 @@ static bool ldst_stride_trans(uint32_t vd, uint32_t rs1, uint32_t rs2, TCGv_i32 desc; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); @@ -893,7 +891,6 @@ static bool ldst_index_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, TCGv_i32 desc; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); @@ -1034,7 +1031,6 @@ static bool ldff_trans(uint32_t vd, uint32_t rs1, uint32_t data, TCGv_i32 desc; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); @@ -1191,7 +1187,6 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, return false; } - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { @@ -1241,7 +1236,6 @@ static bool opivx_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, uint32_t vm, uint32_t data = 0; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); @@ -1405,7 +1399,6 @@ static bool opivi_trans(uint32_t vd, uint32_t imm, uint32_t vs2, uint32_t vm, uint32_t data = 0; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); @@ -1492,7 +1485,6 @@ static bool do_opivv_widen(DisasContext *s, arg_rmrr *a, if (checkfn(s, a)) { uint32_t data = 0; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); @@ -1575,7 +1567,6 @@ static bool do_opiwv_widen(DisasContext *s, arg_rmrr *a, if (opiwv_widen_check(s, a)) { uint32_t data = 0; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); @@ -1648,7 +1639,6 @@ static bool opivv_trans(uint32_t vd, uint32_t vs1, uint32_t vs2, uint32_t vm, { uint32_t data = 0; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, vm); @@ -1842,7 +1832,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ gen_helper_##NAME##_w, \ }; \ TCGLabel *over = gen_new_label(); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -2054,7 +2043,6 @@ static bool trans_vmv_v_v(DisasContext *s, arg_vmv_v_v *a) gen_helper_vmv_v_v_w, gen_helper_vmv_v_v_d, }; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); tcg_gen_gvec_2_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1), @@ -2078,7 +2066,6 @@ static bool trans_vmv_v_x(DisasContext *s, arg_vmv_v_x *a) vext_check_ss(s, a->rd, 0, 1)) { TCGv s1; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); s1 = get_gpr(s, a->rs1, EXT_SIGN); @@ -2140,7 +2127,6 @@ static bool trans_vmv_v_i(DisasContext *s, arg_vmv_v_i *a) gen_helper_vmv_v_x_w, gen_helper_vmv_v_x_d, }; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); s1 = tcg_constant_i64(simm); @@ -2288,7 +2274,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ }; \ TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -2323,7 +2308,6 @@ static bool opfvf_trans(uint32_t vd, uint32_t rs1, uint32_t vs2, TCGv_i64 t1; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); dest = tcg_temp_new_ptr(); @@ -2408,7 +2392,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ }; \ TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over);\ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -2483,7 +2466,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ }; \ TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -2601,7 +2583,6 @@ static bool do_opfv(DisasContext *s, arg_rmr *a, uint32_t data = 0; TCGLabel *over = gen_new_label(); gen_set_rm_chkfrm(s, rm); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); @@ -2713,7 +2694,6 @@ static bool trans_vfmv_v_f(DisasContext *s, arg_vfmv_v_f *a) gen_helper_vmv_v_x_d, }; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); t1 = tcg_temp_new_i64(); @@ -2792,7 +2772,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ }; \ TCGLabel *over = gen_new_label(); \ gen_set_rm_chkfrm(s, FRM); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -2844,7 +2823,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ }; \ TCGLabel *over = gen_new_label(); \ gen_set_rm(s, RISCV_FRM_DYN); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -2912,7 +2890,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ }; \ TCGLabel *over = gen_new_label(); \ gen_set_rm_chkfrm(s, FRM); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -2962,7 +2939,6 @@ static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ }; \ TCGLabel *over = gen_new_label(); \ gen_set_rm_chkfrm(s, FRM); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, VM, a->vm); \ @@ -3053,7 +3029,6 @@ static bool trans_##NAME(DisasContext *s, arg_r *a) \ uint32_t data = 0; \ gen_helper_gvec_4_ptr *fn = gen_helper_##NAME; \ TCGLabel *over = gen_new_label(); \ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); \ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ \ data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ @@ -3224,7 +3199,6 @@ static bool trans_vid_v(DisasContext *s, arg_vid_v *a) require_vm(a->vm, a->rd)) { uint32_t data = 0; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); data = FIELD_DP32(data, VDATA, VM, a->vm); @@ -3411,7 +3385,6 @@ static bool trans_vmv_s_x(DisasContext *s, arg_vmv_s_x *a) TCGv s1; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); t1 = tcg_temp_new_i64(); @@ -3468,8 +3441,7 @@ static bool trans_vfmv_s_f(DisasContext *s, arg_vfmv_s_f *a) TCGv_i64 t1; TCGLabel *over = gen_new_label(); - /* if vl == 0 or vstart >= vl, skip vector register write back */ - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); + /* if vstart >= vl, skip vector register write back */ tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); /* NaN-box f[rs1] */ @@ -3720,7 +3692,6 @@ static bool int_ext_op(DisasContext *s, arg_rmr *a, uint8_t seq) uint32_t data = 0; gen_helper_gvec_3_ptr *fn; TCGLabel *over = gen_new_label(); - tcg_gen_brcondi_tl(TCG_COND_EQ, cpu_vl, 0, over); tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); static gen_helper_gvec_3_ptr * const fns[6][4] = { From patchwork Thu Jun 22 16:16:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798513 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=Fu2X9kpX; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Bv5bY9z20Xt for ; Fri, 23 Jun 2023 02:18:15 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN0O-0000pF-DR; Thu, 22 Jun 2023 12:17:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0M-0000gI-JW for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:54 -0400 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0J-0004c9-Dd for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:53 -0400 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1b512309c86so6258455ad.1 for ; Thu, 22 Jun 2023 09:17:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450669; x=1690042669; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UaJj7542saYHdSGkOE6YmX0ti8sHnk++53G+aNBcFNo=; b=Fu2X9kpX1oGU3TqSpnxH4ve79P13ejyIgNXp1REJBF/lg63xnG1bpy6gu73NILFgoX g4HI8F/y97Xbx35R0AkIKMJ199JHgD2BvOA9ojKEUdJ+hv26x9chQmlsxNCbQRXpZug9 ZxKceeO0X5ZozsHSyBtsTf6RnLACvh7oa8e9455WIX+tv0nRBJReyrpexFMxThyhfVIn JKtzu8V3D2YjXm879w9PSwXEbFZ6NR6MJr9CcVSBx+j/EQ7xMznEkN8Xzo3jrzyGlzqr 4B4nKjco76K5v2DJOEg7uIvaaiAcM3WdQDrY+wYyl/EHyJUUWz3zOsY91kifBV5jHRWj g08A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450669; x=1690042669; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UaJj7542saYHdSGkOE6YmX0ti8sHnk++53G+aNBcFNo=; b=OGPHmk+KU7WYnhouT93taVFiclaE8vPwV8/0fNFTwjDruk10QcSPXjD5lo/ZSeOhML aYxwKeHYXyyuGDaMisjMEFw4uKIT9zUdmFGadrVK5wn822gX1HW9wwI+VzLTye0qPoPq 9Sgs7BPonIW9/KMN9rsD0nh9y1YP6c6VcfRze+PJxV7syLPPFYQSWmvM3xsLkEr+fUu9 69fQEV93MLCXdTooKqMZ3M81lBg1zHmwrWVIQXbP4LjKngmLW1zgCbvZZQ2Tiz0iA96H 68Pt47tLH/KQroQQ003axHb0lAUB+2FkkwaTwr9zf0iDGUQvRXn5oRIftbjOH8Qewq6k OKAA== X-Gm-Message-State: AC+VfDyDQ2pn+M8zjHgyvlR36nw3+yDg9RlwXewqGozfHxyZyPT8TyX7 /IY96q0nfsmkAi+3dYn+7XAqNMSJu2V6TdlY5+1riEe+rN4QgBOS4CH66lc9IK7E9qz7jdTBhM7 99FM36N9XxIT6x/HqvB/KnW38Rgrw7o1FWI3roH/Aoojh2LK50Vk8itiDKfuF9loKiJ+eyBExlU hS0WQ= X-Google-Smtp-Source: ACHHUZ5Mi6UHZQD+CWQngzZrk23X54bQ+SLVYfsfkg78UefmXwPXIqOX05PvgAVQFeTyIltxG+AyKg== X-Received: by 2002:a17:903:182:b0:1b5:5a5f:368a with SMTP id z2-20020a170903018200b001b55a5f368amr18863380plg.27.1687450669412; Thu, 22 Jun 2023 09:17:49 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.17.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:17:49 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Lawrence Hunter , Nazar Kazakov , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Liu Zhiwei , Kiran Ostrolenk , William Salmon Subject: [PATCH v4 04/17] target/riscv: Add Zvbc ISA extension support Date: Fri, 23 Jun 2023 00:16:20 +0800 Message-Id: <20230622161646.32005-5-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=max.chou@sifive.com; helo=mail-pl1-x62d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Lawrence Hunter This commit adds support for the Zvbc vector-crypto extension, which consists of the following instructions: * vclmulh.[vx,vv] * vclmul.[vx,vv] Translation functions are defined in `target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in `target/riscv/vcrypto_helper.c`. Co-authored-by: Nazar Kazakov Signed-off-by: Nazar Kazakov Signed-off-by: Lawrence Hunter Signed-off-by: Max Chou --- target/riscv/cpu.c | 6 +++ target/riscv/cpu_cfg.h | 1 + target/riscv/helper.h | 6 +++ target/riscv/insn32.decode | 6 +++ target/riscv/insn_trans/trans_rvvk.c.inc | 62 ++++++++++++++++++++++++ target/riscv/meson.build | 3 +- target/riscv/translate.c | 1 + target/riscv/vcrypto_helper.c | 59 ++++++++++++++++++++++ 8 files changed, 143 insertions(+), 1 deletion(-) create mode 100644 target/riscv/insn_trans/trans_rvvk.c.inc create mode 100644 target/riscv/vcrypto_helper.c diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index fb8458bf74..53b0fcade6 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -111,6 +111,7 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zksed, PRIV_VERSION_1_12_0, ext_zksed), ISA_EXT_DATA_ENTRY(zksh, PRIV_VERSION_1_12_0, ext_zksh), ISA_EXT_DATA_ENTRY(zkt, PRIV_VERSION_1_12_0, ext_zkt), + ISA_EXT_DATA_ENTRY(zvbc, PRIV_VERSION_1_12_0, ext_zvbc), ISA_EXT_DATA_ENTRY(zve32f, PRIV_VERSION_1_10_0, ext_zve32f), ISA_EXT_DATA_ENTRY(zve64f, PRIV_VERSION_1_10_0, ext_zve64f), ISA_EXT_DATA_ENTRY(zve64d, PRIV_VERSION_1_10_0, ext_zve64d), @@ -1188,6 +1189,11 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp) return; } + if (cpu->cfg.ext_zvbc && !cpu->cfg.ext_zve64f) { + error_setg(errp, "Zvbc extension requires V or Zve64{f,d} extensions"); + return; + } + if (cpu->cfg.ext_zk) { cpu->cfg.ext_zkn = true; cpu->cfg.ext_zkr = true; diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index 6b7e736bc2..5ca19298a7 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -83,6 +83,7 @@ struct RISCVCPUConfig { bool ext_zve32f; bool ext_zve64f; bool ext_zve64d; + bool ext_zvbc; bool ext_zmmul; bool ext_zvfh; bool ext_zvfhmin; diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 98e97810fd..be0f0f1058 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1153,3 +1153,9 @@ DEF_HELPER_FLAGS_3(sm4ks, TCG_CALL_NO_RWG_SE, tl, tl, tl, tl) /* Zce helper */ DEF_HELPER_FLAGS_2(cm_jalt, TCG_CALL_NO_WG, tl, env, i32) + +/* Vector crypto functions */ +DEF_HELPER_6(vclmul_vv, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vclmul_vx, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vclmulh_vv, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vclmulh_vx, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 73d5d1b045..52cd92e262 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -908,3 +908,9 @@ sm4ks .. 11010 ..... ..... 000 ..... 0110011 @k_aes # *** RV32 Zicond Standard Extension *** czero_eqz 0000111 ..... ..... 101 ..... 0110011 @r czero_nez 0000111 ..... ..... 111 ..... 0110011 @r + +# *** Zvbc vector crypto extension *** +vclmul_vv 001100 . ..... ..... 010 ..... 1010111 @r_vm +vclmul_vx 001100 . ..... ..... 110 ..... 1010111 @r_vm +vclmulh_vv 001101 . ..... ..... 010 ..... 1010111 @r_vm +vclmulh_vx 001101 . ..... ..... 110 ..... 1010111 @r_vm diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc new file mode 100644 index 0000000000..552b08a2fd --- /dev/null +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -0,0 +1,62 @@ +/* + * RISC-V translation routines for the vector crypto extension. + * + * Copyright (C) 2023 SiFive, Inc. + * Written by Codethink Ltd and SiFive. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ + +/* + * Zvbc + */ + +#define GEN_VV_MASKED_TRANS(NAME, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + return opivv_trans(a->rd, a->rs1, a->rs2, a->vm, \ + gen_helper_##NAME, s); \ + } \ + return false; \ + } + +static bool vclmul_vv_check(DisasContext *s, arg_rmrr *a) +{ + return opivv_check(s, a) && + s->cfg_ptr->ext_zvbc == true && + s->sew == MO_64; +} + +GEN_VV_MASKED_TRANS(vclmul_vv, vclmul_vv_check) +GEN_VV_MASKED_TRANS(vclmulh_vv, vclmul_vv_check) + +#define GEN_VX_MASKED_TRANS(NAME, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, \ + gen_helper_##NAME, s); \ + } \ + return false; \ + } + +static bool vclmul_vx_check(DisasContext *s, arg_rmrr *a) +{ + return opivx_check(s, a) && + s->cfg_ptr->ext_zvbc == true && + s->sew == MO_64; +} + +GEN_VX_MASKED_TRANS(vclmul_vx, vclmul_vx_check) +GEN_VX_MASKED_TRANS(vclmulh_vx, vclmul_vx_check) diff --git a/target/riscv/meson.build b/target/riscv/meson.build index c3801ee5e0..660078bda1 100644 --- a/target/riscv/meson.build +++ b/target/riscv/meson.build @@ -21,7 +21,8 @@ riscv_ss.add(files( 'translate.c', 'm128_helper.c', 'crypto_helper.c', - 'zce_helper.c' + 'zce_helper.c', + 'vcrypto_helper.c' )) riscv_ss.add(when: 'CONFIG_KVM', if_true: files('kvm.c'), if_false: files('kvm-stub.c')) diff --git a/target/riscv/translate.c b/target/riscv/translate.c index 0a5ab89c43..cc41290b4c 100644 --- a/target/riscv/translate.c +++ b/target/riscv/translate.c @@ -1083,6 +1083,7 @@ static uint32_t opcode_at(DisasContextBase *dcbase, target_ulong pc) #include "insn_trans/trans_rvzicbo.c.inc" #include "insn_trans/trans_rvzfh.c.inc" #include "insn_trans/trans_rvk.c.inc" +#include "insn_trans/trans_rvvk.c.inc" #include "insn_trans/trans_privileged.c.inc" #include "insn_trans/trans_svinval.c.inc" #include "decode-xthead.c.inc" diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c new file mode 100644 index 0000000000..8b7c63d499 --- /dev/null +++ b/target/riscv/vcrypto_helper.c @@ -0,0 +1,59 @@ +/* + * RISC-V Vector Crypto Extension Helpers for QEMU. + * + * Copyright (C) 2023 SiFive, Inc. + * Written by Codethink Ltd and SiFive. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see . + */ + +#include "qemu/osdep.h" +#include "qemu/host-utils.h" +#include "qemu/bitops.h" +#include "cpu.h" +#include "exec/memop.h" +#include "exec/exec-all.h" +#include "exec/helper-proto.h" +#include "internals.h" +#include "vector_internals.h" + +static uint64_t clmul64(uint64_t y, uint64_t x) +{ + uint64_t result = 0; + for (int j = 63; j >= 0; j--) { + if ((y >> j) & 1) { + result ^= (x << j); + } + } + return result; +} + +static uint64_t clmulh64(uint64_t y, uint64_t x) +{ + uint64_t result = 0; + for (int j = 63; j >= 1; j--) { + if ((y >> j) & 1) { + result ^= (x >> (64 - j)); + } + } + return result; +} + +RVVCALL(OPIVV2, vclmul_vv, OP_UUU_D, H8, H8, H8, clmul64) +GEN_VEXT_VV(vclmul_vv, 8) +RVVCALL(OPIVX2, vclmul_vx, OP_UUU_D, H8, H8, clmul64) +GEN_VEXT_VX(vclmul_vx, 8) +RVVCALL(OPIVV2, vclmulh_vv, OP_UUU_D, H8, H8, H8, clmulh64) +GEN_VEXT_VV(vclmulh_vv, 8) +RVVCALL(OPIVX2, vclmulh_vx, OP_UUU_D, H8, H8, clmulh64) +GEN_VEXT_VX(vclmulh_vx, 8) From patchwork Thu Jun 22 16:16:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798516 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=Fiy1e11w; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5D456n8z20Xt for ; Fri, 23 Jun 2023 02:19:16 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN0f-0001JK-G5; Thu, 22 Jun 2023 12:18:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0V-0000zC-Ht for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:05 -0400 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0N-0004cy-Qx for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:17:58 -0400 Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b55643507dso41618585ad.0 for ; Thu, 22 Jun 2023 09:17:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450674; x=1690042674; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+YCNBWpu8sC/m0RCUWCmKe148MHe8RXhkW+2UVN+8xY=; b=Fiy1e11wTmo18TjWXAGY0nS+u10rFsdb+hqCabSULrtMkCi22hwrZbLCphyOMpm5a8 0ayDokXztCQVJHRiIpaxcCvGUt0802eO59wZspbKix98ZcXP3bd64NwWVxaiCx7yhYQk p3SPaaIMyL35Do/g+Z9wv5Ch7afBmoklASdzHmxDEz+zQCV8jfe6y3nvbTis+YD8UszR PImgNqbnztOYUlCtC6Uphlo/hd8va8IE6bPuwG6txNpuzuHp49hDJrPqyVRFeI5tJv10 CFX9jUrj9PnowAJCNli2m6x+JP4JacvWnm45tLzdfQlCyL2LGJGkWc4WcWKw4X4ffuIR Illg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450674; x=1690042674; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+YCNBWpu8sC/m0RCUWCmKe148MHe8RXhkW+2UVN+8xY=; b=RlLZicEa7VLqzi+FoqJbiZJRAyIJDzAj39DDLOAcNw8IBrmg7Skxg9riO852cD7zXw t3lg6UbOAbyGRbJlTk9i/sdQo659aNKwTtoC851/EBnKbranWNImqXmT19aGTABiHuLp VU485k2zkYRRJrT1YyWWeMUAMmiAyVdUx0DtN9uZxlzaMAZmpEfS5aKeCB9oblCUaU5B G0jssW7HEhn59R+MymanX0lspCMFikiwrWSFDmVD/mKIzKOsSHnuSK+z29oJsNCrYjS6 ImrfYJFBTV9wEvsk/qWDQNjA5shuK2ZPuftvprsMTaPKVmcQ2FxCUuFd3i/LsZRTgbfZ 4Bdg== X-Gm-Message-State: AC+VfDwvVKdY9Vn4cliBjEcKQ32vc+QBsFVZDvnqya3/vN75zBPA3ozG JB1TJoVfZ8/CRffvFOoCCyRT89eCfqInFk+SeYYABfovhq/rHcPNNfseEZv6XmdfczDaYkxswVR l3JnXAnwOkXH63JSX02HK0Et0vOpmSmz00WkXcSarErrqiPoqf/KxYYEV0OX0gkUfGrFfoy/2kT i7h80= X-Google-Smtp-Source: ACHHUZ6vH5+uDCgqu+yHXtBqR2WPsyCXN6X/GE/jcOm45+ebE3wXqSZpq8odcgtt0bJypRCIVC1e/w== X-Received: by 2002:a17:902:7fce:b0:1b5:2ae4:9c7d with SMTP id t14-20020a1709027fce00b001b52ae49c7dmr17214059plb.38.1687450674280; Thu, 22 Jun 2023 09:17:54 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.17.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:17:54 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Nazar Kazakov , Richard Henderson , Weiwei Li , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Liu Zhiwei , Junqiang Wang Subject: [PATCH v4 05/17] target/riscv: Move vector translation checks Date: Fri, 23 Jun 2023 00:16:21 +0800 Message-Id: <20230622161646.32005-6-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=max.chou@sifive.com; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Nazar Kazakov Move the checks out of `do_opiv{v,x,i}_gvec{,_shift}` functions and into the corresponding macros. This enables the functions to be reused in proceeding commits without check duplication. Signed-off-by: Nazar Kazakov Reviewed-by: Richard Henderson Reviewed-by: Weiwei Li Signed-off-by: Max Chou --- target/riscv/insn_trans/trans_rvv.c.inc | 28 +++++++++++-------------- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 7e194aae34..5dfd524c7d 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -1183,9 +1183,6 @@ do_opivv_gvec(DisasContext *s, arg_rmrr *a, GVecGen3Fn *gvec_fn, gen_helper_gvec_4_ptr *fn) { TCGLabel *over = gen_new_label(); - if (!opivv_check(s, a)) { - return false; - } tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); @@ -1218,6 +1215,9 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ }; \ + if (!opivv_check(s, a)) { \ + return false; \ + } \ return do_opivv_gvec(s, a, tcg_gen_gvec_##SUF, fns[s->sew]); \ } @@ -1276,10 +1276,6 @@ static inline bool do_opivx_gvec(DisasContext *s, arg_rmrr *a, GVecGen2sFn *gvec_fn, gen_helper_opivx *fn) { - if (!opivx_check(s, a)) { - return false; - } - if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { TCGv_i64 src1 = tcg_temp_new_i64(); @@ -1301,6 +1297,9 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ }; \ + if (!opivx_check(s, a)) { \ + return false; \ + } \ return do_opivx_gvec(s, a, tcg_gen_gvec_##SUF, fns[s->sew]); \ } @@ -1432,10 +1431,6 @@ static inline bool do_opivi_gvec(DisasContext *s, arg_rmrr *a, GVecGen2iFn *gvec_fn, gen_helper_opivx *fn, imm_mode_t imm_mode) { - if (!opivx_check(s, a)) { - return false; - } - if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { gvec_fn(s->sew, vreg_ofs(s, a->rd), vreg_ofs(s, a->rs2), extract_imm(s, a->rs1, imm_mode), MAXSZ(s), MAXSZ(s)); @@ -1453,6 +1448,9 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ gen_helper_##OPIVX##_b, gen_helper_##OPIVX##_h, \ gen_helper_##OPIVX##_w, gen_helper_##OPIVX##_d, \ }; \ + if (!opivx_check(s, a)) { \ + return false; \ + } \ return do_opivi_gvec(s, a, tcg_gen_gvec_##SUF, \ fns[s->sew], IMM_MODE); \ } @@ -1775,10 +1773,6 @@ static inline bool do_opivx_gvec_shift(DisasContext *s, arg_rmrr *a, GVecGen2sFn32 *gvec_fn, gen_helper_opivx *fn) { - if (!opivx_check(s, a)) { - return false; - } - if (a->vm && s->vl_eq_vlmax && !(s->vta && s->lmul < 0)) { TCGv_i32 src1 = tcg_temp_new_i32(); @@ -1800,7 +1794,9 @@ static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ gen_helper_##NAME##_b, gen_helper_##NAME##_h, \ gen_helper_##NAME##_w, gen_helper_##NAME##_d, \ }; \ - \ + if (!opivx_check(s, a)) { \ + return false; \ + } \ return do_opivx_gvec_shift(s, a, tcg_gen_gvec_##SUF, fns[s->sew]); \ } From patchwork Thu Jun 22 16:16:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798518 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=fGTQgw0Y; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Dp1QVxz20Xt for ; Fri, 23 Jun 2023 02:19:54 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN0j-0001cX-1n; Thu, 22 Jun 2023 12:18:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0a-00017T-0z for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:08 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0V-0004gF-8s for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:06 -0400 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b52bf6e659so37075395ad.3 for ; Thu, 22 Jun 2023 09:18:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450679; x=1690042679; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wQ8vK9Wpe3fGLT8kIELnJ5eJhUvDM0umq3P84O6dY/s=; b=fGTQgw0YFiw6HULT/pAHIKIrVVYQV4zLS8qY7ADZRQLPWvXpNdNCEPyL2P0jmC3X0k POkfmTz+nrW1LvVwS5w0frfmLsp2hZIm3jw3rMvGFYA0BdksLhCk+Dggp/xAnS4DCVrB 4qa5ieniifjzMa+wlHzFNVsyM/BYrZOeFqCA4XM53ILm2ySFySqCwdlA3X9Mbs73SHko 1nzJKBJiN9QFZo7j/NDGbLy/BSgs99/fIJFqP6rPP+Ol8ZxkZ7BAf1flAn+Ieva8pB0B YSc4mgUBrypQulKJVNSg30MZbGzHbVCYpcVuDFD7B/7hi4AVX2A/wlAOt8ay9vChOvSv c2Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450679; x=1690042679; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wQ8vK9Wpe3fGLT8kIELnJ5eJhUvDM0umq3P84O6dY/s=; b=l9mUNvVRauD52IQnv/rTG9YG5k3n19Ynl6BXq29Oqj6hQsK9l9n/vfNgHDnIH7iw14 JWza5LdBjHHT5C9sBKac305yqAXjtGeplZBduOS3afOWaEMDSJUYovA0t/UMIlfBafaT kIiI2KPewx3NpqJcnzSO+NGKNcLSz20ZwPU1hbp2wEk0DociEGGiVqgD0IHXMtG+bOQB cogc/8BEjS1L6HpvaIGORkuIx+6lqe3xiS8ahTq0oad6xUzt+tqy14Jd2VR+ms7a8bm6 mDi/4qMmU9KvnL4bVf1W5h9pybgPnyZHTSUCqC/TY1nMBtSrdNcAEIhF6tCN4o59Vypb qiSw== X-Gm-Message-State: AC+VfDwTzBdKElCnqodUiZt5BLl+H266NY6QHmNr2t8uvdSS/xhU/2xk XCs3ltrKTHiiROGkLXdYfNKwfqe3jWnMZh+QZhQqTuyqTdUPOzz8FQ9HbsbsTxawXNr9C7D91Mj xk8VLE37ht7Md4H3p/Usm1VCf+8W4U19DP2CIiUyFoGKmST4Wlc92mJDImfTVXB0ZjoD3yxlVDD yZlFA= X-Google-Smtp-Source: ACHHUZ4F6sYw5CT84hONjiJnvvjml2Lg2OVjBz6pDGV5Ihw9/gB+OPB5RTy8veHj0bbvyU9tnvFfHg== X-Received: by 2002:a17:902:f7ce:b0:1b1:9b59:fc68 with SMTP id h14-20020a170902f7ce00b001b19b59fc68mr12440551plw.13.1687450678893; Thu, 22 Jun 2023 09:17:58 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.17.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:17:58 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Dickon Hood , Richard Henderson , Weiwei Li , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Liu Zhiwei , Junqiang Wang Subject: [PATCH v4 06/17] target/riscv: Refactor translation of vector-widening instruction Date: Fri, 23 Jun 2023 00:16:22 +0800 Message-Id: <20230622161646.32005-7-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=max.chou@sifive.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Dickon Hood Zvbb (implemented in later commit) has a widening instruction, which requires an extra check on the enabled extensions. Refactor GEN_OPIVX_WIDEN_TRANS() to take a check function to avoid reimplementing it. Signed-off-by: Dickon Hood Reviewed-by: Richard Henderson Reviewed-by: Weiwei Li Signed-off-by: Max Chou --- target/riscv/insn_trans/trans_rvv.c.inc | 52 +++++++++++-------------- 1 file changed, 23 insertions(+), 29 deletions(-) diff --git a/target/riscv/insn_trans/trans_rvv.c.inc b/target/riscv/insn_trans/trans_rvv.c.inc index 5dfd524c7d..a556250553 100644 --- a/target/riscv/insn_trans/trans_rvv.c.inc +++ b/target/riscv/insn_trans/trans_rvv.c.inc @@ -1526,30 +1526,24 @@ static bool opivx_widen_check(DisasContext *s, arg_rmrr *a) vext_check_ds(s, a->rd, a->rs2, a->vm); } -static bool do_opivx_widen(DisasContext *s, arg_rmrr *a, - gen_helper_opivx *fn) -{ - if (opivx_widen_check(s, a)) { - return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, fn, s); - } - return false; -} - -#define GEN_OPIVX_WIDEN_TRANS(NAME) \ -static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ -{ \ - static gen_helper_opivx * const fns[3] = { \ - gen_helper_##NAME##_b, \ - gen_helper_##NAME##_h, \ - gen_helper_##NAME##_w \ - }; \ - return do_opivx_widen(s, a, fns[s->sew]); \ +#define GEN_OPIVX_WIDEN_TRANS(NAME, CHECK) \ +static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ +{ \ + if (CHECK(s, a)) { \ + static gen_helper_opivx * const fns[3] = { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w \ + }; \ + return opivx_trans(a->rd, a->rs1, a->rs2, a->vm, fns[s->sew], s); \ + } \ + return false; \ } -GEN_OPIVX_WIDEN_TRANS(vwaddu_vx) -GEN_OPIVX_WIDEN_TRANS(vwadd_vx) -GEN_OPIVX_WIDEN_TRANS(vwsubu_vx) -GEN_OPIVX_WIDEN_TRANS(vwsub_vx) +GEN_OPIVX_WIDEN_TRANS(vwaddu_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwadd_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwsubu_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwsub_vx, opivx_widen_check) /* WIDEN OPIVV with WIDEN */ static bool opiwv_widen_check(DisasContext *s, arg_rmrr *a) @@ -1997,9 +1991,9 @@ GEN_OPIVX_TRANS(vrem_vx, opivx_check) GEN_OPIVV_WIDEN_TRANS(vwmul_vv, opivv_widen_check) GEN_OPIVV_WIDEN_TRANS(vwmulu_vv, opivv_widen_check) GEN_OPIVV_WIDEN_TRANS(vwmulsu_vv, opivv_widen_check) -GEN_OPIVX_WIDEN_TRANS(vwmul_vx) -GEN_OPIVX_WIDEN_TRANS(vwmulu_vx) -GEN_OPIVX_WIDEN_TRANS(vwmulsu_vx) +GEN_OPIVX_WIDEN_TRANS(vwmul_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwmulu_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwmulsu_vx, opivx_widen_check) /* Vector Single-Width Integer Multiply-Add Instructions */ GEN_OPIVV_TRANS(vmacc_vv, opivv_check) @@ -2015,10 +2009,10 @@ GEN_OPIVX_TRANS(vnmsub_vx, opivx_check) GEN_OPIVV_WIDEN_TRANS(vwmaccu_vv, opivv_widen_check) GEN_OPIVV_WIDEN_TRANS(vwmacc_vv, opivv_widen_check) GEN_OPIVV_WIDEN_TRANS(vwmaccsu_vv, opivv_widen_check) -GEN_OPIVX_WIDEN_TRANS(vwmaccu_vx) -GEN_OPIVX_WIDEN_TRANS(vwmacc_vx) -GEN_OPIVX_WIDEN_TRANS(vwmaccsu_vx) -GEN_OPIVX_WIDEN_TRANS(vwmaccus_vx) +GEN_OPIVX_WIDEN_TRANS(vwmaccu_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwmacc_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwmaccsu_vx, opivx_widen_check) +GEN_OPIVX_WIDEN_TRANS(vwmaccus_vx, opivx_widen_check) /* Vector Integer Merge and Move Instructions */ static bool trans_vmv_v_v(DisasContext *s, arg_vmv_v_v *a) From patchwork Thu Jun 22 16:16:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798515 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=a2mmw6y7; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Cj1pvyz20Xt for ; Fri, 23 Jun 2023 02:18:57 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN0k-0001kp-2D; Thu, 22 Jun 2023 12:18:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0b-0001Dg-BC for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:10 -0400 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0X-0004wY-5L for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:08 -0400 Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1b4fef08cfdso40207025ad.1 for ; Thu, 22 Jun 2023 09:18:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450683; x=1690042683; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=c56IXV6W+MLlXCMoQYT1qmjNlC+QQv7pCUiLZhC8k2s=; b=a2mmw6y75POb4CJORSMbuyXikC7Mw9CIiHuQOhJ0NbYhNXwe/up6UzrfZObynr3nuM Eu+egOLQ1iA3ed3kCktj+RZIXLJnl0zaUD/VSIBRZkgw0pPZ6XNH/ySlDCSIpboBzQ44 vVHTI7Vpf44oSQLYeZAhwmm85Bk4E7A8s1TEu1cJN8fuI4dO8dKv64EdJlmM5WNW9+UE ELn3rVzhpNZG0Yea1gguoKv9KExE8Yar9qUC6fkuup9k68lvCvTISNoXsxzC7SvfDL8h EBjbNnKrriWAZQSw22uJ2fBnUXlcmsrwJuqr+mCX/VwSIip5+EXHL2Iwoa+Cpr+1xQDZ y7AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450683; x=1690042683; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=c56IXV6W+MLlXCMoQYT1qmjNlC+QQv7pCUiLZhC8k2s=; b=Bl4hDv+wpmkFO+GGodKd0b3ZIVKQMBRfTRtJrgazpvMCZbNx9JlL2EG774m5A6e27U MTlfqJAcZVoTGeVtyby4BpZJmHketcTllCvZ4hLoStuPCo2o4BR3n6qLbimH0qhnd5CE pz41rmdY8jLwaA62Ps2J5PeuDGZIbhC1XaHYYkP4A3sROfAV4929XfrjtN9IXitpwI5K O/c1oSRd1D5gelzV9LQPViXfYfTpfIejDtUVnGBWUSxRIdhpCsyor68eEBjP4LUVBSwA rjYomXdaYnbHREGC94Dgi9VOb+9taW2QCwXACz1ktnh/JTV1bFQKjYV2l05mBK/iGGcW d4aQ== X-Gm-Message-State: AC+VfDzTSJgEtl9UIKoBXdgWgyLduWTrVDFrWSDS3VxEB6mKMQrK/Jab w1/Tpz2/QtT7WYoqrMJcEfyicsuLY9wGl3XjNfFUM1PEnzg+iySm0V1Qw95dUwiG1exisG05qbL JdqevQ/2fhBRgFCv1IE6dWu6EhQbCZ56lggOJL+v6RRgVoPB6bDmZU8wq9uHOSwXzTGJmeD9jG2 fuFVw= X-Google-Smtp-Source: ACHHUZ4OrFuIkix5vF/JRQVe4HCLvCIzhGxFj9QqSfytKcQPTNZovnAXsokPX6A5II2ItkU08ZkDxA== X-Received: by 2002:a17:902:ce87:b0:1b5:6312:4c5b with SMTP id f7-20020a170902ce8700b001b563124c5bmr11122096plg.63.1687450683156; Thu, 22 Jun 2023 09:18:03 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:02 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Kiran Ostrolenk , Weiwei Li , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Liu Zhiwei Subject: [PATCH v4 07/17] target/riscv: Refactor some of the generic vector functionality Date: Fri, 23 Jun 2023 00:16:23 +0800 Message-Id: <20230622161646.32005-8-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=max.chou@sifive.com; helo=mail-pl1-x629.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Kiran Ostrolenk Move some macros out of `vector_helper` and into `vector_internals`. This ensures they can be used by both vector and vector-crypto helpers (latter implemented in proceeding commits). Signed-off-by: Kiran Ostrolenk Reviewed-by: Weiwei Li Signed-off-by: Max Chou --- target/riscv/vector_helper.c | 42 ------------------------------ target/riscv/vector_internals.h | 46 +++++++++++++++++++++++++++++++++ 2 files changed, 46 insertions(+), 42 deletions(-) diff --git a/target/riscv/vector_helper.c b/target/riscv/vector_helper.c index 57be83400d..124ed22f95 100644 --- a/target/riscv/vector_helper.c +++ b/target/riscv/vector_helper.c @@ -635,9 +635,6 @@ GEN_VEXT_ST_WHOLE(vs8r_v, int8_t, ste_b) #define OP_SUS_H int16_t, uint16_t, int16_t, uint16_t, int16_t #define OP_SUS_W int32_t, uint32_t, int32_t, uint32_t, int32_t #define OP_SUS_D int64_t, uint64_t, int64_t, uint64_t, int64_t -#define WOP_UUU_B uint16_t, uint8_t, uint8_t, uint16_t, uint16_t -#define WOP_UUU_H uint32_t, uint16_t, uint16_t, uint32_t, uint32_t -#define WOP_UUU_W uint64_t, uint32_t, uint32_t, uint64_t, uint64_t #define WOP_SSS_B int16_t, int8_t, int8_t, int16_t, int16_t #define WOP_SSS_H int32_t, int16_t, int16_t, int32_t, int32_t #define WOP_SSS_W int64_t, int32_t, int32_t, int64_t, int64_t @@ -3426,11 +3423,6 @@ GEN_VEXT_VF(vfwnmsac_vf_h, 4) GEN_VEXT_VF(vfwnmsac_vf_w, 8) /* Vector Floating-Point Square-Root Instruction */ -/* (TD, T2, TX2) */ -#define OP_UU_H uint16_t, uint16_t, uint16_t -#define OP_UU_W uint32_t, uint32_t, uint32_t -#define OP_UU_D uint64_t, uint64_t, uint64_t - #define OPFVV1(NAME, TD, T2, TX2, HD, HS2, OP) \ static void do_##NAME(void *vd, void *vs2, int i, \ CPURISCVState *env) \ @@ -4127,40 +4119,6 @@ GEN_VEXT_CMP_VF(vmfge_vf_w, uint32_t, H4, vmfge32) GEN_VEXT_CMP_VF(vmfge_vf_d, uint64_t, H8, vmfge64) /* Vector Floating-Point Classify Instruction */ -#define OPIVV1(NAME, TD, T2, TX2, HD, HS2, OP) \ -static void do_##NAME(void *vd, void *vs2, int i) \ -{ \ - TX2 s2 = *((T2 *)vs2 + HS2(i)); \ - *((TD *)vd + HD(i)) = OP(s2); \ -} - -#define GEN_VEXT_V(NAME, ESZ) \ -void HELPER(NAME)(void *vd, void *v0, void *vs2, \ - CPURISCVState *env, uint32_t desc) \ -{ \ - uint32_t vm = vext_vm(desc); \ - uint32_t vl = env->vl; \ - uint32_t total_elems = \ - vext_get_total_elems(env, desc, ESZ); \ - uint32_t vta = vext_vta(desc); \ - uint32_t vma = vext_vma(desc); \ - uint32_t i; \ - \ - for (i = env->vstart; i < vl; i++) { \ - if (!vm && !vext_elem_mask(v0, i)) { \ - /* set masked-off elements to 1s */ \ - vext_set_elems_1s(vd, vma, i * ESZ, \ - (i + 1) * ESZ); \ - continue; \ - } \ - do_##NAME(vd, vs2, i); \ - } \ - env->vstart = 0; \ - /* set tail elements to 1s */ \ - vext_set_elems_1s(vd, vta, vl * ESZ, \ - total_elems * ESZ); \ -} - target_ulong fclass_h(uint64_t frs1) { float16 f = frs1; diff --git a/target/riscv/vector_internals.h b/target/riscv/vector_internals.h index 749d138beb..8133111e5f 100644 --- a/target/riscv/vector_internals.h +++ b/target/riscv/vector_internals.h @@ -121,12 +121,52 @@ void vext_set_elems_1s(void *base, uint32_t is_agnostic, uint32_t cnt, /* expand macro args before macro */ #define RVVCALL(macro, ...) macro(__VA_ARGS__) +/* (TD, T2, TX2) */ +#define OP_UU_B uint8_t, uint8_t, uint8_t +#define OP_UU_H uint16_t, uint16_t, uint16_t +#define OP_UU_W uint32_t, uint32_t, uint32_t +#define OP_UU_D uint64_t, uint64_t, uint64_t + /* (TD, T1, T2, TX1, TX2) */ #define OP_UUU_B uint8_t, uint8_t, uint8_t, uint8_t, uint8_t #define OP_UUU_H uint16_t, uint16_t, uint16_t, uint16_t, uint16_t #define OP_UUU_W uint32_t, uint32_t, uint32_t, uint32_t, uint32_t #define OP_UUU_D uint64_t, uint64_t, uint64_t, uint64_t, uint64_t +#define OPIVV1(NAME, TD, T2, TX2, HD, HS2, OP) \ +static void do_##NAME(void *vd, void *vs2, int i) \ +{ \ + TX2 s2 = *((T2 *)vs2 + HS2(i)); \ + *((TD *)vd + HD(i)) = OP(s2); \ +} + +#define GEN_VEXT_V(NAME, ESZ) \ +void HELPER(NAME)(void *vd, void *v0, void *vs2, \ + CPURISCVState *env, uint32_t desc) \ +{ \ + uint32_t vm = vext_vm(desc); \ + uint32_t vl = env->vl; \ + uint32_t total_elems = \ + vext_get_total_elems(env, desc, ESZ); \ + uint32_t vta = vext_vta(desc); \ + uint32_t vma = vext_vma(desc); \ + uint32_t i; \ + \ + for (i = env->vstart; i < vl; i++) { \ + if (!vm && !vext_elem_mask(v0, i)) { \ + /* set masked-off elements to 1s */ \ + vext_set_elems_1s(vd, vma, i * ESZ, \ + (i + 1) * ESZ); \ + continue; \ + } \ + do_##NAME(vd, vs2, i); \ + } \ + env->vstart = 0; \ + /* set tail elements to 1s */ \ + vext_set_elems_1s(vd, vta, vl * ESZ, \ + total_elems * ESZ); \ +} + /* operation of two vector elements */ typedef void opivv2_fn(void *vd, void *vs1, void *vs2, int i); @@ -179,4 +219,10 @@ void HELPER(NAME)(void *vd, void *v0, target_ulong s1, \ do_##NAME, ESZ); \ } +/* Three of the widening shortening macros: */ +/* (TD, T1, T2, TX1, TX2) */ +#define WOP_UUU_B uint16_t, uint8_t, uint8_t, uint16_t, uint16_t +#define WOP_UUU_H uint32_t, uint16_t, uint16_t, uint32_t, uint32_t +#define WOP_UUU_W uint64_t, uint32_t, uint32_t, uint64_t, uint64_t + #endif /* TARGET_RISCV_VECTOR_INTERNALS_H */ From patchwork Thu Jun 22 16:16:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798514 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=kBdrNbtQ; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5CT3LVfz20Xt for ; Fri, 23 Jun 2023 02:18:45 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN0r-0001yJ-PM; Thu, 22 Jun 2023 12:18:25 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0d-0001J6-9y for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:13 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0a-00058N-W8 for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:10 -0400 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b52d14df27so37248655ad.0 for ; Thu, 22 Jun 2023 09:18:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450687; x=1690042687; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=X0kiv9pO98oPtKQ5Bfc+3IlYT+Xx00DtpI/UiyKMWmE=; b=kBdrNbtQey4VgUkVJHxObm0E90sxOzlc3bhYK9BdAte7BaSwtgjeQ6DC2C6oiyBWqk MI8cK247/fI1X4Si5mLmY0P8/HO2SqIJ15a+KuCSLRpnKByQv+4IIwA1F5cwe/i8lbyJ ekE8vNKlxzMW87g/C9HTOec/KXiqWoC/+r0gA6AZhquzu3MF3Y8lk+VMav8ThYOXjK6b faMURH2DMrmJ0y0ZEQD1auTn8YSxsql2vEi0B0tSJ4rVPqQ/ZkEcqcbam9bnbtmmsYUY 2BJFNkXo4bbgoBFZY3xwlB1nN+17GadrJBieIMr5ILPXPH2mAvohhS0r2Mg/PyMLWsFs r98g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450687; x=1690042687; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=X0kiv9pO98oPtKQ5Bfc+3IlYT+Xx00DtpI/UiyKMWmE=; b=E29mDPYCRXDGmAXqHfNdbmpchwlE+Xx+yk6TbT9x6N/Ejpz3yRR+RiS6+r3dlE+EJG 2n8HN8zLNmY8KPDwkO97GfWrqgOOtGy+5tdkAPxOSZcUD2k2jZrEFT0Qz8a7qySal56L h0FA/CxsqqC4cH2Ox7TdR1jOOmAMtIZNup+oT+E2PACnRW+Y5fq7aKIXNBiItUouEp0E f+A/WJOKn6Ma7yEpDGI82spqJcCQ3NPXbzZQ/rMUjV9SnXfytgPbT79bYJGH5C5QHyJX We0eEiKf6rk4tYEeJf+K9IlfZUJgHf3EhwTaB6eL0vgSw5ZdE1rJWy3X923y9bKWr0cT abVQ== X-Gm-Message-State: AC+VfDxbTck+sVRZEO4t5sl96gkq3FZyQy33w0LyZAvYjL5jok+jXW77 5gkq5mZhuxr7CJp2i363f5rHpslPAGw+98x42zOVYvWwxKt1gqvj9skp6qckg+0ej0m+mMAianM 2yW9IO0fWvy16Ix2LcOII/pMy7LW+VtZZLD55dQ7FcCHeYSGP7sWxfqtX5L/1btg4iWeRS5Yzok E+ X-Google-Smtp-Source: ACHHUZ4HVqTN5aQTIOFwWrFM7i5OeEE9Htu9OE0Tdq4Dox3zocumDupXzJmvQj6tDYFusCU3rJqfow== X-Received: by 2002:a17:902:7487:b0:1b0:307c:e6fe with SMTP id h7-20020a170902748700b001b0307ce6femr13075964pll.10.1687450687346; Thu, 22 Jun 2023 09:18:07 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:07 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Max Chou , Richard Henderson Subject: [PATCH v4 08/17] tcg: Fix temporary variable in tcg_gen_gvec_andcs Date: Fri, 23 Jun 2023 00:16:24 +0800 Message-Id: <20230622161646.32005-9-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=max.chou@sifive.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The 5th parameter of tcg_gen_gvec_2s should be replaces by the temporary tmp variable in the tcg_gen_gvec_andcs function. Signed-off-by: Max Chou Reviewed-by: Daniel Henrique Barboza --- tcg/tcg-op-gvec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tcg/tcg-op-gvec.c b/tcg/tcg-op-gvec.c index 95a588d6d2..a062239804 100644 --- a/tcg/tcg-op-gvec.c +++ b/tcg/tcg-op-gvec.c @@ -2774,7 +2774,7 @@ void tcg_gen_gvec_andcs(unsigned vece, uint32_t dofs, uint32_t aofs, TCGv_i64 tmp = tcg_temp_ebb_new_i64(); tcg_gen_dup_i64(vece, tmp, c); - tcg_gen_gvec_2s(dofs, aofs, oprsz, maxsz, c, &g); + tcg_gen_gvec_2s(dofs, aofs, oprsz, maxsz, tmp, &g); tcg_temp_free_i64(tmp); } From patchwork Thu Jun 22 16:16:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798519 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=EIu9TIXM; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5FX0Kh1z20XS for ; Fri, 23 Jun 2023 02:20:32 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN14-0002Mm-HD; Thu, 22 Jun 2023 12:18:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0m-0001vv-LP for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:24 -0400 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0j-0005La-93 for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:20 -0400 Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1b5465a79cdso34999005ad.3 for ; Thu, 22 Jun 2023 09:18:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450695; x=1690042695; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZUZgdI+0iY2TbPK45PGIRPTLhUqSFaLojC1CV2JSMxg=; b=EIu9TIXM3SnOvS+LtVWEb+GsnVwn76dG20vpp/KtZsh1a6eBPYAFj7ZWJuQsR7INyE Hi6YNBsM3D59r3BltV0YuEmNiBkGFnxqBI4D3rAGoqDAeGgGvcNAEjosnpA8Zlzw8puL PeDtFPKXRXQ+IF8qKww8wTy96cfFjyonCm7czILO76V3Shdu9dOyDSQiZVS9XXqz75AG Gql/yxS1NX+QwY1hdE0v4o0dIQCubG1r4KWrMkkTe1mClVmsmyfVHIOODx5z10u1PD3E FxW2rZiI/Nobxq3DBwy3a1Ze16iNnC0Womt5Teit1SVHgS3WLYnIMmPMSo80ZYIQnUzy K3Fg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450695; x=1690042695; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZUZgdI+0iY2TbPK45PGIRPTLhUqSFaLojC1CV2JSMxg=; b=dAUHUok7qxBs6i8O55P4C6UYpdnNixg6L+izWUVyUJ3uRRw5SR5rXdRIZf6bJF2aNj q99NYwna8qrPGii1Ol3tw7k/amstBA3t4P//LV9YBNS1Cgj8FM2U9Ee1Z6k9ZBSKj02u hVcWC0XGIrpgmvoex3XfLOq6/3AzIe6B5c3EoQOl+x7HTU9jcVjkKXCz62pZDNQefVnF 2r1tmL3B6/NwGo1z/5JG+4PYp1GX5bwT8hJIfxzqCrTlahV0EWCm5vTadYV82K2KijNa Ig8ZG/R0tkxOw1T7w7/o9GBZ+uhBYbV6ttDhf2ZMwbKldRXhqnOluROO7PfefwhxBapl gNXA== X-Gm-Message-State: AC+VfDwMu7qF1a7CQQVtjkmlsVLO4McbPYePz4xA+AuCmYo/ERz3vq8n l+HrtMkfSI4QwogMrCjXg3BySBY5rEbXlCJ5//S2S3t05MeGXe10UcRgHtfbfwzjE/mp3Lhn56x CtD0EK5gOIHXrmz9tWzed77wig+Czl/71ulfOdBJxwQa7x9wlVDfgd4tLDy5FqLtdPpLN9ad64f BLmjs= X-Google-Smtp-Source: ACHHUZ7wSsbWDk3qgw9v4y1cIMOWKW7+cIclOc8dQ5G2FPl04b8O480FDrQFPUfQqjB5BHUz5EuErg== X-Received: by 2002:a17:902:d38d:b0:1b1:9218:6bf3 with SMTP id e13-20020a170902d38d00b001b192186bf3mr14395851pld.37.1687450695141; Thu, 22 Jun 2023 09:18:15 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:14 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Dickon Hood , Nazar Kazakov , William Salmon , Kiran Ostrolenk , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Liu Zhiwei , Lawrence Hunter Subject: [PATCH v4 09/17] target/riscv: Add Zvbb ISA extension support Date: Fri, 23 Jun 2023 00:16:25 +0800 Message-Id: <20230622161646.32005-10-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=max.chou@sifive.com; helo=mail-pl1-x62c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Dickon Hood This commit adds support for the Zvbb vector-crypto extension, which consists of the following instructions: * vrol.[vv,vx] * vror.[vv,vx,vi] * vbrev8.v * vrev8.v * vandn.[vv,vx] * vbrev.v * vclz.v * vctz.v * vcpop.v * vwsll.[vv,vx,vi] Translation functions are defined in `target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in `target/riscv/vcrypto_helper.c`. Co-authored-by: Nazar Kazakov Co-authored-by: William Salmon Co-authored-by: Kiran Ostrolenk [max.chou@sifive.com: Fix imm mode of vror.vi] Signed-off-by: Nazar Kazakov Signed-off-by: William Salmon Signed-off-by: Kiran Ostrolenk Signed-off-by: Dickon Hood Signed-off-by: Max Chou Reviewed-by: Daniel Henrique Barboza --- target/riscv/cpu.c | 11 ++ target/riscv/cpu_cfg.h | 1 + target/riscv/helper.h | 62 +++++++++ target/riscv/insn32.decode | 20 +++ target/riscv/insn_trans/trans_rvvk.c.inc | 164 +++++++++++++++++++++++ target/riscv/vcrypto_helper.c | 138 +++++++++++++++++++ 6 files changed, 396 insertions(+) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 53b0fcade6..4ee5219dbc 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -111,6 +111,7 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zksed, PRIV_VERSION_1_12_0, ext_zksed), ISA_EXT_DATA_ENTRY(zksh, PRIV_VERSION_1_12_0, ext_zksh), ISA_EXT_DATA_ENTRY(zkt, PRIV_VERSION_1_12_0, ext_zkt), + ISA_EXT_DATA_ENTRY(zvbb, PRIV_VERSION_1_12_0, ext_zvbb), ISA_EXT_DATA_ENTRY(zvbc, PRIV_VERSION_1_12_0, ext_zvbc), ISA_EXT_DATA_ENTRY(zve32f, PRIV_VERSION_1_10_0, ext_zve32f), ISA_EXT_DATA_ENTRY(zve64f, PRIV_VERSION_1_10_0, ext_zve64f), @@ -1189,6 +1190,16 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp) return; } + /* + * In principle Zve*x would also suffice here, were they supported + * in qemu + */ + if (cpu->cfg.ext_zvbb && !cpu->cfg.ext_zve32f) { + error_setg(errp, + "Vector crypto extensions require V or Zve* extensions"); + return; + } + if (cpu->cfg.ext_zvbc && !cpu->cfg.ext_zve64f) { error_setg(errp, "Zvbc extension requires V or Zve64{f,d} extensions"); return; diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index 5ca19298a7..0904dc3ae5 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -83,6 +83,7 @@ struct RISCVCPUConfig { bool ext_zve32f; bool ext_zve64f; bool ext_zve64d; + bool ext_zvbb; bool ext_zvbc; bool ext_zmmul; bool ext_zvfh; diff --git a/target/riscv/helper.h b/target/riscv/helper.h index be0f0f1058..fbb0ceca81 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1159,3 +1159,65 @@ DEF_HELPER_6(vclmul_vv, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vclmul_vx, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(vclmulh_vv, void, ptr, ptr, ptr, ptr, env, i32) DEF_HELPER_6(vclmulh_vx, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(vror_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vror_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vror_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vror_vv_d, void, ptr, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(vror_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vror_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vror_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vror_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(vrol_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vrol_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vrol_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vrol_vv_d, void, ptr, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(vrol_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vrol_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vrol_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vrol_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_5(vrev8_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vrev8_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vrev8_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vrev8_v_d, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev8_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev8_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev8_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev8_v_d, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vbrev_v_d, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_5(vclz_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vclz_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vclz_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vclz_v_d, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vctz_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vctz_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vctz_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vctz_v_d, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vcpop_v_b, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vcpop_v_h, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vcpop_v_w, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vcpop_v_d, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_6(vwsll_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vwsll_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vwsll_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vwsll_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vwsll_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vwsll_vx_w, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_6(vandn_vv_b, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vandn_vv_h, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vandn_vv_w, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vandn_vv_d, void, ptr, ptr, ptr, ptr, env, i32) +DEF_HELPER_6(vandn_vx_b, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vandn_vx_h, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vandn_vx_w, void, ptr, ptr, tl, ptr, env, i32) +DEF_HELPER_6(vandn_vx_d, void, ptr, ptr, tl, ptr, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 52cd92e262..aa6d3185a2 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -37,6 +37,7 @@ %imm_u 12:s20 !function=ex_shift_12 %imm_bs 30:2 !function=ex_shift_3 %imm_rnum 20:4 +%imm_z6 26:1 15:5 # Argument sets: &empty @@ -82,6 +83,7 @@ @r_vm ...... vm:1 ..... ..... ... ..... ....... &rmrr %rs2 %rs1 %rd @r_vm_1 ...... . ..... ..... ... ..... ....... &rmrr vm=1 %rs2 %rs1 %rd @r_vm_0 ...... . ..... ..... ... ..... ....... &rmrr vm=0 %rs2 %rs1 %rd +@r2_zimm6 ..... . vm:1 ..... ..... ... ..... ....... &rmrr %rs2 rs1=%imm_z6 %rd @r2_zimm11 . zimm:11 ..... ... ..... ....... %rs1 %rd @r2_zimm10 .. zimm:10 ..... ... ..... ....... %rs1 %rd @r2_s ....... ..... ..... ... ..... ....... %rs2 %rs1 @@ -914,3 +916,21 @@ vclmul_vv 001100 . ..... ..... 010 ..... 1010111 @r_vm vclmul_vx 001100 . ..... ..... 110 ..... 1010111 @r_vm vclmulh_vv 001101 . ..... ..... 010 ..... 1010111 @r_vm vclmulh_vx 001101 . ..... ..... 110 ..... 1010111 @r_vm + +# *** Zvbb vector crypto extension *** +vrol_vv 010101 . ..... ..... 000 ..... 1010111 @r_vm +vrol_vx 010101 . ..... ..... 100 ..... 1010111 @r_vm +vror_vv 010100 . ..... ..... 000 ..... 1010111 @r_vm +vror_vx 010100 . ..... ..... 100 ..... 1010111 @r_vm +vror_vi 01010. . ..... ..... 011 ..... 1010111 @r2_zimm6 +vbrev8_v 010010 . ..... 01000 010 ..... 1010111 @r2_vm +vrev8_v 010010 . ..... 01001 010 ..... 1010111 @r2_vm +vandn_vv 000001 . ..... ..... 000 ..... 1010111 @r_vm +vandn_vx 000001 . ..... ..... 100 ..... 1010111 @r_vm +vbrev_v 010010 . ..... 01010 010 ..... 1010111 @r2_vm +vclz_v 010010 . ..... 01100 010 ..... 1010111 @r2_vm +vctz_v 010010 . ..... 01101 010 ..... 1010111 @r2_vm +vcpop_v 010010 . ..... 01110 010 ..... 1010111 @r2_vm +vwsll_vv 110101 . ..... ..... 000 ..... 1010111 @r_vm +vwsll_vx 110101 . ..... ..... 100 ..... 1010111 @r_vm +vwsll_vi 110101 . ..... ..... 011 ..... 1010111 @r_vm diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index 552b08a2fd..0e4b337613 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -60,3 +60,167 @@ static bool vclmul_vx_check(DisasContext *s, arg_rmrr *a) GEN_VX_MASKED_TRANS(vclmul_vx, vclmul_vx_check) GEN_VX_MASKED_TRANS(vclmulh_vx, vclmul_vx_check) + +/* + * Zvbb + */ + +#define GEN_OPIVI_GVEC_TRANS_CHECK(NAME, IMM_MODE, OPIVX, SUF, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + static gen_helper_opivx *const fns[4] = { \ + gen_helper_##OPIVX##_b, \ + gen_helper_##OPIVX##_h, \ + gen_helper_##OPIVX##_w, \ + gen_helper_##OPIVX##_d, \ + }; \ + return do_opivi_gvec(s, a, tcg_gen_gvec_##SUF, fns[s->sew], \ + IMM_MODE); \ + } \ + return false; \ + } + +#define GEN_OPIVV_GVEC_TRANS_CHECK(NAME, SUF, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + static gen_helper_gvec_4_ptr *const fns[4] = { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + gen_helper_##NAME##_d, \ + }; \ + return do_opivv_gvec(s, a, tcg_gen_gvec_##SUF, fns[s->sew]); \ + } \ + return false; \ + } + +#define GEN_OPIVX_GVEC_SHIFT_TRANS_CHECK(NAME, SUF, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + static gen_helper_opivx *const fns[4] = { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + gen_helper_##NAME##_d, \ + }; \ + return do_opivx_gvec_shift(s, a, tcg_gen_gvec_##SUF, \ + fns[s->sew]); \ + } \ + return false; \ + } + +static bool zvbb_vv_check(DisasContext *s, arg_rmrr *a) +{ + return opivv_check(s, a) && s->cfg_ptr->ext_zvbb == true; +} + +static bool zvbb_vx_check(DisasContext *s, arg_rmrr *a) +{ + return opivx_check(s, a) && s->cfg_ptr->ext_zvbb == true; +} + +/* vrol.v[vx] */ +GEN_OPIVV_GVEC_TRANS_CHECK(vrol_vv, rotlv, zvbb_vv_check) +GEN_OPIVX_GVEC_SHIFT_TRANS_CHECK(vrol_vx, rotls, zvbb_vx_check) + +/* vror.v[vxi] */ +GEN_OPIVV_GVEC_TRANS_CHECK(vror_vv, rotrv, zvbb_vv_check) +GEN_OPIVX_GVEC_SHIFT_TRANS_CHECK(vror_vx, rotrs, zvbb_vx_check) +GEN_OPIVI_GVEC_TRANS_CHECK(vror_vi, IMM_TRUNC_SEW, vror_vx, rotri, zvbb_vx_check) + +#define GEN_OPIVX_GVEC_TRANS_CHECK(NAME, SUF, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + static gen_helper_opivx *const fns[4] = { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + gen_helper_##NAME##_d, \ + }; \ + return do_opivx_gvec(s, a, tcg_gen_gvec_##SUF, fns[s->sew]); \ + } \ + return false; \ + } + +/* vandn.v[vx] */ +GEN_OPIVV_GVEC_TRANS_CHECK(vandn_vv, andc, zvbb_vv_check) +GEN_OPIVX_GVEC_TRANS_CHECK(vandn_vx, andcs, zvbb_vx_check) + +#define GEN_OPIV_TRANS(NAME, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmr *a) \ + { \ + if (CHECK(s, a)) { \ + uint32_t data = 0; \ + static gen_helper_gvec_3_ptr *const fns[4] = { \ + gen_helper_##NAME##_b, \ + gen_helper_##NAME##_h, \ + gen_helper_##NAME##_w, \ + gen_helper_##NAME##_d, \ + }; \ + TCGLabel *over = gen_new_label(); \ + tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ + \ + data = FIELD_DP32(data, VDATA, VM, a->vm); \ + data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ + data = FIELD_DP32(data, VDATA, VTA, s->vta); \ + data = FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s); \ + data = FIELD_DP32(data, VDATA, VMA, s->vma); \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, 0), \ + vreg_ofs(s, a->rs2), cpu_env, \ + s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, \ + data, fns[s->sew]); \ + mark_vs_dirty(s); \ + gen_set_label(over); \ + return true; \ + } \ + return false; \ + } + +static bool zvbb_opiv_check(DisasContext *s, arg_rmr *a) +{ + return s->cfg_ptr->ext_zvbb == true && + require_rvv(s) && + vext_check_isa_ill(s) && + vext_check_ss(s, a->rd, a->rs2, a->vm); +} + +GEN_OPIV_TRANS(vbrev8_v, zvbb_opiv_check) +GEN_OPIV_TRANS(vrev8_v, zvbb_opiv_check) +GEN_OPIV_TRANS(vbrev_v, zvbb_opiv_check) +GEN_OPIV_TRANS(vclz_v, zvbb_opiv_check) +GEN_OPIV_TRANS(vctz_v, zvbb_opiv_check) +GEN_OPIV_TRANS(vcpop_v, zvbb_opiv_check) + +static bool vwsll_vv_check(DisasContext *s, arg_rmrr *a) +{ + return s->cfg_ptr->ext_zvbb && opivv_widen_check(s, a); +} + +static bool vwsll_vx_check(DisasContext *s, arg_rmrr *a) +{ + return s->cfg_ptr->ext_zvbb && opivx_widen_check(s, a); +} + +/* OPIVI without GVEC IR */ +#define GEN_OPIVI_WIDEN_TRANS(NAME, IMM_MODE, OPIVX, CHECK) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + static gen_helper_opivx *const fns[3] = { \ + gen_helper_##OPIVX##_b, \ + gen_helper_##OPIVX##_h, \ + gen_helper_##OPIVX##_w, \ + }; \ + return opivi_trans(a->rd, a->rs1, a->rs2, a->vm, fns[s->sew], s, \ + IMM_MODE); \ + } \ + return false; \ + } + +GEN_OPIVV_WIDEN_TRANS(vwsll_vv, vwsll_vv_check) +GEN_OPIVX_WIDEN_TRANS(vwsll_vx, vwsll_vx_check) +GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index 8b7c63d499..11239b59d6 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -20,6 +20,7 @@ #include "qemu/osdep.h" #include "qemu/host-utils.h" #include "qemu/bitops.h" +#include "qemu/bswap.h" #include "cpu.h" #include "exec/memop.h" #include "exec/exec-all.h" @@ -57,3 +58,140 @@ RVVCALL(OPIVV2, vclmulh_vv, OP_UUU_D, H8, H8, H8, clmulh64) GEN_VEXT_VV(vclmulh_vv, 8) RVVCALL(OPIVX2, vclmulh_vx, OP_UUU_D, H8, H8, clmulh64) GEN_VEXT_VX(vclmulh_vx, 8) + +RVVCALL(OPIVV2, vror_vv_b, OP_UUU_B, H1, H1, H1, ror8) +RVVCALL(OPIVV2, vror_vv_h, OP_UUU_H, H2, H2, H2, ror16) +RVVCALL(OPIVV2, vror_vv_w, OP_UUU_W, H4, H4, H4, ror32) +RVVCALL(OPIVV2, vror_vv_d, OP_UUU_D, H8, H8, H8, ror64) +GEN_VEXT_VV(vror_vv_b, 1) +GEN_VEXT_VV(vror_vv_h, 2) +GEN_VEXT_VV(vror_vv_w, 4) +GEN_VEXT_VV(vror_vv_d, 8) + +RVVCALL(OPIVX2, vror_vx_b, OP_UUU_B, H1, H1, ror8) +RVVCALL(OPIVX2, vror_vx_h, OP_UUU_H, H2, H2, ror16) +RVVCALL(OPIVX2, vror_vx_w, OP_UUU_W, H4, H4, ror32) +RVVCALL(OPIVX2, vror_vx_d, OP_UUU_D, H8, H8, ror64) +GEN_VEXT_VX(vror_vx_b, 1) +GEN_VEXT_VX(vror_vx_h, 2) +GEN_VEXT_VX(vror_vx_w, 4) +GEN_VEXT_VX(vror_vx_d, 8) + +RVVCALL(OPIVV2, vrol_vv_b, OP_UUU_B, H1, H1, H1, rol8) +RVVCALL(OPIVV2, vrol_vv_h, OP_UUU_H, H2, H2, H2, rol16) +RVVCALL(OPIVV2, vrol_vv_w, OP_UUU_W, H4, H4, H4, rol32) +RVVCALL(OPIVV2, vrol_vv_d, OP_UUU_D, H8, H8, H8, rol64) +GEN_VEXT_VV(vrol_vv_b, 1) +GEN_VEXT_VV(vrol_vv_h, 2) +GEN_VEXT_VV(vrol_vv_w, 4) +GEN_VEXT_VV(vrol_vv_d, 8) + +RVVCALL(OPIVX2, vrol_vx_b, OP_UUU_B, H1, H1, rol8) +RVVCALL(OPIVX2, vrol_vx_h, OP_UUU_H, H2, H2, rol16) +RVVCALL(OPIVX2, vrol_vx_w, OP_UUU_W, H4, H4, rol32) +RVVCALL(OPIVX2, vrol_vx_d, OP_UUU_D, H8, H8, rol64) +GEN_VEXT_VX(vrol_vx_b, 1) +GEN_VEXT_VX(vrol_vx_h, 2) +GEN_VEXT_VX(vrol_vx_w, 4) +GEN_VEXT_VX(vrol_vx_d, 8) + +static uint64_t brev8(uint64_t val) +{ + val = ((val & 0x5555555555555555ull) << 1) | + ((val & 0xAAAAAAAAAAAAAAAAull) >> 1); + val = ((val & 0x3333333333333333ull) << 2) | + ((val & 0xCCCCCCCCCCCCCCCCull) >> 2); + val = ((val & 0x0F0F0F0F0F0F0F0Full) << 4) | + ((val & 0xF0F0F0F0F0F0F0F0ull) >> 4); + + return val; +} + +RVVCALL(OPIVV1, vbrev8_v_b, OP_UU_B, H1, H1, brev8) +RVVCALL(OPIVV1, vbrev8_v_h, OP_UU_H, H2, H2, brev8) +RVVCALL(OPIVV1, vbrev8_v_w, OP_UU_W, H4, H4, brev8) +RVVCALL(OPIVV1, vbrev8_v_d, OP_UU_D, H8, H8, brev8) +GEN_VEXT_V(vbrev8_v_b, 1) +GEN_VEXT_V(vbrev8_v_h, 2) +GEN_VEXT_V(vbrev8_v_w, 4) +GEN_VEXT_V(vbrev8_v_d, 8) + +#define DO_IDENTITY(a) (a) +RVVCALL(OPIVV1, vrev8_v_b, OP_UU_B, H1, H1, DO_IDENTITY) +RVVCALL(OPIVV1, vrev8_v_h, OP_UU_H, H2, H2, bswap16) +RVVCALL(OPIVV1, vrev8_v_w, OP_UU_W, H4, H4, bswap32) +RVVCALL(OPIVV1, vrev8_v_d, OP_UU_D, H8, H8, bswap64) +GEN_VEXT_V(vrev8_v_b, 1) +GEN_VEXT_V(vrev8_v_h, 2) +GEN_VEXT_V(vrev8_v_w, 4) +GEN_VEXT_V(vrev8_v_d, 8) + +#define DO_ANDN(a, b) ((a) & ~(b)) +RVVCALL(OPIVV2, vandn_vv_b, OP_UUU_B, H1, H1, H1, DO_ANDN) +RVVCALL(OPIVV2, vandn_vv_h, OP_UUU_H, H2, H2, H2, DO_ANDN) +RVVCALL(OPIVV2, vandn_vv_w, OP_UUU_W, H4, H4, H4, DO_ANDN) +RVVCALL(OPIVV2, vandn_vv_d, OP_UUU_D, H8, H8, H8, DO_ANDN) +GEN_VEXT_VV(vandn_vv_b, 1) +GEN_VEXT_VV(vandn_vv_h, 2) +GEN_VEXT_VV(vandn_vv_w, 4) +GEN_VEXT_VV(vandn_vv_d, 8) + +RVVCALL(OPIVX2, vandn_vx_b, OP_UUU_B, H1, H1, DO_ANDN) +RVVCALL(OPIVX2, vandn_vx_h, OP_UUU_H, H2, H2, DO_ANDN) +RVVCALL(OPIVX2, vandn_vx_w, OP_UUU_W, H4, H4, DO_ANDN) +RVVCALL(OPIVX2, vandn_vx_d, OP_UUU_D, H8, H8, DO_ANDN) +GEN_VEXT_VX(vandn_vx_b, 1) +GEN_VEXT_VX(vandn_vx_h, 2) +GEN_VEXT_VX(vandn_vx_w, 4) +GEN_VEXT_VX(vandn_vx_d, 8) + +RVVCALL(OPIVV1, vbrev_v_b, OP_UU_B, H1, H1, revbit8) +RVVCALL(OPIVV1, vbrev_v_h, OP_UU_H, H2, H2, revbit16) +RVVCALL(OPIVV1, vbrev_v_w, OP_UU_W, H4, H4, revbit32) +RVVCALL(OPIVV1, vbrev_v_d, OP_UU_D, H8, H8, revbit64) +GEN_VEXT_V(vbrev_v_b, 1) +GEN_VEXT_V(vbrev_v_h, 2) +GEN_VEXT_V(vbrev_v_w, 4) +GEN_VEXT_V(vbrev_v_d, 8) + +RVVCALL(OPIVV1, vclz_v_b, OP_UU_B, H1, H1, clz8) +RVVCALL(OPIVV1, vclz_v_h, OP_UU_H, H2, H2, clz16) +RVVCALL(OPIVV1, vclz_v_w, OP_UU_W, H4, H4, clz32) +RVVCALL(OPIVV1, vclz_v_d, OP_UU_D, H8, H8, clz64) +GEN_VEXT_V(vclz_v_b, 1) +GEN_VEXT_V(vclz_v_h, 2) +GEN_VEXT_V(vclz_v_w, 4) +GEN_VEXT_V(vclz_v_d, 8) + +RVVCALL(OPIVV1, vctz_v_b, OP_UU_B, H1, H1, ctz8) +RVVCALL(OPIVV1, vctz_v_h, OP_UU_H, H2, H2, ctz16) +RVVCALL(OPIVV1, vctz_v_w, OP_UU_W, H4, H4, ctz32) +RVVCALL(OPIVV1, vctz_v_d, OP_UU_D, H8, H8, ctz64) +GEN_VEXT_V(vctz_v_b, 1) +GEN_VEXT_V(vctz_v_h, 2) +GEN_VEXT_V(vctz_v_w, 4) +GEN_VEXT_V(vctz_v_d, 8) + +RVVCALL(OPIVV1, vcpop_v_b, OP_UU_B, H1, H1, ctpop8) +RVVCALL(OPIVV1, vcpop_v_h, OP_UU_H, H2, H2, ctpop16) +RVVCALL(OPIVV1, vcpop_v_w, OP_UU_W, H4, H4, ctpop32) +RVVCALL(OPIVV1, vcpop_v_d, OP_UU_D, H8, H8, ctpop64) +GEN_VEXT_V(vcpop_v_b, 1) +GEN_VEXT_V(vcpop_v_h, 2) +GEN_VEXT_V(vcpop_v_w, 4) +GEN_VEXT_V(vcpop_v_d, 8) + +#define DO_SLL(N, M) (N << (M & (sizeof(N) * 8 - 1))) +RVVCALL(OPIVV2, vwsll_vv_b, WOP_UUU_B, H2, H1, H1, DO_SLL) +RVVCALL(OPIVV2, vwsll_vv_h, WOP_UUU_H, H4, H2, H2, DO_SLL) +RVVCALL(OPIVV2, vwsll_vv_w, WOP_UUU_W, H8, H4, H4, DO_SLL) +GEN_VEXT_VV(vwsll_vv_b, 2) +GEN_VEXT_VV(vwsll_vv_h, 4) +GEN_VEXT_VV(vwsll_vv_w, 8) + +RVVCALL(OPIVX2, vwsll_vx_b, WOP_UUU_B, H2, H1, DO_SLL) +RVVCALL(OPIVX2, vwsll_vx_h, WOP_UUU_H, H4, H2, DO_SLL) +RVVCALL(OPIVX2, vwsll_vx_w, WOP_UUU_W, H8, H4, DO_SLL) +GEN_VEXT_VX(vwsll_vx_b, 2) +GEN_VEXT_VX(vwsll_vx_h, 4) +GEN_VEXT_VX(vwsll_vx_w, 8) From patchwork Thu Jun 22 16:16:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798523 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=IM/Bm+aj; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5GG2FNSz20XS for ; Fri, 23 Jun 2023 02:21:10 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN16-0002UO-Ek; Thu, 22 Jun 2023 12:18:40 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN0s-00023n-7G for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:26 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0p-0005Uu-4F for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:25 -0400 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1b51780c1b3so56155085ad.1 for ; Thu, 22 Jun 2023 09:18:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450701; x=1690042701; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tFSpT3acxB9e1xoyFDgZvk5uODO5fn6k4l229WAMRi4=; b=IM/Bm+ajq0VapHhgtQAmDrwGWMD32Yav2VtDCooiYnEHjwU1WMYNJ/elceMdvoMaEl ND8eCkmX/jNKXvfcdnFhMHVCQjHveE7a3SGTIa1/oGDxI4u0dCmimT04eRrJT7M1x6H3 eAKcHmmc9HRSOfaGfbs7T5l2/XlpA/B2loHi1Jpl9c9sQ++IM88unb6Mt+R7XcKNe+B7 BaBv4isWGhCoJnzcFHPjKs2buTx5phTzLDiiBn1SfP09cBALVK0hMFMhWj1f2LCwIj0k owLx8FtTk2cPK8gGHB7SGjOuwoRj7PbpcHccg0hVsk4HpOd8eDsLvIyhUAMwo8O0FRoT 0gnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450701; x=1690042701; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tFSpT3acxB9e1xoyFDgZvk5uODO5fn6k4l229WAMRi4=; b=SvNJfjtl6/Kd48Lm9aX9MkB2OSgwwu/zDiUw72dKhyDITZBOw0yPxNAIrdQ7u9E0dX RZl+Sy88lfqfGOgNAaLQYEbQ7t3AU9fSor3b7xYVdKpu4ysEubGACSkOY0hG24XEoKJo ON5DaYKxm1sDdqQ8JC1+VCy45aMveWf/+PmgAbIWZSP6pudUgYd3NKsu0NtYdWekFdPu LE6M6KcONZ2up73hPwTH/oLAa9Ww3xel2UZzNBPy+LnjjcNb1L3EsBA1MoKXgNb4gHRL XLmiEW2zybmicrWBWOuBvAsph3+CKEQyASqidZWRaQAVyArLq/feuOJ/KWJGG4LuIJdH D4+Q== X-Gm-Message-State: AC+VfDxIQjYW3wODnWXT+sqP+HD3HNV8b/nyM6KjiXZGz43FaOxAzwEv GLsJSi219FjcOC9r5Nry3tffrF7ZJFzfpwqAYklcXS8jPcTCBn2V7JcQWw5PAqWmpKQ+cibJIVm WzetRk895dm7V5ysQta3EV5ofvJ6yJ9+voUWQHT+hpeSbQVLlEbzrLv+5SaUgJDWfNLrWamRNP+ PAWBQ= X-Google-Smtp-Source: ACHHUZ5JaJaicXI1MAKQFGsedJiFaiw8noDOiHmZ7wAsBJl+7Mo/dOLbuifm8VJTdwiJCUVo/0RP4g== X-Received: by 2002:a17:902:c255:b0:1a9:21bc:65f8 with SMTP id 21-20020a170902c25500b001a921bc65f8mr17320668plg.11.1687450701055; Thu, 22 Jun 2023 09:18:21 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:20 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Nazar Kazakov , Lawrence Hunter , William Salmon , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Liu Zhiwei , Kiran Ostrolenk Subject: [PATCH v4 10/17] target/riscv: Add Zvkned ISA extension support Date: Fri, 23 Jun 2023 00:16:26 +0800 Message-Id: <20230622161646.32005-11-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=max.chou@sifive.com; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Nazar Kazakov This commit adds support for the Zvkned vector-crypto extension, which consists of the following instructions: * vaesef.[vv,vs] * vaesdf.[vv,vs] * vaesdm.[vv,vs] * vaesz.vs * vaesem.[vv,vs] * vaeskf1.vi * vaeskf2.vi Translation functions are defined in `target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in `target/riscv/vcrypto_helper.c`. Co-authored-by: Lawrence Hunter Co-authored-by: William Salmon [max.chou@sifive.com: Replaced vstart checking by TCG op] Signed-off-by: Lawrence Hunter Signed-off-by: William Salmon Signed-off-by: Nazar Kazakov Signed-off-by: Max Chou Reviewed-by: Daniel Henrique Barboza --- target/riscv/cpu.c | 3 +- target/riscv/cpu_cfg.h | 1 + target/riscv/helper.h | 13 + target/riscv/insn32.decode | 14 ++ target/riscv/insn_trans/trans_rvvk.c.inc | 177 +++++++++++++ target/riscv/op_helper.c | 6 + target/riscv/vcrypto_helper.c | 308 +++++++++++++++++++++++ 7 files changed, 521 insertions(+), 1 deletion(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 4ee5219dbc..b6c755ba13 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -118,6 +118,7 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zve64d, PRIV_VERSION_1_10_0, ext_zve64d), ISA_EXT_DATA_ENTRY(zvfh, PRIV_VERSION_1_12_0, ext_zvfh), ISA_EXT_DATA_ENTRY(zvfhmin, PRIV_VERSION_1_12_0, ext_zvfhmin), + ISA_EXT_DATA_ENTRY(zvkned, PRIV_VERSION_1_12_0, ext_zvkned), ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx), ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin), ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia), @@ -1194,7 +1195,7 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp) * In principle Zve*x would also suffice here, were they supported * in qemu */ - if (cpu->cfg.ext_zvbb && !cpu->cfg.ext_zve32f) { + if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkned) && !cpu->cfg.ext_zve32f) { error_setg(errp, "Vector crypto extensions require V or Zve* extensions"); return; diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index 0904dc3ae5..4636d4c84d 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -85,6 +85,7 @@ struct RISCVCPUConfig { bool ext_zve64d; bool ext_zvbb; bool ext_zvbc; + bool ext_zvkned; bool ext_zmmul; bool ext_zvfh; bool ext_zvfhmin; diff --git a/target/riscv/helper.h b/target/riscv/helper.h index fbb0ceca81..738f20d3ca 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1,5 +1,6 @@ /* Exceptions */ DEF_HELPER_2(raise_exception, noreturn, env, i32) +DEF_HELPER_2(restore_cpu_and_raise_exception, noreturn, env, i32) /* Floating Point - rounding mode */ DEF_HELPER_FLAGS_2(set_rounding_mode, TCG_CALL_NO_WG, void, env, i32) @@ -1221,3 +1222,15 @@ DEF_HELPER_6(vandn_vx_b, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(vandn_vx_h, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(vandn_vx_w, void, ptr, ptr, tl, ptr, env, i32) DEF_HELPER_6(vandn_vx_d, void, ptr, ptr, tl, ptr, env, i32) + +DEF_HELPER_4(vaesef_vv, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesef_vs, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesdf_vv, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesdf_vs, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesem_vv, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesem_vs, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesdm_vv, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesdm_vs, void, ptr, ptr, env, i32) +DEF_HELPER_4(vaesz_vs, void, ptr, ptr, env, i32) +DEF_HELPER_5(vaeskf1_vi, void, ptr, ptr, i32, env, i32) +DEF_HELPER_5(vaeskf2_vi, void, ptr, ptr, i32, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index aa6d3185a2..7e0295d493 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -75,6 +75,7 @@ @r_rm ....... ..... ..... ... ..... ....... %rs2 %rs1 %rm %rd @r2_rm ....... ..... ..... ... ..... ....... %rs1 %rm %rd @r2 ....... ..... ..... ... ..... ....... &r2 %rs1 %rd +@r2_vm_1 ...... . ..... ..... ... ..... ....... &rmr vm=1 %rs2 %rd @r2_nfvm ... ... vm:1 ..... ..... ... ..... ....... &r2nfvm %nf %rs1 %rd @r2_vm ...... vm:1 ..... ..... ... ..... ....... &rmr %rs2 %rd @r1_vm ...... vm:1 ..... ..... ... ..... ....... %rd @@ -934,3 +935,16 @@ vcpop_v 010010 . ..... 01110 010 ..... 1010111 @r2_vm vwsll_vv 110101 . ..... ..... 000 ..... 1010111 @r_vm vwsll_vx 110101 . ..... ..... 100 ..... 1010111 @r_vm vwsll_vi 110101 . ..... ..... 011 ..... 1010111 @r_vm + +# *** Zvkned vector crypto extension *** +vaesef_vv 101000 1 ..... 00011 010 ..... 1110111 @r2_vm_1 +vaesef_vs 101001 1 ..... 00011 010 ..... 1110111 @r2_vm_1 +vaesdf_vv 101000 1 ..... 00001 010 ..... 1110111 @r2_vm_1 +vaesdf_vs 101001 1 ..... 00001 010 ..... 1110111 @r2_vm_1 +vaesem_vv 101000 1 ..... 00010 010 ..... 1110111 @r2_vm_1 +vaesem_vs 101001 1 ..... 00010 010 ..... 1110111 @r2_vm_1 +vaesdm_vv 101000 1 ..... 00000 010 ..... 1110111 @r2_vm_1 +vaesdm_vs 101001 1 ..... 00000 010 ..... 1110111 @r2_vm_1 +vaesz_vs 101001 1 ..... 00111 010 ..... 1110111 @r2_vm_1 +vaeskf1_vi 100010 1 ..... ..... 010 ..... 1110111 @r_vm_1 +vaeskf2_vi 101010 1 ..... ..... 010 ..... 1110111 @r_vm_1 diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index 0e4b337613..c618f76e7e 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -224,3 +224,180 @@ static bool vwsll_vx_check(DisasContext *s, arg_rmrr *a) GEN_OPIVV_WIDEN_TRANS(vwsll_vv, vwsll_vv_check) GEN_OPIVX_WIDEN_TRANS(vwsll_vx, vwsll_vx_check) GEN_OPIVI_WIDEN_TRANS(vwsll_vi, IMM_ZX, vwsll_vx, vwsll_vx_check) + +/* + * Zvkned + */ + +#define ZVKNED_EGS 4 + +#define GEN_V_UNMASKED_TRANS(NAME, CHECK, EGS) \ + static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ + { \ + if (CHECK(s, a)) { \ + TCGv_ptr rd_v, rs2_v; \ + TCGv_i32 desc; \ + uint32_t data = 0; \ + TCGLabel *over = gen_new_label(); \ + TCGLabel *vl_ok = gen_new_label(); \ + TCGLabel *vstart_ok = gen_new_label(); \ + TCGv_i32 tmp = tcg_temp_new_i32(); \ + \ + /* save opcode for unwinding in case we throw an exception */ \ + decode_save_opc(s); \ + \ + /* check (vl % EGS == 0) assuming it's power of 2 */ \ + tcg_gen_trunc_tl_i32(tmp, cpu_vl); \ + tcg_gen_andi_i32(tmp, tmp, EGS - 1); \ + tcg_gen_brcondi_i32(TCG_COND_EQ, tmp, 0, vl_ok); \ + gen_helper_restore_cpu_and_raise_exception( \ + cpu_env, tcg_constant_i32(RISCV_EXCP_ILLEGAL_INST)); \ + gen_set_label(vl_ok); \ + \ + /* check (vstart % EGS == 0) assuming it's power of 2 */ \ + tcg_gen_trunc_tl_i32(tmp, cpu_vstart); \ + tcg_gen_andi_i32(tmp, tmp, EGS - 1); \ + tcg_gen_brcondi_i32(TCG_COND_EQ, tmp, 0, vstart_ok); \ + gen_helper_restore_cpu_and_raise_exception( \ + cpu_env, tcg_constant_i32(RISCV_EXCP_ILLEGAL_INST)); \ + gen_set_label(vstart_ok); \ + \ + tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ + data = FIELD_DP32(data, VDATA, VM, a->vm); \ + data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ + data = FIELD_DP32(data, VDATA, VTA, s->vta); \ + data = FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s); \ + data = FIELD_DP32(data, VDATA, VMA, s->vma); \ + rd_v = tcg_temp_new_ptr(); \ + rs2_v = tcg_temp_new_ptr(); \ + desc = tcg_constant_i32( \ + simd_desc(s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, data)); \ + tcg_gen_addi_ptr(rd_v, cpu_env, vreg_ofs(s, a->rd)); \ + tcg_gen_addi_ptr(rs2_v, cpu_env, vreg_ofs(s, a->rs2)); \ + gen_helper_##NAME(rd_v, rs2_v, cpu_env, desc); \ + mark_vs_dirty(s); \ + gen_set_label(over); \ + return true; \ + } \ + return false; \ + } + +static bool vaes_check_vv(DisasContext *s, arg_rmr *a) +{ + int egw_bytes = ZVKNED_EGS << s->sew; + return s->cfg_ptr->ext_zvkned == true && + require_rvv(s) && + vext_check_isa_ill(s) && + MAXSZ(s) >= egw_bytes && + require_align(a->rd, s->lmul) && + require_align(a->rs2, s->lmul) && + s->sew == MO_32; +} + +static bool vaes_check_overlap(DisasContext *s, int vd, int vs2) +{ + int8_t op_size = s->lmul <= 0 ? 1 : 1 << s->lmul; + return !is_overlapped(vd, op_size, vs2, 1); +} + +static bool vaes_check_vs(DisasContext *s, arg_rmr *a) +{ + int egw_bytes = ZVKNED_EGS << s->sew; + return vaes_check_overlap(s, a->rd, a->rs2) && + MAXSZ(s) >= egw_bytes && + s->cfg_ptr->ext_zvkned == true && + require_rvv(s) && + vext_check_isa_ill(s) && + require_align(a->rd, s->lmul) && + s->sew == MO_32; +} + +GEN_V_UNMASKED_TRANS(vaesef_vv, vaes_check_vv, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesef_vs, vaes_check_vs, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesdf_vv, vaes_check_vv, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesdf_vs, vaes_check_vs, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesdm_vv, vaes_check_vv, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesdm_vs, vaes_check_vs, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesz_vs, vaes_check_vs, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesem_vv, vaes_check_vv, ZVKNED_EGS) +GEN_V_UNMASKED_TRANS(vaesem_vs, vaes_check_vs, ZVKNED_EGS) + +#define GEN_VI_UNMASKED_TRANS(NAME, CHECK, EGS) \ + static bool trans_##NAME(DisasContext *s, arg_##NAME *a) \ + { \ + if (CHECK(s, a)) { \ + TCGv_ptr rd_v, rs2_v; \ + TCGv_i32 uimm_v, desc; \ + uint32_t data = 0; \ + TCGLabel *over = gen_new_label(); \ + TCGLabel *vl_ok = gen_new_label(); \ + TCGLabel *vstart_ok = gen_new_label(); \ + TCGv_i32 tmp = tcg_temp_new_i32(); \ + \ + /* save opcode for unwinding in case we throw an exception */ \ + decode_save_opc(s); \ + \ + /* check (vl % EGS == 0) assuming it's power of 2 */ \ + tcg_gen_trunc_tl_i32(tmp, cpu_vl); \ + tcg_gen_andi_i32(tmp, tmp, EGS - 1); \ + tcg_gen_brcondi_i32(TCG_COND_EQ, tmp, 0, vl_ok); \ + gen_helper_restore_cpu_and_raise_exception( \ + cpu_env, tcg_constant_i32(RISCV_EXCP_ILLEGAL_INST)); \ + gen_set_label(vl_ok); \ + \ + /* check (vstart % EGS == 0) assuming it's power of 2 */ \ + tcg_gen_trunc_tl_i32(tmp, cpu_vstart); \ + tcg_gen_andi_i32(tmp, tmp, EGS - 1); \ + tcg_gen_brcondi_i32(TCG_COND_EQ, tmp, 0, vstart_ok); \ + gen_helper_restore_cpu_and_raise_exception( \ + cpu_env, tcg_constant_i32(RISCV_EXCP_ILLEGAL_INST)); \ + gen_set_label(vstart_ok); \ + \ + tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ + data = FIELD_DP32(data, VDATA, VM, a->vm); \ + data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ + data = FIELD_DP32(data, VDATA, VTA, s->vta); \ + data = FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s); \ + data = FIELD_DP32(data, VDATA, VMA, s->vma); \ + \ + rd_v = tcg_temp_new_ptr(); \ + rs2_v = tcg_temp_new_ptr(); \ + uimm_v = tcg_constant_i32(a->rs1); \ + desc = tcg_constant_i32( \ + simd_desc(s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, data)); \ + tcg_gen_addi_ptr(rd_v, cpu_env, vreg_ofs(s, a->rd)); \ + tcg_gen_addi_ptr(rs2_v, cpu_env, vreg_ofs(s, a->rs2)); \ + gen_helper_##NAME(rd_v, rs2_v, uimm_v, cpu_env, desc); \ + mark_vs_dirty(s); \ + gen_set_label(over); \ + return true; \ + } \ + return false; \ + } + +static bool vaeskf1_check(DisasContext *s, arg_vaeskf1_vi *a) +{ + int egw_bytes = ZVKNED_EGS << s->sew; + return s->cfg_ptr->ext_zvkned == true && + require_rvv(s) && + vext_check_isa_ill(s) && + MAXSZ(s) >= egw_bytes && + s->sew == MO_32 && + require_align(a->rd, s->lmul) && + require_align(a->rs2, s->lmul); +} + +static bool vaeskf2_check(DisasContext *s, arg_vaeskf2_vi *a) +{ + int egw_bytes = ZVKNED_EGS << s->sew; + return s->cfg_ptr->ext_zvkned == true && + require_rvv(s) && + vext_check_isa_ill(s) && + MAXSZ(s) >= egw_bytes && + s->sew == MO_32 && + require_align(a->rd, s->lmul) && + require_align(a->rs2, s->lmul); +} + +GEN_VI_UNMASKED_TRANS(vaeskf1_vi, vaeskf1_check, ZVKNED_EGS) +GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS) diff --git a/target/riscv/op_helper.c b/target/riscv/op_helper.c index 9cdb9cdd06..a24bc3b1a3 100644 --- a/target/riscv/op_helper.c +++ b/target/riscv/op_helper.c @@ -39,6 +39,12 @@ void helper_raise_exception(CPURISCVState *env, uint32_t exception) riscv_raise_exception(env, exception, 0); } +void helper_restore_cpu_and_raise_exception(CPURISCVState *env, + uint32_t exception) +{ + riscv_raise_exception(env, exception, GETPC()); +} + target_ulong helper_csrr(CPURISCVState *env, int csr) { /* diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index 11239b59d6..0988eb74c8 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -22,6 +22,7 @@ #include "qemu/bitops.h" #include "qemu/bswap.h" #include "cpu.h" +#include "crypto/aes.h" #include "exec/memop.h" #include "exec/exec-all.h" #include "exec/helper-proto.h" @@ -195,3 +196,310 @@ RVVCALL(OPIVX2, vwsll_vx_w, WOP_UUU_W, H8, H4, DO_SLL) GEN_VEXT_VX(vwsll_vx_b, 2) GEN_VEXT_VX(vwsll_vx_h, 4) GEN_VEXT_VX(vwsll_vx_w, 8) + +static inline void aes_sub_bytes(uint8_t round_state[4][4]) +{ + for (int j = 0; j < 16; j++) { + round_state[j / 4][j % 4] = AES_sbox[round_state[j / 4][j % 4]]; + } +} + +static inline void aes_shift_bytes(uint8_t round_state[4][4]) +{ + uint8_t temp; + temp = round_state[0][1]; + round_state[0][1] = round_state[1][1]; + round_state[1][1] = round_state[2][1]; + round_state[2][1] = round_state[3][1]; + round_state[3][1] = temp; + temp = round_state[0][2]; + round_state[0][2] = round_state[2][2]; + round_state[2][2] = temp; + temp = round_state[1][2]; + round_state[1][2] = round_state[3][2]; + round_state[3][2] = temp; + temp = round_state[0][3]; + round_state[0][3] = round_state[3][3]; + round_state[3][3] = round_state[2][3]; + round_state[2][3] = round_state[1][3]; + round_state[1][3] = temp; +} + +static inline void xor_round_key(uint8_t round_state[4][4], uint8_t *round_key) +{ + for (int j = 0; j < 16; j++) { + round_state[j / 4][j % 4] = round_state[j / 4][j % 4] ^ (round_key)[j]; + } +} + +static inline void aes_inv_sub_bytes(uint8_t round_state[4][4]) +{ + for (int j = 0; j < 16; j++) { + round_state[j / 4][j % 4] = AES_isbox[round_state[j / 4][j % 4]]; + } +} + +static inline void aes_inv_shift_bytes(uint8_t round_state[4][4]) +{ + uint8_t temp; + temp = round_state[3][1]; + round_state[3][1] = round_state[2][1]; + round_state[2][1] = round_state[1][1]; + round_state[1][1] = round_state[0][1]; + round_state[0][1] = temp; + temp = round_state[0][2]; + round_state[0][2] = round_state[2][2]; + round_state[2][2] = temp; + temp = round_state[1][2]; + round_state[1][2] = round_state[3][2]; + round_state[3][2] = temp; + temp = round_state[0][3]; + round_state[0][3] = round_state[1][3]; + round_state[1][3] = round_state[2][3]; + round_state[2][3] = round_state[3][3]; + round_state[3][3] = temp; +} + +static inline uint8_t xtime(uint8_t x) +{ + return (x << 1) ^ (((x >> 7) & 1) * 0x1b); +} + +static inline uint8_t multiply(uint8_t x, uint8_t y) +{ + return (((y & 1) * x) ^ ((y >> 1 & 1) * xtime(x)) ^ + ((y >> 2 & 1) * xtime(xtime(x))) ^ + ((y >> 3 & 1) * xtime(xtime(xtime(x)))) ^ + ((y >> 4 & 1) * xtime(xtime(xtime(xtime(x)))))); +} + +static inline void aes_inv_mix_cols(uint8_t round_state[4][4]) +{ + uint8_t a, b, c, d; + for (int j = 0; j < 4; ++j) { + a = round_state[j][0]; + b = round_state[j][1]; + c = round_state[j][2]; + d = round_state[j][3]; + round_state[j][0] = multiply(a, 0x0e) ^ multiply(b, 0x0b) ^ + multiply(c, 0x0d) ^ multiply(d, 0x09); + round_state[j][1] = multiply(a, 0x09) ^ multiply(b, 0x0e) ^ + multiply(c, 0x0b) ^ multiply(d, 0x0d); + round_state[j][2] = multiply(a, 0x0d) ^ multiply(b, 0x09) ^ + multiply(c, 0x0e) ^ multiply(d, 0x0b); + round_state[j][3] = multiply(a, 0x0b) ^ multiply(b, 0x0d) ^ + multiply(c, 0x09) ^ multiply(d, 0x0e); + } +} + +static inline void aes_mix_cols(uint8_t round_state[4][4]) +{ + uint8_t a, b; + for (int j = 0; j < 4; ++j) { + a = round_state[j][0]; + b = round_state[j][0] ^ round_state[j][1] ^ round_state[j][2] ^ + round_state[j][3]; + round_state[j][0] ^= xtime(round_state[j][0] ^ round_state[j][1]) ^ b; + round_state[j][1] ^= xtime(round_state[j][1] ^ round_state[j][2]) ^ b; + round_state[j][2] ^= xtime(round_state[j][2] ^ round_state[j][3]) ^ b; + round_state[j][3] ^= xtime(round_state[j][3] ^ a) ^ b; + } +} + +#define GEN_ZVKNED_HELPER_VV(NAME, ...) \ + void HELPER(NAME)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env, \ + uint32_t desc) \ + { \ + uint64_t *vd = vd_vptr; \ + uint64_t *vs2 = vs2_vptr; \ + uint32_t vl = env->vl; \ + uint32_t total_elems = vext_get_total_elems(env, desc, 4); \ + uint32_t vta = vext_vta(desc); \ + \ + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { \ + uint64_t round_key[2] = { \ + cpu_to_le64(vs2[i * 2 + 0]), \ + cpu_to_le64(vs2[i * 2 + 1]), \ + }; \ + uint8_t round_state[4][4]; \ + cpu_to_le64s(vd + i * 2 + 0); \ + cpu_to_le64s(vd + i * 2 + 1); \ + for (int j = 0; j < 16; j++) { \ + round_state[j / 4][j % 4] = ((uint8_t *)(vd + i * 2))[j]; \ + } \ + __VA_ARGS__; \ + for (int j = 0; j < 16; j++) { \ + ((uint8_t *)(vd + i * 2))[j] = round_state[j / 4][j % 4]; \ + } \ + le64_to_cpus(vd + i * 2 + 0); \ + le64_to_cpus(vd + i * 2 + 1); \ + } \ + env->vstart = 0; \ + /* set tail elements to 1s */ \ + vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); \ + } + +#define GEN_ZVKNED_HELPER_VS(NAME, ...) \ + void HELPER(NAME)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env, \ + uint32_t desc) \ + { \ + uint64_t *vd = vd_vptr; \ + uint64_t *vs2 = vs2_vptr; \ + uint32_t vl = env->vl; \ + uint32_t total_elems = vext_get_total_elems(env, desc, 4); \ + uint32_t vta = vext_vta(desc); \ + \ + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { \ + uint64_t round_key[2] = { \ + cpu_to_le64(vs2[0]), \ + cpu_to_le64(vs2[1]), \ + }; \ + uint8_t round_state[4][4]; \ + cpu_to_le64s(vd + i * 2 + 0); \ + cpu_to_le64s(vd + i * 2 + 1); \ + for (int j = 0; j < 16; j++) { \ + round_state[j / 4][j % 4] = ((uint8_t *)(vd + i * 2))[j]; \ + } \ + __VA_ARGS__; \ + for (int j = 0; j < 16; j++) { \ + ((uint8_t *)(vd + i * 2))[j] = round_state[j / 4][j % 4]; \ + } \ + le64_to_cpus(vd + i * 2 + 0); \ + le64_to_cpus(vd + i * 2 + 1); \ + } \ + env->vstart = 0; \ + /* set tail elements to 1s */ \ + vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); \ + } + +GEN_ZVKNED_HELPER_VV(vaesef_vv, aes_sub_bytes(round_state); + aes_shift_bytes(round_state); + xor_round_key(round_state, (uint8_t *)round_key);) +GEN_ZVKNED_HELPER_VS(vaesef_vs, aes_sub_bytes(round_state); + aes_shift_bytes(round_state); + xor_round_key(round_state, (uint8_t *)round_key);) +GEN_ZVKNED_HELPER_VV(vaesdf_vv, aes_inv_shift_bytes(round_state); + aes_inv_sub_bytes(round_state); + xor_round_key(round_state, (uint8_t *)round_key);) +GEN_ZVKNED_HELPER_VS(vaesdf_vs, aes_inv_shift_bytes(round_state); + aes_inv_sub_bytes(round_state); + xor_round_key(round_state, (uint8_t *)round_key);) +GEN_ZVKNED_HELPER_VV(vaesem_vv, aes_shift_bytes(round_state); + aes_sub_bytes(round_state); aes_mix_cols(round_state); + xor_round_key(round_state, (uint8_t *)round_key);) +GEN_ZVKNED_HELPER_VS(vaesem_vs, aes_shift_bytes(round_state); + aes_sub_bytes(round_state); aes_mix_cols(round_state); + xor_round_key(round_state, (uint8_t *)round_key);) +GEN_ZVKNED_HELPER_VV(vaesdm_vv, aes_inv_shift_bytes(round_state); + aes_inv_sub_bytes(round_state); + xor_round_key(round_state, (uint8_t *)round_key); + aes_inv_mix_cols(round_state);) +GEN_ZVKNED_HELPER_VS(vaesdm_vs, aes_inv_shift_bytes(round_state); + aes_inv_sub_bytes(round_state); + xor_round_key(round_state, (uint8_t *)round_key); + aes_inv_mix_cols(round_state);) +GEN_ZVKNED_HELPER_VS(vaesz_vs, + xor_round_key(round_state, (uint8_t *)round_key);) + +void HELPER(vaeskf1_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, + CPURISCVState *env, uint32_t desc) +{ + uint32_t *vd = vd_vptr; + uint32_t *vs2 = vs2_vptr; + uint32_t vl = env->vl; + uint32_t total_elems = vext_get_total_elems(env, desc, 4); + uint32_t vta = vext_vta(desc); + + uimm &= 0b1111; + if (uimm > 10 || uimm == 0) { + uimm ^= 0b1000; + } + + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { + uint32_t rk[8]; + static const uint32_t rcon[] = { + 0x01000000, 0x02000000, 0x04000000, 0x08000000, 0x10000000, + 0x20000000, 0x40000000, 0x80000000, 0x1B000000, 0x36000000, + }; + + rk[0] = bswap32(vs2[i * 4 + H4(0)]); + rk[1] = bswap32(vs2[i * 4 + H4(1)]); + rk[2] = bswap32(vs2[i * 4 + H4(2)]); + rk[3] = bswap32(vs2[i * 4 + H4(3)]); + + rk[4] = rk[0] ^ (AES_Te4[(rk[3] >> 16) & 0xff] & 0xff000000) ^ + (AES_Te4[(rk[3] >> 8) & 0xff] & 0x00ff0000) ^ + (AES_Te4[(rk[3] >> 0) & 0xff] & 0x0000ff00) ^ + (AES_Te4[(rk[3] >> 24) & 0xff] & 0x000000ff) ^ rcon[uimm - 1]; + rk[5] = rk[1] ^ rk[4]; + rk[6] = rk[2] ^ rk[5]; + rk[7] = rk[3] ^ rk[6]; + + vd[i * 4 + H4(0)] = bswap32(rk[4]); + vd[i * 4 + H4(1)] = bswap32(rk[5]); + vd[i * 4 + H4(2)] = bswap32(rk[6]); + vd[i * 4 + H4(3)] = bswap32(rk[7]); + } + env->vstart = 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); +} + +void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, + CPURISCVState *env, uint32_t desc) +{ + uint32_t *vd = vd_vptr; + uint32_t *vs2 = vs2_vptr; + uint32_t vl = env->vl; + uint32_t total_elems = vext_get_total_elems(env, desc, 4); + uint32_t vta = vext_vta(desc); + + uimm &= 0b1111; + if (uimm > 14 || uimm < 2) { + uimm ^= 0b1000; + } + + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { + uint32_t rk[12]; + static const uint32_t rcon[] = { + 0x01000000, 0x02000000, 0x04000000, 0x08000000, 0x10000000, + 0x20000000, 0x40000000, 0x80000000, 0x1B000000, 0x36000000, + }; + + rk[0] = bswap32(vd[i * 4 + H4(0)]); + rk[1] = bswap32(vd[i * 4 + H4(1)]); + rk[2] = bswap32(vd[i * 4 + H4(2)]); + rk[3] = bswap32(vd[i * 4 + H4(3)]); + rk[4] = bswap32(vs2[i * 4 + H4(0)]); + rk[5] = bswap32(vs2[i * 4 + H4(1)]); + rk[6] = bswap32(vs2[i * 4 + H4(2)]); + rk[7] = bswap32(vs2[i * 4 + H4(3)]); + + if (uimm % 2 == 0) { + rk[8] = rk[0] ^ (AES_Te4[(rk[7] >> 16) & 0xff] & 0xff000000) ^ + (AES_Te4[(rk[7] >> 8) & 0xff] & 0x00ff0000) ^ + (AES_Te4[(rk[7] >> 0) & 0xff] & 0x0000ff00) ^ + (AES_Te4[(rk[7] >> 24) & 0xff] & 0x000000ff) ^ + rcon[(uimm - 1) / 2]; + rk[9] = rk[1] ^ rk[8]; + rk[10] = rk[2] ^ rk[9]; + rk[11] = rk[3] ^ rk[10]; + } else { + rk[8] = rk[0] ^ (AES_Te4[(rk[7] >> 24) & 0xff] & 0xff000000) ^ + (AES_Te4[(rk[7] >> 16) & 0xff] & 0x00ff0000) ^ + (AES_Te4[(rk[7] >> 8) & 0xff] & 0x0000ff00) ^ + (AES_Te4[(rk[7] >> 0) & 0xff] & 0x000000ff); + rk[9] = rk[1] ^ rk[8]; + rk[10] = rk[2] ^ rk[9]; + rk[11] = rk[3] ^ rk[10]; + } + + vd[i * 4 + H4(0)] = bswap32(rk[8]); + vd[i * 4 + H4(1)] = bswap32(rk[9]); + vd[i * 4 + H4(2)] = bswap32(rk[10]); + vd[i * 4 + H4(3)] = bswap32(rk[11]); + } + env->vstart = 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); +} From patchwork Thu Jun 22 16:16:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798522 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=IExoCznk; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5G86vmGz20XS for ; Fri, 23 Jun 2023 02:21:04 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN17-0002Vs-2g; Thu, 22 Jun 2023 12:18:41 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN11-0002Md-EX for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:36 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN0w-0005c8-3m for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:32 -0400 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1b6824141b4so6003065ad.1 for ; Thu, 22 Jun 2023 09:18:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450708; x=1690042708; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/TvPXNpXsW/Z1/sP4Ug9xQf1391E6wfenaD6fhdpwWI=; b=IExoCznk5zIGnrUO9vZva7f4IbDsl30nmWnuc/CwEj/Mv5a814PXPqd0YwHf0Vrpe7 XQhPe+b99iLz2tQet1kyFCOV1talVodRVoOFu4KfseaTEgjhN/Ey829G3Brlg+jxZBv5 Zy14FTySV1qOWiuLPPDJ4a8BGuhQ1XjxuTixPKWrETng6jFw/oJ6QRYFxuCfyu2ZZs/g qnCCopyfjXeONXIOKeUEJpIXgrGbCqrzv/08v07OZnMJmENNoy+023FWGLuqFP6YyLSd jLpcY7pm1g8mPvhn+u00roIcUOwmTy3tO3Swbgo6MW/lkDvjrA60mZfLlFi2S/nhqt8a BvbA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450708; x=1690042708; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/TvPXNpXsW/Z1/sP4Ug9xQf1391E6wfenaD6fhdpwWI=; b=kPvnvSc9o7yRl0SEuX/893Y7GMTesIB1alsYrd+bSkYRCSCoGbM50qgLB7BQWDj6T6 vWobEORf5FJyPJ6S7IV/BT642p4Z9LHjH0gsF/oAtzMS/ip3FCfYNsE+9uwnZQbbMRt9 tvlnc6O01W8JixPR6C9Ff0Vd75IlcddXI4bN2VFGngtcZ/i8v40yk5PLMEmw1+knkcOU pRKamCLaO5eTn9xFLidLVNiIpZwMnjRKrJuqnf1RrZ1xj2iqRUfXQ+pB94OkgYfYe4Ch MVQQoZKdwjZmTuvceQDVSQJS+O8AGqY+z1ygRcTD/0kyhhLrXnYpIgEny0it6MhHadgR ccEQ== X-Gm-Message-State: AC+VfDyeZXC0Q39mHK2oRQcziFGGMSXwCMDRvLS4+bhdEPdK1KgRSM/7 xQedVKv8qRYBHF36Pn0rUzQweBpMYRyofmUPZmmDG63xWLlDx4B+7lrMLABKeeXaNqtZ2E1uInT udZQruVE1tw3uL4ouJ0kRAYfo0H5b/5iojbIfCfY2oK9NtCGjnNA8zTApMKh5NMrz1NNJFKW7Uv g8 X-Google-Smtp-Source: ACHHUZ44CZenWi7Q1uHV4xcvgoQEBJ58l+NQ8ZzyHLR+9wApGuui8NqELUDIs/5pkc0MwbzRpbc7ug== X-Received: by 2002:a17:902:ea0a:b0:1b5:2ca9:f71b with SMTP id s10-20020a170902ea0a00b001b52ca9f71bmr26977066plg.11.1687450707429; Thu, 22 Jun 2023 09:18:27 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:27 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Kiran Ostrolenk , Nazar Kazakov , Lawrence Hunter , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Liu Zhiwei , William Salmon Subject: [PATCH v4 11/17] target/riscv: Add Zvknh ISA extension support Date: Fri, 23 Jun 2023 00:16:27 +0800 Message-Id: <20230622161646.32005-12-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=max.chou@sifive.com; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Kiran Ostrolenk This commit adds support for the Zvknh vector-crypto extension, which consists of the following instructions: * vsha2ms.vv * vsha2c[hl].vv Translation functions are defined in `target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in `target/riscv/vcrypto_helper.c`. Co-authored-by: Nazar Kazakov Co-authored-by: Lawrence Hunter [max.chou@sifive.com: Replaced vstart checking by TCG op] Signed-off-by: Nazar Kazakov Signed-off-by: Lawrence Hunter Signed-off-by: Kiran Ostrolenk Signed-off-by: Max Chou Reviewed-by: Daniel Henrique Barboza --- target/riscv/cpu.c | 11 +- target/riscv/cpu_cfg.h | 2 + target/riscv/helper.h | 4 + target/riscv/insn32.decode | 5 + target/riscv/insn_trans/trans_rvvk.c.inc | 78 +++++++++ target/riscv/vcrypto_helper.c | 214 +++++++++++++++++++++++ 6 files changed, 311 insertions(+), 3 deletions(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index b6c755ba13..6bba8ba8c9 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -119,6 +119,8 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zvfh, PRIV_VERSION_1_12_0, ext_zvfh), ISA_EXT_DATA_ENTRY(zvfhmin, PRIV_VERSION_1_12_0, ext_zvfhmin), ISA_EXT_DATA_ENTRY(zvkned, PRIV_VERSION_1_12_0, ext_zvkned), + ISA_EXT_DATA_ENTRY(zvknha, PRIV_VERSION_1_12_0, ext_zvknha), + ISA_EXT_DATA_ENTRY(zvknhb, PRIV_VERSION_1_12_0, ext_zvknhb), ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx), ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin), ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia), @@ -1195,14 +1197,17 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp) * In principle Zve*x would also suffice here, were they supported * in qemu */ - if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkned) && !cpu->cfg.ext_zve32f) { + if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkned || cpu->cfg.ext_zvknha) && + !cpu->cfg.ext_zve32f) { error_setg(errp, "Vector crypto extensions require V or Zve* extensions"); return; } - if (cpu->cfg.ext_zvbc && !cpu->cfg.ext_zve64f) { - error_setg(errp, "Zvbc extension requires V or Zve64{f,d} extensions"); + if ((cpu->cfg.ext_zvbc || cpu->cfg.ext_zvknhb) && !cpu->cfg.ext_zve64f) { + error_setg( + errp, + "Zvbc and Zvknhb extensions require V or Zve64{f,d} extensions"); return; } diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index 4636d4c84d..41cce87ffc 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -86,6 +86,8 @@ struct RISCVCPUConfig { bool ext_zvbb; bool ext_zvbc; bool ext_zvkned; + bool ext_zvknha; + bool ext_zvknhb; bool ext_zmmul; bool ext_zvfh; bool ext_zvfhmin; diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 738f20d3ca..19f5a8a28d 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1234,3 +1234,7 @@ DEF_HELPER_4(vaesdm_vs, void, ptr, ptr, env, i32) DEF_HELPER_4(vaesz_vs, void, ptr, ptr, env, i32) DEF_HELPER_5(vaeskf1_vi, void, ptr, ptr, i32, env, i32) DEF_HELPER_5(vaeskf2_vi, void, ptr, ptr, i32, env, i32) + +DEF_HELPER_5(vsha2ms_vv, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vsha2ch_vv, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vsha2cl_vv, void, ptr, ptr, ptr, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 7e0295d493..d2cfb2729c 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -948,3 +948,8 @@ vaesdm_vs 101001 1 ..... 00000 010 ..... 1110111 @r2_vm_1 vaesz_vs 101001 1 ..... 00111 010 ..... 1110111 @r2_vm_1 vaeskf1_vi 100010 1 ..... ..... 010 ..... 1110111 @r_vm_1 vaeskf2_vi 101010 1 ..... ..... 010 ..... 1110111 @r_vm_1 + +# *** Zvknh vector crypto extension *** +vsha2ms_vv 101101 1 ..... ..... 010 ..... 1110111 @r_vm_1 +vsha2ch_vv 101110 1 ..... ..... 010 ..... 1110111 @r_vm_1 +vsha2cl_vv 101111 1 ..... ..... 010 ..... 1110111 @r_vm_1 diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index c618f76e7e..528a0d3b32 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -401,3 +401,81 @@ static bool vaeskf2_check(DisasContext *s, arg_vaeskf2_vi *a) GEN_VI_UNMASKED_TRANS(vaeskf1_vi, vaeskf1_check, ZVKNED_EGS) GEN_VI_UNMASKED_TRANS(vaeskf2_vi, vaeskf2_check, ZVKNED_EGS) + +/* + * Zvknh + */ + +#define ZVKNH_EGS 4 + +#define GEN_VV_UNMASKED_TRANS(NAME, CHECK, EGS) \ + static bool trans_##NAME(DisasContext *s, arg_rmrr *a) \ + { \ + if (CHECK(s, a)) { \ + uint32_t data = 0; \ + TCGLabel *over = gen_new_label(); \ + TCGLabel *vl_ok = gen_new_label(); \ + TCGLabel *vstart_ok = gen_new_label(); \ + TCGv_i32 tmp = tcg_temp_new_i32(); \ + \ + /* save opcode for unwinding in case we throw an exception */ \ + decode_save_opc(s); \ + \ + /* check (vl % EGS == 0) assuming it's power of 2 */ \ + tcg_gen_trunc_tl_i32(tmp, cpu_vl); \ + tcg_gen_andi_i32(tmp, tmp, EGS - 1); \ + tcg_gen_brcondi_i32(TCG_COND_EQ, tmp, 0, vl_ok); \ + gen_helper_restore_cpu_and_raise_exception( \ + cpu_env, tcg_constant_i32(RISCV_EXCP_ILLEGAL_INST)); \ + gen_set_label(vl_ok); \ + \ + /* check (vstart % EGS == 0) assuming it's power of 2 */ \ + tcg_gen_trunc_tl_i32(tmp, cpu_vstart); \ + tcg_gen_andi_i32(tmp, tmp, EGS - 1); \ + tcg_gen_brcondi_i32(TCG_COND_EQ, tmp, 0, vstart_ok); \ + gen_helper_restore_cpu_and_raise_exception( \ + cpu_env, tcg_constant_i32(RISCV_EXCP_ILLEGAL_INST)); \ + gen_set_label(vstart_ok); \ + \ + tcg_gen_brcond_tl(TCG_COND_GEU, cpu_vstart, cpu_vl, over); \ + \ + data = FIELD_DP32(data, VDATA, VM, a->vm); \ + data = FIELD_DP32(data, VDATA, LMUL, s->lmul); \ + data = FIELD_DP32(data, VDATA, VTA, s->vta); \ + data = FIELD_DP32(data, VDATA, VTA_ALL_1S, s->cfg_vta_all_1s); \ + data = FIELD_DP32(data, VDATA, VMA, s->vma); \ + \ + tcg_gen_gvec_3_ptr(vreg_ofs(s, a->rd), vreg_ofs(s, a->rs1), \ + vreg_ofs(s, a->rs2), cpu_env, \ + s->cfg_ptr->vlen / 8, s->cfg_ptr->vlen / 8, \ + data, gen_helper_##NAME); \ + \ + mark_vs_dirty(s); \ + gen_set_label(over); \ + return true; \ + } \ + return false; \ + } + +static bool vsha_check_sew(DisasContext *s) +{ + return (s->cfg_ptr->ext_zvknha == true && s->sew == MO_32) || + (s->cfg_ptr->ext_zvknhb == true && + (s->sew == MO_32 || s->sew == MO_64)); +} + +static bool vsha_check(DisasContext *s, arg_rmrr *a) +{ + int egw_bytes = ZVKNH_EGS << s->sew; + int mult = 1 << MAX(s->lmul, 0); + return opivv_check(s, a) && + vsha_check_sew(s) && + MAXSZ(s) >= egw_bytes && + !is_overlapped(a->rd, mult, a->rs1, mult) && + !is_overlapped(a->rd, mult, a->rs2, mult) && + s->lmul >= 0; +} + +GEN_VV_UNMASKED_TRANS(vsha2ms_vv, vsha_check, ZVKNH_EGS) +GEN_VV_UNMASKED_TRANS(vsha2cl_vv, vsha_check, ZVKNH_EGS) +GEN_VV_UNMASKED_TRANS(vsha2ch_vv, vsha_check, ZVKNH_EGS) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index 0988eb74c8..ca09062c6c 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -503,3 +503,217 @@ void HELPER(vaeskf2_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, /* set tail elements to 1s */ vext_set_elems_1s(vd, vta, vl * 4, total_elems * 4); } + +static inline uint32_t sig0_sha256(uint32_t x) +{ + return ror32(x, 7) ^ ror32(x, 18) ^ (x >> 3); +} + +static inline uint32_t sig1_sha256(uint32_t x) +{ + return ror32(x, 17) ^ ror32(x, 19) ^ (x >> 10); +} + +static inline uint64_t sig0_sha512(uint64_t x) +{ + return ror64(x, 1) ^ ror64(x, 8) ^ (x >> 7); +} + +static inline uint64_t sig1_sha512(uint64_t x) +{ + return ror64(x, 19) ^ ror64(x, 61) ^ (x >> 6); +} + +static inline void vsha2ms_e32(uint32_t *vd, uint32_t *vs1, uint32_t *vs2) +{ + uint32_t res[4]; + res[0] = sig1_sha256(vs1[H4(2)]) + vs2[H4(1)] + sig0_sha256(vd[H4(1)]) + + vd[H4(0)]; + res[1] = sig1_sha256(vs1[H4(3)]) + vs2[H4(2)] + sig0_sha256(vd[H4(2)]) + + vd[H4(1)]; + res[2] = + sig1_sha256(res[0]) + vs2[H4(3)] + sig0_sha256(vd[H4(3)]) + vd[H4(2)]; + res[3] = + sig1_sha256(res[1]) + vs1[H4(0)] + sig0_sha256(vs2[H4(0)]) + vd[H4(3)]; + vd[H4(3)] = res[3]; + vd[H4(2)] = res[2]; + vd[H4(1)] = res[1]; + vd[H4(0)] = res[0]; +} + +static inline void vsha2ms_e64(uint64_t *vd, uint64_t *vs1, uint64_t *vs2) +{ + uint64_t res[4]; + res[0] = sig1_sha512(vs1[2]) + vs2[1] + sig0_sha512(vd[1]) + vd[0]; + res[1] = sig1_sha512(vs1[3]) + vs2[2] + sig0_sha512(vd[2]) + vd[1]; + res[2] = sig1_sha512(res[0]) + vs2[3] + sig0_sha512(vd[3]) + vd[2]; + res[3] = sig1_sha512(res[1]) + vs1[0] + sig0_sha512(vs2[0]) + vd[3]; + vd[3] = res[3]; + vd[2] = res[2]; + vd[1] = res[1]; + vd[0] = res[0]; +} + +void HELPER(vsha2ms_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + uint32_t sew = FIELD_EX64(env->vtype, VTYPE, VSEW); + uint32_t esz = sew == MO_32 ? 4 : 8; + uint32_t total_elems; + uint32_t vta = vext_vta(desc); + + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { + if (sew == MO_32) { + vsha2ms_e32(((uint32_t *)vd) + i * 4, ((uint32_t *)vs1) + i * 4, + ((uint32_t *)vs2) + i * 4); + } else { + /* If not 32 then SEW should be 64 */ + vsha2ms_e64(((uint64_t *)vd) + i * 4, ((uint64_t *)vs1) + i * 4, + ((uint64_t *)vs2) + i * 4); + } + } + /* set tail elements to 1s */ + total_elems = vext_get_total_elems(env, desc, esz); + vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + env->vstart = 0; +} + +static inline uint64_t sum0_64(uint64_t x) +{ + return ror64(x, 28) ^ ror64(x, 34) ^ ror64(x, 39); +} + +static inline uint32_t sum0_32(uint32_t x) +{ + return ror32(x, 2) ^ ror32(x, 13) ^ ror32(x, 22); +} + +static inline uint64_t sum1_64(uint64_t x) +{ + return ror64(x, 14) ^ ror64(x, 18) ^ ror64(x, 41); +} + +static inline uint32_t sum1_32(uint32_t x) +{ + return ror32(x, 6) ^ ror32(x, 11) ^ ror32(x, 25); +} + +#define ch(x, y, z) ((x & y) ^ ((~x) & z)) + +#define maj(x, y, z) ((x & y) ^ (x & z) ^ (y & z)) + +static void vsha2c_64(uint64_t *vs2, uint64_t *vd, uint64_t *vs1) +{ + uint64_t a = vs2[3], b = vs2[2], e = vs2[1], f = vs2[0]; + uint64_t c = vd[3], d = vd[2], g = vd[1], h = vd[0]; + uint64_t W0 = vs1[0], W1 = vs1[1]; + uint64_t T1 = h + sum1_64(e) + ch(e, f, g) + W0; + uint64_t T2 = sum0_64(a) + maj(a, b, c); + + h = g; + g = f; + f = e; + e = d + T1; + d = c; + c = b; + b = a; + a = T1 + T2; + + T1 = h + sum1_64(e) + ch(e, f, g) + W1; + T2 = sum0_64(a) + maj(a, b, c); + h = g; + g = f; + f = e; + e = d + T1; + d = c; + c = b; + b = a; + a = T1 + T2; + + vd[0] = f; + vd[1] = e; + vd[2] = b; + vd[3] = a; +} + +static void vsha2c_32(uint32_t *vs2, uint32_t *vd, uint32_t *vs1) +{ + uint32_t a = vs2[H4(3)], b = vs2[H4(2)], e = vs2[H4(1)], f = vs2[H4(0)]; + uint32_t c = vd[H4(3)], d = vd[H4(2)], g = vd[H4(1)], h = vd[H4(0)]; + uint32_t W0 = vs1[H4(0)], W1 = vs1[H4(1)]; + uint32_t T1 = h + sum1_32(e) + ch(e, f, g) + W0; + uint32_t T2 = sum0_32(a) + maj(a, b, c); + + h = g; + g = f; + f = e; + e = d + T1; + d = c; + c = b; + b = a; + a = T1 + T2; + + T1 = h + sum1_32(e) + ch(e, f, g) + W1; + T2 = sum0_32(a) + maj(a, b, c); + h = g; + g = f; + f = e; + e = d + T1; + d = c; + c = b; + b = a; + a = T1 + T2; + + vd[H4(0)] = f; + vd[H4(1)] = e; + vd[H4(2)] = b; + vd[H4(3)] = a; +} + +void HELPER(vsha2ch_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + uint32_t sew = FIELD_EX64(env->vtype, VTYPE, VSEW); + uint32_t esz = sew == MO_64 ? 8 : 4; + uint32_t total_elems; + uint32_t vta = vext_vta(desc); + + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { + if (sew == MO_64) { + vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, + ((uint64_t *)vs1) + 4 * i + 2); + } else { + vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, + ((uint32_t *)vs1) + 4 * i + 2); + } + } + + /* set tail elements to 1s */ + total_elems = vext_get_total_elems(env, desc, esz); + vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + env->vstart = 0; +} + +void HELPER(vsha2cl_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, + uint32_t desc) +{ + uint32_t sew = FIELD_EX64(env->vtype, VTYPE, VSEW); + uint32_t esz = sew == MO_64 ? 8 : 4; + uint32_t total_elems; + uint32_t vta = vext_vta(desc); + + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { + if (sew == MO_64) { + vsha2c_64(((uint64_t *)vs2) + 4 * i, ((uint64_t *)vd) + 4 * i, + (((uint64_t *)vs1) + 4 * i)); + } else { + vsha2c_32(((uint32_t *)vs2) + 4 * i, ((uint32_t *)vd) + 4 * i, + (((uint32_t *)vs1) + 4 * i)); + } + } + + /* set tail elements to 1s */ + total_elems = vext_get_total_elems(env, desc, esz); + vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); + env->vstart = 0; +} From patchwork Thu Jun 22 16:16:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798520 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=kuecoTZ7; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Ft6hBtz20XS for ; Fri, 23 Jun 2023 02:20:50 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN1F-0003Qw-66; Thu, 22 Jun 2023 12:18:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN14-0002Pv-92 for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:38 -0400 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN11-0005oB-7Z for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:37 -0400 Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b54f5aac48so34316045ad.2 for ; Thu, 22 Jun 2023 09:18:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450713; x=1690042713; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=++iGz8W47Fl8VDZS9ZjlhnPiF8AUo0crt86BWGKHZn8=; b=kuecoTZ7lcRTVBT5sZ4It/KXlPX1ntdkk35NNaPJIImHuc8lwDl46K0mO5Iuv6Vzqc +gUaA8/6Ss1OrkXjsxfY2O5Bw1kI+mrpGtn25CGp4INK6QdcrZ/ZCKtQdCTvIHOGpCpM R1QVNlbyEAM0Ff+/xB6fsPO4Mi1zcsoX7hRnWhaONGat5NjaaOBpVgYAtZGon+vrpJ7G 4ZQu2hem+jUYZGS9cyildlN5MGQo+sujUDPk16MQnKX5o6AglbbDld1kzTRNakQNSStO bSgAq/95Zc+B8aY3WPoVlYryRmpkHIq4XsaeTJHgfi2JhRrL9vuLN6BIG0MmbASFWhsb HPNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450713; x=1690042713; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=++iGz8W47Fl8VDZS9ZjlhnPiF8AUo0crt86BWGKHZn8=; b=jwwyHZ/9ENN9tbaJjsbISeLx3NjJTYsTNwuhbzrComEZeXDLXae3FkJ9OPvfFByVUk cTjXWLKpEKzwsMeCUr2TEkwslOI2htteSWve1sYDvNCjMD8MjHLYvC9vvtBFeF05Ummk hPc3WBVcQiaLhykEgUn5yvGy+4uE+j7S1cxldZsGOJfPoLEdXWzFZygyCkOchIEOCMtn gbwbm/FVM1OWevWkmr9fJ8eASdM8SpU/EApdY9tOAghEiMlYugy7+IpUerJUs7vpPExJ +ChK5GB5h14QiPY62HIxSNTgEpivCkECJfUOnrnjgFRyBoOqXmX297pbVQ1TVmjZSL44 oA9A== X-Gm-Message-State: AC+VfDx14vjOzUbcMM+kB+dha2RqXbTmi9M0FjNCLC/GffnXoD/2ci5g xS/4fH58HpFsqHa6pzuB0ApTxyiTD4BFRZQkQoXeTQIMDrSn39EpTphBKE5+Pphc152r+DMX8YW coeaQZohmpOSap10GDOPInc8ncjdIb08yAIvnK5oNW6m7EgZAgaZjd8l4gszmp7jJ96ucz3tQ/m Pm X-Google-Smtp-Source: ACHHUZ7IzHgIDvv1lDGa4adUt67EIS+2ir57y4MWcWkeUz4Tc/2M7XpbtWJiVaipzPw4fEERZLoiHA== X-Received: by 2002:a17:902:e74e:b0:1b3:8865:aaae with SMTP id p14-20020a170902e74e00b001b38865aaaemr19278703plf.53.1687450712938; Thu, 22 Jun 2023 09:18:32 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:32 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Lawrence Hunter , Kiran Ostrolenk , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Liu Zhiwei , Nazar Kazakov , William Salmon Subject: [PATCH v4 12/17] target/riscv: Add Zvksh ISA extension support Date: Fri, 23 Jun 2023 00:16:28 +0800 Message-Id: <20230622161646.32005-13-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=max.chou@sifive.com; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Lawrence Hunter This commit adds support for the Zvksh vector-crypto extension, which consists of the following instructions: * vsm3me.vv * vsm3c.vi Translation functions are defined in `target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in `target/riscv/vcrypto_helper.c`. Co-authored-by: Kiran Ostrolenk [max.chou@sifive.com: Replaced vstart checking by TCG op] Signed-off-by: Kiran Ostrolenk Signed-off-by: Lawrence Hunter Signed-off-by: Max Chou Reviewed-by: Daniel Henrique Barboza --- target/riscv/cpu.c | 5 +- target/riscv/cpu_cfg.h | 1 + target/riscv/helper.h | 3 + target/riscv/insn32.decode | 4 + target/riscv/insn_trans/trans_rvvk.c.inc | 31 ++++++ target/riscv/vcrypto_helper.c | 134 +++++++++++++++++++++++ 6 files changed, 176 insertions(+), 2 deletions(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 6bba8ba8c9..c9a9ff80cd 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -121,6 +121,7 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zvkned, PRIV_VERSION_1_12_0, ext_zvkned), ISA_EXT_DATA_ENTRY(zvknha, PRIV_VERSION_1_12_0, ext_zvknha), ISA_EXT_DATA_ENTRY(zvknhb, PRIV_VERSION_1_12_0, ext_zvknhb), + ISA_EXT_DATA_ENTRY(zvksh, PRIV_VERSION_1_12_0, ext_zvksh), ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx), ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin), ISA_EXT_DATA_ENTRY(smaia, PRIV_VERSION_1_12_0, ext_smaia), @@ -1197,8 +1198,8 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp) * In principle Zve*x would also suffice here, were they supported * in qemu */ - if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkned || cpu->cfg.ext_zvknha) && - !cpu->cfg.ext_zve32f) { + if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkned || cpu->cfg.ext_zvknha || + cpu->cfg.ext_zvksh) && !cpu->cfg.ext_zve32f) { error_setg(errp, "Vector crypto extensions require V or Zve* extensions"); return; diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index 41cce87ffc..f859d9e2f5 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -88,6 +88,7 @@ struct RISCVCPUConfig { bool ext_zvkned; bool ext_zvknha; bool ext_zvknhb; + bool ext_zvksh; bool ext_zmmul; bool ext_zvfh; bool ext_zvfhmin; diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 19f5a8a28d..9220af18e6 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1238,3 +1238,6 @@ DEF_HELPER_5(vaeskf2_vi, void, ptr, ptr, i32, env, i32) DEF_HELPER_5(vsha2ms_vv, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(vsha2ch_vv, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(vsha2cl_vv, void, ptr, ptr, ptr, env, i32) + +DEF_HELPER_5(vsm3me_vv, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_5(vsm3c_vi, void, ptr, ptr, i32, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index d2cfb2729c..5ca83e8462 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -953,3 +953,7 @@ vaeskf2_vi 101010 1 ..... ..... 010 ..... 1110111 @r_vm_1 vsha2ms_vv 101101 1 ..... ..... 010 ..... 1110111 @r_vm_1 vsha2ch_vv 101110 1 ..... ..... 010 ..... 1110111 @r_vm_1 vsha2cl_vv 101111 1 ..... ..... 010 ..... 1110111 @r_vm_1 + +# *** Zvksh vector crypto extension *** +vsm3me_vv 100000 1 ..... ..... 010 ..... 1110111 @r_vm_1 +vsm3c_vi 101011 1 ..... ..... 010 ..... 1110111 @r_vm_1 diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index 528a0d3b32..af1fb74c38 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -479,3 +479,34 @@ static bool vsha_check(DisasContext *s, arg_rmrr *a) GEN_VV_UNMASKED_TRANS(vsha2ms_vv, vsha_check, ZVKNH_EGS) GEN_VV_UNMASKED_TRANS(vsha2cl_vv, vsha_check, ZVKNH_EGS) GEN_VV_UNMASKED_TRANS(vsha2ch_vv, vsha_check, ZVKNH_EGS) + +/* + * Zvksh + */ + +#define ZVKSH_EGS 8 + +static inline bool vsm3_check(DisasContext *s, arg_rmrr *a) +{ + int egw_bytes = ZVKSH_EGS << s->sew; + int mult = 1 << MAX(s->lmul, 0); + return s->cfg_ptr->ext_zvksh == true && + require_rvv(s) && + vext_check_isa_ill(s) && + !is_overlapped(a->rd, mult, a->rs2, mult) && + MAXSZ(s) >= egw_bytes && + s->sew == MO_32; +} + +static inline bool vsm3me_check(DisasContext *s, arg_rmrr *a) +{ + return vsm3_check(s, a) && vext_check_sss(s, a->rd, a->rs1, a->rs2, a->vm); +} + +static inline bool vsm3c_check(DisasContext *s, arg_rmrr *a) +{ + return vsm3_check(s, a) && vext_check_ss(s, a->rd, a->rs2, a->vm); +} + +GEN_VV_UNMASKED_TRANS(vsm3me_vv, vsm3me_check, ZVKSH_EGS) +GEN_VI_UNMASKED_TRANS(vsm3c_vi, vsm3c_check, ZVKSH_EGS) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index ca09062c6c..06c8f4adc7 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -717,3 +717,137 @@ void HELPER(vsha2cl_vv)(void *vd, void *vs1, void *vs2, CPURISCVState *env, vext_set_elems_1s(vd, vta, env->vl * esz, total_elems * esz); env->vstart = 0; } + +static inline uint32_t p1(uint32_t x) +{ + return x ^ rol32(x, 15) ^ rol32(x, 23); +} + +static inline uint32_t zvksh_w(uint32_t m16, uint32_t m9, uint32_t m3, + uint32_t m13, uint32_t m6) +{ + return p1(m16 ^ m9 ^ rol32(m3, 15)) ^ rol32(m13, 7) ^ m6; +} + +void HELPER(vsm3me_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr, + CPURISCVState *env, uint32_t desc) +{ + uint32_t esz = memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW)); + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + uint32_t vta = vext_vta(desc); + uint32_t *vd = vd_vptr; + uint32_t *vs1 = vs1_vptr; + uint32_t *vs2 = vs2_vptr; + + for (int i = env->vstart / 8; i < env->vl / 8; i++) { + uint32_t w[24]; + for (int j = 0; j < 8; j++) { + w[j] = bswap32(vs1[H4((i * 8) + j)]); + w[j + 8] = bswap32(vs2[H4((i * 8) + j)]); + } + for (int j = 0; j < 8; j++) { + w[j + 16] = + zvksh_w(w[j], w[j + 7], w[j + 13], w[j + 3], w[j + 10]); + } + for (int j = 0; j < 8; j++) { + vd[(i * 8) + j] = bswap32(w[H4(j + 16)]); + } + } + vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz); + env->vstart = 0; +} + +static inline uint32_t ff1(uint32_t x, uint32_t y, uint32_t z) +{ + return x ^ y ^ z; +} + +static inline uint32_t ff2(uint32_t x, uint32_t y, uint32_t z) +{ + return (x & y) | (x & z) | (y & z); +} + +static inline uint32_t ff_j(uint32_t x, uint32_t y, uint32_t z, uint32_t j) +{ + return (j <= 15) ? ff1(x, y, z) : ff2(x, y, z); +} + +static inline uint32_t gg1(uint32_t x, uint32_t y, uint32_t z) +{ + return x ^ y ^ z; +} + +static inline uint32_t gg2(uint32_t x, uint32_t y, uint32_t z) +{ + return (x & y) | (~x & z); +} + +static inline uint32_t gg_j(uint32_t x, uint32_t y, uint32_t z, uint32_t j) +{ + return (j <= 15) ? gg1(x, y, z) : gg2(x, y, z); +} + +static inline uint32_t t_j(uint32_t j) +{ + return (j <= 15) ? 0x79cc4519 : 0x7a879d8a; +} + +static inline uint32_t p_0(uint32_t x) +{ + return x ^ rol32(x, 9) ^ rol32(x, 17); +} + +static void sm3c(uint32_t *vd, uint32_t *vs1, uint32_t *vs2, uint32_t uimm) +{ + uint32_t x0, x1; + uint32_t j; + uint32_t ss1, ss2, tt1, tt2; + x0 = vs2[0] ^ vs2[4]; + x1 = vs2[1] ^ vs2[5]; + j = 2 * uimm; + ss1 = rol32(rol32(vs1[0], 12) + vs1[4] + rol32(t_j(j), j % 32), 7); + ss2 = ss1 ^ rol32(vs1[0], 12); + tt1 = ff_j(vs1[0], vs1[1], vs1[2], j) + vs1[3] + ss2 + x0; + tt2 = gg_j(vs1[4], vs1[5], vs1[6], j) + vs1[7] + ss1 + vs2[0]; + vs1[3] = vs1[2]; + vd[3] = rol32(vs1[1], 9); + vs1[1] = vs1[0]; + vd[1] = tt1; + vs1[7] = vs1[6]; + vd[7] = rol32(vs1[5], 19); + vs1[5] = vs1[4]; + vd[5] = p_0(tt2); + j = 2 * uimm + 1; + ss1 = rol32(rol32(vd[1], 12) + vd[5] + rol32(t_j(j), j % 32), 7); + ss2 = ss1 ^ rol32(vd[1], 12); + tt1 = ff_j(vd[1], vs1[1], vd[3], j) + vs1[3] + ss2 + x1; + tt2 = gg_j(vd[5], vs1[5], vd[7], j) + vs1[7] + ss1 + vs2[1]; + vd[2] = rol32(vs1[1], 9); + vd[0] = tt1; + vd[6] = rol32(vs1[5], 19); + vd[4] = p_0(tt2); +} + +void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, + CPURISCVState *env, uint32_t desc) +{ + uint32_t esz = memop_size(FIELD_EX64(env->vtype, VTYPE, VSEW)); + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + uint32_t vta = vext_vta(desc); + uint32_t *vd = vd_vptr; + uint32_t *vs2 = vs2_vptr; + uint32_t v1[8], v2[8], v3[8]; + + for (int i = env->vstart / 8; i < env->vl / 8; i++) { + for (int k = 0; k < 8; k++) { + v2[k] = bswap32(vd[H4(i * 8 + k)]); + v3[k] = bswap32(vs2[H4(i * 8 + k)]); + } + sm3c(v1, v2, v3, uimm); + for (int k = 0; k < 8; k++) { + vd[i * 8 + k] = bswap32(v1[H4(k)]); + } + } + vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz); + env->vstart = 0; +} From patchwork Thu Jun 22 16:16:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798524 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=k2NcTbi3; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5GP156vz20XS for ; Fri, 23 Jun 2023 02:21:17 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN1I-0003i8-VF; Thu, 22 Jun 2023 12:18:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN18-0002os-JR for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:42 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN15-0005px-M0 for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:41 -0400 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1b55643507dso41625085ad.0 for ; Thu, 22 Jun 2023 09:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450718; x=1690042718; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HlQQfJpt5YP/8KFD0EcLOqkCH1ejvEd65Ipu6QGIQug=; b=k2NcTbi3yI/+XTDWKlajGhb64ywr9u6rsprBLhGF5MERwZJ0EXYA+lV0k7SAR5krUj RNit3xrmFYm/MqW2ofKO6t8Xq4qXtebfYUdUZZtSYN+P5k8fOo+TrPEejNikhXaA404l RRuGJJGyoyvE1tzOqre5in9shz99GIYKljv6pKjvmqhfbykNlzIci4j0ALJjo7a23qAL 4VhelBURG4JNG54GRNb1Jxk7a+GxWH8KOoTnXD1rJAOCnMkUvAgm4kGEUW91jFT06gjQ BRG1SAMI0cPwPuSMOLcKja6/vEmdQYHtVY4bH7JxnAGDNfRc56myVZis+QTZX0yVBmjQ JtBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450718; x=1690042718; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HlQQfJpt5YP/8KFD0EcLOqkCH1ejvEd65Ipu6QGIQug=; b=EWGMg/lZCjaJXOOTsYkbarBJeKSWZIs6P4OpN7DQdGsNdZDO8A7HyydHQbbeSlUtwS YBKASJhYDxKp7ZDg0jZXSrCykoBykGW5yGNeWwQMWAgo4C1zksjZa1eUODRXtd1mGO+m 78+fBDV4sklfgrdb6Ad0MR9wgZA1f+qrrxoAHsnCE2jfVRskZyY077GGG7zlJrX03HZ3 FHTSpNuYnZz/t2pbbS6hvFV8Osbro84EgA8Im5si2BHnlyxb0P9J+v09vG3GB5K9RluB /AR1jxIVcoGNSj/bB+Ro5PbW6gcQMlUV+1FV1x29TlVJsJEQiGne8gK1Wo0da9UcNXgb W7TA== X-Gm-Message-State: AC+VfDxu5iu6Qc1XMiPrX1UDiMpi0hfA4sgmkjnFlnqPc6mu7VV2DCSp L9Zt61nGOyVL0CHFELF6sjO2g9h9T9cbvi0cf/ik340qwuacFLDdSucsgk6qPYajFbWCuW1HYEe Lr3NkLkGW3JqgGZ5DxHXWtVAGU6Fn0ypJkZ0itWyyXEiotsh42+amQzswhA9zHOBYcR/zBYeu+i m3 X-Google-Smtp-Source: ACHHUZ6gBaCSWr6lVlrxYdzQgzR+8PuDt9jhsLBHGjdmeQYONJSOXzTJcqAZh5W0iFcqJslnBhAVEA== X-Received: by 2002:a17:902:e810:b0:1ab:11c8:777a with SMTP id u16-20020a170902e81000b001ab11c8777amr24437115plg.13.1687450718178; Thu, 22 Jun 2023 09:18:38 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:37 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Nazar Kazakov , Lawrence Hunter , Max Chou , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Liu Zhiwei , Kiran Ostrolenk , William Salmon Subject: [PATCH v4 13/17] target/riscv: Add Zvkg ISA extension support Date: Fri, 23 Jun 2023 00:16:29 +0800 Message-Id: <20230622161646.32005-14-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=max.chou@sifive.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Nazar Kazakov This commit adds support for the Zvkg vector-crypto extension, which consists of the following instructions: * vgmul.vv * vghsh.vv Translation functions are defined in `target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in `target/riscv/vcrypto_helper.c`. Co-authored-by: Lawrence Hunter [max.chou@sifive.com: Replaced vstart checking by TCG op] Signed-off-by: Lawrence Hunter Signed-off-by: Nazar Kazakov Signed-off-by: Max Chou Reviewed-by: Daniel Henrique Barboza --- target/riscv/cpu.c | 5 +- target/riscv/cpu_cfg.h | 1 + target/riscv/helper.h | 3 + target/riscv/insn32.decode | 4 ++ target/riscv/insn_trans/trans_rvvk.c.inc | 30 ++++++++++ target/riscv/vcrypto_helper.c | 72 ++++++++++++++++++++++++ 6 files changed, 113 insertions(+), 2 deletions(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index c9a9ff80cd..8e60a122d4 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -118,6 +118,7 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zve64d, PRIV_VERSION_1_10_0, ext_zve64d), ISA_EXT_DATA_ENTRY(zvfh, PRIV_VERSION_1_12_0, ext_zvfh), ISA_EXT_DATA_ENTRY(zvfhmin, PRIV_VERSION_1_12_0, ext_zvfhmin), + ISA_EXT_DATA_ENTRY(zvkg, PRIV_VERSION_1_12_0, ext_zvkg), ISA_EXT_DATA_ENTRY(zvkned, PRIV_VERSION_1_12_0, ext_zvkned), ISA_EXT_DATA_ENTRY(zvknha, PRIV_VERSION_1_12_0, ext_zvknha), ISA_EXT_DATA_ENTRY(zvknhb, PRIV_VERSION_1_12_0, ext_zvknhb), @@ -1198,8 +1199,8 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp) * In principle Zve*x would also suffice here, were they supported * in qemu */ - if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkned || cpu->cfg.ext_zvknha || - cpu->cfg.ext_zvksh) && !cpu->cfg.ext_zve32f) { + if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkg || cpu->cfg.ext_zvkned || + cpu->cfg.ext_zvknha || cpu->cfg.ext_zvksh) && !cpu->cfg.ext_zve32f) { error_setg(errp, "Vector crypto extensions require V or Zve* extensions"); return; diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index f859d9e2f5..b125b0b33f 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -85,6 +85,7 @@ struct RISCVCPUConfig { bool ext_zve64d; bool ext_zvbb; bool ext_zvbc; + bool ext_zvkg; bool ext_zvkned; bool ext_zvknha; bool ext_zvknhb; diff --git a/target/riscv/helper.h b/target/riscv/helper.h index 9220af18e6..a4fe1ff5ca 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1241,3 +1241,6 @@ DEF_HELPER_5(vsha2cl_vv, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(vsm3me_vv, void, ptr, ptr, ptr, env, i32) DEF_HELPER_5(vsm3c_vi, void, ptr, ptr, i32, env, i32) + +DEF_HELPER_5(vghsh_vv, void, ptr, ptr, ptr, env, i32) +DEF_HELPER_4(vgmul_vv, void, ptr, ptr, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index 5ca83e8462..b10497afd3 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -957,3 +957,7 @@ vsha2cl_vv 101111 1 ..... ..... 010 ..... 1110111 @r_vm_1 # *** Zvksh vector crypto extension *** vsm3me_vv 100000 1 ..... ..... 010 ..... 1110111 @r_vm_1 vsm3c_vi 101011 1 ..... ..... 010 ..... 1110111 @r_vm_1 + +# *** Zvkg vector crypto extension *** +vghsh_vv 101100 1 ..... ..... 010 ..... 1110111 @r_vm_1 +vgmul_vv 101000 1 ..... 10001 010 ..... 1110111 @r2_vm_1 diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index af1fb74c38..e5ccb26c45 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -510,3 +510,33 @@ static inline bool vsm3c_check(DisasContext *s, arg_rmrr *a) GEN_VV_UNMASKED_TRANS(vsm3me_vv, vsm3me_check, ZVKSH_EGS) GEN_VI_UNMASKED_TRANS(vsm3c_vi, vsm3c_check, ZVKSH_EGS) + +/* + * Zvkg + */ + +#define ZVKG_EGS 4 + +static bool vgmul_check(DisasContext *s, arg_rmr *a) +{ + int egw_bytes = ZVKG_EGS << s->sew; + return s->cfg_ptr->ext_zvkg == true && + vext_check_isa_ill(s) && + require_rvv(s) && + MAXSZ(s) >= egw_bytes && + vext_check_ss(s, a->rd, a->rs2, a->vm) && + s->sew == MO_32; +} + +GEN_V_UNMASKED_TRANS(vgmul_vv, vgmul_check, ZVKG_EGS) + +static bool vghsh_check(DisasContext *s, arg_rmrr *a) +{ + int egw_bytes = ZVKG_EGS << s->sew; + return s->cfg_ptr->ext_zvkg == true && + opivv_check(s, a) && + MAXSZ(s) >= egw_bytes && + s->sew == MO_32; +} + +GEN_VV_UNMASKED_TRANS(vghsh_vv, vghsh_check, ZVKG_EGS) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index 06c8f4adc7..04e6374211 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -851,3 +851,75 @@ void HELPER(vsm3c_vi)(void *vd_vptr, void *vs2_vptr, uint32_t uimm, vext_set_elems_1s(vd_vptr, vta, env->vl * esz, total_elems * esz); env->vstart = 0; } + +void HELPER(vghsh_vv)(void *vd_vptr, void *vs1_vptr, void *vs2_vptr, + CPURISCVState *env, uint32_t desc) +{ + uint64_t *vd = vd_vptr; + uint64_t *vs1 = vs1_vptr; + uint64_t *vs2 = vs2_vptr; + uint32_t vta = vext_vta(desc); + uint32_t total_elems = vext_get_total_elems(env, desc, 4); + + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { + uint64_t Y[2] = {vd[i * 2 + 0], vd[i * 2 + 1]}; + uint64_t H[2] = {brev8(vs2[i * 2 + 0]), brev8(vs2[i * 2 + 1])}; + uint64_t X[2] = {vs1[i * 2 + 0], vs1[i * 2 + 1]}; + uint64_t Z[2] = {0, 0}; + + uint64_t S[2] = {brev8(Y[0] ^ X[0]), brev8(Y[1] ^ X[1])}; + + for (uint j = 0; j < 128; j++) { + if ((S[j / 64] >> (j % 64)) & 1) { + Z[0] ^= H[0]; + Z[1] ^= H[1]; + } + bool reduce = ((H[1] >> 63) & 1); + H[1] = H[1] << 1 | H[0] >> 63; + H[0] = H[0] << 1; + if (reduce) { + H[0] ^= 0x87; + } + } + + vd[i * 2 + 0] = brev8(Z[0]); + vd[i * 2 + 1] = brev8(Z[1]); + } + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4); + env->vstart = 0; +} + +void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env, + uint32_t desc) +{ + uint64_t *vd = vd_vptr; + uint64_t *vs2 = vs2_vptr; + uint32_t vta = vext_vta(desc); + uint32_t total_elems = vext_get_total_elems(env, desc, 4); + + for (uint32_t i = env->vstart / 4; i < env->vl / 4; i++) { + uint64_t Y[2] = {brev8(vd[i * 2 + 0]), brev8(vd[i * 2 + 1])}; + uint64_t H[2] = {brev8(vs2[i * 2 + 0]), brev8(vs2[i * 2 + 1])}; + uint64_t Z[2] = {0, 0}; + + for (uint j = 0; j < 128; j++) { + if ((Y[j / 64] >> (j % 64)) & 1) { + Z[0] ^= H[0]; + Z[1] ^= H[1]; + } + bool reduce = ((H[1] >> 63) & 1); + H[1] = H[1] << 1 | H[0] >> 63; + H[0] = H[0] << 1; + if (reduce) { + H[0] ^= 0x87; + } + } + + vd[i * 2 + 0] = brev8(Z[0]); + vd[i * 2 + 1] = brev8(Z[1]); + } + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4); + env->vstart = 0; +} From patchwork Thu Jun 22 16:16:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798525 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=LF5Q0GNC; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5GV73Pgz20XS for ; Fri, 23 Jun 2023 02:21:22 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN1I-0003ef-Hu; Thu, 22 Jun 2023 12:18:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN1C-0003Mv-Sr for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:48 -0400 Received: from mail-pl1-x630.google.com ([2607:f8b0:4864:20::630]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN1A-00069A-5J for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:46 -0400 Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1b52bf6e669so57494635ad.2 for ; Thu, 22 Jun 2023 09:18:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450722; x=1690042722; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5EdDN+KecjH+W/bjTNWjdWHVlQ3BFHGeOB5Edh+tsFE=; b=LF5Q0GNCup3H5Tn+kQTWafzXlsKG7fnn3ryqxswS/mLmURxCqN7ehXyyd/WSQlMm3r wg4M3SqL+hs+gb//Kk8uNRw2xWO4g9ugmQ8Yp9ZcrQMWhwUdP2rIHjlDSj/3jWuG0URg oVdHYPU356CnArsZrIjQc9gC2U10tyMUgerJUq46ZLO1kqg6f/MxE1V3tUw2sFZC0r7y 2ResoKbEWErcUXjof2Dz/ZMGq0z9yNTRZEZOxWsQkBzpv03IihM5e0L/ikc27nw3Mfrp bfOYQmPfGpq6V1fcgTmDXHlXShuIBvVfV4DLvn605TitHnoGsDPxdKvoUnSBaz2tKuCJ BqCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450722; x=1690042722; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5EdDN+KecjH+W/bjTNWjdWHVlQ3BFHGeOB5Edh+tsFE=; b=DmRjkjhRlSskR9Y0GYvXvkX1hATMAJax4pq20F4wMLZfT8BH77+NYcI4GljaCEC0WX Ygnq18P3mTE6bd1E9eo41Yi2HWdh/ALKxpHvKmh/JoOKAutyQJUVXdNl/nHdCbeNw2MK XbCeyKPrV6NCu6n/007/zDaZSRsxFSGMIhbDREtMhJqOnIXgbQhTWbNh9jCecBacbiKf YCrA8ahF63U2bHnzUs7HR8wi+qsii4neFu7urVHf6rmNKjYyGz6exo14Z0X4axrFZz0G X+LEH0BhHpgMuaNQEwQT6TYvctFFWdXc3H8TYEFt/3i22q1hg7TR4544qgBHYoIyC7jQ wYHw== X-Gm-Message-State: AC+VfDyeaaB3NHZOyUKaaPhhpVXjXcgpu7VVS2BVUuT1iYfD7pfubzvN t9ghQeqXnq7Dl9V1o+dLXEudmiMADEZPAJEVJwIF/sokimHX4BfEICZ4DV8J5gPV1o1X3DfXvbm oX35jzyZltLdcseNVRUV8b3QZGDxKmRZfEKNm/YqeFMgh7K7pOtgTCWdPY39KvSQNWSp6zx8H9t Bz X-Google-Smtp-Source: ACHHUZ6NI02HJvPmARhD7t6i382U2SzsrO7z8whSg+pQb5tyWeqk/dhqHLqP6I2DGStWLCHIaqYJ4w== X-Received: by 2002:a17:902:d38d:b0:1b1:9f7d:2aec with SMTP id e13-20020a170902d38d00b001b19f7d2aecmr17580654pld.68.1687450722579; Thu, 22 Jun 2023 09:18:42 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:42 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Max Chou , Frank Chang , Richard Henderson , =?utf-8?q?Daniel_P=2E_Ber?= =?utf-8?q?rang=C3=A9?= , Peter Maydell , qemu-arm@nongnu.org Subject: [PATCH v4 14/17] crypto: Create sm4_subword Date: Fri, 23 Jun 2023 00:16:30 +0800 Message-Id: <20230622161646.32005-15-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::630; envelope-from=max.chou@sifive.com; helo=mail-pl1-x630.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Allows sharing of sm4_subword between different targets. Signed-off-by: Max Chou Reviewed-by: Frank Chang Reviewed-by: Richard Henderson Signed-off-by: Max Chou --- include/crypto/sm4.h | 8 ++++++++ target/arm/tcg/crypto_helper.c | 10 ++-------- 2 files changed, 10 insertions(+), 8 deletions(-) diff --git a/include/crypto/sm4.h b/include/crypto/sm4.h index 9bd3ebc62e..de8245d8a7 100644 --- a/include/crypto/sm4.h +++ b/include/crypto/sm4.h @@ -3,4 +3,12 @@ extern const uint8_t sm4_sbox[256]; +static inline uint32_t sm4_subword(uint32_t word) +{ + return sm4_sbox[word & 0xff] | + sm4_sbox[(word >> 8) & 0xff] << 8 | + sm4_sbox[(word >> 16) & 0xff] << 16 | + sm4_sbox[(word >> 24) & 0xff] << 24; +} + #endif diff --git a/target/arm/tcg/crypto_helper.c b/target/arm/tcg/crypto_helper.c index d28690321f..58e6c4f779 100644 --- a/target/arm/tcg/crypto_helper.c +++ b/target/arm/tcg/crypto_helper.c @@ -707,10 +707,7 @@ static void do_crypto_sm4e(uint64_t *rd, uint64_t *rn, uint64_t *rm) CR_ST_WORD(d, (i + 3) % 4) ^ CR_ST_WORD(n, i); - t = sm4_sbox[t & 0xff] | - sm4_sbox[(t >> 8) & 0xff] << 8 | - sm4_sbox[(t >> 16) & 0xff] << 16 | - sm4_sbox[(t >> 24) & 0xff] << 24; + t = sm4_subword(t); CR_ST_WORD(d, i) ^= t ^ rol32(t, 2) ^ rol32(t, 10) ^ rol32(t, 18) ^ rol32(t, 24); @@ -744,10 +741,7 @@ static void do_crypto_sm4ekey(uint64_t *rd, uint64_t *rn, uint64_t *rm) CR_ST_WORD(d, (i + 3) % 4) ^ CR_ST_WORD(m, i); - t = sm4_sbox[t & 0xff] | - sm4_sbox[(t >> 8) & 0xff] << 8 | - sm4_sbox[(t >> 16) & 0xff] << 16 | - sm4_sbox[(t >> 24) & 0xff] << 24; + t = sm4_subword(t); CR_ST_WORD(d, i) ^= t ^ rol32(t, 13) ^ rol32(t, 23); } From patchwork Thu Jun 22 16:16:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798526 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=SqYk/jO4; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5Gg2QF6z20XS for ; Fri, 23 Jun 2023 02:21:31 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN1I-0003hb-SD; Thu, 22 Jun 2023 12:18:52 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN1F-0003Tw-Tk for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:50 -0400 Received: from mail-pl1-x62c.google.com ([2607:f8b0:4864:20::62c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN1E-0006An-9Y for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:49 -0400 Received: by mail-pl1-x62c.google.com with SMTP id d9443c01a7336-1b51488ad67so39638655ad.3 for ; Thu, 22 Jun 2023 09:18:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450726; x=1690042726; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eM7fMUS/bQ7fR32WgG7Yf2+xFexOx9ylKR4feRvm9IU=; b=SqYk/jO4BmTu1pVDOl/xPtyhH4VTMjarNCNOIK064NIuKDXGt/Boxaqp5gnjU4RK53 BCsSu9qd31XZhs6VxfwdzPYtmQdUdh1pTeSU1u7cJJaS55O58nVKfDuxTF7DYO4Nm5Ri 3PfVkXRYU0EPTReMa+JRYoFuYV737SN0x16La7u9yGQFXV6z+MCqnoaydL63O4L+IlmU BNQazqPPhWKaI6z9YKwC8Rg6i/jVs9Sb+L0kSIfRlNGmaIoF6b2QuWB69RpsH5be1Qva 4qYYbtJZWnnvatLWfpZ1uWyIgrrSC7gaFMJIS0tc6MNn2NLKf74TBYeaZLQ/493oRigk D9bA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450726; x=1690042726; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eM7fMUS/bQ7fR32WgG7Yf2+xFexOx9ylKR4feRvm9IU=; b=Giyzp3BgN6uUdFR8f5wc8UWxxuNodMRhXJdVbvQUW6byMaHYfj+93t1QNHDRlQs5dl NGDUXPw06PaUF0ZaAjqWXQ+VIFZgqQpMCHsVJi/aSGGIA92SoeOP56KiIfbMVqB5ZRwq fJtmKRRqfHn+mWQHaCGoInGX0iTvd3p9xw4Y8Z/Zj0cKlAH/qmi6XWPuqASdhwCwTdwp LWeV4bV+QBklRWZwoxf3/oPufxrzimZRqjFHak3MRFXEyssPTfK+IhyZ4vNeGEBQhM+0 7TXFP127+ywSjvkC68rn4rQFiHa8OJ31QKRb0mGb+oI2WIReFyaEnnAx6+wh+EcUno9s 1s2g== X-Gm-Message-State: AC+VfDyhQ4V1BRNTm5b9Ow6ISPakrtkjRhMYcxPEfmwCM8xT2+HSF0y0 iO1mSS/qB0/D8aVFLoyLP/0ofHQ61472mxQCcWeKgeGfqTmqxQBnDZm3Fu0sMt6cX0LKEovwgaV Ai6lbkkjbKsUAtsADYU6yRBilh9Pr7Vzy1zV1087PBEan21Y8405htXFBKkON1dFfE/fefAwv35 Nj X-Google-Smtp-Source: ACHHUZ5UVoHWlDclwROU1+WCw2kmw/khG6980sONrxh2/ysbl3ro9h5UEi9FO7GCIBHhWVK8QCsh6g== X-Received: by 2002:a17:902:6bc1:b0:1b1:9d3c:66c9 with SMTP id m1-20020a1709026bc100b001b19d3c66c9mr14441827plt.5.1687450726564; Thu, 22 Jun 2023 09:18:46 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:46 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Max Chou , Frank Chang , =?utf-8?q?Daniel_P=2E_Berrang=C3=A9?= Subject: [PATCH v4 15/17] crypto: Add SM4 constant parameter CK Date: Fri, 23 Jun 2023 00:16:31 +0800 Message-Id: <20230622161646.32005-16-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62c; envelope-from=max.chou@sifive.com; helo=mail-pl1-x62c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Adds sm4_ck constant for use in sm4 cryptography across different targets. Signed-off-by: Max Chou Reviewed-by: Frank Chang Signed-off-by: Max Chou --- crypto/sm4.c | 10 ++++++++++ include/crypto/sm4.h | 1 + 2 files changed, 11 insertions(+) diff --git a/crypto/sm4.c b/crypto/sm4.c index 9f0cd452c7..2987306cf7 100644 --- a/crypto/sm4.c +++ b/crypto/sm4.c @@ -47,3 +47,13 @@ uint8_t const sm4_sbox[] = { 0x79, 0xee, 0x5f, 0x3e, 0xd7, 0xcb, 0x39, 0x48, }; +uint32_t const sm4_ck[] = { + 0x00070e15, 0x1c232a31, 0x383f464d, 0x545b6269, + 0x70777e85, 0x8c939aa1, 0xa8afb6bd, 0xc4cbd2d9, + 0xe0e7eef5, 0xfc030a11, 0x181f262d, 0x343b4249, + 0x50575e65, 0x6c737a81, 0x888f969d, 0xa4abb2b9, + 0xc0c7ced5, 0xdce3eaf1, 0xf8ff060d, 0x141b2229, + 0x30373e45, 0x4c535a61, 0x686f767d, 0x848b9299, + 0xa0a7aeb5, 0xbcc3cad1, 0xd8dfe6ed, 0xf4fb0209, + 0x10171e25, 0x2c333a41, 0x484f565d, 0x646b7279 +}; diff --git a/include/crypto/sm4.h b/include/crypto/sm4.h index de8245d8a7..382b26d922 100644 --- a/include/crypto/sm4.h +++ b/include/crypto/sm4.h @@ -2,6 +2,7 @@ #define QEMU_SM4_H extern const uint8_t sm4_sbox[256]; +extern const uint32_t sm4_ck[32]; static inline uint32_t sm4_subword(uint32_t word) { From patchwork Thu Jun 22 16:16:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Max Chou X-Patchwork-Id: 1798517 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=sifive.com header.i=@sifive.com header.a=rsa-sha256 header.s=google header.b=Rill2gDI; dkim-atps=neutral Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Qn5DL0GqHz20Xt for ; Fri, 23 Jun 2023 02:19:30 +1000 (AEST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qCN1V-0004rg-2A; Thu, 22 Jun 2023 12:19:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qCN1Q-0004gF-I4 for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:19:00 -0400 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qCN1K-0006Oe-Jo for qemu-devel@nongnu.org; Thu, 22 Jun 2023 12:18:58 -0400 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1b6824141b4so6006655ad.1 for ; Thu, 22 Jun 2023 09:18:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1687450733; x=1690042733; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V9rXS2uVsmJClLL/Ilu4TbLjU15vLGUVBtBv7T+C6BA=; b=Rill2gDIhoHbaKvPCa31OmmmMuyJt096Z/1GVUonFM+am+iVsDYosUVcmTzI9E1IT/ AjzIRAn/s4yEzQzZmFDszZG1lWnQUIiwoOfAFfmy5MLb0YMHL7hSL2jPwDJstWOXMffh erVLz2YgxH5bHAjl42VOp4CZo23GEgb6wWRPr7ZjuS9maNwMoPFBpyS8gTd+enjJg2sD w12BYgeilrAzAQH+spfqth/X/FHYe3sOloIklJNM4tCinY65ILWdmEVrCz8N+6a/CLDR LloeS2wJtz0LBa5zWJEw7D68bDTszCePUhOUGO7DrBjSeSm527lNfYhz/6yAolQnZ+Mx gkkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687450733; x=1690042733; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=V9rXS2uVsmJClLL/Ilu4TbLjU15vLGUVBtBv7T+C6BA=; b=geTqbFVvKd5ZMzbsOxxMSMBWHKoJ5auNJeXTR2DwmqYtxZbgd9+IarAHXPGxhI5OrU BWTDSKxXBA3skSB1u92n4swE/4H1UI32RFNSz3/If2KP9ffKcedVGm+K9YHo105OGB8w WK34Py8oa3ikAyJxyGYvULkiEHMOo9xExurzJb9WDR4BrTLYnkzXAGuAulOZb8u0KKg5 7kKKomAA49RC8OQqNlQNqkJdanYbL0hn7xVdG03ogB9EH09pRhkyXA6nBH3zdB3YEoVA +vfbYhR56k6dK9R6VrD3rQFzfHHSlHgC91oZYPbCvuAxD2HzN+qIOb8I/pifz2U2+g/z 1P+A== X-Gm-Message-State: AC+VfDzUVDeTVDMWkfv2bbe16HfgA4WlIZZANKjPh1S9UrPZDcT5y22y Kku1tIANi2CVuJoRDbYofZS7931cLbHZeOhhmuDE2iMWRWEih7s1xbZ0p34xApkgY0tINHDJ6G8 hS6PqFHAFqF69zN1GEcFdifed+xxQ7uze1Amp8yfLwl2Rtj7JGw1WF3BsR42DaXIiJjkJMzn4/9 ru X-Google-Smtp-Source: ACHHUZ5hJziijIhjvBHvl0Lah+xvUsmdU5+Qn3CgC9oE6mn9v5RSnjuS6J9kEODi7e8VQGjxXnRGLg== X-Received: by 2002:a17:903:228b:b0:1b6:9fe9:a450 with SMTP id b11-20020a170903228b00b001b69fe9a450mr6784132plh.32.1687450732882; Thu, 22 Jun 2023 09:18:52 -0700 (PDT) Received: from MaxdeMBP.localdomain (125-228-20-175.hinet-ip.hinet.net. [125.228.20.175]) by smtp.gmail.com with ESMTPSA id c1-20020a170902d48100b001b3f039f8a8sm5609683plg.61.2023.06.22.09.18.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 22 Jun 2023 09:18:52 -0700 (PDT) From: Max Chou To: qemu-devel@nongnu.org, qemu-riscv@nongnu.org Cc: dbarboza@ventanamicro.com, Max Chou , Frank Chang , Palmer Dabbelt , Alistair Francis , Bin Meng , Weiwei Li , Liu Zhiwei , Nazar Kazakov , Lawrence Hunter , Kiran Ostrolenk , William Salmon Subject: [PATCH v4 16/17] target/riscv: Add Zvksed ISA extension support Date: Fri, 23 Jun 2023 00:16:32 +0800 Message-Id: <20230622161646.32005-17-max.chou@sifive.com> X-Mailer: git-send-email 2.39.2 (Apple Git-143) In-Reply-To: <20230622161646.32005-1-max.chou@sifive.com> References: <20230622161646.32005-1-max.chou@sifive.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::635; envelope-from=max.chou@sifive.com; helo=mail-pl1-x635.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This commit adds support for the Zvksed vector-crypto extension, which consists of the following instructions: * vsm4k.vi * vsm4r.[vv,vs] Translation functions are defined in `target/riscv/insn_trans/trans_rvvk.c.inc` and helpers are defined in `target/riscv/vcrypto_helper.c`. Signed-off-by: Max Chou Reviewed-by: Frank Chang [lawrence.hunter@codethink.co.uk: Moved SM4 functions from crypto_helper.c to vcrypto_helper.c] [nazar.kazakov@codethink.co.uk: Added alignment checks, refactored code to use macros, and minor style changes] Signed-off-by: Max Chou --- target/riscv/cpu.c | 4 +- target/riscv/cpu_cfg.h | 1 + target/riscv/helper.h | 4 + target/riscv/insn32.decode | 5 + target/riscv/insn_trans/trans_rvvk.c.inc | 43 ++++++++ target/riscv/vcrypto_helper.c | 127 +++++++++++++++++++++++ 6 files changed, 183 insertions(+), 1 deletion(-) diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c index 8e60a122d4..c1956dc29b 100644 --- a/target/riscv/cpu.c +++ b/target/riscv/cpu.c @@ -122,6 +122,7 @@ static const struct isa_ext_data isa_edata_arr[] = { ISA_EXT_DATA_ENTRY(zvkned, PRIV_VERSION_1_12_0, ext_zvkned), ISA_EXT_DATA_ENTRY(zvknha, PRIV_VERSION_1_12_0, ext_zvknha), ISA_EXT_DATA_ENTRY(zvknhb, PRIV_VERSION_1_12_0, ext_zvknhb), + ISA_EXT_DATA_ENTRY(zvksed, PRIV_VERSION_1_12_0, ext_zvksed), ISA_EXT_DATA_ENTRY(zvksh, PRIV_VERSION_1_12_0, ext_zvksh), ISA_EXT_DATA_ENTRY(zhinx, PRIV_VERSION_1_12_0, ext_zhinx), ISA_EXT_DATA_ENTRY(zhinxmin, PRIV_VERSION_1_12_0, ext_zhinxmin), @@ -1200,7 +1201,8 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp) * in qemu */ if ((cpu->cfg.ext_zvbb || cpu->cfg.ext_zvkg || cpu->cfg.ext_zvkned || - cpu->cfg.ext_zvknha || cpu->cfg.ext_zvksh) && !cpu->cfg.ext_zve32f) { + cpu->cfg.ext_zvknha || cpu->cfg.ext_zvksed || cpu->cfg.ext_zvksh) && + !cpu->cfg.ext_zve32f) { error_setg(errp, "Vector crypto extensions require V or Zve* extensions"); return; diff --git a/target/riscv/cpu_cfg.h b/target/riscv/cpu_cfg.h index b125b0b33f..a3ece218bf 100644 --- a/target/riscv/cpu_cfg.h +++ b/target/riscv/cpu_cfg.h @@ -89,6 +89,7 @@ struct RISCVCPUConfig { bool ext_zvkned; bool ext_zvknha; bool ext_zvknhb; + bool ext_zvksed; bool ext_zvksh; bool ext_zmmul; bool ext_zvfh; diff --git a/target/riscv/helper.h b/target/riscv/helper.h index a4fe1ff5ca..16e6726666 100644 --- a/target/riscv/helper.h +++ b/target/riscv/helper.h @@ -1244,3 +1244,7 @@ DEF_HELPER_5(vsm3c_vi, void, ptr, ptr, i32, env, i32) DEF_HELPER_5(vghsh_vv, void, ptr, ptr, ptr, env, i32) DEF_HELPER_4(vgmul_vv, void, ptr, ptr, env, i32) + +DEF_HELPER_5(vsm4k_vi, void, ptr, ptr, i32, env, i32) +DEF_HELPER_4(vsm4r_vv, void, ptr, ptr, env, i32) +DEF_HELPER_4(vsm4r_vs, void, ptr, ptr, env, i32) diff --git a/target/riscv/insn32.decode b/target/riscv/insn32.decode index b10497afd3..dab38e23e3 100644 --- a/target/riscv/insn32.decode +++ b/target/riscv/insn32.decode @@ -961,3 +961,8 @@ vsm3c_vi 101011 1 ..... ..... 010 ..... 1110111 @r_vm_1 # *** Zvkg vector crypto extension *** vghsh_vv 101100 1 ..... ..... 010 ..... 1110111 @r_vm_1 vgmul_vv 101000 1 ..... 10001 010 ..... 1110111 @r2_vm_1 + +# *** Zvksed vector crypto extension *** +vsm4k_vi 100001 1 ..... ..... 010 ..... 1110111 @r_vm_1 +vsm4r_vv 101000 1 ..... 10000 010 ..... 1110111 @r2_vm_1 +vsm4r_vs 101001 1 ..... 10000 010 ..... 1110111 @r2_vm_1 diff --git a/target/riscv/insn_trans/trans_rvvk.c.inc b/target/riscv/insn_trans/trans_rvvk.c.inc index e5ccb26c45..1de9dcb6eb 100644 --- a/target/riscv/insn_trans/trans_rvvk.c.inc +++ b/target/riscv/insn_trans/trans_rvvk.c.inc @@ -540,3 +540,46 @@ static bool vghsh_check(DisasContext *s, arg_rmrr *a) } GEN_VV_UNMASKED_TRANS(vghsh_vv, vghsh_check, ZVKG_EGS) + +/* + * Zvksed + */ + +#define ZVKSED_EGS 4 + +static bool zvksed_check(DisasContext *s) +{ + int egw_bytes = ZVKSED_EGS << s->sew; + return s->cfg_ptr->ext_zvksed == true && + require_rvv(s) && + vext_check_isa_ill(s) && + MAXSZ(s) >= egw_bytes && + s->sew == MO_32; +} + +static bool vsm4k_vi_check(DisasContext *s, arg_rmrr *a) +{ + return zvksed_check(s) && + require_align(a->rd, s->lmul) && + require_align(a->rs2, s->lmul); +} + +GEN_VI_UNMASKED_TRANS(vsm4k_vi, vsm4k_vi_check, ZVKSED_EGS) + +static bool vsm4r_vv_check(DisasContext *s, arg_rmr *a) +{ + return zvksed_check(s) && + require_align(a->rd, s->lmul) && + require_align(a->rs2, s->lmul); +} + +GEN_V_UNMASKED_TRANS(vsm4r_vv, vsm4r_vv_check, ZVKSED_EGS) + +static bool vsm4r_vs_check(DisasContext *s, arg_rmr *a) +{ + return zvksed_check(s) && + !is_overlapped(a->rd, 1 << MAX(s->lmul, 0), a->rs2, 1) && + require_align(a->rd, s->lmul); +} + +GEN_V_UNMASKED_TRANS(vsm4r_vs, vsm4r_vs_check, ZVKSED_EGS) diff --git a/target/riscv/vcrypto_helper.c b/target/riscv/vcrypto_helper.c index 04e6374211..9bdd564438 100644 --- a/target/riscv/vcrypto_helper.c +++ b/target/riscv/vcrypto_helper.c @@ -23,6 +23,7 @@ #include "qemu/bswap.h" #include "cpu.h" #include "crypto/aes.h" +#include "crypto/sm4.h" #include "exec/memop.h" #include "exec/exec-all.h" #include "exec/helper-proto.h" @@ -923,3 +924,129 @@ void HELPER(vgmul_vv)(void *vd_vptr, void *vs2_vptr, CPURISCVState *env, vext_set_elems_1s(vd, vta, env->vl * 4, total_elems * 4); env->vstart = 0; } + +void HELPER(vsm4k_vi)(void *vd, void *vs2, uint32_t uimm5, CPURISCVState *env, + uint32_t desc) +{ + const uint32_t egs = 4; + uint32_t rnd = uimm5 & 0x7; + uint32_t group_start = env->vstart / egs; + uint32_t group_end = env->vl / egs; + uint32_t esz = sizeof(uint32_t); + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + + for (uint32_t i = group_start; i < group_end; ++i) { + uint32_t vstart = i * egs; + uint32_t vend = (i + 1) * egs; + uint32_t rk[4] = {0}; + uint32_t tmp[8] = {0}; + + for (uint32_t j = vstart; j < vend; ++j) { + rk[j - vstart] = *((uint32_t *)vs2 + H4(j)); + } + + for (uint32_t j = 0; j < egs; ++j) { + tmp[j] = rk[j]; + } + + for (uint32_t j = 0; j < egs; ++j) { + uint32_t b, s; + b = tmp[j + 1] ^ tmp[j + 2] ^ tmp[j + 3] ^ sm4_ck[rnd * 4 + j]; + + s = sm4_subword(b); + + tmp[j + 4] = tmp[j] ^ (s ^ rol32(s, 13) ^ rol32(s, 23)); + } + + for (uint32_t j = vstart; j < vend; ++j) { + *((uint32_t *)vd + H4(j)) = tmp[egs + (j - vstart)]; + } + } + + env->vstart = 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz); +} + +static void do_sm4_round(uint32_t *rk, uint32_t *buf) +{ + const uint32_t egs = 4; + uint32_t s, b; + + for (uint32_t j = egs; j < egs * 2; ++j) { + b = buf[j - 3] ^ buf[j - 2] ^ buf[j - 1] ^ rk[j - 4]; + + s = sm4_subword(b); + + buf[j] = buf[j - 4] ^ (s ^ rol32(s, 2) ^ rol32(s, 10) ^ rol32(s, 18) ^ + rol32(s, 24)); + } +} + +void HELPER(vsm4r_vv)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc) +{ + const uint32_t egs = 4; + uint32_t group_start = env->vstart / egs; + uint32_t group_end = env->vl / egs; + uint32_t esz = sizeof(uint32_t); + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + + for (uint32_t i = group_start; i < group_end; ++i) { + uint32_t vstart = i * egs; + uint32_t vend = (i + 1) * egs; + uint32_t rk[4] = {0}; + uint32_t tmp[8] = {0}; + + for (uint32_t j = vstart; j < vend; ++j) { + rk[j - vstart] = *((uint32_t *)vs2 + H4(j)); + } + + for (uint32_t j = vstart; j < vend; ++j) { + tmp[j - vstart] = *((uint32_t *)vd + H4(j)); + } + + do_sm4_round(rk, tmp); + + for (uint32_t j = vstart; j < vend; ++j) { + *((uint32_t *)vd + H4(j)) = tmp[egs + (j - vstart)]; + } + } + + env->vstart = 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz); +} + +void HELPER(vsm4r_vs)(void *vd, void *vs2, CPURISCVState *env, uint32_t desc) +{ + const uint32_t egs = 4; + uint32_t group_start = env->vstart / egs; + uint32_t group_end = env->vl / egs; + uint32_t esz = sizeof(uint32_t); + uint32_t total_elems = vext_get_total_elems(env, desc, esz); + + for (uint32_t i = group_start; i < group_end; ++i) { + uint32_t vstart = i * egs; + uint32_t vend = (i + 1) * egs; + uint32_t rk[4] = {0}; + uint32_t tmp[8] = {0}; + + for (uint32_t j = 0; j < egs; ++j) { + rk[j] = *((uint32_t *)vs2 + H4(j)); + } + + for (uint32_t j = vstart; j < vend; ++j) { + tmp[j - vstart] = *((uint32_t *)vd + H4(j)); + } + + do_sm4_round(rk, tmp); + + for (uint32_t j = vstart; j < vend; ++j) { + *((uint32_t *)vd + H4(j)) = tmp[egs + (j - vstart)]; + } + } + + env->vstart = 0; + /* set tail elements to 1s */ + vext_set_elems_1s(vd, vext_vta(desc), env->vl * esz, total_elems * esz); +}