From patchwork Wed Sep 21 19:46:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chris Stillson X-Patchwork-Id: 1680811 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=none (no SPF record) smtp.mailfrom=lists.infradead.org (client-ip=2607:7c80:54:3::133; helo=bombadil.infradead.org; envelope-from=kvm-riscv-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=lists.infradead.org header.i=@lists.infradead.org header.a=rsa-sha256 header.s=bombadil.20210309 header.b=crV/hgjD; dkim=fail reason="signature verification failed" (2048-bit key; secure) header.d=infradead.org header.i=@infradead.org header.a=rsa-sha256 header.s=desiato.20200630 header.b=dFw8HkAY; dkim=fail reason="signature verification failed" (2048-bit key; unprotected) header.d=rivosinc-com.20210112.gappssmtp.com header.i=@rivosinc-com.20210112.gappssmtp.com header.a=rsa-sha256 header.s=20210112 header.b=4zRZG8Zb; dkim-atps=neutral Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:3::133]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4MXpqr6jbzz1yqL for ; Thu, 22 Sep 2022 05:48:24 +1000 (AEST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=k61y7agGpTZUrLQNzSN2ZE91cZJEkHyLKATNLicFVNY=; b=crV/hgjDKEEjke LaF6MlEph/gsbldy7cC2loRkoDhmtN+pJ2kYV9gujBllr77qjnJwW1lap2ODHm5d8gtRVYAtZuVx4 ofyKDQLrgMVMYPhwM88at32/CYNSvNP+SPGj+2XJDpYUz3tw1d3ELj5a/Q7V1f1Al60UJj9b57JNz QeZwa15WWIIU+bjWt/+2+fALUCN/8lXsNdkpH4KINft4J1iKkkL/I3TSKD4dg6eEfE3icGr/yNaw3 dPvjIx/8iakMn36SVe1eRBuQLicdMHBs/8KKF5LuEVsnKppz+AO7qejgAWNMkryOXggXpVdpR+J3D RAThisHnwXULVqJrUbIw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1ob5hk-00CVKU-Iu; Wed, 21 Sep 2022 19:48:20 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1ob5hg-00CVHd-RW for kvm-riscv@bombadil.infradead.org; Wed, 21 Sep 2022 19:48:16 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:MIME-Version :References:In-Reply-To:Message-Id:Date:Subject:Cc:To:From:Sender:Reply-To: Content-Type:Content-ID:Content-Description; bh=KNTXllz/7lqM8/dxY7FGxyptHmO7d+SD0mjK3UzDW10=; b=dFw8HkAYL+ngtaS9/EBjHJP3PT RoiwMwfGR2HycEXRFcS4VpinIf1PWfIAvxiqk6vXnLgw71B0nELMJ5MshUMuAug78wkDRfL8k/7iz esOjFiVdEaaLkvcEodBXrEWdlMrElzLnV+yEwR5MH6YD0hJKMKc63xesJXFbgKBRr4awItFmJldcK j+3hMT/5rV3ACZ0kesUNREvXAhngAOWjuyp5H9FEvudf+LvGVBGExnsrGrTA6BbDxSHewgumB/nok zKIrIhJCWi18DuokdHgPq3gp+duGA9TwHOBCuU3AHs8ROaae5xc1mkVpZmSWtqkl6e8b+3/LY+rWB vcacOxCQ==; Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]) by desiato.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1ob5gb-00EkAD-3V for kvm-riscv@lists.infradead.org; Wed, 21 Sep 2022 19:48:12 +0000 Received: by mail-pl1-x631.google.com with SMTP id w13so6721697plp.1 for ; Wed, 21 Sep 2022 12:46:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=KNTXllz/7lqM8/dxY7FGxyptHmO7d+SD0mjK3UzDW10=; b=4zRZG8Zb4e4mxlIrhpf7G2WQOZPiUjk1qdo8dZ5w3+IzLTgdXoZh804On2UcjYATeg MFw/iAtfRZk4BXnweCdoMf5FF4d6gxT3tPOIvPukXNQBWaiKiLc9C0l3NR1KAc4+wk0A /+fzryq2L9LblD2JI3whjLfQMiQ1bUIzxRbpFyXgC/udjG2alYh71XpyLyBtAji9e2Ti I/8HKAuZIPOjTobAc0xYCsXZMaTpDY7xpAbjA0P5UllYIdRqkkAF43K+ICiFN0VaEnt+ Qdibsg+R6nJJB1+w0ZTaZf/ZfgUxY8Y8A1F0S4x8qhbP9MimRqgMxOPpUVs8rL6FlLuB iFAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=KNTXllz/7lqM8/dxY7FGxyptHmO7d+SD0mjK3UzDW10=; b=Y0bAH8vlkXzLLeSVVwtJYH6K9Rp4oLPc8dh+6nGX3SvbhNABdKiZPgcMoPm2OfmTBs Vm5zBO6hKM3BnIEJl/OSWlq64JXG+RmO0mcHsi3otRXm2eV+yAJsx/pp3VcDgox6L/h+ 0s+D6c+TNKbVLznTIuj7SuPjwDRuVQRR7EDiPhwtS8ZBVUOdenbrQV1CujSxQ1cz6G7t UXnH2IpS/PbmEsGhriaGmJZHO7llVFm/nIwLK3dXwotUPF7BuSKaDoQGANhjKEeiS82m 2m+heSPV9eeWZjall2jDZ6ET1QJACFoEBUlDy+cv5emRhtDkjOM+hpHMpAIGtggj64wd HCeA== X-Gm-Message-State: ACrzQf0BJJv0njWmraWIVonbf46909eX5ehwCCPPj1D5QLPmlv+7T7uj 3XHlorwRW0xEzrO8qIR1RM+gfUAqSItX9A== X-Google-Smtp-Source: AMsMyM6ZnNI7nM3T2R503jdBu9k5h0fDD5BgYE/7x2ylGfqZ4AJNwYiLoaX6AVs+en71otXOOBw/TA== X-Received: by 2002:a17:902:d552:b0:178:5b6d:629 with SMTP id z18-20020a170902d55200b001785b6d0629mr6371639plf.17.1663789618079; Wed, 21 Sep 2022 12:46:58 -0700 (PDT) Received: from stillson.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id o2-20020aa79782000000b0054aa69bc192sm2551057pfp.72.2022.09.21.12.46.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 12:46:57 -0700 (PDT) From: Chris Stillson To: linux-riscv@lists.infradead.org, jpalmer@dabbelt.com, kvm-riscv@lists.infradead.org Cc: Greentime Hu , Han-Kuan Chen Subject: [PATCH 13/17] riscv: Add vector extension XOR implementation Date: Wed, 21 Sep 2022 12:46:25 -0700 Message-Id: <20220921194629.1480202-14-stillson@rivosinc.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220921194629.1480202-1-stillson@rivosinc.com> References: <20220921194629.1480202-1-stillson@rivosinc.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220921_204730_184544_E4595F2F X-CRM114-Status: GOOD ( 15.49 ) X-Spam-Score: -0.0 (/) X-Spam-Report: Spam detection software, running on the system "desiato.infradead.org", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: From: Greentime Hu This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen Signed-off-by: Han-Kuan Chen Signed-off-by: Greentime Hu --- arch/riscv/include/asm/xor.h | [...] Content analysis details: (-0.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 RCVD_IN_DNSWL_NONE RBL: Sender listed at https://www.dnswl.org/, no trust [2607:f8b0:4864:20:0:0:0:631 listed in] [list.dnswl.org] 0.0 SPF_HELO_NONE SPF: HELO does not publish an SPF Record -0.0 SPF_PASS SPF: sender matches SPF record -0.1 DKIM_VALID Message has at least one valid DKIM or DK signature 0.1 DKIM_SIGNED Message has a DKIM or DK signature, not necessarily valid -0.0 T_SCC_BODY_TEXT_LINE No description available. X-BeenThere: kvm-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "kvm-riscv" Errors-To: kvm-riscv-bounces+incoming=patchwork.ozlabs.org@lists.infradead.org From: Greentime Hu This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen Signed-off-by: Han-Kuan Chen Signed-off-by: Greentime Hu --- arch/riscv/include/asm/xor.h | 82 ++++++++++++++++++++++++++++++++++++ arch/riscv/lib/Makefile | 1 + arch/riscv/lib/xor.S | 81 +++++++++++++++++++++++++++++++++++ 3 files changed, 164 insertions(+) create mode 100644 arch/riscv/include/asm/xor.h create mode 100644 arch/riscv/lib/xor.S diff --git a/arch/riscv/include/asm/xor.h b/arch/riscv/include/asm/xor.h new file mode 100644 index 000000000000..d1f2eeb14afb --- /dev/null +++ b/arch/riscv/include/asm/xor.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ + +#include +#include +#ifdef CONFIG_VECTOR +#include +#include + +void xor_regs_2_(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2); +void xor_regs_3_(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2, + const unsigned long * __restrict p3); +void xor_regs_4_(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2, + const unsigned long * __restrict p3, + const unsigned long * __restrict p4); +void xor_regs_5_(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2, + const unsigned long * __restrict p3, + const unsigned long * __restrict p4, + const unsigned long * __restrict p5); + +static void xor_rvv_2(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2) +{ + kernel_rvv_begin(); + xor_regs_2_(bytes, p1, p2); + kernel_rvv_end(); +} + +static void xor_rvv_3(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2, + const unsigned long * __restrict p3) +{ + kernel_rvv_begin(); + xor_regs_3_(bytes, p1, p2, p3); + kernel_rvv_end(); +} + +static void xor_rvv_4(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2, + const unsigned long * __restrict p3, + const unsigned long * __restrict p4) +{ + kernel_rvv_begin(); + xor_regs_4_(bytes, p1, p2, p3, p4); + kernel_rvv_end(); +} + +static void xor_rvv_5(unsigned long bytes, unsigned long * __restrict p1, + const unsigned long * __restrict p2, + const unsigned long * __restrict p3, + const unsigned long * __restrict p4, + const unsigned long * __restrict p5) +{ + kernel_rvv_begin(); + xor_regs_5_(bytes, p1, p2, p3, p4, p5); + kernel_rvv_end(); +} + +static struct xor_block_template xor_block_rvv = { + .name = "rvv", + .do_2 = xor_rvv_2, + .do_3 = xor_rvv_3, + .do_4 = xor_rvv_4, + .do_5 = xor_rvv_5 +}; + +#undef XOR_TRY_TEMPLATES +#define XOR_TRY_TEMPLATES \ + do { \ + xor_speed(&xor_block_8regs); \ + xor_speed(&xor_block_32regs); \ + if (has_vector()) { \ + xor_speed(&xor_block_rvv);\ + } \ + } while (0) +#endif diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 25d5c9664e57..acd87ac86d24 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -7,3 +7,4 @@ lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o +lib-$(CONFIG_VECTOR) += xor.o diff --git a/arch/riscv/lib/xor.S b/arch/riscv/lib/xor.S new file mode 100644 index 000000000000..3bc059e18171 --- /dev/null +++ b/arch/riscv/lib/xor.S @@ -0,0 +1,81 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ +#include +#include +#include + +ENTRY(xor_regs_2_) + vsetvli a3, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a3 + vxor.vv v16, v0, v8 + add a2, a2, a3 + vse8.v v16, (a1) + add a1, a1, a3 + bnez a0, xor_regs_2_ + ret +END(xor_regs_2_) +EXPORT_SYMBOL(xor_regs_2_) + +ENTRY(xor_regs_3_) + vsetvli a4, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a4 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a4 + vxor.vv v16, v0, v16 + add a3, a3, a4 + vse8.v v16, (a1) + add a1, a1, a4 + bnez a0, xor_regs_3_ + ret +END(xor_regs_3_) +EXPORT_SYMBOL(xor_regs_3_) + +ENTRY(xor_regs_4_) + vsetvli a5, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a5 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a5 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a5 + vxor.vv v16, v0, v24 + add a4, a4, a5 + vse8.v v16, (a1) + add a1, a1, a5 + bnez a0, xor_regs_4_ + ret +END(xor_regs_4_) +EXPORT_SYMBOL(xor_regs_4_) + +ENTRY(xor_regs_5_) + vsetvli a6, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a6 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a6 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a6 + vxor.vv v0, v0, v24 + vle8.v v8, (a5) + add a4, a4, a6 + vxor.vv v16, v0, v8 + add a5, a5, a6 + vse8.v v16, (a1) + add a1, a1, a6 + bnez a0, xor_regs_5_ + ret +END(xor_regs_5_) +EXPORT_SYMBOL(xor_regs_5_)