From patchwork Wed Oct 23 10:45:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Li, Pan2" X-Patchwork-Id: 2000966 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=Pd/XViwa; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XYQl21dF8z1xw0 for ; Wed, 23 Oct 2024 21:48:46 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7062F3858C2B for ; Wed, 23 Oct 2024 10:48:44 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.11]) by sourceware.org (Postfix) with ESMTPS id 80A4F3858CD1 for ; Wed, 23 Oct 2024 10:47:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 80A4F3858CD1 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 80A4F3858CD1 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.11 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680432; cv=none; b=nmoaKghaIIQTSS/QC3US0Gas6JYjVRNboWBFQicTcGX2Tvyj8BZq6b3kE8Z0IUKGHe90mfRCXfTDpCWOFIk1WGYoH8VegJuLZKqeO742elWcv7m1WrUfVc9JRL+1uNzWW/CwfOEQY+NWQrSCPa8wwTQdM24Yp7MaOOmVxVLEDf0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729680432; c=relaxed/simple; bh=KN1++Ldd/SzM9s/0xMQIVpfiD0FAyLhJaQvxIjpeubo=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=OxU+woBIMcPdVHQfI9uA1uphGpH6eipSGrJmyI+F1H6CqnD+MocsCTX6DxmGuHdqsvbHACa0sBrLlDD5Eri82tJRBmdBE05y8zdQalWsWObtnY/OtSJla6W0tkK2oK8XOhnFECqDGBD5mg5nz2ube0iEETxyC/rtA0TMSfYR8po= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1729680429; x=1761216429; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KN1++Ldd/SzM9s/0xMQIVpfiD0FAyLhJaQvxIjpeubo=; b=Pd/XViwaTFPbjFUQZVv7U3ERN9UL55GuePg2z7M0jo0gRCPp7IxUsXif xVJPXa7kft0RtQKnKpVlbyT1c6HoEA2cxCWUtjuou9eMriocv7H2ioCMx guoGP3vNBmaWoPAztSdoT85f3j26EMBmjMgbO9plm9GWYFqX+ROkh0D+c WwNBcmQhjgcBCPPcvFDEXnDFWhv8cDrwEdA5xwpQDRKwIi5FrYD1Z9WEn ++NocQM7nW5klrFH0FDQt/f2V1deiQzADTwJbyfhhmgH5ZZ5Sl+RBpyIq HeF2NvLdjZcxIfSqv0etjFsVlb29Vu2YPWN266aXRuKr8tXkXQeFusTm5 w==; X-CSE-ConnectionGUID: /l1c87xIS8Ke1cav2KrXCA== X-CSE-MsgGUID: /otEK5ZcS6aAUIOpM6L5Bg== X-IronPort-AV: E=McAfee;i="6700,10204,11222"; a="39808717" X-IronPort-AV: E=Sophos;i="6.11,199,1725346800"; d="scan'208";a="39808717" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by orvoesa103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Oct 2024 03:47:09 -0700 X-CSE-ConnectionGUID: SnUW7myMS2eSfI/IexRYWA== X-CSE-MsgGUID: t4hWXtOST7ijGR+gT7kF/w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,225,1725346800"; d="scan'208";a="103436966" Received: from panli.sh.intel.com ([10.239.154.73]) by fmviesa002.fm.intel.com with ESMTP; 23 Oct 2024 03:47:07 -0700 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com, Tamar.Christina@arm.com, juzhe.zhong@rivai.ai, kito.cheng@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li Subject: [PATCH 4/5] RISC-V: Implement the MASK_LEN_STRIDED_LOAD{STORE} Date: Wed, 23 Oct 2024 18:45:15 +0800 Message-ID: <20241023104516.2818244-4-pan2.li@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241023104516.2818244-1-pan2.li@intel.com> References: <20241023104516.2818244-1-pan2.li@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_ASCII_DIVIDERS, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Pan Li This patch would like to implment the MASK_LEN_STRIDED_LOAD{STORE} in the RISC-V backend by leveraging the vector strided load/store insn. For example: void foo (int * __restrict a, int * __restrict b, int stride, int n) { for (int i = 0; i < n; i++) a[i*stride] = b[i*stride] + 100; } Before this patch: 38 │ vsetvli a5,a3,e32,m1,ta,ma 39 │ vluxei64.v v1,(a1),v4 40 │ mul a4,a2,a5 41 │ sub a3,a3,a5 42 │ vadd.vv v1,v1,v2 43 │ vsuxei64.v v1,(a0),v4 44 │ add a1,a1,a4 45 │ add a0,a0,a4 After this patch: 33 │ vsetvli a5,a3,e32,m1,ta,ma 34 │ vlse32.v v1,0(a1),a2 35 │ mul a4,a2,a5 36 │ sub a3,a3,a5 37 │ vadd.vv v1,v1,v2 38 │ vsse32.v v1,0(a0),a2 39 │ add a1,a1,a4 40 │ add a0,a0,a4 The below test suites are passed for this patch: * The riscv fully regression test. gcc/ChangeLog: * config/riscv/autovec.md (mask_len_strided_load_): Add new pattern for MASK_LEN_STRIDED_LOAD. (mask_len_strided_store_): Ditto but for store. * config/riscv/riscv-protos.h (expand_strided_load): Add new func decl to expand strided load. (expand_strided_store): Ditto but for store. * config/riscv/riscv-v.cc (expand_strided_load): Add new func impl to expand strided load. (expand_strided_store): Ditto but for store. Signed-off-by: Pan Li Co-Authored-By: Juzhe-Zhong --- gcc/config/riscv/autovec.md | 29 ++++++++++++++++++ gcc/config/riscv/riscv-protos.h | 2 ++ gcc/config/riscv/riscv-v.cc | 52 +++++++++++++++++++++++++++++++++ 3 files changed, 83 insertions(+) diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index a34f63c9651..85a915bd65f 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2855,3 +2855,32 @@ (define_expand "v3" DONE; } ) + +;; ========================================================================= +;; == Strided Load/Store +;; ========================================================================= +(define_expand "mask_len_strided_load_" + [(match_operand:V 0 "register_operand") + (match_operand 1 "pmode_reg_or_0_operand") + (match_operand 2 "pmode_reg_or_0_operand") + (match_operand: 3 "vector_mask_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_strided_load (mode, operands); + DONE; + }) + +(define_expand "mask_len_strided_store_" + [(match_operand 0 "pmode_reg_or_0_operand") + (match_operand 1 "pmode_reg_or_0_operand") + (match_operand:V 2 "register_operand") + (match_operand: 3 "vector_mask_operand") + (match_operand 4 "autovec_length_operand") + (match_operand 5 "const_0_operand")] + "TARGET_VECTOR" + { + riscv_vector::expand_strided_store(mode, operands); + DONE; + }) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index d690162bb0c..47c9494ff2b 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -696,6 +696,8 @@ bool expand_strcmp (rtx, rtx, rtx, rtx, unsigned HOST_WIDE_INT, bool); void emit_vec_extract (rtx, rtx, rtx); bool expand_vec_setmem (rtx, rtx, rtx); bool expand_vec_cmpmem (rtx, rtx, rtx, rtx); +void expand_strided_load (machine_mode, rtx *); +void expand_strided_store (machine_mode, rtx *); /* Rounding mode bitfield for fixed point VXRM. */ enum fixed_point_rounding_mode diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index 630fbd80e94..ae028e8928a 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -3833,6 +3833,58 @@ expand_load_store (rtx *ops, bool is_load) } } +/* Expand MASK_LEN_STRIDED_LOAD. */ +void +expand_strided_load (machine_mode mode, rtx *ops) +{ + rtx v_reg = ops[0]; + rtx base = ops[1]; + rtx stride = ops[2]; + rtx mask = ops[3]; + rtx len = ops[4]; + poly_int64 len_val; + + insn_code icode = code_for_pred_strided_load (mode); + rtx emit_ops[] = {v_reg, mask, gen_rtx_MEM (mode, base), stride}; + + if (poly_int_rtx_p (len, &len_val) + && known_eq (len_val, GET_MODE_NUNITS (mode))) + emit_vlmax_insn (icode, BINARY_OP_TAMA, emit_ops); + else + { + len = satisfies_constraint_K (len) ? len : force_reg (Pmode, len); + emit_nonvlmax_insn (icode, BINARY_OP_TAMA, emit_ops, len); + } +} + +/* Expand MASK_LEN_STRIDED_STORE. */ +void +expand_strided_store (machine_mode mode, rtx *ops) +{ + rtx v_reg = ops[2]; + rtx base = ops[0]; + rtx stride = ops[1]; + rtx mask = ops[3]; + rtx len = ops[4]; + poly_int64 len_val; + rtx vl_type; + + if (poly_int_rtx_p (len, &len_val) + && known_eq (len_val, GET_MODE_NUNITS (mode))) + { + len = gen_reg_rtx (Pmode); + emit_vlmax_vsetvl (mode, len); + vl_type = get_avl_type_rtx (VLMAX); + } + else + { + len = satisfies_constraint_K (len) ? len : force_reg (Pmode, len); + vl_type = get_avl_type_rtx (NONVLMAX); + } + + emit_insn (gen_pred_strided_store (mode, gen_rtx_MEM (mode, base), + mask, stride, v_reg, len, vl_type)); +} /* Return true if the operation is the floating-point operation need FRM. */ static bool