From patchwork Fri Oct 18 07:24:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Wang X-Patchwork-Id: 1998936 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XVGT40RVCz1xth for ; Fri, 18 Oct 2024 18:25:43 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 01D4E3858C66 for ; Fri, 18 Oct 2024 07:25:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from zg8tmja2lje4os4yms4ymjma.icoremail.net (zg8tmja2lje4os4yms4ymjma.icoremail.net [206.189.21.223]) by sourceware.org (Postfix) with ESMTP id 120B13858D37 for ; Fri, 18 Oct 2024 07:25:12 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 120B13858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=eswincomputing.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=eswincomputing.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 120B13858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=206.189.21.223 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729236320; cv=none; b=ucjayvFYA2XirXjKStZt0afMJw3eFP+bqj9XLBaRdenPcI8C+AVNbQLaO2jlXPaHE3PvlchtjL0h0qP5VB8nZ/lAr2tMycr/Q+EsjFwnk9spSUy9+trv306GTUGmGDvE5xaZt4CQxDw8LKzNyssmzI6rDD5DVJoXxOLvZecyHxY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1729236320; c=relaxed/simple; bh=gWX/YeHhDJwOMYhSCR01sLszxZU2af1Njt2GWUxWnsk=; h=From:To:Subject:Date:Message-Id; b=MxB4J/S4OO/gIAJin71uXzowlQQn1J3d92qL53DoKu1HsAGkBjCY7267MwuO7NLQCB5XhU9m1Y8GbQuB3DzfvPyyoMO8Jpq8gy6EMFJpd+4N4D4kP1KVADgqX7YxsuqUrx67DhoT9b/zb6yYYnEWp6yjhoJrdZCL+XTrCpQSDZM= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost.localdomain (unknown [10.12.130.31]) by app2 (Coremail) with SMTP id TQJkCgCnauRNDRJn9cEOAA--.2665S4; Fri, 18 Oct 2024 15:25:02 +0800 (CST) From: Feng Wang To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, juzhe.zhong@rivai.ai, Feng Wang Subject: [PATCH v2] RISC-V:Auto vect for vector-bfloat16 Date: Fri, 18 Oct 2024 07:24:58 +0000 Message-Id: <20241018072458.22223-1-wangfeng@eswincomputing.com> X-Mailer: git-send-email 2.17.1 X-CM-TRANSID: TQJkCgCnauRNDRJn9cEOAA--.2665S4 X-Coremail-Antispam: 1UD129KBjvAXoWfGryUKw17XF18Jw43XF1DWrg_yoW8XryDCo WxZrs3Cw1UJr1Ik39I9F4fJr1kXF4jyrn7JFyFvr1YkFsxJFWrKwnrKa13u345J343WFyU ZFWfCF1kJFZ5Jrs3n29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUY87AC8VAFwI0_Jr0_Gr1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2 x7M28EF7xvwVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8 Jr0_Cr1UM28EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26r xl6s0DM2AIxVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj 6xIIjxv20xvE14v26r1j6r18McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr 0_Gr1lF7xvr2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7MxkIecxEwVCm-wCF 04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWUJVW8JwC20s026c02F40E14v26r1j6r 18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67kF1VAFwI0_JF0_Jw1lIxkGc2Ij64vI r41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY6xIIjxv20xvEc7CjxVAFwI0_Jr0_Gr 1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0xvEx4A2jsIE14v26r1j6r4UMIIF0xvE x4A2jsIEc7CjxVAFwI0_Jr0_GrUvcSsGvfC2KfnxnUUI43ZEXa7VUbrMaUUUUUU== X-CM-SenderInfo: pzdqwwxhqjqvxvzl0uprps33xlqjhudrp/ X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_ASCII_DIVIDERS, KAM_DMARC_STATUS, KAM_NUMSUBJECT, KAM_SHORT, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch add auto-vect patterns for vector-bfloat16 extension. Similar to vector extensions, these patterns can use vector BF16 instructions to optimize the automatic vectorization of for loops. gcc/ChangeLog: * config/riscv/autovec-opt.md (*widen_bf16_fma): Add vfwmacc auto-vect opt pattern for vector-bfloat16. * config/riscv/vector-bfloat16.md (extend2): Add auto-vect pattern for Zvfbfmin extension. (trunc2): Ditto. * config/riscv/vector-iterators.md: Move vector-bfloat16 iterator definitions from vector-bfloat16.md. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c: New test. * gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c: New test. * gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c: New test. Signed-off-by: Feng Wang Signed-off-by: Feng Wang --- gcc/config/riscv/autovec-opt.md | 23 ++++ gcc/config/riscv/vector-bfloat16.md | 116 +++++++++++++----- gcc/config/riscv/vector-iterators.md | 32 +++++ .../riscv/rvv/autovec/vfncvt-auto-vect.c | 19 +++ .../riscv/rvv/autovec/vfwcvt-auto-vect.c | 19 +++ .../riscv/rvv/autovec/vfwmacc-auto-vect.c | 14 +++ 6 files changed, 195 insertions(+), 28 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md index 4b33a145c17..0c6722601ff 100644 --- a/gcc/config/riscv/autovec-opt.md +++ b/gcc/config/riscv/autovec-opt.md @@ -1009,6 +1009,29 @@ } [(set_attr "type" "vfwmuladd")]) +;; vfwmacc for vector_bfloat16 +(define_insn_and_split "*widen_bf16_fma" + [(set (match_operand:VWEXTF_ZVFBF 0 "register_operand") + (plus:VWEXTF_ZVFBF + (mult:VWEXTF_ZVFBF + (float_extend:VWEXTF_ZVFBF + (match_operand: 2 "register_operand")) + (float_extend:VWEXTF_ZVFBF + (match_operand: 3 "register_operand"))) + (match_operand:VWEXTF_ZVFBF 1 "register_operand")))] + "TARGET_ZVFBFWMA && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] + { + rtx ops[] = {operands[0], operands[1], operands[2], operands[3]}; + riscv_vector::emit_vlmax_insn (code_for_pred_widen_bf16_mul (mode), + riscv_vector::WIDEN_TERNARY_OP_FRM_DYN, ops); + DONE; + } + [(set_attr "type" "vfwmaccbf16") + (set_attr "mode" "")]) + ;; This combine pattern does not correspond to an single instruction. ;; This is a temporary pattern produced by a combine pass and if there ;; is no further combine into widen pattern, then fall back to extend diff --git a/gcc/config/riscv/vector-bfloat16.md b/gcc/config/riscv/vector-bfloat16.md index 562aa8ee5ed..90b174be2e7 100644 --- a/gcc/config/riscv/vector-bfloat16.md +++ b/gcc/config/riscv/vector-bfloat16.md @@ -17,26 +17,11 @@ ;; along with GCC; see the file COPYING3. If not see ;; . -(define_mode_iterator VWEXTF_ZVFBF [ - (RVVM8SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") - (RVVM4SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") - (RVVM2SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") - (RVVM1SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") - (RVVMF2SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32") -]) - -(define_mode_attr V_FP32TOBF16_TRUNC [ - (RVVM8SF "RVVM4BF") (RVVM4SF "RVVM2BF") (RVVM2SF "RVVM1BF") (RVVM1SF "RVVMF2BF") (RVVMF2SF "RVVMF4BF") -]) - -(define_mode_attr VF32_SUBEL [ - (RVVM8SF "BF") (RVVM4SF "BF") (RVVM2SF "BF") (RVVM1SF "BF") (RVVMF2SF "BF")]) - ;; Zvfbfmin extension (define_insn "@pred_trunc_to_bf16" - [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") - (if_then_else: + [(set (match_operand: 0 "register_operand" "=vd, vd, vr, vr, &vr, &vr") + (if_then_else: (unspec: [(match_operand: 1 "vector_mask_operand" " vm, vm,Wc1,Wc1,vmWc1,vmWc1") (match_operand 4 "vector_length_operand" " rK, rK, rK, rK, rK, rK") @@ -47,13 +32,13 @@ (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM) (reg:SI FRM_REGNUM)] UNSPEC_VPREDICATE) - (float_truncate: + (float_truncate: (match_operand:VWEXTF_ZVFBF 3 "register_operand" " 0, 0, 0, 0, vr, vr")) - (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] + (match_operand: 2 "vector_merge_operand" " vu, 0, vu, 0, vu, 0")))] "TARGET_ZVFBFMIN" "vfncvtbf16.f.f.w\t%0,%3%p1" [(set_attr "type" "vfncvtbf16") - (set_attr "mode" "") + (set_attr "mode" "") (set (attr "frm_mode") (symbol_ref "riscv_vector::get_frm_mode (operands[8])"))]) @@ -69,12 +54,12 @@ (reg:SI VL_REGNUM) (reg:SI VTYPE_REGNUM)] UNSPEC_VPREDICATE) (float_extend:VWEXTF_ZVFBF - (match_operand: 3 "register_operand" " vr, vr")) + (match_operand: 3 "register_operand" " vr, vr")) (match_operand:VWEXTF_ZVFBF 2 "vector_merge_operand" " vu, 0")))] "TARGET_ZVFBFMIN" "vfwcvtbf16.f.f.v\t%0,%3%p1" [(set_attr "type" "vfwcvtbf16") - (set_attr "mode" "")]) + (set_attr "mode" "")]) (define_insn "@pred_widen_bf16_mul_" @@ -93,15 +78,15 @@ (plus:VWEXTF_ZVFBF (mult:VWEXTF_ZVFBF (float_extend:VWEXTF_ZVFBF - (match_operand: 3 "register_operand" " vr")) + (match_operand: 3 "register_operand" " vr")) (float_extend:VWEXTF_ZVFBF - (match_operand: 4 "register_operand" " vr"))) + (match_operand: 4 "register_operand" " vr"))) (match_operand:VWEXTF_ZVFBF 2 "register_operand" " 0")) (match_dup 2)))] "TARGET_ZVFBFWMA" "vfwmaccbf16.vv\t%0,%3,%4%p1" [(set_attr "type" "vfwmaccbf16") - (set_attr "mode" "") + (set_attr "mode" "") (set (attr "frm_mode") (symbol_ref "riscv_vector::get_frm_mode (operands[9])"))]) @@ -121,15 +106,90 @@ (plus:VWEXTF_ZVFBF (mult:VWEXTF_ZVFBF (float_extend:VWEXTF_ZVFBF - (vec_duplicate: + (vec_duplicate: (match_operand: 3 "register_operand" " f"))) (float_extend:VWEXTF_ZVFBF - (match_operand: 4 "register_operand" " vr"))) + (match_operand: 4 "register_operand" " vr"))) (match_operand:VWEXTF_ZVFBF 2 "register_operand" " 0")) (match_dup 2)))] "TARGET_ZVFBFWMA" "vfwmaccbf16.vf\t%0,%3,%4%p1" [(set_attr "type" "vfwmaccbf16") - (set_attr "mode" "") + (set_attr "mode" "") (set (attr "frm_mode") (symbol_ref "riscv_vector::get_frm_mode (operands[9])"))]) + +;; Auto vect pattern + +;; ------------------------------------------------------------------------- +;; ---- [BF16] Widening. +;; ------------------------------------------------------------------------- +;; - vfwcvtbf16.f.f.v +;; ------------------------------------------------------------------------- +(define_insn_and_split "extend2" + [(set (match_operand:VWEXTF_ZVFBF 0 "register_operand" "=&vr") + (float_extend:VWEXTF_ZVFBF + (match_operand: 1 "register_operand" " vr")))] + "TARGET_ZVFBFMIN && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + insn_code icode = code_for_pred_extend_bf16_to (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP, operands); + DONE; +} + [(set_attr "type" "vfwcvtbf16") + (set_attr "mode" "")]) + +(define_expand "extend2" + [(set (match_operand:VDF 0 "register_operand") + (float_extend:VDF + (match_operand: 1 "register_operand")))] + "TARGET_ZVFBFMIN" +{ + rtx dblw = gen_reg_rtx (mode); + emit_insn (gen_extend2 (dblw, operands[1])); + emit_insn (gen_extend2 (operands[0], dblw)); + DONE; +}) + +;; ------------------------------------------------------------------------- +;; ---- [BF16] Narrowing. +;; ------------------------------------------------------------------------- +;; - vfncvtbf16.f.f.w +;; ------------------------------------------------------------------------- +(define_insn_and_split "trunc2" + [(set (match_operand: 0 "register_operand" "=vr") + (float_truncate: + (match_operand:VSF 1 "register_operand" " vr")))] + "TARGET_ZVFBFMIN && can_create_pseudo_p ()" + "#" + "&& 1" + [(const_int 0)] +{ + insn_code icode = code_for_pred_trunc_to_bf16 (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_DYN, operands); + DONE; +} + [(set_attr "type" "vfncvtbf16") + (set_attr "mode" "")]) + +(define_expand "trunc2" + [(set (match_operand: 0 "register_operand") + (float_truncate: + (match_operand:VDF 1 "register_operand")))] + "TARGET_ZVFBFMIN" +{ + rtx half = gen_reg_rtx (mode); + rtx opshalf[] = {half, operands[1]}; + + /* According to the RISC-V V Spec 13.19. we need to use + vfncvt.rod.f.f.w for all steps but the last. */ + insn_code icode = code_for_pred_rod_trunc (mode); + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP, opshalf); + + emit_insn (gen_trunc2 (operands[0], half)); + DONE; +}) + diff --git a/gcc/config/riscv/vector-iterators.md b/gcc/config/riscv/vector-iterators.md index 43325d1ba87..a53c5233839 100644 --- a/gcc/config/riscv/vector-iterators.md +++ b/gcc/config/riscv/vector-iterators.md @@ -4512,3 +4512,35 @@ (V256DF "v64df") (V512DF "v128df") ]) + +;;vector bfloat16 +(define_mode_iterator VWEXTF_ZVFBF [ + (RVVM8SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") + (RVVM4SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") + (RVVM2SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") + (RVVM1SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32") + (RVVMF2SF "TARGET_VECTOR_ELEN_BF_16 && TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32") +]) + +(define_mode_iterator VSF [ + (RVVM8SF "TARGET_VECTOR_ELEN_FP_32") (RVVM4SF "TARGET_VECTOR_ELEN_FP_32") (RVVM2SF "TARGET_VECTOR_ELEN_FP_32") + (RVVM1SF "TARGET_VECTOR_ELEN_FP_32") (RVVMF2SF "TARGET_VECTOR_ELEN_FP_32 && TARGET_MIN_VLEN > 32") +]) + +(define_mode_iterator VDF [ + (RVVM8DF "TARGET_VECTOR_ELEN_FP_64") (RVVM4DF "TARGET_VECTOR_ELEN_FP_64") + (RVVM2DF "TARGET_VECTOR_ELEN_FP_64") (RVVM1DF "TARGET_VECTOR_ELEN_FP_64") +]) + +(define_mode_attr V_FPWIDETOBF16_TRUNC [ + (RVVM8SF "RVVM4BF") (RVVM4SF "RVVM2BF") (RVVM2SF "RVVM1BF") (RVVM1SF "RVVMF2BF") (RVVMF2SF "RVVMF4BF") + (RVVM8DF "RVVM2BF") (RVVM4DF "RVVM1BF") (RVVM2DF "RVVMF2BF") (RVVM1DF "RVVMF4BF") +]) + +(define_mode_attr v_fpwidetobf16_trunc [ + (RVVM8SF "rvvm4bf") (RVVM4SF "rvvm2bf") (RVVM2SF "rvvm1bf") (RVVM1SF "rvvmf2bf") (RVVMF2SF "rvvmf4bf") + (RVVM8DF "rvvm2bf") (RVVM4DF "rvvm1bf") (RVVM2DF "rvvmf2bf") (RVVM1DF "rvvmf4bf") +]) + +(define_mode_attr VF32_SUBEL [ + (RVVM8SF "BF") (RVVM4SF "BF") (RVVM2SF "BF") (RVVM1SF "BF") (RVVMF2SF "BF")]) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c new file mode 100644 index 00000000000..7ba3615ccf1 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfncvt-auto-vect.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfbfmin -mabi=ilp32d" } */ + +__attribute__((noipa)) +void vfncvt_float_BFloat16 (__bf16 *dst, float *a, int n) +{ + for (int i = 0; i < n; i++) + dst[i] = (__bf16)a[i]; +} + +__attribute__((noipa)) +void vfncvt_double_BFloat16 (__bf16 *dst, double *a, int n) +{ + for (int i = 0; i < n; i++) + dst[i] = (__bf16)a[i]; +} + +/* { dg-final { scan-assembler-times {\tvfncvtbf16\.f\.f\.w} 2 } } */ +/* { dg-final { scan-assembler-times {\tvfncvt\.rod\.f\.f\.w} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c new file mode 100644 index 00000000000..6629dd909a0 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwcvt-auto-vect.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfbfmin -mabi=ilp32d" } */ + +__attribute__((noipa)) +void vfwcvt__BFloat16float (float *dst, __bf16 *a, int n) +{ + for (int i = 0; i < n; i++) + dst[i] = (float)a[i]; +} + +__attribute__((noipa)) +void vfwcvt__BFloat16double (double *dst, __bf16 *a, int n) +{ + for (int i = 0; i < n; i++) + dst[i] = (double)a[i]; +} + +/* { dg-final { scan-assembler-times {\tvfwcvtbf16\.f\.f\.v} 2 } } */ +/* { dg-final { scan-assembler-times {\tvfwcvt\.f\.f\.v} 1 } } */ diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c new file mode 100644 index 00000000000..a767f2c8ef8 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/vfwmacc-auto-vect.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-march=rv32gcv_zvfbfwma -mabi=ilp32d -ffast-math" } */ + +__attribute__ ((noipa)) +void vwmacc_float_bf16 (float *__restrict dst, + __bf16 *__restrict a, + __bf16 *__restrict b, + int n) +{ + for (int i = 0; i < n; i++) + dst[i] += (float) (a[i] * b[i]); +} + +/* { dg-final { scan-assembler-times {\tvfwmaccbf16\.vv} 1 } } */