Message ID | 20240711071028.9044-1-wangfeng@eswincomputing.com |
---|---|
State | New |
Headers | show |
Series | [1/3,v3] RISC-V: Add vector type of BFloat16 format | expand |
OK for this patch set, I know you already got LGTM from JuZhe or me before, so just an explicitly ack to let you know it's still OK once CI is passed. On Thu, Jul 11, 2024 at 3:11 PM Feng Wang <wangfeng@eswincomputing.com> wrote: > > v3: Rebase > v2: Rebase > The vector type of BFloat16 format is added in this patch, > subsequent extensions to zvfbfmin and zvfwma need to be based > on this patch. > > Signed-off-by: Feng Wang <wangfeng@eswincomputing.com> > gcc/ChangeLog: > > * config/riscv/genrvv-type-indexer.cc (bfloat16_type): > Generate bf16 vector_type and scalar_type in DEF_RVV_TYPE_INDEX. > (bfloat16_wide_type): Ditto. > (same_ratio_eew_bf16_type): Ditto. > (main): Ditto. > * config/riscv/riscv-modes.def (ADJUST_BYTESIZE): > (RVV_WHOLE_MODES): Add vector type for BFloat16. > (RVV_FRACT_MODE): Ditto. > (RVV_NF4_MODES): Ditto. > (RVV_NF8_MODES): Ditto. > (RVV_NF2_MODES): Ditto. > * config/riscv/riscv-vector-builtins-types.def (vbfloat16mf4_t): > (vbfloat16mf2_t): Add builtin vector type for BFloat16. > (vbfloat16m1_t): Ditto. > (vbfloat16m2_t): Ditto. > (vbfloat16m4_t): Ditto. > (vbfloat16m8_t): Ditto. > (vbfloat16mf4x2_t): Ditto. > (vbfloat16mf4x3_t): Ditto. > (vbfloat16mf4x4_t): Ditto. > (vbfloat16mf4x5_t): Ditto. > (vbfloat16mf4x6_t): Ditto. > (vbfloat16mf4x7_t): Ditto. > (vbfloat16mf4x8_t): Ditto. > (vbfloat16mf2x2_t): Ditto. > (vbfloat16mf2x3_t): Ditto. > (vbfloat16mf2x4_t): Ditto. > (vbfloat16mf2x5_t): Ditto. > (vbfloat16mf2x6_t): Ditto. > (vbfloat16mf2x7_t): Ditto. > (vbfloat16mf2x8_t): Ditto. > (vbfloat16m1x2_t): Ditto. > (vbfloat16m1x3_t): Ditto. > (vbfloat16m1x4_t): Ditto. > (vbfloat16m1x5_t): Ditto. > (vbfloat16m1x6_t): Ditto. > (vbfloat16m1x7_t): Ditto. > (vbfloat16m1x8_t): Ditto. > (vbfloat16m2x2_t): Ditto. > (vbfloat16m2x3_t): Ditto. > (vbfloat16m2x4_t): Ditto. > (vbfloat16m4x2_t): Ditto. > * config/riscv/riscv-vector-builtins.cc (check_required_extensions): > Add required_ext checking for BFloat16. > * config/riscv/riscv-vector-builtins.def (vbfloat16mf4_t): > Add vector_type for BFloat16 in builtins.def. > (vbfloat16mf4x2_t): Ditto. > (vbfloat16mf4x3_t): Ditto. > (vbfloat16mf4x4_t): Ditto. > (vbfloat16mf4x5_t): Ditto. > (vbfloat16mf4x6_t): Ditto. > (vbfloat16mf4x7_t): Ditto. > (vbfloat16mf4x8_t): Ditto. > (vbfloat16mf2_t): Ditto. > (vbfloat16mf2x2_t): Ditto. > (vbfloat16mf2x3_t): Ditto. > (vbfloat16mf2x4_t): Ditto. > (vbfloat16mf2x5_t): Ditto. > (vbfloat16mf2x6_t): Ditto. > (vbfloat16mf2x7_t): Ditto. > (vbfloat16mf2x8_t): Ditto. > (vbfloat16m1_t): Ditto. > (vbfloat16m1x2_t): Ditto. > (vbfloat16m1x3_t): Ditto. > (vbfloat16m1x4_t): Ditto. > (vbfloat16m1x5_t): Ditto. > (vbfloat16m1x6_t): Ditto. > (vbfloat16m1x7_t): Ditto. > (vbfloat16m1x8_t): Ditto. > (vbfloat16m2_t): Ditto. > (vbfloat16m2x2_t): Ditto. > (vbfloat16m2x3_t): Ditto. > (vbfloat16m2x4_t): Ditto. > (vbfloat16m4_t): Ditto. > (vbfloat16m4x2_t): Ditto. > (vbfloat16m8_t): Ditto. > (double_trunc_bfloat_scalar): Add scalar_type def for BFloat16. > (double_trunc_bfloat_vector): Add vector_type def for BFloat16. > * config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_BF_16): > Add required defination of BFloat16 ext. > * config/riscv/riscv-vector-switch.def (ENTRY): > Add vector_type information for BFloat16. > (TUPLE_ENTRY): Add tuple vector_type information for BFloat16. > > --- > gcc/config/riscv/genrvv-type-indexer.cc | 115 ++++++++++++++++++ > gcc/config/riscv/riscv-modes.def | 30 ++++- > .../riscv/riscv-vector-builtins-types.def | 50 ++++++++ > gcc/config/riscv/riscv-vector-builtins.cc | 7 +- > gcc/config/riscv/riscv-vector-builtins.def | 55 ++++++++- > gcc/config/riscv/riscv-vector-builtins.h | 1 + > gcc/config/riscv/riscv-vector-switch.def | 36 ++++++ > 7 files changed, 291 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/riscv/genrvv-type-indexer.cc b/gcc/config/riscv/genrvv-type-indexer.cc > index 27cbd14982c..8626ddeaaa8 100644 > --- a/gcc/config/riscv/genrvv-type-indexer.cc > +++ b/gcc/config/riscv/genrvv-type-indexer.cc > @@ -117,6 +117,42 @@ inttype (unsigned sew, int lmul_log2, unsigned nf, bool unsigned_p) > return mode.str (); > } > > +std::string > +bfloat16_type (int lmul_log2) > +{ > + if (!valid_type (16, lmul_log2, /*float_t*/ true)) > + return "INVALID"; > + > + std::stringstream mode; > + mode << "vbfloat16" << to_lmul (lmul_log2) << "_t"; > + return mode.str (); > +} > + > +std::string > +bfloat16_wide_type (int lmul_log2) > +{ > + if (!valid_type (32, lmul_log2, /*float_t*/ true)) > + return "INVALID"; > + > + std::stringstream mode; > + mode << "vfloat32" << to_lmul (lmul_log2) << "_t"; > + return mode.str (); > +} > + > +std::string > +bfloat16_type (int lmul_log2, unsigned nf) > +{ > + if (!valid_type (16, lmul_log2, nf, /*float_t*/ true)) > + return "INVALID"; > + > + std::stringstream mode; > + mode << "vbfloat16" << to_lmul (lmul_log2); > + if (nf > 1) > + mode << "x" << nf; > + mode << "_t"; > + return mode.str (); > +} > + > std::string > floattype (unsigned sew, int lmul_log2) > { > @@ -182,6 +218,15 @@ same_ratio_eew_type (unsigned sew, int lmul_log2, unsigned eew, bool unsigned_p, > return inttype (eew, elmul_log2, unsigned_p); > } > > +std::string > +same_ratio_eew_bf16_type (unsigned sew, int lmul_log2) > +{ > + if (sew != 32) > + return "INVALID"; > + int elmul_log2 = lmul_log2 - 1; > + return bfloat16_type (elmul_log2); > +} > + > int > main (int argc, const char **argv) > { > @@ -215,6 +260,8 @@ main (int argc, const char **argv) > fprintf (fp, " /*DOUBLE_TRUNC_SIGNED*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ INVALID,\n"); > fprintf (fp, " /*FLOAT*/ INVALID,\n"); > fprintf (fp, " /*LMUL1*/ INVALID,\n"); > @@ -294,6 +341,8 @@ main (int argc, const char **argv) > same_ratio_eew_type (sew, lmul_log2, sew / 2, true, > false) > .c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", > same_ratio_eew_type (sew, lmul_log2, sew / 2, false, true) > .c_str ()); > @@ -341,6 +390,68 @@ main (int argc, const char **argv) > inttype (sew, lmul_log2, 1, unsigned_p).c_str ()); > fprintf (fp, ")\n"); > } > + // Build for vbfloat16 > + for (int lmul_log2 : {-2, -1, 0, 1, 2, 3}) > + for (unsigned nf : {1, 2, 3, 4, 5, 6, 7, 8}) > + { > + if (!valid_type (16, lmul_log2, nf, /*float_t*/ true)) > + continue; > + > + fprintf (fp, "DEF_RVV_TYPE_INDEX (\n"); > + fprintf (fp, " /*VECTOR*/ %s,\n", > + bfloat16_type (lmul_log2, nf).c_str ()); > + fprintf (fp, " /*MASK*/ %s,\n", maskmode (16, lmul_log2).c_str ()); > + fprintf (fp, " /*SIGNED*/ %s,\n", > + inttype (16, lmul_log2, /*unsigned_p*/ false).c_str ()); > + fprintf (fp, " /*UNSIGNED*/ %s,\n", > + inttype (16, lmul_log2, /*unsigned_p*/ true).c_str ()); > + for (unsigned eew : {8, 16, 32, 64}) > + fprintf ( > + fp, " /*EEW%d_INDEX*/ %s,\n", eew, > + same_ratio_eew_type (16, lmul_log2, eew, true, false).c_str ()); > + fprintf (fp, " /*SHIFT*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); > + fprintf (fp, " /*QUAD_TRUNC*/ INVALID,\n"); > + fprintf (fp, " /*OCT_TRUNC*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_SCALAR*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_SIGNED*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, false).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, true, false).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); > + fprintf (fp, " /*FLOAT*/ INVALID,\n"); > + fprintf (fp, " /*LMUL1*/ %s,\n", > + bfloat16_type (/*lmul_log2*/ 0).c_str ()); > + fprintf (fp, " /*WLMUL1*/ %s,\n", > + bfloat16_wide_type (/*lmul_log2*/ 0).c_str ()); > + for (unsigned eew : {8, 16, 32, 64}) > + fprintf (fp, " /*EEW%d_INTERPRET*/ INVALID,\n", eew); > + > + for (unsigned boolsize : BOOL_SIZE_LIST) > + fprintf (fp, " /*BOOL%d_INTERPRET*/ INVALID,\n", boolsize); > + > + for (unsigned eew : EEW_SIZE_LIST) > + fprintf (fp, " /*SIGNED_EEW%d_LMUL1_INTERPRET*/ INVALID,\n", eew); > + > + for (unsigned eew : EEW_SIZE_LIST) > + fprintf (fp, " /*UNSIGNED_EEW%d_LMUL1_INTERPRET*/ INVALID,\n", eew); > + > + for (unsigned lmul_log2_offset : {1, 2, 3, 4, 5, 6}) > + { > + unsigned multiple_of_lmul = 1 << lmul_log2_offset; > + fprintf (fp, " /*X%d_VLMUL_EXT*/ %s,\n", multiple_of_lmul, > + bfloat16_type (lmul_log2 + lmul_log2_offset).c_str ()); > + } > + fprintf (fp, " /*TUPLE_SUBPART*/ %s\n", > + bfloat16_type (lmul_log2, 1U).c_str ()); > + fprintf (fp, ")\n"); > + } > // Build for vfloat > for (unsigned sew : {16, 32, 64}) > for (int lmul_log2 : {-3, -2, -1, 0, 1, 2, 3}) > @@ -378,6 +489,10 @@ main (int argc, const char **argv) > same_ratio_eew_type (sew, lmul_log2, sew / 2, true, false) > .c_str ()); > fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ %s,\n", > + same_ratio_eew_bf16_type (sew, lmul_log2).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ %s,\n", > + same_ratio_eew_bf16_type (sew, lmul_log2).c_str ()); > fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", > same_ratio_eew_type (sew, lmul_log2, sew / 2, false, true) > .c_str ()); > diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def > index 6de4e440298..b0a78f72754 100644 > --- a/gcc/config/riscv/riscv-modes.def > +++ b/gcc/config/riscv/riscv-modes.def > @@ -93,6 +93,7 @@ ADJUST_BYTESIZE (RVVMF64BI, riscv_v_adjust_bytesize (RVVMF64BImode, 1)); > |DF |RVVM1DF|RVVM2DF|RVVM4DF|RVVM8DF|N/A |N/A |N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|RVVMF2SF|N/A |N/A | > |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|RVVMF4HF|N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|RVVMF4BF|N/A | > > There are the following data types for ELEN = 32. > > @@ -101,11 +102,13 @@ There are the following data types for ELEN = 32. > |HI |RVVM1HI|RVVM2HI|RVVM4HI|RVVM8HI|RVVMF2HI|N/A |N/A | > |QI |RVVM1QI|RVVM2QI|RVVM4QI|RVVM8QI|RVVMF2QI|RVVMF4QI|N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|N/A |N/A |N/A | > - |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | */ > + |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|N/A |N/A | */ > > #define RVV_WHOLE_MODES(LMUL) \ > VECTOR_MODE_WITH_PREFIX (RVVM, INT, QI, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, INT, HI, LMUL, 0); \ > + VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, BF, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, HF, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, INT, SI, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, SF, LMUL, 0); \ > @@ -120,6 +123,8 @@ There are the following data types for ELEN = 32. > riscv_v_adjust_nunits (RVVM##LMUL##SImode, false, LMUL, 1)); \ > ADJUST_NUNITS (RVVM##LMUL##DI, \ > riscv_v_adjust_nunits (RVVM##LMUL##DImode, false, LMUL, 1)); \ > + ADJUST_NUNITS (RVVM##LMUL##BF, \ > + riscv_v_adjust_nunits (RVVM##LMUL##BFmode, false, LMUL, 1)); \ > ADJUST_NUNITS (RVVM##LMUL##HF, \ > riscv_v_adjust_nunits (RVVM##LMUL##HFmode, false, LMUL, 1)); \ > ADJUST_NUNITS (RVVM##LMUL##SF, \ > @@ -131,6 +136,7 @@ There are the following data types for ELEN = 32. > ADJUST_ALIGNMENT (RVVM##LMUL##HI, 2); \ > ADJUST_ALIGNMENT (RVVM##LMUL##SI, 4); \ > ADJUST_ALIGNMENT (RVVM##LMUL##DI, 8); \ > + ADJUST_ALIGNMENT (RVVM##LMUL##BF, 2); \ > ADJUST_ALIGNMENT (RVVM##LMUL##HF, 2); \ > ADJUST_ALIGNMENT (RVVM##LMUL##SF, 4); \ > ADJUST_ALIGNMENT (RVVM##LMUL##DF, 8); > @@ -153,6 +159,8 @@ RVV_FRACT_MODE (INT, QI, 4, 1) > RVV_FRACT_MODE (INT, QI, 8, 1) > RVV_FRACT_MODE (INT, HI, 2, 2) > RVV_FRACT_MODE (INT, HI, 4, 2) > +RVV_FRACT_MODE (FLOAT, BF, 2, 2) > +RVV_FRACT_MODE (FLOAT, BF, 4, 2) > RVV_FRACT_MODE (FLOAT, HF, 2, 2) > RVV_FRACT_MODE (FLOAT, HF, 4, 2) > RVV_FRACT_MODE (INT, SI, 2, 4) > @@ -174,6 +182,9 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) > VECTOR_MODE_WITH_PREFIX (RVVMF4x, INT, HI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVMF2x, INT, HI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM1x, INT, HI, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVMF4x, FLOAT, BF, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVMF2x, FLOAT, BF, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVM1x, FLOAT, BF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVMF4x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVMF2x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM1x, FLOAT, HF, NF, 1); \ > @@ -198,6 +209,12 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) > riscv_v_adjust_nunits (RVVMF2x##NF##HImode, true, 2, NF)); \ > ADJUST_NUNITS (RVVM1x##NF##HI, \ > riscv_v_adjust_nunits (RVVM1x##NF##HImode, false, 1, NF)); \ > + ADJUST_NUNITS (RVVMF4x##NF##BF, \ > + riscv_v_adjust_nunits (RVVMF4x##NF##BFmode, true, 4, NF)); \ > + ADJUST_NUNITS (RVVMF2x##NF##BF, \ > + riscv_v_adjust_nunits (RVVMF2x##NF##BFmode, true, 2, NF)); \ > + ADJUST_NUNITS (RVVM1x##NF##BF, \ > + riscv_v_adjust_nunits (RVVM1x##NF##BFmode, false, 1, NF)); \ > ADJUST_NUNITS (RVVMF4x##NF##HF, \ > riscv_v_adjust_nunits (RVVMF4x##NF##HFmode, true, 4, NF)); \ > ADJUST_NUNITS (RVVMF2x##NF##HF, \ > @@ -224,6 +241,9 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) > ADJUST_ALIGNMENT (RVVMF4x##NF##HI, 2); \ > ADJUST_ALIGNMENT (RVVMF2x##NF##HI, 2); \ > ADJUST_ALIGNMENT (RVVM1x##NF##HI, 2); \ > + ADJUST_ALIGNMENT (RVVMF4x##NF##BF, 2); \ > + ADJUST_ALIGNMENT (RVVMF2x##NF##BF, 2); \ > + ADJUST_ALIGNMENT (RVVM1x##NF##BF, 2); \ > ADJUST_ALIGNMENT (RVVMF4x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVMF2x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVM1x##NF##HF, 2); \ > @@ -245,6 +265,7 @@ RVV_NF8_MODES (2) > #define RVV_NF4_MODES(NF) \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, QI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, HI, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, BF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, SI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, SF, NF, 1); \ > @@ -255,6 +276,8 @@ RVV_NF8_MODES (2) > riscv_v_adjust_nunits (RVVM2x##NF##QImode, false, 2, NF)); \ > ADJUST_NUNITS (RVVM2x##NF##HI, \ > riscv_v_adjust_nunits (RVVM2x##NF##HImode, false, 2, NF)); \ > + ADJUST_NUNITS (RVVM2x##NF##BF, \ > + riscv_v_adjust_nunits (RVVM2x##NF##BFmode, false, 2, NF)); \ > ADJUST_NUNITS (RVVM2x##NF##HF, \ > riscv_v_adjust_nunits (RVVM2x##NF##HFmode, false, 2, NF)); \ > ADJUST_NUNITS (RVVM2x##NF##SI, \ > @@ -268,6 +291,7 @@ RVV_NF8_MODES (2) > \ > ADJUST_ALIGNMENT (RVVM2x##NF##QI, 1); \ > ADJUST_ALIGNMENT (RVVM2x##NF##HI, 2); \ > + ADJUST_ALIGNMENT (RVVM2x##NF##BF, 2); \ > ADJUST_ALIGNMENT (RVVM2x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVM2x##NF##SI, 4); \ > ADJUST_ALIGNMENT (RVVM2x##NF##SF, 4); \ > @@ -281,6 +305,7 @@ RVV_NF4_MODES (4) > #define RVV_NF2_MODES(NF) \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, QI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, HI, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, BF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, SI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, SF, NF, 1); \ > @@ -291,6 +316,8 @@ RVV_NF4_MODES (4) > riscv_v_adjust_nunits (RVVM4x##NF##QImode, false, 4, NF)); \ > ADJUST_NUNITS (RVVM4x##NF##HI, \ > riscv_v_adjust_nunits (RVVM4x##NF##HImode, false, 4, NF)); \ > + ADJUST_NUNITS (RVVM4x##NF##BF, \ > + riscv_v_adjust_nunits (RVVM4x##NF##BFmode, false, 4, NF)); \ > ADJUST_NUNITS (RVVM4x##NF##HF, \ > riscv_v_adjust_nunits (RVVM4x##NF##HFmode, false, 4, NF)); \ > ADJUST_NUNITS (RVVM4x##NF##SI, \ > @@ -304,6 +331,7 @@ RVV_NF4_MODES (4) > \ > ADJUST_ALIGNMENT (RVVM4x##NF##QI, 1); \ > ADJUST_ALIGNMENT (RVVM4x##NF##HI, 2); \ > + ADJUST_ALIGNMENT (RVVM4x##NF##BF, 2); \ > ADJUST_ALIGNMENT (RVVM4x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVM4x##NF##SI, 4); \ > ADJUST_ALIGNMENT (RVVM4x##NF##SF, 4); \ > diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def b/gcc/config/riscv/riscv-vector-builtins-types.def > index 61019a56844..e7fca4cca79 100644 > --- a/gcc/config/riscv/riscv-vector-builtins-types.def > +++ b/gcc/config/riscv/riscv-vector-builtins-types.def > @@ -397,6 +397,13 @@ DEF_RVV_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_U_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64) > > +DEF_RVV_F_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_F_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m8_t, RVV_REQUIRE_ELEN_BF_16) > + > DEF_RVV_F_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_F_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_F_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -999,6 +1006,11 @@ DEF_RVV_X2_VLMUL_EXT_OPS (vuint32m4_t, 0) > DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -1040,6 +1052,10 @@ DEF_RVV_X4_VLMUL_EXT_OPS (vuint32m1_t, 0) > DEF_RVV_X4_VLMUL_EXT_OPS (vuint32m2_t, 0) > DEF_RVV_X4_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_X4_VLMUL_EXT_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -1070,6 +1086,9 @@ DEF_RVV_X8_VLMUL_EXT_OPS (vuint16m1_t, 0) > DEF_RVV_X8_VLMUL_EXT_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X8_VLMUL_EXT_OPS (vuint32m1_t, 0) > DEF_RVV_X8_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -1089,6 +1108,8 @@ DEF_RVV_X16_VLMUL_EXT_OPS (vuint8mf2_t, 0) > DEF_RVV_X16_VLMUL_EXT_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X16_VLMUL_EXT_OPS (vuint16mf2_t, 0) > DEF_RVV_X16_VLMUL_EXT_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X16_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X16_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X16_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X16_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X16_VLMUL_EXT_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64) > @@ -1099,6 +1120,7 @@ DEF_RVV_X32_VLMUL_EXT_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X32_VLMUL_EXT_OPS (vuint8mf8_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X32_VLMUL_EXT_OPS (vuint8mf4_t, 0) > DEF_RVV_X32_VLMUL_EXT_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X32_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X32_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > > DEF_RVV_X64_VLMUL_EXT_OPS (vint8mf8_t, RVV_REQUIRE_MIN_VLEN_64) > @@ -1112,6 +1134,7 @@ DEF_RVV_LMUL1_OPS (vuint8m1_t, 0) > DEF_RVV_LMUL1_OPS (vuint16m1_t, 0) > DEF_RVV_LMUL1_OPS (vuint32m1_t, 0) > DEF_RVV_LMUL1_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_LMUL1_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_LMUL1_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_LMUL1_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32) > DEF_RVV_LMUL1_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64) > @@ -1124,6 +1147,7 @@ DEF_RVV_LMUL2_OPS (vuint8m2_t, 0) > DEF_RVV_LMUL2_OPS (vuint16m2_t, 0) > DEF_RVV_LMUL2_OPS (vuint32m2_t, 0) > DEF_RVV_LMUL2_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_LMUL2_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_LMUL2_OPS (vfloat16m2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_LMUL2_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32) > DEF_RVV_LMUL2_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64) > @@ -1137,6 +1161,7 @@ DEF_RVV_LMUL4_OPS (vuint16m4_t, 0) > DEF_RVV_LMUL4_OPS (vuint32m4_t, 0) > DEF_RVV_LMUL4_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_LMUL4_OPS (vfloat16m4_t, RVV_REQUIRE_ELEN_FP_16) > +DEF_RVV_LMUL4_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_LMUL4_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32) > DEF_RVV_LMUL4_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64) > > @@ -1312,6 +1337,31 @@ DEF_RVV_TUPLE_OPS (vint64m2x4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_TUPLE_OPS (vuint64m2x4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_TUPLE_OPS (vint64m4x2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_TUPLE_OPS (vuint64m4x2_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x2_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x3_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x5_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x6_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x7_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x8_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x3_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x5_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x6_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x7_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x8_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x3_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x5_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x6_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x7_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x8_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m2x2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m2x3_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m2x4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m4x2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_TUPLE_OPS (vfloat16mf4x2_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_TUPLE_OPS (vfloat16mf4x3_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_TUPLE_OPS (vfloat16mf4x4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc > index c08d87a2680..720436dfbc9 100644 > --- a/gcc/config/riscv/riscv-vector-builtins.cc > +++ b/gcc/config/riscv/riscv-vector-builtins.cc > @@ -2808,7 +2808,8 @@ static CONSTEXPR const function_type_info function_types[] = { > VECTOR, MASK, SIGNED, UNSIGNED, EEW8_INDEX, EEW16_INDEX, EEW32_INDEX, \ > EEW64_INDEX, SHIFT, DOUBLE_TRUNC, QUAD_TRUNC, OCT_TRUNC, \ > DOUBLE_TRUNC_SCALAR, DOUBLE_TRUNC_SIGNED, DOUBLE_TRUNC_UNSIGNED, \ > - DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > + DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_BFLOAT_SCALAR, \ > + DOUBLE_TRUNC_BFLOAT, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > EEW8_INTERPRET, EEW16_INTERPRET, EEW32_INTERPRET, EEW64_INTERPRET, \ > BOOL1_INTERPRET, BOOL2_INTERPRET, BOOL4_INTERPRET, BOOL8_INTERPRET, \ > BOOL16_INTERPRET, BOOL32_INTERPRET, BOOL64_INTERPRET, \ > @@ -2845,6 +2846,8 @@ static CONSTEXPR const function_type_info function_types[] = { > VECTOR_TYPE_##DOUBLE_TRUNC_SIGNED, \ > VECTOR_TYPE_##DOUBLE_TRUNC_UNSIGNED, \ > VECTOR_TYPE_##DOUBLE_TRUNC_UNSIGNED_SCALAR, \ > + VECTOR_TYPE_##DOUBLE_TRUNC_BFLOAT_SCALAR, \ > + VECTOR_TYPE_##DOUBLE_TRUNC_BFLOAT, \ > VECTOR_TYPE_##DOUBLE_TRUNC_FLOAT, \ > VECTOR_TYPE_##FLOAT, \ > VECTOR_TYPE_##LMUL1, \ > @@ -3284,6 +3287,8 @@ check_required_extensions (const function_instance &instance) > > uint64_t riscv_isa_flags = 0; > > + if (TARGET_VECTOR_ELEN_BF_16) > + riscv_isa_flags |= RVV_REQUIRE_ELEN_BF_16; > if (TARGET_VECTOR_ELEN_FP_16) > riscv_isa_flags |= RVV_REQUIRE_ELEN_FP_16; > if (TARGET_VECTOR_ELEN_FP_32) > diff --git a/gcc/config/riscv/riscv-vector-builtins.def b/gcc/config/riscv/riscv-vector-builtins.def > index 784b54c81a4..97f329d11eb 100644 > --- a/gcc/config/riscv/riscv-vector-builtins.def > +++ b/gcc/config/riscv/riscv-vector-builtins.def > @@ -72,7 +72,8 @@ along with GCC; see the file COPYING3. If not see > VECTOR, MASK, SIGNED, UNSIGNED, EEW8_INDEX, EEW16_INDEX, EEW32_INDEX, \ > EEW64_INDEX, SHIFT, DOUBLE_TRUNC, QUAD_TRUNC, OCT_TRUNC, \ > DOUBLE_TRUNC_SCALAR, DOUBLE_TRUNC_SIGNED, DOUBLE_TRUNC_UNSIGNED, \ > - DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > + DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_BFLOAT_SCALAR, \ > + DOUBLE_TRUNC_BFLOAT, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > EEW8_INTERPRET, EEW16_INTERPRET, EEW32_INTERPRET, EEW64_INTERPRET, \ > BOOL1_INTERPRET, BOOL2_INTERPRET, BOOL4_INTERPRET, BOOL8_INTERPRET, \ > BOOL16_INTERPRET, BOOL32_INTERPRET, BOOL64_INTERPRET, \ > @@ -436,6 +437,56 @@ DEF_RVV_TYPE (vint64m8_t, 15, __rvv_int64m8_t, int64, RVVM8DI, _i64m8, _i64, > DEF_RVV_TYPE (vuint64m8_t, 16, __rvv_uint64m8_t, uint64, RVVM8DI, _u64m8, _u64, > _e64m8) > > +/* Enabled if TARGET_VECTOR_ELEN_BF_16 && (TARGET_ZVFBFMIN or TARGET_ZVFBFWMA). */ > +/* LMUL = 1/4. */ > +DEF_RVV_TYPE (vbfloat16mf4_t, 19, __rvv_bfloat16mf4_t, bfloat16, RVVMF4BF, _bf16mf4, > + _bf16, _e16mf4) > +/* Define tuple types for SEW = 16, LMUL = MF4. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x2_t, 21, __rvv_bfloat16mf4x2_t, vbfloat16mf4_t, bfloat16, 2, _bf16mf4x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x3_t, 21, __rvv_bfloat16mf4x3_t, vbfloat16mf4_t, bfloat16, 3, _bf16mf4x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x4_t, 21, __rvv_bfloat16mf4x4_t, vbfloat16mf4_t, bfloat16, 4, _bf16mf4x4) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x5_t, 21, __rvv_bfloat16mf4x5_t, vbfloat16mf4_t, bfloat16, 5, _bf16mf4x5) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x6_t, 21, __rvv_bfloat16mf4x6_t, vbfloat16mf4_t, bfloat16, 6, _bf16mf4x6) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x7_t, 21, __rvv_bfloat16mf4x7_t, vbfloat16mf4_t, bfloat16, 7, _bf16mf4x7) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x8_t, 21, __rvv_bfloat16mf4x8_t, vbfloat16mf4_t, bfloat16, 8, _bf16mf4x8) > +/* LMUL = 1/2. */ > +DEF_RVV_TYPE (vbfloat16mf2_t, 19, __rvv_bfloat16mf2_t, bfloat16, RVVMF2BF, _bf16mf2, > + _bf16, _e16mf2) > +/* Define tuple types for SEW = 16, LMUL = MF2. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x2_t, 21, __rvv_bfloat16mf2x2_t, vbfloat16mf2_t, bfloat16, 2, _bf16mf2x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x3_t, 21, __rvv_bfloat16mf2x3_t, vbfloat16mf2_t, bfloat16, 3, _bf16mf2x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x4_t, 21, __rvv_bfloat16mf2x4_t, vbfloat16mf2_t, bfloat16, 4, _bf16mf2x4) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x5_t, 21, __rvv_bfloat16mf2x5_t, vbfloat16mf2_t, bfloat16, 5, _bf16mf2x5) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x6_t, 21, __rvv_bfloat16mf2x6_t, vbfloat16mf2_t, bfloat16, 6, _bf16mf2x6) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x7_t, 21, __rvv_bfloat16mf2x7_t, vbfloat16mf2_t, bfloat16, 7, _bf16mf2x7) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x8_t, 21, __rvv_bfloat16mf2x8_t, vbfloat16mf2_t, bfloat16, 8, _bf16mf2x8) > +/* LMUL = 1. */ > +DEF_RVV_TYPE (vbfloat16m1_t, 18, __rvv_bfloat16m1_t, bfloat16, RVVM1BF, _bf16m1, > + _bf16, _e16m1) > +/* Define tuple types for SEW = 16, LMUL = M1. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x2_t, 20, __rvv_bfloat16m1x2_t, vbfloat16m1_t, bfloat16, 2, _bf16m1x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x3_t, 20, __rvv_bfloat16m1x3_t, vbfloat16m1_t, bfloat16, 3, _bf16m1x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x4_t, 20, __rvv_bfloat16m1x4_t, vbfloat16m1_t, bfloat16, 4, _bf16m1x4) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x5_t, 20, __rvv_bfloat16m1x5_t, vbfloat16m1_t, bfloat16, 5, _bf16m1x5) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x6_t, 20, __rvv_bfloat16m1x6_t, vbfloat16m1_t, bfloat16, 6, _bf16m1x6) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x7_t, 20, __rvv_bfloat16m1x7_t, vbfloat16m1_t, bfloat16, 7, _bf16m1x7) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x8_t, 20, __rvv_bfloat16m1x8_t, vbfloat16m1_t, bfloat16, 8, _bf16m1x8) > +/* LMUL = 2. */ > +DEF_RVV_TYPE (vbfloat16m2_t, 18, __rvv_bfloat16m2_t, bfloat16, RVVM2BF, _bf16m2, > + _bf16, _e16m2) > +/* Define tuple types for SEW = 16, LMUL = M2. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16m2x2_t, 20, __rvv_bfloat16m2x2_t, vbfloat16m2_t, bfloat16, 2, _bf16m2x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16m2x3_t, 20, __rvv_bfloat16m2x3_t, vbfloat16m2_t, bfloat16, 3, _bf16m2x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16m2x4_t, 20, __rvv_bfloat16m2x4_t, vbfloat16m2_t, bfloat16, 4, _bf16m2x4) > +/* LMUL = 4. */ > +DEF_RVV_TYPE (vbfloat16m4_t, 18, __rvv_bfloat16m4_t, bfloat16, RVVM4BF, _bf16m4, > + _bf16, _e16m4) > +/* Define tuple types for SEW = 16, LMUL = M4. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16m4x2_t, 20, __rvv_bfloat16m4x2_t, vbfloat16m4_t, bfloat16, 2, _bf16m4x2) > +/* LMUL = 8. */ > +DEF_RVV_TYPE (vbfloat16m8_t, 18, __rvv_bfloat16m8_t, bfloat16, RVVM8BF, _bf16m8, > + _bf16, _e16m8) > + > /* Enabled if TARGET_VECTOR_ELEN_FP_16 && (TARGET_ZVFH or TARGET_ZVFHMIN). */ > /* LMUL = 1/4. */ > DEF_RVV_TYPE (vfloat16mf4_t, 18, __rvv_float16mf4_t, float16, RVVMF4HF, _f16mf4, > @@ -630,6 +681,8 @@ DEF_RVV_BASE_TYPE (double_trunc_scalar, get_scalar_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_signed_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_unsigned_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_unsigned_scalar, get_scalar_type (type_idx)) > +DEF_RVV_BASE_TYPE (double_trunc_bfloat_scalar, get_scalar_type (type_idx)) > +DEF_RVV_BASE_TYPE (double_trunc_bfloat_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_float_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (float_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (lmul1_vector, get_vector_type (type_idx)) > diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h > index 05d18ae1322..56dbe2cf0e2 100644 > --- a/gcc/config/riscv/riscv-vector-builtins.h > +++ b/gcc/config/riscv/riscv-vector-builtins.h > @@ -109,6 +109,7 @@ static const unsigned int CP_WRITE_CSR = 1U << 5; > #define RVV_REQUIRE_FULL_V (1 << 4) /* Require Full 'V' extension. */ > #define RVV_REQUIRE_MIN_VLEN_64 (1 << 5) /* Require TARGET_MIN_VLEN >= 64. */ > #define RVV_REQUIRE_ELEN_FP_16 (1 << 6) /* Require FP ELEN >= 32. */ > +#define RVV_REQUIRE_ELEN_BF_16 (1 << 7) /* Require BF16. */ > > /* Enumerates the required extensions. */ > enum required_ext > diff --git a/gcc/config/riscv/riscv-vector-switch.def b/gcc/config/riscv/riscv-vector-switch.def > index 452283b7416..de72e415fe8 100644 > --- a/gcc/config/riscv/riscv-vector-switch.def > +++ b/gcc/config/riscv/riscv-vector-switch.def > @@ -43,6 +43,7 @@ Encode SEW and LMUL into data types. > |DF |RVVM1DF|RVVM2DF|RVVM4DF|RVVM8DF|N/A |N/A |N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|RVVMF2SF|N/A |N/A | > |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|RVVMF4HF|N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|RVVMF4BF|N/A | > > There are the following data types for ELEN = 32. > > @@ -52,6 +53,7 @@ There are the following data types for ELEN = 32. > |QI |RVVM1QI|RVVM2QI|RVVM4QI|RVVM8QI|RVVMF2QI|RVVMF4QI|N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|N/A |N/A |N/A | > |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|N/A |N/A | > > Encode the ratio of SEW/LMUL into the mask types. > There are the following mask types. > @@ -93,6 +95,14 @@ ENTRY (RVVM1HI, true, LMUL_1, 16) > ENTRY (RVVMF2HI, !TARGET_XTHEADVECTOR, LMUL_F2, 32) > ENTRY (RVVMF4HI, TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, LMUL_F4, 64) > > +/* Disable modes if TARGET_MIN_VLEN == 32 or !TARGET_VECTOR_ELEN_BF_16. */ > +ENTRY (RVVM8BF, TARGET_VECTOR_ELEN_BF_16, LMUL_8, 2) > +ENTRY (RVVM4BF, TARGET_VECTOR_ELEN_BF_16, LMUL_4, 4) > +ENTRY (RVVM2BF, TARGET_VECTOR_ELEN_BF_16, LMUL_2, 8) > +ENTRY (RVVM1BF, TARGET_VECTOR_ELEN_BF_16, LMUL_1, 16) > +ENTRY (RVVMF2BF, TARGET_VECTOR_ELEN_BF_16, LMUL_F2, 32) > +ENTRY (RVVMF4BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, LMUL_F4, 64) > + > /* Disable modes if TARGET_MIN_VLEN == 32 or !TARGET_VECTOR_ELEN_FP_16. */ > ENTRY (RVVM8HF, TARGET_VECTOR_ELEN_FP_16, LMUL_8, 2) > ENTRY (RVVM4HF, TARGET_VECTOR_ELEN_FP_16, LMUL_4, 4) > @@ -198,6 +208,32 @@ TUPLE_ENTRY (RVVM1x2HI, true, RVVM1HI, 2, LMUL_1, 16) > TUPLE_ENTRY (RVVMF2x2HI, !TARGET_XTHEADVECTOR, RVVMF2HI, 2, LMUL_F2, 32) > TUPLE_ENTRY (RVVMF4x2HI, TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, RVVMF4HI, 2, LMUL_F4, 64) > > +TUPLE_ENTRY (RVVM1x8BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 8, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x8BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 8, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x8BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 8, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM1x7BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 7, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x7BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 7, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x7BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 7, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM1x6BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 6, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x6BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 6, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x6BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 6, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM1x5BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 5, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x5BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 5, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x5BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 5, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM2x4BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 4, LMUL_2, 8) > +TUPLE_ENTRY (RVVM1x4BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 4, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x4BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 4, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x4BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 4, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM2x3BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 3, LMUL_2, 8) > +TUPLE_ENTRY (RVVM1x3BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 3, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x3BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 3, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x3BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 3, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM4x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM4BF, 2, LMUL_4, 4) > +TUPLE_ENTRY (RVVM2x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 2, LMUL_2, 8) > +TUPLE_ENTRY (RVVM1x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 2, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x2BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 2, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x2BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 2, LMUL_F4, 64) > + > TUPLE_ENTRY (RVVM1x8HF, TARGET_VECTOR_ELEN_FP_16, RVVM1HF, 8, LMUL_1, 16) > TUPLE_ENTRY (RVVMF2x8HF, TARGET_VECTOR_ELEN_FP_16 && !TARGET_XTHEADVECTOR, RVVMF2HF, 8, LMUL_F2, 32) > TUPLE_ENTRY (RVVMF4x8HF, TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, RVVMF4HF, 8, LMUL_F4, 64) > -- > 2.17.1 >
Yes. LGTM from myside too. you can commit it as long as you pass the CI juzhe.zhong@rivai.ai From: Kito Cheng Date: 2024-07-11 15:26 To: Feng Wang CC: gcc-patches; juzhe.zhong; jinma.contrib Subject: Re: [PATCH 1/3 v3] RISC-V: Add vector type of BFloat16 format OK for this patch set, I know you already got LGTM from JuZhe or me before, so just an explicitly ack to let you know it's still OK once CI is passed. On Thu, Jul 11, 2024 at 3:11 PM Feng Wang <wangfeng@eswincomputing.com> wrote: > > v3: Rebase > v2: Rebase > The vector type of BFloat16 format is added in this patch, > subsequent extensions to zvfbfmin and zvfwma need to be based > on this patch. > > Signed-off-by: Feng Wang <wangfeng@eswincomputing.com> > gcc/ChangeLog: > > * config/riscv/genrvv-type-indexer.cc (bfloat16_type): > Generate bf16 vector_type and scalar_type in DEF_RVV_TYPE_INDEX. > (bfloat16_wide_type): Ditto. > (same_ratio_eew_bf16_type): Ditto. > (main): Ditto. > * config/riscv/riscv-modes.def (ADJUST_BYTESIZE): > (RVV_WHOLE_MODES): Add vector type for BFloat16. > (RVV_FRACT_MODE): Ditto. > (RVV_NF4_MODES): Ditto. > (RVV_NF8_MODES): Ditto. > (RVV_NF2_MODES): Ditto. > * config/riscv/riscv-vector-builtins-types.def (vbfloat16mf4_t): > (vbfloat16mf2_t): Add builtin vector type for BFloat16. > (vbfloat16m1_t): Ditto. > (vbfloat16m2_t): Ditto. > (vbfloat16m4_t): Ditto. > (vbfloat16m8_t): Ditto. > (vbfloat16mf4x2_t): Ditto. > (vbfloat16mf4x3_t): Ditto. > (vbfloat16mf4x4_t): Ditto. > (vbfloat16mf4x5_t): Ditto. > (vbfloat16mf4x6_t): Ditto. > (vbfloat16mf4x7_t): Ditto. > (vbfloat16mf4x8_t): Ditto. > (vbfloat16mf2x2_t): Ditto. > (vbfloat16mf2x3_t): Ditto. > (vbfloat16mf2x4_t): Ditto. > (vbfloat16mf2x5_t): Ditto. > (vbfloat16mf2x6_t): Ditto. > (vbfloat16mf2x7_t): Ditto. > (vbfloat16mf2x8_t): Ditto. > (vbfloat16m1x2_t): Ditto. > (vbfloat16m1x3_t): Ditto. > (vbfloat16m1x4_t): Ditto. > (vbfloat16m1x5_t): Ditto. > (vbfloat16m1x6_t): Ditto. > (vbfloat16m1x7_t): Ditto. > (vbfloat16m1x8_t): Ditto. > (vbfloat16m2x2_t): Ditto. > (vbfloat16m2x3_t): Ditto. > (vbfloat16m2x4_t): Ditto. > (vbfloat16m4x2_t): Ditto. > * config/riscv/riscv-vector-builtins.cc (check_required_extensions): > Add required_ext checking for BFloat16. > * config/riscv/riscv-vector-builtins.def (vbfloat16mf4_t): > Add vector_type for BFloat16 in builtins.def. > (vbfloat16mf4x2_t): Ditto. > (vbfloat16mf4x3_t): Ditto. > (vbfloat16mf4x4_t): Ditto. > (vbfloat16mf4x5_t): Ditto. > (vbfloat16mf4x6_t): Ditto. > (vbfloat16mf4x7_t): Ditto. > (vbfloat16mf4x8_t): Ditto. > (vbfloat16mf2_t): Ditto. > (vbfloat16mf2x2_t): Ditto. > (vbfloat16mf2x3_t): Ditto. > (vbfloat16mf2x4_t): Ditto. > (vbfloat16mf2x5_t): Ditto. > (vbfloat16mf2x6_t): Ditto. > (vbfloat16mf2x7_t): Ditto. > (vbfloat16mf2x8_t): Ditto. > (vbfloat16m1_t): Ditto. > (vbfloat16m1x2_t): Ditto. > (vbfloat16m1x3_t): Ditto. > (vbfloat16m1x4_t): Ditto. > (vbfloat16m1x5_t): Ditto. > (vbfloat16m1x6_t): Ditto. > (vbfloat16m1x7_t): Ditto. > (vbfloat16m1x8_t): Ditto. > (vbfloat16m2_t): Ditto. > (vbfloat16m2x2_t): Ditto. > (vbfloat16m2x3_t): Ditto. > (vbfloat16m2x4_t): Ditto. > (vbfloat16m4_t): Ditto. > (vbfloat16m4x2_t): Ditto. > (vbfloat16m8_t): Ditto. > (double_trunc_bfloat_scalar): Add scalar_type def for BFloat16. > (double_trunc_bfloat_vector): Add vector_type def for BFloat16. > * config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_BF_16): > Add required defination of BFloat16 ext. > * config/riscv/riscv-vector-switch.def (ENTRY): > Add vector_type information for BFloat16. > (TUPLE_ENTRY): Add tuple vector_type information for BFloat16. > > --- > gcc/config/riscv/genrvv-type-indexer.cc | 115 ++++++++++++++++++ > gcc/config/riscv/riscv-modes.def | 30 ++++- > .../riscv/riscv-vector-builtins-types.def | 50 ++++++++ > gcc/config/riscv/riscv-vector-builtins.cc | 7 +- > gcc/config/riscv/riscv-vector-builtins.def | 55 ++++++++- > gcc/config/riscv/riscv-vector-builtins.h | 1 + > gcc/config/riscv/riscv-vector-switch.def | 36 ++++++ > 7 files changed, 291 insertions(+), 3 deletions(-) > > diff --git a/gcc/config/riscv/genrvv-type-indexer.cc b/gcc/config/riscv/genrvv-type-indexer.cc > index 27cbd14982c..8626ddeaaa8 100644 > --- a/gcc/config/riscv/genrvv-type-indexer.cc > +++ b/gcc/config/riscv/genrvv-type-indexer.cc > @@ -117,6 +117,42 @@ inttype (unsigned sew, int lmul_log2, unsigned nf, bool unsigned_p) > return mode.str (); > } > > +std::string > +bfloat16_type (int lmul_log2) > +{ > + if (!valid_type (16, lmul_log2, /*float_t*/ true)) > + return "INVALID"; > + > + std::stringstream mode; > + mode << "vbfloat16" << to_lmul (lmul_log2) << "_t"; > + return mode.str (); > +} > + > +std::string > +bfloat16_wide_type (int lmul_log2) > +{ > + if (!valid_type (32, lmul_log2, /*float_t*/ true)) > + return "INVALID"; > + > + std::stringstream mode; > + mode << "vfloat32" << to_lmul (lmul_log2) << "_t"; > + return mode.str (); > +} > + > +std::string > +bfloat16_type (int lmul_log2, unsigned nf) > +{ > + if (!valid_type (16, lmul_log2, nf, /*float_t*/ true)) > + return "INVALID"; > + > + std::stringstream mode; > + mode << "vbfloat16" << to_lmul (lmul_log2); > + if (nf > 1) > + mode << "x" << nf; > + mode << "_t"; > + return mode.str (); > +} > + > std::string > floattype (unsigned sew, int lmul_log2) > { > @@ -182,6 +218,15 @@ same_ratio_eew_type (unsigned sew, int lmul_log2, unsigned eew, bool unsigned_p, > return inttype (eew, elmul_log2, unsigned_p); > } > > +std::string > +same_ratio_eew_bf16_type (unsigned sew, int lmul_log2) > +{ > + if (sew != 32) > + return "INVALID"; > + int elmul_log2 = lmul_log2 - 1; > + return bfloat16_type (elmul_log2); > +} > + > int > main (int argc, const char **argv) > { > @@ -215,6 +260,8 @@ main (int argc, const char **argv) > fprintf (fp, " /*DOUBLE_TRUNC_SIGNED*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ INVALID,\n"); > fprintf (fp, " /*FLOAT*/ INVALID,\n"); > fprintf (fp, " /*LMUL1*/ INVALID,\n"); > @@ -294,6 +341,8 @@ main (int argc, const char **argv) > same_ratio_eew_type (sew, lmul_log2, sew / 2, true, > false) > .c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); > fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", > same_ratio_eew_type (sew, lmul_log2, sew / 2, false, true) > .c_str ()); > @@ -341,6 +390,68 @@ main (int argc, const char **argv) > inttype (sew, lmul_log2, 1, unsigned_p).c_str ()); > fprintf (fp, ")\n"); > } > + // Build for vbfloat16 > + for (int lmul_log2 : {-2, -1, 0, 1, 2, 3}) > + for (unsigned nf : {1, 2, 3, 4, 5, 6, 7, 8}) > + { > + if (!valid_type (16, lmul_log2, nf, /*float_t*/ true)) > + continue; > + > + fprintf (fp, "DEF_RVV_TYPE_INDEX (\n"); > + fprintf (fp, " /*VECTOR*/ %s,\n", > + bfloat16_type (lmul_log2, nf).c_str ()); > + fprintf (fp, " /*MASK*/ %s,\n", maskmode (16, lmul_log2).c_str ()); > + fprintf (fp, " /*SIGNED*/ %s,\n", > + inttype (16, lmul_log2, /*unsigned_p*/ false).c_str ()); > + fprintf (fp, " /*UNSIGNED*/ %s,\n", > + inttype (16, lmul_log2, /*unsigned_p*/ true).c_str ()); > + for (unsigned eew : {8, 16, 32, 64}) > + fprintf ( > + fp, " /*EEW%d_INDEX*/ %s,\n", eew, > + same_ratio_eew_type (16, lmul_log2, eew, true, false).c_str ()); > + fprintf (fp, " /*SHIFT*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); > + fprintf (fp, " /*QUAD_TRUNC*/ INVALID,\n"); > + fprintf (fp, " /*OCT_TRUNC*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_SCALAR*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_SIGNED*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, false).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, true, false).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", > + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); > + fprintf (fp, " /*FLOAT*/ INVALID,\n"); > + fprintf (fp, " /*LMUL1*/ %s,\n", > + bfloat16_type (/*lmul_log2*/ 0).c_str ()); > + fprintf (fp, " /*WLMUL1*/ %s,\n", > + bfloat16_wide_type (/*lmul_log2*/ 0).c_str ()); > + for (unsigned eew : {8, 16, 32, 64}) > + fprintf (fp, " /*EEW%d_INTERPRET*/ INVALID,\n", eew); > + > + for (unsigned boolsize : BOOL_SIZE_LIST) > + fprintf (fp, " /*BOOL%d_INTERPRET*/ INVALID,\n", boolsize); > + > + for (unsigned eew : EEW_SIZE_LIST) > + fprintf (fp, " /*SIGNED_EEW%d_LMUL1_INTERPRET*/ INVALID,\n", eew); > + > + for (unsigned eew : EEW_SIZE_LIST) > + fprintf (fp, " /*UNSIGNED_EEW%d_LMUL1_INTERPRET*/ INVALID,\n", eew); > + > + for (unsigned lmul_log2_offset : {1, 2, 3, 4, 5, 6}) > + { > + unsigned multiple_of_lmul = 1 << lmul_log2_offset; > + fprintf (fp, " /*X%d_VLMUL_EXT*/ %s,\n", multiple_of_lmul, > + bfloat16_type (lmul_log2 + lmul_log2_offset).c_str ()); > + } > + fprintf (fp, " /*TUPLE_SUBPART*/ %s\n", > + bfloat16_type (lmul_log2, 1U).c_str ()); > + fprintf (fp, ")\n"); > + } > // Build for vfloat > for (unsigned sew : {16, 32, 64}) > for (int lmul_log2 : {-3, -2, -1, 0, 1, 2, 3}) > @@ -378,6 +489,10 @@ main (int argc, const char **argv) > same_ratio_eew_type (sew, lmul_log2, sew / 2, true, false) > .c_str ()); > fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ %s,\n", > + same_ratio_eew_bf16_type (sew, lmul_log2).c_str ()); > + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ %s,\n", > + same_ratio_eew_bf16_type (sew, lmul_log2).c_str ()); > fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", > same_ratio_eew_type (sew, lmul_log2, sew / 2, false, true) > .c_str ()); > diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def > index 6de4e440298..b0a78f72754 100644 > --- a/gcc/config/riscv/riscv-modes.def > +++ b/gcc/config/riscv/riscv-modes.def > @@ -93,6 +93,7 @@ ADJUST_BYTESIZE (RVVMF64BI, riscv_v_adjust_bytesize (RVVMF64BImode, 1)); > |DF |RVVM1DF|RVVM2DF|RVVM4DF|RVVM8DF|N/A |N/A |N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|RVVMF2SF|N/A |N/A | > |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|RVVMF4HF|N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|RVVMF4BF|N/A | > > There are the following data types for ELEN = 32. > > @@ -101,11 +102,13 @@ There are the following data types for ELEN = 32. > |HI |RVVM1HI|RVVM2HI|RVVM4HI|RVVM8HI|RVVMF2HI|N/A |N/A | > |QI |RVVM1QI|RVVM2QI|RVVM4QI|RVVM8QI|RVVMF2QI|RVVMF4QI|N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|N/A |N/A |N/A | > - |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | */ > + |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|N/A |N/A | */ > > #define RVV_WHOLE_MODES(LMUL) \ > VECTOR_MODE_WITH_PREFIX (RVVM, INT, QI, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, INT, HI, LMUL, 0); \ > + VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, BF, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, HF, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, INT, SI, LMUL, 0); \ > VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, SF, LMUL, 0); \ > @@ -120,6 +123,8 @@ There are the following data types for ELEN = 32. > riscv_v_adjust_nunits (RVVM##LMUL##SImode, false, LMUL, 1)); \ > ADJUST_NUNITS (RVVM##LMUL##DI, \ > riscv_v_adjust_nunits (RVVM##LMUL##DImode, false, LMUL, 1)); \ > + ADJUST_NUNITS (RVVM##LMUL##BF, \ > + riscv_v_adjust_nunits (RVVM##LMUL##BFmode, false, LMUL, 1)); \ > ADJUST_NUNITS (RVVM##LMUL##HF, \ > riscv_v_adjust_nunits (RVVM##LMUL##HFmode, false, LMUL, 1)); \ > ADJUST_NUNITS (RVVM##LMUL##SF, \ > @@ -131,6 +136,7 @@ There are the following data types for ELEN = 32. > ADJUST_ALIGNMENT (RVVM##LMUL##HI, 2); \ > ADJUST_ALIGNMENT (RVVM##LMUL##SI, 4); \ > ADJUST_ALIGNMENT (RVVM##LMUL##DI, 8); \ > + ADJUST_ALIGNMENT (RVVM##LMUL##BF, 2); \ > ADJUST_ALIGNMENT (RVVM##LMUL##HF, 2); \ > ADJUST_ALIGNMENT (RVVM##LMUL##SF, 4); \ > ADJUST_ALIGNMENT (RVVM##LMUL##DF, 8); > @@ -153,6 +159,8 @@ RVV_FRACT_MODE (INT, QI, 4, 1) > RVV_FRACT_MODE (INT, QI, 8, 1) > RVV_FRACT_MODE (INT, HI, 2, 2) > RVV_FRACT_MODE (INT, HI, 4, 2) > +RVV_FRACT_MODE (FLOAT, BF, 2, 2) > +RVV_FRACT_MODE (FLOAT, BF, 4, 2) > RVV_FRACT_MODE (FLOAT, HF, 2, 2) > RVV_FRACT_MODE (FLOAT, HF, 4, 2) > RVV_FRACT_MODE (INT, SI, 2, 4) > @@ -174,6 +182,9 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) > VECTOR_MODE_WITH_PREFIX (RVVMF4x, INT, HI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVMF2x, INT, HI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM1x, INT, HI, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVMF4x, FLOAT, BF, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVMF2x, FLOAT, BF, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVM1x, FLOAT, BF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVMF4x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVMF2x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM1x, FLOAT, HF, NF, 1); \ > @@ -198,6 +209,12 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) > riscv_v_adjust_nunits (RVVMF2x##NF##HImode, true, 2, NF)); \ > ADJUST_NUNITS (RVVM1x##NF##HI, \ > riscv_v_adjust_nunits (RVVM1x##NF##HImode, false, 1, NF)); \ > + ADJUST_NUNITS (RVVMF4x##NF##BF, \ > + riscv_v_adjust_nunits (RVVMF4x##NF##BFmode, true, 4, NF)); \ > + ADJUST_NUNITS (RVVMF2x##NF##BF, \ > + riscv_v_adjust_nunits (RVVMF2x##NF##BFmode, true, 2, NF)); \ > + ADJUST_NUNITS (RVVM1x##NF##BF, \ > + riscv_v_adjust_nunits (RVVM1x##NF##BFmode, false, 1, NF)); \ > ADJUST_NUNITS (RVVMF4x##NF##HF, \ > riscv_v_adjust_nunits (RVVMF4x##NF##HFmode, true, 4, NF)); \ > ADJUST_NUNITS (RVVMF2x##NF##HF, \ > @@ -224,6 +241,9 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) > ADJUST_ALIGNMENT (RVVMF4x##NF##HI, 2); \ > ADJUST_ALIGNMENT (RVVMF2x##NF##HI, 2); \ > ADJUST_ALIGNMENT (RVVM1x##NF##HI, 2); \ > + ADJUST_ALIGNMENT (RVVMF4x##NF##BF, 2); \ > + ADJUST_ALIGNMENT (RVVMF2x##NF##BF, 2); \ > + ADJUST_ALIGNMENT (RVVM1x##NF##BF, 2); \ > ADJUST_ALIGNMENT (RVVMF4x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVMF2x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVM1x##NF##HF, 2); \ > @@ -245,6 +265,7 @@ RVV_NF8_MODES (2) > #define RVV_NF4_MODES(NF) \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, QI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, HI, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, BF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, SI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, SF, NF, 1); \ > @@ -255,6 +276,8 @@ RVV_NF8_MODES (2) > riscv_v_adjust_nunits (RVVM2x##NF##QImode, false, 2, NF)); \ > ADJUST_NUNITS (RVVM2x##NF##HI, \ > riscv_v_adjust_nunits (RVVM2x##NF##HImode, false, 2, NF)); \ > + ADJUST_NUNITS (RVVM2x##NF##BF, \ > + riscv_v_adjust_nunits (RVVM2x##NF##BFmode, false, 2, NF)); \ > ADJUST_NUNITS (RVVM2x##NF##HF, \ > riscv_v_adjust_nunits (RVVM2x##NF##HFmode, false, 2, NF)); \ > ADJUST_NUNITS (RVVM2x##NF##SI, \ > @@ -268,6 +291,7 @@ RVV_NF8_MODES (2) > \ > ADJUST_ALIGNMENT (RVVM2x##NF##QI, 1); \ > ADJUST_ALIGNMENT (RVVM2x##NF##HI, 2); \ > + ADJUST_ALIGNMENT (RVVM2x##NF##BF, 2); \ > ADJUST_ALIGNMENT (RVVM2x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVM2x##NF##SI, 4); \ > ADJUST_ALIGNMENT (RVVM2x##NF##SF, 4); \ > @@ -281,6 +305,7 @@ RVV_NF4_MODES (4) > #define RVV_NF2_MODES(NF) \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, QI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, HI, NF, 1); \ > + VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, BF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, HF, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, SI, NF, 1); \ > VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, SF, NF, 1); \ > @@ -291,6 +316,8 @@ RVV_NF4_MODES (4) > riscv_v_adjust_nunits (RVVM4x##NF##QImode, false, 4, NF)); \ > ADJUST_NUNITS (RVVM4x##NF##HI, \ > riscv_v_adjust_nunits (RVVM4x##NF##HImode, false, 4, NF)); \ > + ADJUST_NUNITS (RVVM4x##NF##BF, \ > + riscv_v_adjust_nunits (RVVM4x##NF##BFmode, false, 4, NF)); \ > ADJUST_NUNITS (RVVM4x##NF##HF, \ > riscv_v_adjust_nunits (RVVM4x##NF##HFmode, false, 4, NF)); \ > ADJUST_NUNITS (RVVM4x##NF##SI, \ > @@ -304,6 +331,7 @@ RVV_NF4_MODES (4) > \ > ADJUST_ALIGNMENT (RVVM4x##NF##QI, 1); \ > ADJUST_ALIGNMENT (RVVM4x##NF##HI, 2); \ > + ADJUST_ALIGNMENT (RVVM4x##NF##BF, 2); \ > ADJUST_ALIGNMENT (RVVM4x##NF##HF, 2); \ > ADJUST_ALIGNMENT (RVVM4x##NF##SI, 4); \ > ADJUST_ALIGNMENT (RVVM4x##NF##SF, 4); \ > diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def b/gcc/config/riscv/riscv-vector-builtins-types.def > index 61019a56844..e7fca4cca79 100644 > --- a/gcc/config/riscv/riscv-vector-builtins-types.def > +++ b/gcc/config/riscv/riscv-vector-builtins-types.def > @@ -397,6 +397,13 @@ DEF_RVV_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_U_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64) > > +DEF_RVV_F_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_F_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_F_OPS (vbfloat16m8_t, RVV_REQUIRE_ELEN_BF_16) > + > DEF_RVV_F_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_F_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_F_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -999,6 +1006,11 @@ DEF_RVV_X2_VLMUL_EXT_OPS (vuint32m4_t, 0) > DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -1040,6 +1052,10 @@ DEF_RVV_X4_VLMUL_EXT_OPS (vuint32m1_t, 0) > DEF_RVV_X4_VLMUL_EXT_OPS (vuint32m2_t, 0) > DEF_RVV_X4_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_X4_VLMUL_EXT_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -1070,6 +1086,9 @@ DEF_RVV_X8_VLMUL_EXT_OPS (vuint16m1_t, 0) > DEF_RVV_X8_VLMUL_EXT_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X8_VLMUL_EXT_OPS (vuint32m1_t, 0) > DEF_RVV_X8_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > @@ -1089,6 +1108,8 @@ DEF_RVV_X16_VLMUL_EXT_OPS (vuint8mf2_t, 0) > DEF_RVV_X16_VLMUL_EXT_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X16_VLMUL_EXT_OPS (vuint16mf2_t, 0) > DEF_RVV_X16_VLMUL_EXT_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X16_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X16_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_X16_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X16_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_X16_VLMUL_EXT_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64) > @@ -1099,6 +1120,7 @@ DEF_RVV_X32_VLMUL_EXT_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X32_VLMUL_EXT_OPS (vuint8mf8_t, RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X32_VLMUL_EXT_OPS (vuint8mf4_t, 0) > DEF_RVV_X32_VLMUL_EXT_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_X32_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_X32_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > > DEF_RVV_X64_VLMUL_EXT_OPS (vint8mf8_t, RVV_REQUIRE_MIN_VLEN_64) > @@ -1112,6 +1134,7 @@ DEF_RVV_LMUL1_OPS (vuint8m1_t, 0) > DEF_RVV_LMUL1_OPS (vuint16m1_t, 0) > DEF_RVV_LMUL1_OPS (vuint32m1_t, 0) > DEF_RVV_LMUL1_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_LMUL1_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_LMUL1_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_LMUL1_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32) > DEF_RVV_LMUL1_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64) > @@ -1124,6 +1147,7 @@ DEF_RVV_LMUL2_OPS (vuint8m2_t, 0) > DEF_RVV_LMUL2_OPS (vuint16m2_t, 0) > DEF_RVV_LMUL2_OPS (vuint32m2_t, 0) > DEF_RVV_LMUL2_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_LMUL2_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_LMUL2_OPS (vfloat16m2_t, RVV_REQUIRE_ELEN_FP_16) > DEF_RVV_LMUL2_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32) > DEF_RVV_LMUL2_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64) > @@ -1137,6 +1161,7 @@ DEF_RVV_LMUL4_OPS (vuint16m4_t, 0) > DEF_RVV_LMUL4_OPS (vuint32m4_t, 0) > DEF_RVV_LMUL4_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_LMUL4_OPS (vfloat16m4_t, RVV_REQUIRE_ELEN_FP_16) > +DEF_RVV_LMUL4_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_LMUL4_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32) > DEF_RVV_LMUL4_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64) > > @@ -1312,6 +1337,31 @@ DEF_RVV_TUPLE_OPS (vint64m2x4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_TUPLE_OPS (vuint64m2x4_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_TUPLE_OPS (vint64m4x2_t, RVV_REQUIRE_ELEN_64) > DEF_RVV_TUPLE_OPS (vuint64m4x2_t, RVV_REQUIRE_ELEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x2_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x3_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x5_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x6_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x7_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf4x8_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x3_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x5_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x6_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x7_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16mf2x8_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x3_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x5_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x6_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x7_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m1x8_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m2x2_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m2x3_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m2x4_t, RVV_REQUIRE_ELEN_BF_16) > +DEF_RVV_TUPLE_OPS (vbfloat16m4x2_t, RVV_REQUIRE_ELEN_BF_16) > DEF_RVV_TUPLE_OPS (vfloat16mf4x2_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_TUPLE_OPS (vfloat16mf4x3_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > DEF_RVV_TUPLE_OPS (vfloat16mf4x4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) > diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc > index c08d87a2680..720436dfbc9 100644 > --- a/gcc/config/riscv/riscv-vector-builtins.cc > +++ b/gcc/config/riscv/riscv-vector-builtins.cc > @@ -2808,7 +2808,8 @@ static CONSTEXPR const function_type_info function_types[] = { > VECTOR, MASK, SIGNED, UNSIGNED, EEW8_INDEX, EEW16_INDEX, EEW32_INDEX, \ > EEW64_INDEX, SHIFT, DOUBLE_TRUNC, QUAD_TRUNC, OCT_TRUNC, \ > DOUBLE_TRUNC_SCALAR, DOUBLE_TRUNC_SIGNED, DOUBLE_TRUNC_UNSIGNED, \ > - DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > + DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_BFLOAT_SCALAR, \ > + DOUBLE_TRUNC_BFLOAT, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > EEW8_INTERPRET, EEW16_INTERPRET, EEW32_INTERPRET, EEW64_INTERPRET, \ > BOOL1_INTERPRET, BOOL2_INTERPRET, BOOL4_INTERPRET, BOOL8_INTERPRET, \ > BOOL16_INTERPRET, BOOL32_INTERPRET, BOOL64_INTERPRET, \ > @@ -2845,6 +2846,8 @@ static CONSTEXPR const function_type_info function_types[] = { > VECTOR_TYPE_##DOUBLE_TRUNC_SIGNED, \ > VECTOR_TYPE_##DOUBLE_TRUNC_UNSIGNED, \ > VECTOR_TYPE_##DOUBLE_TRUNC_UNSIGNED_SCALAR, \ > + VECTOR_TYPE_##DOUBLE_TRUNC_BFLOAT_SCALAR, \ > + VECTOR_TYPE_##DOUBLE_TRUNC_BFLOAT, \ > VECTOR_TYPE_##DOUBLE_TRUNC_FLOAT, \ > VECTOR_TYPE_##FLOAT, \ > VECTOR_TYPE_##LMUL1, \ > @@ -3284,6 +3287,8 @@ check_required_extensions (const function_instance &instance) > > uint64_t riscv_isa_flags = 0; > > + if (TARGET_VECTOR_ELEN_BF_16) > + riscv_isa_flags |= RVV_REQUIRE_ELEN_BF_16; > if (TARGET_VECTOR_ELEN_FP_16) > riscv_isa_flags |= RVV_REQUIRE_ELEN_FP_16; > if (TARGET_VECTOR_ELEN_FP_32) > diff --git a/gcc/config/riscv/riscv-vector-builtins.def b/gcc/config/riscv/riscv-vector-builtins.def > index 784b54c81a4..97f329d11eb 100644 > --- a/gcc/config/riscv/riscv-vector-builtins.def > +++ b/gcc/config/riscv/riscv-vector-builtins.def > @@ -72,7 +72,8 @@ along with GCC; see the file COPYING3. If not see > VECTOR, MASK, SIGNED, UNSIGNED, EEW8_INDEX, EEW16_INDEX, EEW32_INDEX, \ > EEW64_INDEX, SHIFT, DOUBLE_TRUNC, QUAD_TRUNC, OCT_TRUNC, \ > DOUBLE_TRUNC_SCALAR, DOUBLE_TRUNC_SIGNED, DOUBLE_TRUNC_UNSIGNED, \ > - DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > + DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_BFLOAT_SCALAR, \ > + DOUBLE_TRUNC_BFLOAT, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ > EEW8_INTERPRET, EEW16_INTERPRET, EEW32_INTERPRET, EEW64_INTERPRET, \ > BOOL1_INTERPRET, BOOL2_INTERPRET, BOOL4_INTERPRET, BOOL8_INTERPRET, \ > BOOL16_INTERPRET, BOOL32_INTERPRET, BOOL64_INTERPRET, \ > @@ -436,6 +437,56 @@ DEF_RVV_TYPE (vint64m8_t, 15, __rvv_int64m8_t, int64, RVVM8DI, _i64m8, _i64, > DEF_RVV_TYPE (vuint64m8_t, 16, __rvv_uint64m8_t, uint64, RVVM8DI, _u64m8, _u64, > _e64m8) > > +/* Enabled if TARGET_VECTOR_ELEN_BF_16 && (TARGET_ZVFBFMIN or TARGET_ZVFBFWMA). */ > +/* LMUL = 1/4. */ > +DEF_RVV_TYPE (vbfloat16mf4_t, 19, __rvv_bfloat16mf4_t, bfloat16, RVVMF4BF, _bf16mf4, > + _bf16, _e16mf4) > +/* Define tuple types for SEW = 16, LMUL = MF4. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x2_t, 21, __rvv_bfloat16mf4x2_t, vbfloat16mf4_t, bfloat16, 2, _bf16mf4x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x3_t, 21, __rvv_bfloat16mf4x3_t, vbfloat16mf4_t, bfloat16, 3, _bf16mf4x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x4_t, 21, __rvv_bfloat16mf4x4_t, vbfloat16mf4_t, bfloat16, 4, _bf16mf4x4) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x5_t, 21, __rvv_bfloat16mf4x5_t, vbfloat16mf4_t, bfloat16, 5, _bf16mf4x5) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x6_t, 21, __rvv_bfloat16mf4x6_t, vbfloat16mf4_t, bfloat16, 6, _bf16mf4x6) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x7_t, 21, __rvv_bfloat16mf4x7_t, vbfloat16mf4_t, bfloat16, 7, _bf16mf4x7) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x8_t, 21, __rvv_bfloat16mf4x8_t, vbfloat16mf4_t, bfloat16, 8, _bf16mf4x8) > +/* LMUL = 1/2. */ > +DEF_RVV_TYPE (vbfloat16mf2_t, 19, __rvv_bfloat16mf2_t, bfloat16, RVVMF2BF, _bf16mf2, > + _bf16, _e16mf2) > +/* Define tuple types for SEW = 16, LMUL = MF2. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x2_t, 21, __rvv_bfloat16mf2x2_t, vbfloat16mf2_t, bfloat16, 2, _bf16mf2x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x3_t, 21, __rvv_bfloat16mf2x3_t, vbfloat16mf2_t, bfloat16, 3, _bf16mf2x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x4_t, 21, __rvv_bfloat16mf2x4_t, vbfloat16mf2_t, bfloat16, 4, _bf16mf2x4) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x5_t, 21, __rvv_bfloat16mf2x5_t, vbfloat16mf2_t, bfloat16, 5, _bf16mf2x5) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x6_t, 21, __rvv_bfloat16mf2x6_t, vbfloat16mf2_t, bfloat16, 6, _bf16mf2x6) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x7_t, 21, __rvv_bfloat16mf2x7_t, vbfloat16mf2_t, bfloat16, 7, _bf16mf2x7) > +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x8_t, 21, __rvv_bfloat16mf2x8_t, vbfloat16mf2_t, bfloat16, 8, _bf16mf2x8) > +/* LMUL = 1. */ > +DEF_RVV_TYPE (vbfloat16m1_t, 18, __rvv_bfloat16m1_t, bfloat16, RVVM1BF, _bf16m1, > + _bf16, _e16m1) > +/* Define tuple types for SEW = 16, LMUL = M1. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x2_t, 20, __rvv_bfloat16m1x2_t, vbfloat16m1_t, bfloat16, 2, _bf16m1x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x3_t, 20, __rvv_bfloat16m1x3_t, vbfloat16m1_t, bfloat16, 3, _bf16m1x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x4_t, 20, __rvv_bfloat16m1x4_t, vbfloat16m1_t, bfloat16, 4, _bf16m1x4) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x5_t, 20, __rvv_bfloat16m1x5_t, vbfloat16m1_t, bfloat16, 5, _bf16m1x5) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x6_t, 20, __rvv_bfloat16m1x6_t, vbfloat16m1_t, bfloat16, 6, _bf16m1x6) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x7_t, 20, __rvv_bfloat16m1x7_t, vbfloat16m1_t, bfloat16, 7, _bf16m1x7) > +DEF_RVV_TUPLE_TYPE (vbfloat16m1x8_t, 20, __rvv_bfloat16m1x8_t, vbfloat16m1_t, bfloat16, 8, _bf16m1x8) > +/* LMUL = 2. */ > +DEF_RVV_TYPE (vbfloat16m2_t, 18, __rvv_bfloat16m2_t, bfloat16, RVVM2BF, _bf16m2, > + _bf16, _e16m2) > +/* Define tuple types for SEW = 16, LMUL = M2. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16m2x2_t, 20, __rvv_bfloat16m2x2_t, vbfloat16m2_t, bfloat16, 2, _bf16m2x2) > +DEF_RVV_TUPLE_TYPE (vbfloat16m2x3_t, 20, __rvv_bfloat16m2x3_t, vbfloat16m2_t, bfloat16, 3, _bf16m2x3) > +DEF_RVV_TUPLE_TYPE (vbfloat16m2x4_t, 20, __rvv_bfloat16m2x4_t, vbfloat16m2_t, bfloat16, 4, _bf16m2x4) > +/* LMUL = 4. */ > +DEF_RVV_TYPE (vbfloat16m4_t, 18, __rvv_bfloat16m4_t, bfloat16, RVVM4BF, _bf16m4, > + _bf16, _e16m4) > +/* Define tuple types for SEW = 16, LMUL = M4. */ > +DEF_RVV_TUPLE_TYPE (vbfloat16m4x2_t, 20, __rvv_bfloat16m4x2_t, vbfloat16m4_t, bfloat16, 2, _bf16m4x2) > +/* LMUL = 8. */ > +DEF_RVV_TYPE (vbfloat16m8_t, 18, __rvv_bfloat16m8_t, bfloat16, RVVM8BF, _bf16m8, > + _bf16, _e16m8) > + > /* Enabled if TARGET_VECTOR_ELEN_FP_16 && (TARGET_ZVFH or TARGET_ZVFHMIN). */ > /* LMUL = 1/4. */ > DEF_RVV_TYPE (vfloat16mf4_t, 18, __rvv_float16mf4_t, float16, RVVMF4HF, _f16mf4, > @@ -630,6 +681,8 @@ DEF_RVV_BASE_TYPE (double_trunc_scalar, get_scalar_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_signed_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_unsigned_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_unsigned_scalar, get_scalar_type (type_idx)) > +DEF_RVV_BASE_TYPE (double_trunc_bfloat_scalar, get_scalar_type (type_idx)) > +DEF_RVV_BASE_TYPE (double_trunc_bfloat_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (double_trunc_float_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (float_vector, get_vector_type (type_idx)) > DEF_RVV_BASE_TYPE (lmul1_vector, get_vector_type (type_idx)) > diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h > index 05d18ae1322..56dbe2cf0e2 100644 > --- a/gcc/config/riscv/riscv-vector-builtins.h > +++ b/gcc/config/riscv/riscv-vector-builtins.h > @@ -109,6 +109,7 @@ static const unsigned int CP_WRITE_CSR = 1U << 5; > #define RVV_REQUIRE_FULL_V (1 << 4) /* Require Full 'V' extension. */ > #define RVV_REQUIRE_MIN_VLEN_64 (1 << 5) /* Require TARGET_MIN_VLEN >= 64. */ > #define RVV_REQUIRE_ELEN_FP_16 (1 << 6) /* Require FP ELEN >= 32. */ > +#define RVV_REQUIRE_ELEN_BF_16 (1 << 7) /* Require BF16. */ > > /* Enumerates the required extensions. */ > enum required_ext > diff --git a/gcc/config/riscv/riscv-vector-switch.def b/gcc/config/riscv/riscv-vector-switch.def > index 452283b7416..de72e415fe8 100644 > --- a/gcc/config/riscv/riscv-vector-switch.def > +++ b/gcc/config/riscv/riscv-vector-switch.def > @@ -43,6 +43,7 @@ Encode SEW and LMUL into data types. > |DF |RVVM1DF|RVVM2DF|RVVM4DF|RVVM8DF|N/A |N/A |N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|RVVMF2SF|N/A |N/A | > |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|RVVMF4HF|N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|RVVMF4BF|N/A | > > There are the following data types for ELEN = 32. > > @@ -52,6 +53,7 @@ There are the following data types for ELEN = 32. > |QI |RVVM1QI|RVVM2QI|RVVM4QI|RVVM8QI|RVVMF2QI|RVVMF4QI|N/A | > |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|N/A |N/A |N/A | > |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | > + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|N/A |N/A | > > Encode the ratio of SEW/LMUL into the mask types. > There are the following mask types. > @@ -93,6 +95,14 @@ ENTRY (RVVM1HI, true, LMUL_1, 16) > ENTRY (RVVMF2HI, !TARGET_XTHEADVECTOR, LMUL_F2, 32) > ENTRY (RVVMF4HI, TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, LMUL_F4, 64) > > +/* Disable modes if TARGET_MIN_VLEN == 32 or !TARGET_VECTOR_ELEN_BF_16. */ > +ENTRY (RVVM8BF, TARGET_VECTOR_ELEN_BF_16, LMUL_8, 2) > +ENTRY (RVVM4BF, TARGET_VECTOR_ELEN_BF_16, LMUL_4, 4) > +ENTRY (RVVM2BF, TARGET_VECTOR_ELEN_BF_16, LMUL_2, 8) > +ENTRY (RVVM1BF, TARGET_VECTOR_ELEN_BF_16, LMUL_1, 16) > +ENTRY (RVVMF2BF, TARGET_VECTOR_ELEN_BF_16, LMUL_F2, 32) > +ENTRY (RVVMF4BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, LMUL_F4, 64) > + > /* Disable modes if TARGET_MIN_VLEN == 32 or !TARGET_VECTOR_ELEN_FP_16. */ > ENTRY (RVVM8HF, TARGET_VECTOR_ELEN_FP_16, LMUL_8, 2) > ENTRY (RVVM4HF, TARGET_VECTOR_ELEN_FP_16, LMUL_4, 4) > @@ -198,6 +208,32 @@ TUPLE_ENTRY (RVVM1x2HI, true, RVVM1HI, 2, LMUL_1, 16) > TUPLE_ENTRY (RVVMF2x2HI, !TARGET_XTHEADVECTOR, RVVMF2HI, 2, LMUL_F2, 32) > TUPLE_ENTRY (RVVMF4x2HI, TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, RVVMF4HI, 2, LMUL_F4, 64) > > +TUPLE_ENTRY (RVVM1x8BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 8, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x8BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 8, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x8BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 8, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM1x7BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 7, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x7BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 7, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x7BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 7, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM1x6BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 6, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x6BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 6, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x6BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 6, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM1x5BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 5, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x5BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 5, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x5BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 5, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM2x4BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 4, LMUL_2, 8) > +TUPLE_ENTRY (RVVM1x4BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 4, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x4BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 4, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x4BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 4, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM2x3BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 3, LMUL_2, 8) > +TUPLE_ENTRY (RVVM1x3BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 3, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x3BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 3, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x3BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 3, LMUL_F4, 64) > +TUPLE_ENTRY (RVVM4x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM4BF, 2, LMUL_4, 4) > +TUPLE_ENTRY (RVVM2x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 2, LMUL_2, 8) > +TUPLE_ENTRY (RVVM1x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 2, LMUL_1, 16) > +TUPLE_ENTRY (RVVMF2x2BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 2, LMUL_F2, 32) > +TUPLE_ENTRY (RVVMF4x2BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 2, LMUL_F4, 64) > + > TUPLE_ENTRY (RVVM1x8HF, TARGET_VECTOR_ELEN_FP_16, RVVM1HF, 8, LMUL_1, 16) > TUPLE_ENTRY (RVVMF2x8HF, TARGET_VECTOR_ELEN_FP_16 && !TARGET_XTHEADVECTOR, RVVMF2HF, 8, LMUL_F2, 32) > TUPLE_ENTRY (RVVMF4x8HF, TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, RVVMF4HF, 8, LMUL_F4, 64) > -- > 2.17.1 >
On 7/11/24 1:26 AM, Kito Cheng wrote: > OK for this patch set, I know you already got LGTM from JuZhe or me > before, so just an explicitly ack to let you know it's still OK once > CI is passed. The tabs vs spaces problem pointed out by the linter should be fixed. The other lint issues probably should be left as-is. Given fixing tabs vs spaces is NFC, I'm fine with Feng fixing those and committing the result rather than doing another review round on the first patch in this series. jeff
diff --git a/gcc/config/riscv/genrvv-type-indexer.cc b/gcc/config/riscv/genrvv-type-indexer.cc index 27cbd14982c..8626ddeaaa8 100644 --- a/gcc/config/riscv/genrvv-type-indexer.cc +++ b/gcc/config/riscv/genrvv-type-indexer.cc @@ -117,6 +117,42 @@ inttype (unsigned sew, int lmul_log2, unsigned nf, bool unsigned_p) return mode.str (); } +std::string +bfloat16_type (int lmul_log2) +{ + if (!valid_type (16, lmul_log2, /*float_t*/ true)) + return "INVALID"; + + std::stringstream mode; + mode << "vbfloat16" << to_lmul (lmul_log2) << "_t"; + return mode.str (); +} + +std::string +bfloat16_wide_type (int lmul_log2) +{ + if (!valid_type (32, lmul_log2, /*float_t*/ true)) + return "INVALID"; + + std::stringstream mode; + mode << "vfloat32" << to_lmul (lmul_log2) << "_t"; + return mode.str (); +} + +std::string +bfloat16_type (int lmul_log2, unsigned nf) +{ + if (!valid_type (16, lmul_log2, nf, /*float_t*/ true)) + return "INVALID"; + + std::stringstream mode; + mode << "vbfloat16" << to_lmul (lmul_log2); + if (nf > 1) + mode << "x" << nf; + mode << "_t"; + return mode.str (); +} + std::string floattype (unsigned sew, int lmul_log2) { @@ -182,6 +218,15 @@ same_ratio_eew_type (unsigned sew, int lmul_log2, unsigned eew, bool unsigned_p, return inttype (eew, elmul_log2, unsigned_p); } +std::string +same_ratio_eew_bf16_type (unsigned sew, int lmul_log2) +{ + if (sew != 32) + return "INVALID"; + int elmul_log2 = lmul_log2 - 1; + return bfloat16_type (elmul_log2); +} + int main (int argc, const char **argv) { @@ -215,6 +260,8 @@ main (int argc, const char **argv) fprintf (fp, " /*DOUBLE_TRUNC_SIGNED*/ INVALID,\n"); fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED*/ INVALID,\n"); fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ INVALID,\n"); fprintf (fp, " /*FLOAT*/ INVALID,\n"); fprintf (fp, " /*LMUL1*/ INVALID,\n"); @@ -294,6 +341,8 @@ main (int argc, const char **argv) same_ratio_eew_type (sew, lmul_log2, sew / 2, true, false) .c_str ()); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", same_ratio_eew_type (sew, lmul_log2, sew / 2, false, true) .c_str ()); @@ -341,6 +390,68 @@ main (int argc, const char **argv) inttype (sew, lmul_log2, 1, unsigned_p).c_str ()); fprintf (fp, ")\n"); } + // Build for vbfloat16 + for (int lmul_log2 : {-2, -1, 0, 1, 2, 3}) + for (unsigned nf : {1, 2, 3, 4, 5, 6, 7, 8}) + { + if (!valid_type (16, lmul_log2, nf, /*float_t*/ true)) + continue; + + fprintf (fp, "DEF_RVV_TYPE_INDEX (\n"); + fprintf (fp, " /*VECTOR*/ %s,\n", + bfloat16_type (lmul_log2, nf).c_str ()); + fprintf (fp, " /*MASK*/ %s,\n", maskmode (16, lmul_log2).c_str ()); + fprintf (fp, " /*SIGNED*/ %s,\n", + inttype (16, lmul_log2, /*unsigned_p*/ false).c_str ()); + fprintf (fp, " /*UNSIGNED*/ %s,\n", + inttype (16, lmul_log2, /*unsigned_p*/ true).c_str ()); + for (unsigned eew : {8, 16, 32, 64}) + fprintf ( + fp, " /*EEW%d_INDEX*/ %s,\n", eew, + same_ratio_eew_type (16, lmul_log2, eew, true, false).c_str ()); + fprintf (fp, " /*SHIFT*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC*/ %s,\n", + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); + fprintf (fp, " /*QUAD_TRUNC*/ INVALID,\n"); + fprintf (fp, " /*OCT_TRUNC*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_SCALAR*/ %s,\n", + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); + fprintf (fp, " /*DOUBLE_TRUNC_SIGNED*/ %s,\n", + same_ratio_eew_type (16, lmul_log2, 8, false, false).c_str ()); + fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED*/ %s,\n", + same_ratio_eew_type (16, lmul_log2, 8, true, false).c_str ()); + fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", + same_ratio_eew_type (16, lmul_log2, 8, false, true).c_str ()); + fprintf (fp, " /*FLOAT*/ INVALID,\n"); + fprintf (fp, " /*LMUL1*/ %s,\n", + bfloat16_type (/*lmul_log2*/ 0).c_str ()); + fprintf (fp, " /*WLMUL1*/ %s,\n", + bfloat16_wide_type (/*lmul_log2*/ 0).c_str ()); + for (unsigned eew : {8, 16, 32, 64}) + fprintf (fp, " /*EEW%d_INTERPRET*/ INVALID,\n", eew); + + for (unsigned boolsize : BOOL_SIZE_LIST) + fprintf (fp, " /*BOOL%d_INTERPRET*/ INVALID,\n", boolsize); + + for (unsigned eew : EEW_SIZE_LIST) + fprintf (fp, " /*SIGNED_EEW%d_LMUL1_INTERPRET*/ INVALID,\n", eew); + + for (unsigned eew : EEW_SIZE_LIST) + fprintf (fp, " /*UNSIGNED_EEW%d_LMUL1_INTERPRET*/ INVALID,\n", eew); + + for (unsigned lmul_log2_offset : {1, 2, 3, 4, 5, 6}) + { + unsigned multiple_of_lmul = 1 << lmul_log2_offset; + fprintf (fp, " /*X%d_VLMUL_EXT*/ %s,\n", multiple_of_lmul, + bfloat16_type (lmul_log2 + lmul_log2_offset).c_str ()); + } + fprintf (fp, " /*TUPLE_SUBPART*/ %s\n", + bfloat16_type (lmul_log2, 1U).c_str ()); + fprintf (fp, ")\n"); + } // Build for vfloat for (unsigned sew : {16, 32, 64}) for (int lmul_log2 : {-3, -2, -1, 0, 1, 2, 3}) @@ -378,6 +489,10 @@ main (int argc, const char **argv) same_ratio_eew_type (sew, lmul_log2, sew / 2, true, false) .c_str ()); fprintf (fp, " /*DOUBLE_TRUNC_UNSIGNED_SCALAR*/ INVALID,\n"); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT_SCALAR*/ %s,\n", + same_ratio_eew_bf16_type (sew, lmul_log2).c_str ()); + fprintf (fp, " /*DOUBLE_TRUNC_BFLOAT*/ %s,\n", + same_ratio_eew_bf16_type (sew, lmul_log2).c_str ()); fprintf (fp, " /*DOUBLE_TRUNC_FLOAT*/ %s,\n", same_ratio_eew_type (sew, lmul_log2, sew / 2, false, true) .c_str ()); diff --git a/gcc/config/riscv/riscv-modes.def b/gcc/config/riscv/riscv-modes.def index 6de4e440298..b0a78f72754 100644 --- a/gcc/config/riscv/riscv-modes.def +++ b/gcc/config/riscv/riscv-modes.def @@ -93,6 +93,7 @@ ADJUST_BYTESIZE (RVVMF64BI, riscv_v_adjust_bytesize (RVVMF64BImode, 1)); |DF |RVVM1DF|RVVM2DF|RVVM4DF|RVVM8DF|N/A |N/A |N/A | |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|RVVMF2SF|N/A |N/A | |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|RVVMF4HF|N/A | + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|RVVMF4BF|N/A | There are the following data types for ELEN = 32. @@ -101,11 +102,13 @@ There are the following data types for ELEN = 32. |HI |RVVM1HI|RVVM2HI|RVVM4HI|RVVM8HI|RVVMF2HI|N/A |N/A | |QI |RVVM1QI|RVVM2QI|RVVM4QI|RVVM8QI|RVVMF2QI|RVVMF4QI|N/A | |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|N/A |N/A |N/A | - |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | */ + |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|N/A |N/A | */ #define RVV_WHOLE_MODES(LMUL) \ VECTOR_MODE_WITH_PREFIX (RVVM, INT, QI, LMUL, 0); \ VECTOR_MODE_WITH_PREFIX (RVVM, INT, HI, LMUL, 0); \ + VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, BF, LMUL, 0); \ VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, HF, LMUL, 0); \ VECTOR_MODE_WITH_PREFIX (RVVM, INT, SI, LMUL, 0); \ VECTOR_MODE_WITH_PREFIX (RVVM, FLOAT, SF, LMUL, 0); \ @@ -120,6 +123,8 @@ There are the following data types for ELEN = 32. riscv_v_adjust_nunits (RVVM##LMUL##SImode, false, LMUL, 1)); \ ADJUST_NUNITS (RVVM##LMUL##DI, \ riscv_v_adjust_nunits (RVVM##LMUL##DImode, false, LMUL, 1)); \ + ADJUST_NUNITS (RVVM##LMUL##BF, \ + riscv_v_adjust_nunits (RVVM##LMUL##BFmode, false, LMUL, 1)); \ ADJUST_NUNITS (RVVM##LMUL##HF, \ riscv_v_adjust_nunits (RVVM##LMUL##HFmode, false, LMUL, 1)); \ ADJUST_NUNITS (RVVM##LMUL##SF, \ @@ -131,6 +136,7 @@ There are the following data types for ELEN = 32. ADJUST_ALIGNMENT (RVVM##LMUL##HI, 2); \ ADJUST_ALIGNMENT (RVVM##LMUL##SI, 4); \ ADJUST_ALIGNMENT (RVVM##LMUL##DI, 8); \ + ADJUST_ALIGNMENT (RVVM##LMUL##BF, 2); \ ADJUST_ALIGNMENT (RVVM##LMUL##HF, 2); \ ADJUST_ALIGNMENT (RVVM##LMUL##SF, 4); \ ADJUST_ALIGNMENT (RVVM##LMUL##DF, 8); @@ -153,6 +159,8 @@ RVV_FRACT_MODE (INT, QI, 4, 1) RVV_FRACT_MODE (INT, QI, 8, 1) RVV_FRACT_MODE (INT, HI, 2, 2) RVV_FRACT_MODE (INT, HI, 4, 2) +RVV_FRACT_MODE (FLOAT, BF, 2, 2) +RVV_FRACT_MODE (FLOAT, BF, 4, 2) RVV_FRACT_MODE (FLOAT, HF, 2, 2) RVV_FRACT_MODE (FLOAT, HF, 4, 2) RVV_FRACT_MODE (INT, SI, 2, 4) @@ -174,6 +182,9 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) VECTOR_MODE_WITH_PREFIX (RVVMF4x, INT, HI, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVMF2x, INT, HI, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM1x, INT, HI, NF, 1); \ + VECTOR_MODE_WITH_PREFIX (RVVMF4x, FLOAT, BF, NF, 1); \ + VECTOR_MODE_WITH_PREFIX (RVVMF2x, FLOAT, BF, NF, 1); \ + VECTOR_MODE_WITH_PREFIX (RVVM1x, FLOAT, BF, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVMF4x, FLOAT, HF, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVMF2x, FLOAT, HF, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM1x, FLOAT, HF, NF, 1); \ @@ -198,6 +209,12 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) riscv_v_adjust_nunits (RVVMF2x##NF##HImode, true, 2, NF)); \ ADJUST_NUNITS (RVVM1x##NF##HI, \ riscv_v_adjust_nunits (RVVM1x##NF##HImode, false, 1, NF)); \ + ADJUST_NUNITS (RVVMF4x##NF##BF, \ + riscv_v_adjust_nunits (RVVMF4x##NF##BFmode, true, 4, NF)); \ + ADJUST_NUNITS (RVVMF2x##NF##BF, \ + riscv_v_adjust_nunits (RVVMF2x##NF##BFmode, true, 2, NF)); \ + ADJUST_NUNITS (RVVM1x##NF##BF, \ + riscv_v_adjust_nunits (RVVM1x##NF##BFmode, false, 1, NF)); \ ADJUST_NUNITS (RVVMF4x##NF##HF, \ riscv_v_adjust_nunits (RVVMF4x##NF##HFmode, true, 4, NF)); \ ADJUST_NUNITS (RVVMF2x##NF##HF, \ @@ -224,6 +241,9 @@ RVV_FRACT_MODE (FLOAT, SF, 2, 4) ADJUST_ALIGNMENT (RVVMF4x##NF##HI, 2); \ ADJUST_ALIGNMENT (RVVMF2x##NF##HI, 2); \ ADJUST_ALIGNMENT (RVVM1x##NF##HI, 2); \ + ADJUST_ALIGNMENT (RVVMF4x##NF##BF, 2); \ + ADJUST_ALIGNMENT (RVVMF2x##NF##BF, 2); \ + ADJUST_ALIGNMENT (RVVM1x##NF##BF, 2); \ ADJUST_ALIGNMENT (RVVMF4x##NF##HF, 2); \ ADJUST_ALIGNMENT (RVVMF2x##NF##HF, 2); \ ADJUST_ALIGNMENT (RVVM1x##NF##HF, 2); \ @@ -245,6 +265,7 @@ RVV_NF8_MODES (2) #define RVV_NF4_MODES(NF) \ VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, QI, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, HI, NF, 1); \ + VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, BF, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, HF, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM2x, INT, SI, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM2x, FLOAT, SF, NF, 1); \ @@ -255,6 +276,8 @@ RVV_NF8_MODES (2) riscv_v_adjust_nunits (RVVM2x##NF##QImode, false, 2, NF)); \ ADJUST_NUNITS (RVVM2x##NF##HI, \ riscv_v_adjust_nunits (RVVM2x##NF##HImode, false, 2, NF)); \ + ADJUST_NUNITS (RVVM2x##NF##BF, \ + riscv_v_adjust_nunits (RVVM2x##NF##BFmode, false, 2, NF)); \ ADJUST_NUNITS (RVVM2x##NF##HF, \ riscv_v_adjust_nunits (RVVM2x##NF##HFmode, false, 2, NF)); \ ADJUST_NUNITS (RVVM2x##NF##SI, \ @@ -268,6 +291,7 @@ RVV_NF8_MODES (2) \ ADJUST_ALIGNMENT (RVVM2x##NF##QI, 1); \ ADJUST_ALIGNMENT (RVVM2x##NF##HI, 2); \ + ADJUST_ALIGNMENT (RVVM2x##NF##BF, 2); \ ADJUST_ALIGNMENT (RVVM2x##NF##HF, 2); \ ADJUST_ALIGNMENT (RVVM2x##NF##SI, 4); \ ADJUST_ALIGNMENT (RVVM2x##NF##SF, 4); \ @@ -281,6 +305,7 @@ RVV_NF4_MODES (4) #define RVV_NF2_MODES(NF) \ VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, QI, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, HI, NF, 1); \ + VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, BF, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, HF, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM4x, INT, SI, NF, 1); \ VECTOR_MODE_WITH_PREFIX (RVVM4x, FLOAT, SF, NF, 1); \ @@ -291,6 +316,8 @@ RVV_NF4_MODES (4) riscv_v_adjust_nunits (RVVM4x##NF##QImode, false, 4, NF)); \ ADJUST_NUNITS (RVVM4x##NF##HI, \ riscv_v_adjust_nunits (RVVM4x##NF##HImode, false, 4, NF)); \ + ADJUST_NUNITS (RVVM4x##NF##BF, \ + riscv_v_adjust_nunits (RVVM4x##NF##BFmode, false, 4, NF)); \ ADJUST_NUNITS (RVVM4x##NF##HF, \ riscv_v_adjust_nunits (RVVM4x##NF##HFmode, false, 4, NF)); \ ADJUST_NUNITS (RVVM4x##NF##SI, \ @@ -304,6 +331,7 @@ RVV_NF4_MODES (4) \ ADJUST_ALIGNMENT (RVVM4x##NF##QI, 1); \ ADJUST_ALIGNMENT (RVVM4x##NF##HI, 2); \ + ADJUST_ALIGNMENT (RVVM4x##NF##BF, 2); \ ADJUST_ALIGNMENT (RVVM4x##NF##HF, 2); \ ADJUST_ALIGNMENT (RVVM4x##NF##SI, 4); \ ADJUST_ALIGNMENT (RVVM4x##NF##SF, 4); \ diff --git a/gcc/config/riscv/riscv-vector-builtins-types.def b/gcc/config/riscv/riscv-vector-builtins-types.def index 61019a56844..e7fca4cca79 100644 --- a/gcc/config/riscv/riscv-vector-builtins-types.def +++ b/gcc/config/riscv/riscv-vector-builtins-types.def @@ -397,6 +397,13 @@ DEF_RVV_U_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) DEF_RVV_U_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) DEF_RVV_U_OPS (vuint64m8_t, RVV_REQUIRE_ELEN_64) +DEF_RVV_F_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_F_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_F_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_F_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_F_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_F_OPS (vbfloat16m8_t, RVV_REQUIRE_ELEN_BF_16) + DEF_RVV_F_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_F_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) DEF_RVV_F_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) @@ -999,6 +1006,11 @@ DEF_RVV_X2_VLMUL_EXT_OPS (vuint32m4_t, 0) DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) DEF_RVV_X2_VLMUL_EXT_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_X2_VLMUL_EXT_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) DEF_RVV_X2_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) @@ -1040,6 +1052,10 @@ DEF_RVV_X4_VLMUL_EXT_OPS (vuint32m1_t, 0) DEF_RVV_X4_VLMUL_EXT_OPS (vuint32m2_t, 0) DEF_RVV_X4_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) DEF_RVV_X4_VLMUL_EXT_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_X4_VLMUL_EXT_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) DEF_RVV_X4_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) @@ -1070,6 +1086,9 @@ DEF_RVV_X8_VLMUL_EXT_OPS (vuint16m1_t, 0) DEF_RVV_X8_VLMUL_EXT_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X8_VLMUL_EXT_OPS (vuint32m1_t, 0) DEF_RVV_X8_VLMUL_EXT_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_X8_VLMUL_EXT_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) DEF_RVV_X8_VLMUL_EXT_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) @@ -1089,6 +1108,8 @@ DEF_RVV_X16_VLMUL_EXT_OPS (vuint8mf2_t, 0) DEF_RVV_X16_VLMUL_EXT_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X16_VLMUL_EXT_OPS (vuint16mf2_t, 0) DEF_RVV_X16_VLMUL_EXT_OPS (vuint32mf2_t, RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_X16_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_X16_VLMUL_EXT_OPS (vbfloat16mf2_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_X16_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X16_VLMUL_EXT_OPS (vfloat16mf2_t, RVV_REQUIRE_ELEN_FP_16) DEF_RVV_X16_VLMUL_EXT_OPS (vfloat32mf2_t, RVV_REQUIRE_ELEN_FP_32 | RVV_REQUIRE_MIN_VLEN_64) @@ -1099,6 +1120,7 @@ DEF_RVV_X32_VLMUL_EXT_OPS (vint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X32_VLMUL_EXT_OPS (vuint8mf8_t, RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X32_VLMUL_EXT_OPS (vuint8mf4_t, 0) DEF_RVV_X32_VLMUL_EXT_OPS (vuint16mf4_t, RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_X32_VLMUL_EXT_OPS (vbfloat16mf4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X32_VLMUL_EXT_OPS (vfloat16mf4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_X64_VLMUL_EXT_OPS (vint8mf8_t, RVV_REQUIRE_MIN_VLEN_64) @@ -1112,6 +1134,7 @@ DEF_RVV_LMUL1_OPS (vuint8m1_t, 0) DEF_RVV_LMUL1_OPS (vuint16m1_t, 0) DEF_RVV_LMUL1_OPS (vuint32m1_t, 0) DEF_RVV_LMUL1_OPS (vuint64m1_t, RVV_REQUIRE_ELEN_64) +DEF_RVV_LMUL1_OPS (vbfloat16m1_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_LMUL1_OPS (vfloat16m1_t, RVV_REQUIRE_ELEN_FP_16) DEF_RVV_LMUL1_OPS (vfloat32m1_t, RVV_REQUIRE_ELEN_FP_32) DEF_RVV_LMUL1_OPS (vfloat64m1_t, RVV_REQUIRE_ELEN_FP_64) @@ -1124,6 +1147,7 @@ DEF_RVV_LMUL2_OPS (vuint8m2_t, 0) DEF_RVV_LMUL2_OPS (vuint16m2_t, 0) DEF_RVV_LMUL2_OPS (vuint32m2_t, 0) DEF_RVV_LMUL2_OPS (vuint64m2_t, RVV_REQUIRE_ELEN_64) +DEF_RVV_LMUL2_OPS (vbfloat16m2_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_LMUL2_OPS (vfloat16m2_t, RVV_REQUIRE_ELEN_FP_16) DEF_RVV_LMUL2_OPS (vfloat32m2_t, RVV_REQUIRE_ELEN_FP_32) DEF_RVV_LMUL2_OPS (vfloat64m2_t, RVV_REQUIRE_ELEN_FP_64) @@ -1137,6 +1161,7 @@ DEF_RVV_LMUL4_OPS (vuint16m4_t, 0) DEF_RVV_LMUL4_OPS (vuint32m4_t, 0) DEF_RVV_LMUL4_OPS (vuint64m4_t, RVV_REQUIRE_ELEN_64) DEF_RVV_LMUL4_OPS (vfloat16m4_t, RVV_REQUIRE_ELEN_FP_16) +DEF_RVV_LMUL4_OPS (vbfloat16m4_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_LMUL4_OPS (vfloat32m4_t, RVV_REQUIRE_ELEN_FP_32) DEF_RVV_LMUL4_OPS (vfloat64m4_t, RVV_REQUIRE_ELEN_FP_64) @@ -1312,6 +1337,31 @@ DEF_RVV_TUPLE_OPS (vint64m2x4_t, RVV_REQUIRE_ELEN_64) DEF_RVV_TUPLE_OPS (vuint64m2x4_t, RVV_REQUIRE_ELEN_64) DEF_RVV_TUPLE_OPS (vint64m4x2_t, RVV_REQUIRE_ELEN_64) DEF_RVV_TUPLE_OPS (vuint64m4x2_t, RVV_REQUIRE_ELEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf4x2_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf4x3_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf4x4_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf4x5_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf4x6_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf4x7_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf4x8_t, RVV_REQUIRE_ELEN_BF_16 | RVV_REQUIRE_MIN_VLEN_64) +DEF_RVV_TUPLE_OPS (vbfloat16mf2x2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16mf2x3_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16mf2x4_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16mf2x5_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16mf2x6_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16mf2x7_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16mf2x8_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m1x2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m1x3_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m1x4_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m1x5_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m1x6_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m1x7_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m1x8_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m2x2_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m2x3_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m2x4_t, RVV_REQUIRE_ELEN_BF_16) +DEF_RVV_TUPLE_OPS (vbfloat16m4x2_t, RVV_REQUIRE_ELEN_BF_16) DEF_RVV_TUPLE_OPS (vfloat16mf4x2_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_TUPLE_OPS (vfloat16mf4x3_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) DEF_RVV_TUPLE_OPS (vfloat16mf4x4_t, RVV_REQUIRE_ELEN_FP_16 | RVV_REQUIRE_MIN_VLEN_64) diff --git a/gcc/config/riscv/riscv-vector-builtins.cc b/gcc/config/riscv/riscv-vector-builtins.cc index c08d87a2680..720436dfbc9 100644 --- a/gcc/config/riscv/riscv-vector-builtins.cc +++ b/gcc/config/riscv/riscv-vector-builtins.cc @@ -2808,7 +2808,8 @@ static CONSTEXPR const function_type_info function_types[] = { VECTOR, MASK, SIGNED, UNSIGNED, EEW8_INDEX, EEW16_INDEX, EEW32_INDEX, \ EEW64_INDEX, SHIFT, DOUBLE_TRUNC, QUAD_TRUNC, OCT_TRUNC, \ DOUBLE_TRUNC_SCALAR, DOUBLE_TRUNC_SIGNED, DOUBLE_TRUNC_UNSIGNED, \ - DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ + DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_BFLOAT_SCALAR, \ + DOUBLE_TRUNC_BFLOAT, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ EEW8_INTERPRET, EEW16_INTERPRET, EEW32_INTERPRET, EEW64_INTERPRET, \ BOOL1_INTERPRET, BOOL2_INTERPRET, BOOL4_INTERPRET, BOOL8_INTERPRET, \ BOOL16_INTERPRET, BOOL32_INTERPRET, BOOL64_INTERPRET, \ @@ -2845,6 +2846,8 @@ static CONSTEXPR const function_type_info function_types[] = { VECTOR_TYPE_##DOUBLE_TRUNC_SIGNED, \ VECTOR_TYPE_##DOUBLE_TRUNC_UNSIGNED, \ VECTOR_TYPE_##DOUBLE_TRUNC_UNSIGNED_SCALAR, \ + VECTOR_TYPE_##DOUBLE_TRUNC_BFLOAT_SCALAR, \ + VECTOR_TYPE_##DOUBLE_TRUNC_BFLOAT, \ VECTOR_TYPE_##DOUBLE_TRUNC_FLOAT, \ VECTOR_TYPE_##FLOAT, \ VECTOR_TYPE_##LMUL1, \ @@ -3284,6 +3287,8 @@ check_required_extensions (const function_instance &instance) uint64_t riscv_isa_flags = 0; + if (TARGET_VECTOR_ELEN_BF_16) + riscv_isa_flags |= RVV_REQUIRE_ELEN_BF_16; if (TARGET_VECTOR_ELEN_FP_16) riscv_isa_flags |= RVV_REQUIRE_ELEN_FP_16; if (TARGET_VECTOR_ELEN_FP_32) diff --git a/gcc/config/riscv/riscv-vector-builtins.def b/gcc/config/riscv/riscv-vector-builtins.def index 784b54c81a4..97f329d11eb 100644 --- a/gcc/config/riscv/riscv-vector-builtins.def +++ b/gcc/config/riscv/riscv-vector-builtins.def @@ -72,7 +72,8 @@ along with GCC; see the file COPYING3. If not see VECTOR, MASK, SIGNED, UNSIGNED, EEW8_INDEX, EEW16_INDEX, EEW32_INDEX, \ EEW64_INDEX, SHIFT, DOUBLE_TRUNC, QUAD_TRUNC, OCT_TRUNC, \ DOUBLE_TRUNC_SCALAR, DOUBLE_TRUNC_SIGNED, DOUBLE_TRUNC_UNSIGNED, \ - DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ + DOUBLE_TRUNC_UNSIGNED_SCALAR, DOUBLE_TRUNC_BFLOAT_SCALAR, \ + DOUBLE_TRUNC_BFLOAT, DOUBLE_TRUNC_FLOAT, FLOAT, LMUL1, WLMUL1, \ EEW8_INTERPRET, EEW16_INTERPRET, EEW32_INTERPRET, EEW64_INTERPRET, \ BOOL1_INTERPRET, BOOL2_INTERPRET, BOOL4_INTERPRET, BOOL8_INTERPRET, \ BOOL16_INTERPRET, BOOL32_INTERPRET, BOOL64_INTERPRET, \ @@ -436,6 +437,56 @@ DEF_RVV_TYPE (vint64m8_t, 15, __rvv_int64m8_t, int64, RVVM8DI, _i64m8, _i64, DEF_RVV_TYPE (vuint64m8_t, 16, __rvv_uint64m8_t, uint64, RVVM8DI, _u64m8, _u64, _e64m8) +/* Enabled if TARGET_VECTOR_ELEN_BF_16 && (TARGET_ZVFBFMIN or TARGET_ZVFBFWMA). */ +/* LMUL = 1/4. */ +DEF_RVV_TYPE (vbfloat16mf4_t, 19, __rvv_bfloat16mf4_t, bfloat16, RVVMF4BF, _bf16mf4, + _bf16, _e16mf4) +/* Define tuple types for SEW = 16, LMUL = MF4. */ +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x2_t, 21, __rvv_bfloat16mf4x2_t, vbfloat16mf4_t, bfloat16, 2, _bf16mf4x2) +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x3_t, 21, __rvv_bfloat16mf4x3_t, vbfloat16mf4_t, bfloat16, 3, _bf16mf4x3) +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x4_t, 21, __rvv_bfloat16mf4x4_t, vbfloat16mf4_t, bfloat16, 4, _bf16mf4x4) +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x5_t, 21, __rvv_bfloat16mf4x5_t, vbfloat16mf4_t, bfloat16, 5, _bf16mf4x5) +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x6_t, 21, __rvv_bfloat16mf4x6_t, vbfloat16mf4_t, bfloat16, 6, _bf16mf4x6) +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x7_t, 21, __rvv_bfloat16mf4x7_t, vbfloat16mf4_t, bfloat16, 7, _bf16mf4x7) +DEF_RVV_TUPLE_TYPE (vbfloat16mf4x8_t, 21, __rvv_bfloat16mf4x8_t, vbfloat16mf4_t, bfloat16, 8, _bf16mf4x8) +/* LMUL = 1/2. */ +DEF_RVV_TYPE (vbfloat16mf2_t, 19, __rvv_bfloat16mf2_t, bfloat16, RVVMF2BF, _bf16mf2, + _bf16, _e16mf2) +/* Define tuple types for SEW = 16, LMUL = MF2. */ +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x2_t, 21, __rvv_bfloat16mf2x2_t, vbfloat16mf2_t, bfloat16, 2, _bf16mf2x2) +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x3_t, 21, __rvv_bfloat16mf2x3_t, vbfloat16mf2_t, bfloat16, 3, _bf16mf2x3) +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x4_t, 21, __rvv_bfloat16mf2x4_t, vbfloat16mf2_t, bfloat16, 4, _bf16mf2x4) +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x5_t, 21, __rvv_bfloat16mf2x5_t, vbfloat16mf2_t, bfloat16, 5, _bf16mf2x5) +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x6_t, 21, __rvv_bfloat16mf2x6_t, vbfloat16mf2_t, bfloat16, 6, _bf16mf2x6) +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x7_t, 21, __rvv_bfloat16mf2x7_t, vbfloat16mf2_t, bfloat16, 7, _bf16mf2x7) +DEF_RVV_TUPLE_TYPE (vbfloat16mf2x8_t, 21, __rvv_bfloat16mf2x8_t, vbfloat16mf2_t, bfloat16, 8, _bf16mf2x8) +/* LMUL = 1. */ +DEF_RVV_TYPE (vbfloat16m1_t, 18, __rvv_bfloat16m1_t, bfloat16, RVVM1BF, _bf16m1, + _bf16, _e16m1) +/* Define tuple types for SEW = 16, LMUL = M1. */ +DEF_RVV_TUPLE_TYPE (vbfloat16m1x2_t, 20, __rvv_bfloat16m1x2_t, vbfloat16m1_t, bfloat16, 2, _bf16m1x2) +DEF_RVV_TUPLE_TYPE (vbfloat16m1x3_t, 20, __rvv_bfloat16m1x3_t, vbfloat16m1_t, bfloat16, 3, _bf16m1x3) +DEF_RVV_TUPLE_TYPE (vbfloat16m1x4_t, 20, __rvv_bfloat16m1x4_t, vbfloat16m1_t, bfloat16, 4, _bf16m1x4) +DEF_RVV_TUPLE_TYPE (vbfloat16m1x5_t, 20, __rvv_bfloat16m1x5_t, vbfloat16m1_t, bfloat16, 5, _bf16m1x5) +DEF_RVV_TUPLE_TYPE (vbfloat16m1x6_t, 20, __rvv_bfloat16m1x6_t, vbfloat16m1_t, bfloat16, 6, _bf16m1x6) +DEF_RVV_TUPLE_TYPE (vbfloat16m1x7_t, 20, __rvv_bfloat16m1x7_t, vbfloat16m1_t, bfloat16, 7, _bf16m1x7) +DEF_RVV_TUPLE_TYPE (vbfloat16m1x8_t, 20, __rvv_bfloat16m1x8_t, vbfloat16m1_t, bfloat16, 8, _bf16m1x8) +/* LMUL = 2. */ +DEF_RVV_TYPE (vbfloat16m2_t, 18, __rvv_bfloat16m2_t, bfloat16, RVVM2BF, _bf16m2, + _bf16, _e16m2) +/* Define tuple types for SEW = 16, LMUL = M2. */ +DEF_RVV_TUPLE_TYPE (vbfloat16m2x2_t, 20, __rvv_bfloat16m2x2_t, vbfloat16m2_t, bfloat16, 2, _bf16m2x2) +DEF_RVV_TUPLE_TYPE (vbfloat16m2x3_t, 20, __rvv_bfloat16m2x3_t, vbfloat16m2_t, bfloat16, 3, _bf16m2x3) +DEF_RVV_TUPLE_TYPE (vbfloat16m2x4_t, 20, __rvv_bfloat16m2x4_t, vbfloat16m2_t, bfloat16, 4, _bf16m2x4) +/* LMUL = 4. */ +DEF_RVV_TYPE (vbfloat16m4_t, 18, __rvv_bfloat16m4_t, bfloat16, RVVM4BF, _bf16m4, + _bf16, _e16m4) +/* Define tuple types for SEW = 16, LMUL = M4. */ +DEF_RVV_TUPLE_TYPE (vbfloat16m4x2_t, 20, __rvv_bfloat16m4x2_t, vbfloat16m4_t, bfloat16, 2, _bf16m4x2) +/* LMUL = 8. */ +DEF_RVV_TYPE (vbfloat16m8_t, 18, __rvv_bfloat16m8_t, bfloat16, RVVM8BF, _bf16m8, + _bf16, _e16m8) + /* Enabled if TARGET_VECTOR_ELEN_FP_16 && (TARGET_ZVFH or TARGET_ZVFHMIN). */ /* LMUL = 1/4. */ DEF_RVV_TYPE (vfloat16mf4_t, 18, __rvv_float16mf4_t, float16, RVVMF4HF, _f16mf4, @@ -630,6 +681,8 @@ DEF_RVV_BASE_TYPE (double_trunc_scalar, get_scalar_type (type_idx)) DEF_RVV_BASE_TYPE (double_trunc_signed_vector, get_vector_type (type_idx)) DEF_RVV_BASE_TYPE (double_trunc_unsigned_vector, get_vector_type (type_idx)) DEF_RVV_BASE_TYPE (double_trunc_unsigned_scalar, get_scalar_type (type_idx)) +DEF_RVV_BASE_TYPE (double_trunc_bfloat_scalar, get_scalar_type (type_idx)) +DEF_RVV_BASE_TYPE (double_trunc_bfloat_vector, get_vector_type (type_idx)) DEF_RVV_BASE_TYPE (double_trunc_float_vector, get_vector_type (type_idx)) DEF_RVV_BASE_TYPE (float_vector, get_vector_type (type_idx)) DEF_RVV_BASE_TYPE (lmul1_vector, get_vector_type (type_idx)) diff --git a/gcc/config/riscv/riscv-vector-builtins.h b/gcc/config/riscv/riscv-vector-builtins.h index 05d18ae1322..56dbe2cf0e2 100644 --- a/gcc/config/riscv/riscv-vector-builtins.h +++ b/gcc/config/riscv/riscv-vector-builtins.h @@ -109,6 +109,7 @@ static const unsigned int CP_WRITE_CSR = 1U << 5; #define RVV_REQUIRE_FULL_V (1 << 4) /* Require Full 'V' extension. */ #define RVV_REQUIRE_MIN_VLEN_64 (1 << 5) /* Require TARGET_MIN_VLEN >= 64. */ #define RVV_REQUIRE_ELEN_FP_16 (1 << 6) /* Require FP ELEN >= 32. */ +#define RVV_REQUIRE_ELEN_BF_16 (1 << 7) /* Require BF16. */ /* Enumerates the required extensions. */ enum required_ext diff --git a/gcc/config/riscv/riscv-vector-switch.def b/gcc/config/riscv/riscv-vector-switch.def index 452283b7416..de72e415fe8 100644 --- a/gcc/config/riscv/riscv-vector-switch.def +++ b/gcc/config/riscv/riscv-vector-switch.def @@ -43,6 +43,7 @@ Encode SEW and LMUL into data types. |DF |RVVM1DF|RVVM2DF|RVVM4DF|RVVM8DF|N/A |N/A |N/A | |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|RVVMF2SF|N/A |N/A | |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|RVVMF4HF|N/A | + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|RVVMF4BF|N/A | There are the following data types for ELEN = 32. @@ -52,6 +53,7 @@ There are the following data types for ELEN = 32. |QI |RVVM1QI|RVVM2QI|RVVM4QI|RVVM8QI|RVVMF2QI|RVVMF4QI|N/A | |SF |RVVM1SF|RVVM2SF|RVVM4SF|RVVM8SF|N/A |N/A |N/A | |HF |RVVM1HF|RVVM2HF|RVVM4HF|RVVM8HF|RVVMF2HF|N/A |N/A | + |BF |RVVM1BF|RVVM2BF|RVVM4BF|RVVM8BF|RVVMF2BF|N/A |N/A | Encode the ratio of SEW/LMUL into the mask types. There are the following mask types. @@ -93,6 +95,14 @@ ENTRY (RVVM1HI, true, LMUL_1, 16) ENTRY (RVVMF2HI, !TARGET_XTHEADVECTOR, LMUL_F2, 32) ENTRY (RVVMF4HI, TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, LMUL_F4, 64) +/* Disable modes if TARGET_MIN_VLEN == 32 or !TARGET_VECTOR_ELEN_BF_16. */ +ENTRY (RVVM8BF, TARGET_VECTOR_ELEN_BF_16, LMUL_8, 2) +ENTRY (RVVM4BF, TARGET_VECTOR_ELEN_BF_16, LMUL_4, 4) +ENTRY (RVVM2BF, TARGET_VECTOR_ELEN_BF_16, LMUL_2, 8) +ENTRY (RVVM1BF, TARGET_VECTOR_ELEN_BF_16, LMUL_1, 16) +ENTRY (RVVMF2BF, TARGET_VECTOR_ELEN_BF_16, LMUL_F2, 32) +ENTRY (RVVMF4BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, LMUL_F4, 64) + /* Disable modes if TARGET_MIN_VLEN == 32 or !TARGET_VECTOR_ELEN_FP_16. */ ENTRY (RVVM8HF, TARGET_VECTOR_ELEN_FP_16, LMUL_8, 2) ENTRY (RVVM4HF, TARGET_VECTOR_ELEN_FP_16, LMUL_4, 4) @@ -198,6 +208,32 @@ TUPLE_ENTRY (RVVM1x2HI, true, RVVM1HI, 2, LMUL_1, 16) TUPLE_ENTRY (RVVMF2x2HI, !TARGET_XTHEADVECTOR, RVVMF2HI, 2, LMUL_F2, 32) TUPLE_ENTRY (RVVMF4x2HI, TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, RVVMF4HI, 2, LMUL_F4, 64) +TUPLE_ENTRY (RVVM1x8BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 8, LMUL_1, 16) +TUPLE_ENTRY (RVVMF2x8BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 8, LMUL_F2, 32) +TUPLE_ENTRY (RVVMF4x8BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 8, LMUL_F4, 64) +TUPLE_ENTRY (RVVM1x7BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 7, LMUL_1, 16) +TUPLE_ENTRY (RVVMF2x7BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 7, LMUL_F2, 32) +TUPLE_ENTRY (RVVMF4x7BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 7, LMUL_F4, 64) +TUPLE_ENTRY (RVVM1x6BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 6, LMUL_1, 16) +TUPLE_ENTRY (RVVMF2x6BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 6, LMUL_F2, 32) +TUPLE_ENTRY (RVVMF4x6BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 6, LMUL_F4, 64) +TUPLE_ENTRY (RVVM1x5BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 5, LMUL_1, 16) +TUPLE_ENTRY (RVVMF2x5BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 5, LMUL_F2, 32) +TUPLE_ENTRY (RVVMF4x5BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 5, LMUL_F4, 64) +TUPLE_ENTRY (RVVM2x4BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 4, LMUL_2, 8) +TUPLE_ENTRY (RVVM1x4BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 4, LMUL_1, 16) +TUPLE_ENTRY (RVVMF2x4BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 4, LMUL_F2, 32) +TUPLE_ENTRY (RVVMF4x4BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 4, LMUL_F4, 64) +TUPLE_ENTRY (RVVM2x3BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 3, LMUL_2, 8) +TUPLE_ENTRY (RVVM1x3BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 3, LMUL_1, 16) +TUPLE_ENTRY (RVVMF2x3BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 3, LMUL_F2, 32) +TUPLE_ENTRY (RVVMF4x3BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 3, LMUL_F4, 64) +TUPLE_ENTRY (RVVM4x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM4BF, 2, LMUL_4, 4) +TUPLE_ENTRY (RVVM2x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM2BF, 2, LMUL_2, 8) +TUPLE_ENTRY (RVVM1x2BF, TARGET_VECTOR_ELEN_BF_16, RVVM1BF, 2, LMUL_1, 16) +TUPLE_ENTRY (RVVMF2x2BF, TARGET_VECTOR_ELEN_BF_16, RVVMF2BF, 2, LMUL_F2, 32) +TUPLE_ENTRY (RVVMF4x2BF, TARGET_VECTOR_ELEN_BF_16 && TARGET_MIN_VLEN > 32, RVVMF4BF, 2, LMUL_F4, 64) + TUPLE_ENTRY (RVVM1x8HF, TARGET_VECTOR_ELEN_FP_16, RVVM1HF, 8, LMUL_1, 16) TUPLE_ENTRY (RVVMF2x8HF, TARGET_VECTOR_ELEN_FP_16 && !TARGET_XTHEADVECTOR, RVVMF2HF, 8, LMUL_F2, 32) TUPLE_ENTRY (RVVMF4x8HF, TARGET_VECTOR_ELEN_FP_16 && TARGET_MIN_VLEN > 32 && !TARGET_XTHEADVECTOR, RVVMF4HF, 8, LMUL_F4, 64)
v3: Rebase v2: Rebase The vector type of BFloat16 format is added in this patch, subsequent extensions to zvfbfmin and zvfwma need to be based on this patch. Signed-off-by: Feng Wang <wangfeng@eswincomputing.com> gcc/ChangeLog: * config/riscv/genrvv-type-indexer.cc (bfloat16_type): Generate bf16 vector_type and scalar_type in DEF_RVV_TYPE_INDEX. (bfloat16_wide_type): Ditto. (same_ratio_eew_bf16_type): Ditto. (main): Ditto. * config/riscv/riscv-modes.def (ADJUST_BYTESIZE): (RVV_WHOLE_MODES): Add vector type for BFloat16. (RVV_FRACT_MODE): Ditto. (RVV_NF4_MODES): Ditto. (RVV_NF8_MODES): Ditto. (RVV_NF2_MODES): Ditto. * config/riscv/riscv-vector-builtins-types.def (vbfloat16mf4_t): (vbfloat16mf2_t): Add builtin vector type for BFloat16. (vbfloat16m1_t): Ditto. (vbfloat16m2_t): Ditto. (vbfloat16m4_t): Ditto. (vbfloat16m8_t): Ditto. (vbfloat16mf4x2_t): Ditto. (vbfloat16mf4x3_t): Ditto. (vbfloat16mf4x4_t): Ditto. (vbfloat16mf4x5_t): Ditto. (vbfloat16mf4x6_t): Ditto. (vbfloat16mf4x7_t): Ditto. (vbfloat16mf4x8_t): Ditto. (vbfloat16mf2x2_t): Ditto. (vbfloat16mf2x3_t): Ditto. (vbfloat16mf2x4_t): Ditto. (vbfloat16mf2x5_t): Ditto. (vbfloat16mf2x6_t): Ditto. (vbfloat16mf2x7_t): Ditto. (vbfloat16mf2x8_t): Ditto. (vbfloat16m1x2_t): Ditto. (vbfloat16m1x3_t): Ditto. (vbfloat16m1x4_t): Ditto. (vbfloat16m1x5_t): Ditto. (vbfloat16m1x6_t): Ditto. (vbfloat16m1x7_t): Ditto. (vbfloat16m1x8_t): Ditto. (vbfloat16m2x2_t): Ditto. (vbfloat16m2x3_t): Ditto. (vbfloat16m2x4_t): Ditto. (vbfloat16m4x2_t): Ditto. * config/riscv/riscv-vector-builtins.cc (check_required_extensions): Add required_ext checking for BFloat16. * config/riscv/riscv-vector-builtins.def (vbfloat16mf4_t): Add vector_type for BFloat16 in builtins.def. (vbfloat16mf4x2_t): Ditto. (vbfloat16mf4x3_t): Ditto. (vbfloat16mf4x4_t): Ditto. (vbfloat16mf4x5_t): Ditto. (vbfloat16mf4x6_t): Ditto. (vbfloat16mf4x7_t): Ditto. (vbfloat16mf4x8_t): Ditto. (vbfloat16mf2_t): Ditto. (vbfloat16mf2x2_t): Ditto. (vbfloat16mf2x3_t): Ditto. (vbfloat16mf2x4_t): Ditto. (vbfloat16mf2x5_t): Ditto. (vbfloat16mf2x6_t): Ditto. (vbfloat16mf2x7_t): Ditto. (vbfloat16mf2x8_t): Ditto. (vbfloat16m1_t): Ditto. (vbfloat16m1x2_t): Ditto. (vbfloat16m1x3_t): Ditto. (vbfloat16m1x4_t): Ditto. (vbfloat16m1x5_t): Ditto. (vbfloat16m1x6_t): Ditto. (vbfloat16m1x7_t): Ditto. (vbfloat16m1x8_t): Ditto. (vbfloat16m2_t): Ditto. (vbfloat16m2x2_t): Ditto. (vbfloat16m2x3_t): Ditto. (vbfloat16m2x4_t): Ditto. (vbfloat16m4_t): Ditto. (vbfloat16m4x2_t): Ditto. (vbfloat16m8_t): Ditto. (double_trunc_bfloat_scalar): Add scalar_type def for BFloat16. (double_trunc_bfloat_vector): Add vector_type def for BFloat16. * config/riscv/riscv-vector-builtins.h (RVV_REQUIRE_ELEN_BF_16): Add required defination of BFloat16 ext. * config/riscv/riscv-vector-switch.def (ENTRY): Add vector_type information for BFloat16. (TUPLE_ENTRY): Add tuple vector_type information for BFloat16. --- gcc/config/riscv/genrvv-type-indexer.cc | 115 ++++++++++++++++++ gcc/config/riscv/riscv-modes.def | 30 ++++- .../riscv/riscv-vector-builtins-types.def | 50 ++++++++ gcc/config/riscv/riscv-vector-builtins.cc | 7 +- gcc/config/riscv/riscv-vector-builtins.def | 55 ++++++++- gcc/config/riscv/riscv-vector-builtins.h | 1 + gcc/config/riscv/riscv-vector-switch.def | 36 ++++++ 7 files changed, 291 insertions(+), 3 deletions(-)