Message ID | 20230920023059.1728132-1-pan2.li@intel.com |
---|---|
State | New |
Headers | show |
Series | [v1] RISC-V: Support ceil and ceilf auto-vectorization | expand |
+;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) It should be "V_VLSF" instead of "VF" so that you could also support VLS CEIL. Besides, I want to see this following case: a[i] = cond[i] ? CEIL (b[i]): c[i]; Ideally, we should be able to combine vfcvt + vmerge into vfcvt with mask. juzhe.zhong@rivai.ai From: pan2.li Date: 2023-09-20 10:30 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization From: Pan Li <pan2.li@intel.com> This patch would like to support auto-vectorization for both the ceil and ceilf of math.h. It depends on the -ffast-math option. When we would like to call ceil/ceilf like v2 = ceil (v1), we will onvert it into below insn (reference the implementation of llvm). * vfcvt.x.f v3, v1, RUP * vfcvt.f.x v2, v3 The conditional auto-vectorization for ceil/ceilf is also supported and covered by test cases. Befor this patch: math-ceil-1.c:21:1: missed: couldn't vectorize loop ... .L3: flw fa0,0(s0) addi s0,s0,4 addi s1,s1,4 call ceilf fsw fa0,-4(s1) bne s0,s2,.L3 After this patch: ... fsrmi 3 .L4: vsetvli a5,a2,e32,m1,ta,ma vle32.v v1,0(a1) vsetvli a3,zero,e32,m1,ta,ma slli a4,a5,2 vfcvt.x.f.v v1,v1 sub a2,a2,a5 vfcvt.f.x.v v1,v1 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) add a1,a1,a4 add a0,a0,a4 bne a2,zero,.L4 .L14: fsrm a6 ret Please not VLS mode is not involved in this patch and will be token care of in the underlying patches soon. gcc/ChangeLog: * config/riscv/autovec.md (ceil<mode>2): New pattern. * config/riscv/riscv-protos.h (enum insn_flags): New enum type. (enum insn_type): Ditto. * config/riscv/riscv-v.cc: Handle rounding up. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/math-ceil-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-2.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-3.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-4.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: New test. * gcc.target/riscv/rvv/autovec/test-math.h: New test. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/config/riscv/autovec.md | 30 +++++++++++++ gcc/config/riscv/riscv-protos.h | 4 ++ gcc/config/riscv/riscv-v.cc | 2 + .../riscv/rvv/autovec/math-ceil-1.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-2.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-3.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-4.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-1.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-2.c | 24 ++++++++++ .../gcc.target/riscv/rvv/autovec/test-math.h | 45 +++++++++++++++++++ 10 files changed, 219 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 493d5745485..ea508d81047 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2374,3 +2374,33 @@ (define_expand "<u>avg<v_double_trunc>3_ceil" riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3); DONE; }) + +;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a2d218d67b..833f1efbaf4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -250,6 +250,9 @@ enum insn_flags : unsigned int /* flags for the floating-point rounding mode. */ /* Means INSN has FRM operand and the value is FRM_DYN. */ FRM_DYN_P = 1 << 15, + + /* Means INSN has FRM operand and the value is FRM_RUP. */ + FRM_RUP_P = 1 << 16, }; enum insn_type : unsigned int @@ -290,6 +293,7 @@ enum insn_type : unsigned int UNARY_OP_TAMA = __MASK_OP_TAMA | UNARY_OP_P, UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, + UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, /* Binary operator. */ BINARY_OP = __NORMAL_OP | BINARY_OP_P, diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a9287e5d671..4192f988648 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -323,6 +323,8 @@ public: /* Add rounding mode operand. */ if (m_insn_flags & FRM_DYN_P) add_rounding_mode_operand (FRM_DYN); + if (m_insn_flags & FRM_RUP_P) + add_rounding_mode_operand (FRM_RUP); gcc_assert (insn_data[(int) icode].n_operands == m_opno); expand (icode, any_mem_p); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c new file mode 100644 index 00000000000..8f0f09609eb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c new file mode 100644 index 00000000000..73395d30d7a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c new file mode 100644 index 00000000000..eb0f3a3db78 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c new file mode 100644 index 00000000000..b9a3c8ebf84 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c new file mode 100644 index 00000000000..014c4c3ac0a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +float out[ARRAY_SIZE]; +float ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(float, ceilf) +TEST_INIT(float) +TEST_ASSERT(float) + +int +main () +{ + test_float_init (in, ref, ARRAY_SIZE); + test_float_ceilf (out, in, ARRAY_SIZE); + test_float_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c new file mode 100644 index 00000000000..ae361e11144 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +double out[ARRAY_SIZE]; +double ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(double, ceil) +TEST_INIT(double) +TEST_ASSERT(double) + +int +main () +{ + test_double_init (in, ref, ARRAY_SIZE); + test_double_ceil (out, in, ARRAY_SIZE); + test_double_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h new file mode 100644 index 00000000000..57dd5e0e460 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h @@ -0,0 +1,45 @@ +#include <math.h> + +#define TEST_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = CALL (in[i]); \ + } + +#define TEST_COND_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, int *cond, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = cond[i] ? CALL (in[i]) : in[i]; \ + } + +#define TEST_INIT(TYPE) \ + void test_##TYPE##_init (TYPE *in, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + TYPE tmp = (TYPE)i; \ + \ + if (i % 2 == 0) \ + { \ + in[i] = 1.5f + (TYPE)i; \ + ref[i] = (TYPE)(i + 2); \ + } \ + else \ + { \ + in[i] = (TYPE)i; \ + ref[i] = (TYPE)i; \ + } \ + } \ + } + +#define TEST_ASSERT(TYPE) \ + void test_##TYPE##_assert (TYPE *out, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + if (out[i] != ref[i]) \ + __builtin_abort (); \ + } \ + }
> It should be "V_VLSF" instead of "VF" so that you could also support VLS CEIL. Under preparing, and will append to this V2 instead of another patch. > a[i] = cond[i] ? CEIL (b[i]): c[i]; Sure Pan From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai> Sent: Wednesday, September 20, 2023 10:35 AM To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org> Cc: Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com> Subject: Re: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization +;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) It should be "V_VLSF" instead of "VF" so that you could also support VLS CEIL. Besides, I want to see this following case: a[i] = cond[i] ? CEIL (b[i]): c[i]; Ideally, we should be able to combine vfcvt + vmerge into vfcvt with mask.
I have checked LLVM: https://godbolt.org/z/4jWG5vjMT It seems their code sequence as follows: vfabs.v vmflt.vf vfcvt.x.f.v -> static rounding mode vfcvt.f.x.v -> dynamic rounding mode vfsgnj.vv How come you just only need 2 static vfcvt insns is enough ? juzhe.zhong@rivai.ai From: Li, Pan2 Date: 2023-09-20 10:44 To: juzhe.zhong@rivai.ai; gcc-patches CC: Wang, Yanzhang; kito.cheng Subject: RE: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization > It should be "V_VLSF" instead of "VF" so that you could also support VLS CEIL. Under preparing, and will append to this V2 instead of another patch. > a[i] = cond[i] ? CEIL (b[i]): c[i]; Sure Pan From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai> Sent: Wednesday, September 20, 2023 10:35 AM To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org> Cc: Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com> Subject: Re: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization +;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) It should be "V_VLSF" instead of "VF" so that you could also support VLS CEIL. Besides, I want to see this following case: a[i] = cond[i] ? CEIL (b[i]): c[i]; Ideally, we should be able to combine vfcvt + vmerge into vfcvt with mask. juzhe.zhong@rivai.ai From: pan2.li Date: 2023-09-20 10:30 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization From: Pan Li <pan2.li@intel.com> This patch would like to support auto-vectorization for both the ceil and ceilf of math.h. It depends on the -ffast-math option. When we would like to call ceil/ceilf like v2 = ceil (v1), we will onvert it into below insn (reference the implementation of llvm). * vfcvt.x.f v3, v1, RUP * vfcvt.f.x v2, v3 The conditional auto-vectorization for ceil/ceilf is also supported and covered by test cases. Befor this patch: math-ceil-1.c:21:1: missed: couldn't vectorize loop ... .L3: flw fa0,0(s0) addi s0,s0,4 addi s1,s1,4 call ceilf fsw fa0,-4(s1) bne s0,s2,.L3 After this patch: ... fsrmi 3 .L4: vsetvli a5,a2,e32,m1,ta,ma vle32.v v1,0(a1) vsetvli a3,zero,e32,m1,ta,ma slli a4,a5,2 vfcvt.x.f.v v1,v1 sub a2,a2,a5 vfcvt.f.x.v v1,v1 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) add a1,a1,a4 add a0,a0,a4 bne a2,zero,.L4 .L14: fsrm a6 ret Please not VLS mode is not involved in this patch and will be token care of in the underlying patches soon. gcc/ChangeLog: * config/riscv/autovec.md (ceil<mode>2): New pattern. * config/riscv/riscv-protos.h (enum insn_flags): New enum type. (enum insn_type): Ditto. * config/riscv/riscv-v.cc: Handle rounding up. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/math-ceil-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-2.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-3.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-4.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: New test. * gcc.target/riscv/rvv/autovec/test-math.h: New test. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/config/riscv/autovec.md | 30 +++++++++++++ gcc/config/riscv/riscv-protos.h | 4 ++ gcc/config/riscv/riscv-v.cc | 2 + .../riscv/rvv/autovec/math-ceil-1.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-2.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-3.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-4.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-1.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-2.c | 24 ++++++++++ .../gcc.target/riscv/rvv/autovec/test-math.h | 45 +++++++++++++++++++ 10 files changed, 219 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 493d5745485..ea508d81047 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2374,3 +2374,33 @@ (define_expand "<u>avg<v_double_trunc>3_ceil" riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3); DONE; }) + +;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a2d218d67b..833f1efbaf4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -250,6 +250,9 @@ enum insn_flags : unsigned int /* flags for the floating-point rounding mode. */ /* Means INSN has FRM operand and the value is FRM_DYN. */ FRM_DYN_P = 1 << 15, + + /* Means INSN has FRM operand and the value is FRM_RUP. */ + FRM_RUP_P = 1 << 16, }; enum insn_type : unsigned int @@ -290,6 +293,7 @@ enum insn_type : unsigned int UNARY_OP_TAMA = __MASK_OP_TAMA | UNARY_OP_P, UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, + UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, /* Binary operator. */ BINARY_OP = __NORMAL_OP | BINARY_OP_P, diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a9287e5d671..4192f988648 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -323,6 +323,8 @@ public: /* Add rounding mode operand. */ if (m_insn_flags & FRM_DYN_P) add_rounding_mode_operand (FRM_DYN); + if (m_insn_flags & FRM_RUP_P) + add_rounding_mode_operand (FRM_RUP); gcc_assert (insn_data[(int) icode].n_operands == m_opno); expand (icode, any_mem_p); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c new file mode 100644 index 00000000000..8f0f09609eb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c new file mode 100644 index 00000000000..73395d30d7a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c new file mode 100644 index 00000000000..eb0f3a3db78 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c new file mode 100644 index 00000000000..b9a3c8ebf84 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c new file mode 100644 index 00000000000..014c4c3ac0a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +float out[ARRAY_SIZE]; +float ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(float, ceilf) +TEST_INIT(float) +TEST_ASSERT(float) + +int +main () +{ + test_float_init (in, ref, ARRAY_SIZE); + test_float_ceilf (out, in, ARRAY_SIZE); + test_float_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c new file mode 100644 index 00000000000..ae361e11144 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +double out[ARRAY_SIZE]; +double ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(double, ceil) +TEST_INIT(double) +TEST_ASSERT(double) + +int +main () +{ + test_double_init (in, ref, ARRAY_SIZE); + test_double_ceil (out, in, ARRAY_SIZE); + test_double_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h new file mode 100644 index 00000000000..57dd5e0e460 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h @@ -0,0 +1,45 @@ +#include <math.h> + +#define TEST_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = CALL (in[i]); \ + } + +#define TEST_COND_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, int *cond, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = cond[i] ? CALL (in[i]) : in[i]; \ + } + +#define TEST_INIT(TYPE) \ + void test_##TYPE##_init (TYPE *in, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + TYPE tmp = (TYPE)i; \ + \ + if (i % 2 == 0) \ + { \ + in[i] = 1.5f + (TYPE)i; \ + ref[i] = (TYPE)(i + 2); \ + } \ + else \ + { \ + in[i] = (TYPE)i; \ + ref[i] = (TYPE)i; \ + } \ + } \ + } + +#define TEST_ASSERT(TYPE) \ + void test_##TYPE##_assert (TYPE *out, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + if (out[i] != ref[i]) \ + __builtin_abort (); \ + } \ + }
I just checked the LLVM implementation. This is their codes of rounding autovectorizaton: They handle CEIL/FLOOR/FROUND/FROUNDEVEN/FROUND TO ZERO with the same handling switch (Op.getOpcode()) { default: llvm_unreachable("Unexpected opcode"); case ISD::FCEIL: case ISD::VP_FCEIL: case ISD::FFLOOR: case ISD::VP_FFLOOR: case ISD::FROUND: case ISD::FROUNDEVEN: case ISD::VP_FROUND: case ISD::VP_FROUNDEVEN: case ISD::VP_FROUNDTOZERO: { RISCVFPRndMode::RoundingMode FRM = matchRoundingOp(Op.getOpcode()); assert(FRM != RISCVFPRndMode::Invalid); Truncated = DAG.getNode(RISCVISD::VFCVT_RM_X_F_VL, DL, IntVT, Src, Mask, DAG.getTargetConstant(FRM, DL, XLenVT), VL); break; } case ISD::FTRUNC: Truncated = DAG.getNode(RISCVISD::VFCVT_RTZ_X_F_VL, DL, IntVT, Src, Mask, VL); break; case ISD::VP_FRINT: Truncated = DAG.getNode(RISCVISD::VFCVT_X_F_VL, DL, IntVT, Src, Mask, VL); break; case ISD::VP_FNEARBYINT: Truncated = DAG.getNode(RISCVISD::VFROUND_NOEXCEPT_VL, DL, ContainerVT, Src, Mask, VL); break; } // VFROUND_NOEXCEPT_VL includes SINT_TO_FP_VL. if (Op.getOpcode() != ISD::VP_FNEARBYINT) Truncated = DAG.getNode(RISCVISD::SINT_TO_FP_VL, DL, ContainerVT, Truncated, Mask, VL); // Restore the original sign so that -0.0 is preserved. Truncated = DAG.getNode(RISCVISD::FCOPYSIGN_VL, DL, ContainerVT, Truncated, Src, Src, Mask, VL); I think you could just copy LLVM implementation and translate them into GCC codes. It's so simple. Create a function call 'expand_rounding". LLVM code is very easy to read. I believe you could leverage LLVM implementation quickly. juzhe.zhong@rivai.ai From: Li, Pan2 Date: 2023-09-20 10:44 To: juzhe.zhong@rivai.ai; gcc-patches CC: Wang, Yanzhang; kito.cheng Subject: RE: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization > It should be "V_VLSF" instead of "VF" so that you could also support VLS CEIL. Under preparing, and will append to this V2 instead of another patch. > a[i] = cond[i] ? CEIL (b[i]): c[i]; Sure Pan From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai> Sent: Wednesday, September 20, 2023 10:35 AM To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org> Cc: Li, Pan2 <pan2.li@intel.com>; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com> Subject: Re: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization +;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) It should be "V_VLSF" instead of "VF" so that you could also support VLS CEIL. Besides, I want to see this following case: a[i] = cond[i] ? CEIL (b[i]): c[i]; Ideally, we should be able to combine vfcvt + vmerge into vfcvt with mask. juzhe.zhong@rivai.ai From: pan2.li Date: 2023-09-20 10:30 To: gcc-patches CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng Subject: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization From: Pan Li <pan2.li@intel.com> This patch would like to support auto-vectorization for both the ceil and ceilf of math.h. It depends on the -ffast-math option. When we would like to call ceil/ceilf like v2 = ceil (v1), we will onvert it into below insn (reference the implementation of llvm). * vfcvt.x.f v3, v1, RUP * vfcvt.f.x v2, v3 The conditional auto-vectorization for ceil/ceilf is also supported and covered by test cases. Befor this patch: math-ceil-1.c:21:1: missed: couldn't vectorize loop ... .L3: flw fa0,0(s0) addi s0,s0,4 addi s1,s1,4 call ceilf fsw fa0,-4(s1) bne s0,s2,.L3 After this patch: ... fsrmi 3 .L4: vsetvli a5,a2,e32,m1,ta,ma vle32.v v1,0(a1) vsetvli a3,zero,e32,m1,ta,ma slli a4,a5,2 vfcvt.x.f.v v1,v1 sub a2,a2,a5 vfcvt.f.x.v v1,v1 vsetvli zero,a5,e32,m1,ta,ma vse32.v v1,0(a0) add a1,a1,a4 add a0,a0,a4 bne a2,zero,.L4 .L14: fsrm a6 ret Please not VLS mode is not involved in this patch and will be token care of in the underlying patches soon. gcc/ChangeLog: * config/riscv/autovec.md (ceil<mode>2): New pattern. * config/riscv/riscv-protos.h (enum insn_flags): New enum type. (enum insn_type): Ditto. * config/riscv/riscv-v.cc: Handle rounding up. gcc/testsuite/ChangeLog: * gcc.target/riscv/rvv/autovec/math-ceil-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-2.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-3.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-4.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-1.c: New test. * gcc.target/riscv/rvv/autovec/math-ceil-run-2.c: New test. * gcc.target/riscv/rvv/autovec/test-math.h: New test. Signed-off-by: Pan Li <pan2.li@intel.com> --- gcc/config/riscv/autovec.md | 30 +++++++++++++ gcc/config/riscv/riscv-protos.h | 4 ++ gcc/config/riscv/riscv-v.cc | 2 + .../riscv/rvv/autovec/math-ceil-1.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-2.c | 21 +++++++++ .../riscv/rvv/autovec/math-ceil-3.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-4.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-1.c | 24 ++++++++++ .../riscv/rvv/autovec/math-ceil-run-2.c | 24 ++++++++++ .../gcc.target/riscv/rvv/autovec/test-math.h | 45 +++++++++++++++++++ 10 files changed, 219 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 493d5745485..ea508d81047 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2374,3 +2374,33 @@ (define_expand "<u>avg<v_double_trunc>3_ceil" riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3); DONE; }) + +;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a2d218d67b..833f1efbaf4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -250,6 +250,9 @@ enum insn_flags : unsigned int /* flags for the floating-point rounding mode. */ /* Means INSN has FRM operand and the value is FRM_DYN. */ FRM_DYN_P = 1 << 15, + + /* Means INSN has FRM operand and the value is FRM_RUP. */ + FRM_RUP_P = 1 << 16, }; enum insn_type : unsigned int @@ -290,6 +293,7 @@ enum insn_type : unsigned int UNARY_OP_TAMA = __MASK_OP_TAMA | UNARY_OP_P, UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, + UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, /* Binary operator. */ BINARY_OP = __NORMAL_OP | BINARY_OP_P, diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a9287e5d671..4192f988648 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -323,6 +323,8 @@ public: /* Add rounding mode operand. */ if (m_insn_flags & FRM_DYN_P) add_rounding_mode_operand (FRM_DYN); + if (m_insn_flags & FRM_RUP_P) + add_rounding_mode_operand (FRM_RUP); gcc_assert (insn_data[(int) icode].n_operands == m_opno); expand (icode, any_mem_p); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c new file mode 100644 index 00000000000..8f0f09609eb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c new file mode 100644 index 00000000000..73395d30d7a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c new file mode 100644 index 00000000000..eb0f3a3db78 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c new file mode 100644 index 00000000000..b9a3c8ebf84 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c new file mode 100644 index 00000000000..014c4c3ac0a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +float out[ARRAY_SIZE]; +float ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(float, ceilf) +TEST_INIT(float) +TEST_ASSERT(float) + +int +main () +{ + test_float_init (in, ref, ARRAY_SIZE); + test_float_ceilf (out, in, ARRAY_SIZE); + test_float_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c new file mode 100644 index 00000000000..ae361e11144 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +double out[ARRAY_SIZE]; +double ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(double, ceil) +TEST_INIT(double) +TEST_ASSERT(double) + +int +main () +{ + test_double_init (in, ref, ARRAY_SIZE); + test_double_ceil (out, in, ARRAY_SIZE); + test_double_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h new file mode 100644 index 00000000000..57dd5e0e460 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h @@ -0,0 +1,45 @@ +#include <math.h> + +#define TEST_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = CALL (in[i]); \ + } + +#define TEST_COND_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, int *cond, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = cond[i] ? CALL (in[i]) : in[i]; \ + } + +#define TEST_INIT(TYPE) \ + void test_##TYPE##_init (TYPE *in, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + TYPE tmp = (TYPE)i; \ + \ + if (i % 2 == 0) \ + { \ + in[i] = 1.5f + (TYPE)i; \ + ref[i] = (TYPE)(i + 2); \ + } \ + else \ + { \ + in[i] = (TYPE)i; \ + ref[i] = (TYPE)i; \ + } \ + } \ + } + +#define TEST_ASSERT(TYPE) \ + void test_##TYPE##_assert (TYPE *out, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + if (out[i] != ref[i]) \ + __builtin_abort (); \ + } \ + }
Thanks Juzhe, let me check and keep you posted.
Pan
From: juzhe.zhong@rivai.ai <juzhe.zhong@rivai.ai>
Sent: Wednesday, September 20, 2023 11:37 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches <gcc-patches@gcc.gnu.org>
Cc: Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng <kito.cheng@gmail.com>
Subject: Re: RE: [PATCH v1] RISC-V: Support ceil and ceilf auto-vectorization
I just checked the LLVM implementation.
This is their codes of rounding autovectorizaton:
They handle CEIL/FLOOR/FROUND/FROUNDEVEN/FROUND TO ZERO with the same handling
switch (Op.getOpcode()) {
default:
llvm_unreachable("Unexpected opcode");
case ISD::FCEIL:
case ISD::VP_FCEIL:
case ISD::FFLOOR:
case ISD::VP_FFLOOR:
case ISD::FROUND:
case ISD::FROUNDEVEN:
case ISD::VP_FROUND:
case ISD::VP_FROUNDEVEN:
case ISD::VP_FROUNDTOZERO: {
RISCVFPRndMode::RoundingMode FRM = matchRoundingOp(Op.getOpcode());
assert(FRM != RISCVFPRndMode::Invalid);
Truncated = DAG.getNode(RISCVISD::VFCVT_RM_X_F_VL, DL, IntVT, Src, Mask,
DAG.getTargetConstant(FRM, DL, XLenVT), VL);
break;
}
case ISD::FTRUNC:
Truncated = DAG.getNode(RISCVISD::VFCVT_RTZ_X_F_VL, DL, IntVT, Src,
Mask, VL);
break;
case ISD::VP_FRINT:
Truncated = DAG.getNode(RISCVISD::VFCVT_X_F_VL, DL, IntVT, Src, Mask, VL);
break;
case ISD::VP_FNEARBYINT:
Truncated = DAG.getNode(RISCVISD::VFROUND_NOEXCEPT_VL, DL, ContainerVT, Src,
Mask, VL);
break;
}
// VFROUND_NOEXCEPT_VL includes SINT_TO_FP_VL.
if (Op.getOpcode() != ISD::VP_FNEARBYINT)
Truncated = DAG.getNode(RISCVISD::SINT_TO_FP_VL, DL, ContainerVT, Truncated,
Mask, VL);
// Restore the original sign so that -0.0 is preserved.
Truncated = DAG.getNode(RISCVISD::FCOPYSIGN_VL, DL, ContainerVT, Truncated,
Src, Src, Mask, VL);
I think you could just copy LLVM implementation and translate them into GCC codes.
It's so simple.
Create a function call 'expand_rounding".
LLVM code is very easy to read. I believe you could leverage LLVM implementation quickly.
diff --git a/gcc/config/riscv/autovec.md b/gcc/config/riscv/autovec.md index 493d5745485..ea508d81047 100644 --- a/gcc/config/riscv/autovec.md +++ b/gcc/config/riscv/autovec.md @@ -2374,3 +2374,33 @@ (define_expand "<u>avg<v_double_trunc>3_ceil" riscv_vector::emit_vlmax_insn (icode, riscv_vector::BINARY_OP, ops3); DONE; }) + +;; ------------------------------------------------------------------------- +;; ---- [FP] Math.h. +;; ------------------------------------------------------------------------- +;; Includes: +;; - ceil/ceilf +;; ------------------------------------------------------------------------- +(define_expand "ceil<mode>2" + [(match_operand:VF 0 "register_operand") + (match_operand:VF 1 "register_operand")] + "TARGET_VECTOR" + { + rtx tmp = gen_reg_rtx (<VCONVERT>mode); + rtx ops_1[] = {tmp, operands[1]}; + insn_code icode = code_for_pred_fcvt_x_f (UNSPEC_VFCVT, <MODE>mode); + + /* vfcvt.x.f with rounding up (aka ceil). */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_1); + + rtx ops_2[] = {operands[0], tmp}; + icode = code_for_pred (FLOAT, <MODE>mode); + + /* vfcvt.f.x for the final result. To avoid unnecessary frm register + access, we use RUP here and it will never do the rounding up because + the tmp rtx comes from the float to int conversion. */ + riscv_vector::emit_vlmax_insn (icode, riscv_vector::UNARY_OP_FRM_RUP, ops_2); + + DONE; + } +) diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h index 5a2d218d67b..833f1efbaf4 100644 --- a/gcc/config/riscv/riscv-protos.h +++ b/gcc/config/riscv/riscv-protos.h @@ -250,6 +250,9 @@ enum insn_flags : unsigned int /* flags for the floating-point rounding mode. */ /* Means INSN has FRM operand and the value is FRM_DYN. */ FRM_DYN_P = 1 << 15, + + /* Means INSN has FRM operand and the value is FRM_RUP. */ + FRM_RUP_P = 1 << 16, }; enum insn_type : unsigned int @@ -290,6 +293,7 @@ enum insn_type : unsigned int UNARY_OP_TAMA = __MASK_OP_TAMA | UNARY_OP_P, UNARY_OP_TAMU = __MASK_OP_TAMU | UNARY_OP_P, UNARY_OP_FRM_DYN = UNARY_OP | FRM_DYN_P, + UNARY_OP_FRM_RUP = UNARY_OP | FRM_RUP_P, /* Binary operator. */ BINARY_OP = __NORMAL_OP | BINARY_OP_P, diff --git a/gcc/config/riscv/riscv-v.cc b/gcc/config/riscv/riscv-v.cc index a9287e5d671..4192f988648 100644 --- a/gcc/config/riscv/riscv-v.cc +++ b/gcc/config/riscv/riscv-v.cc @@ -323,6 +323,8 @@ public: /* Add rounding mode operand. */ if (m_insn_flags & FRM_DYN_P) add_rounding_mode_operand (FRM_DYN); + if (m_insn_flags & FRM_RUP_P) + add_rounding_mode_operand (FRM_RUP); gcc_assert (insn_data[(int) icode].n_operands == m_opno); expand (icode, any_mem_p); diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c new file mode 100644 index 00000000000..8f0f09609eb --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-1.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c new file mode 100644 index 00000000000..73395d30d7a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c new file mode 100644 index 00000000000..eb0f3a3db78 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_float_ceilf: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e32,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(float, ceilf) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c new file mode 100644 index 00000000000..b9a3c8ebf84 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-4.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -fno-schedule-insns -fno-schedule-insns2" } */ +/* { dg-final { check-function-bodies "**" "" } } */ + +#include "test-math.h" + +/* +** test_double_ceil: +** frrm\s+[atx][0-9]+ +** ... +** fsrmi\s+3 +** ... +** vsetvli\s+[atx][0-9]+,\s*zero,\s*e64,\s*m1,\s*ta,\s*ma +** ... +** vfcvt\.x\.f\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vfcvt\.f\.x\.v\s+v[0-9]+,\s*v[0-9]+ +** ... +** vmerge\.vvm\s+v[0-9]+,\s*v[0-9]+,\s*v[0-9]+,\s*v0 +** ... +** fsrm\s+[atx][0-9]+ +** ... +*/ +TEST_COND_CEIL(double, ceil) diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c new file mode 100644 index 00000000000..014c4c3ac0a --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-1.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +float in[ARRAY_SIZE]; +float out[ARRAY_SIZE]; +float ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(float, ceilf) +TEST_INIT(float) +TEST_ASSERT(float) + +int +main () +{ + test_float_init (in, ref, ARRAY_SIZE); + test_float_ceilf (out, in, ARRAY_SIZE); + test_float_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c new file mode 100644 index 00000000000..ae361e11144 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/math-ceil-run-2.c @@ -0,0 +1,24 @@ +/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */ +/* { dg-additional-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize -fno-vect-cost-model -ffast-math -lm" } */ +#include "test-math.h" + +#define ARRAY_SIZE 128 + +double in[ARRAY_SIZE]; +double out[ARRAY_SIZE]; +double ref[ARRAY_SIZE]; + +// Test function declaration +TEST_CEIL(double, ceil) +TEST_INIT(double) +TEST_ASSERT(double) + +int +main () +{ + test_double_init (in, ref, ARRAY_SIZE); + test_double_ceil (out, in, ARRAY_SIZE); + test_double_assert (out, ref, ARRAY_SIZE); + + return 0; +} diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h new file mode 100644 index 00000000000..57dd5e0e460 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/test-math.h @@ -0,0 +1,45 @@ +#include <math.h> + +#define TEST_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = CALL (in[i]); \ + } + +#define TEST_COND_CEIL(TYPE, CALL) \ + void test_##TYPE##_##CALL (TYPE *out, int *cond, TYPE *in, unsigned count) \ + { \ + for (unsigned i = 0; i < count; i++) \ + out[i] = cond[i] ? CALL (in[i]) : in[i]; \ + } + +#define TEST_INIT(TYPE) \ + void test_##TYPE##_init (TYPE *in, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + TYPE tmp = (TYPE)i; \ + \ + if (i % 2 == 0) \ + { \ + in[i] = 1.5f + (TYPE)i; \ + ref[i] = (TYPE)(i + 2); \ + } \ + else \ + { \ + in[i] = (TYPE)i; \ + ref[i] = (TYPE)i; \ + } \ + } \ + } + +#define TEST_ASSERT(TYPE) \ + void test_##TYPE##_assert (TYPE *out, TYPE *ref, unsigned size) \ + { \ + for (unsigned i = 0; i < size; i++) \ + { \ + if (out[i] != ref[i]) \ + __builtin_abort (); \ + } \ + }