Message ID | 20240531013759.23672-1-quic_pzheng@quicinc.com |
---|---|
State | New |
Headers | show |
Series | [v2] aarch64: Add vector floating point extend pattern [PR113880, PR113869] | expand |
Pengxuan Zheng <quic_pzheng@quicinc.com> writes: > This patch adds vector floating point extend pattern for V2SF->V2DF and > V4HF->V4SF conversions by renaming the existing aarch64_float_extend_lo_<Vwide> > pattern to the standard optab one, i.e., extend<mode><Vwide>2. This allows the > vectorizer to vectorize certain floating point widening operations for the > aarch64 target. > > PR target/113880 > PR target/113869 > > gcc/ChangeLog: > > * config/aarch64/aarch64-builtins.cc (VAR1): Remap float_extend_lo_ > builtin codes to standard optab ones. > * config/aarch64/aarch64-simd.md (aarch64_float_extend_lo_<Vwide>): Rename > to... > (extend<mode><Vwide>2): ... This. > > gcc/testsuite/ChangeLog: > > * gcc.target/aarch64/extend-vec.c: New test. OK, thanks, and sorry for the slow review. Richard > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> > --- > gcc/config/aarch64/aarch64-builtins.cc | 9 ++++++++ > gcc/config/aarch64/aarch64-simd.md | 2 +- > gcc/testsuite/gcc.target/aarch64/extend-vec.c | 21 +++++++++++++++++++ > 3 files changed, 31 insertions(+), 1 deletion(-) > create mode 100644 gcc/testsuite/gcc.target/aarch64/extend-vec.c > > diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc > index f8eeccb554d..25189888d17 100644 > --- a/gcc/config/aarch64/aarch64-builtins.cc > +++ b/gcc/config/aarch64/aarch64-builtins.cc > @@ -534,6 +534,15 @@ BUILTIN_VDQ_BHSI (urhadd, uavg, _ceil, 0) > BUILTIN_VDQ_BHSI (shadd, avg, _floor, 0) > BUILTIN_VDQ_BHSI (uhadd, uavg, _floor, 0) > > +/* The builtins below should be expanded through the standard optabs > + CODE_FOR_extend<mode><Vwide>2. */ > +#undef VAR1 > +#define VAR1(F,T,N,M) \ > + constexpr insn_code CODE_FOR_aarch64_##F##M = CODE_FOR_##T##N##M##2; > + > +VAR1 (float_extend_lo_, extend, v2sf, v2df) > +VAR1 (float_extend_lo_, extend, v4hf, v4sf) > + > #undef VAR1 > #define VAR1(T, N, MAP, FLAG, A) \ > {#N #A, UP (A), CF##MAP (N, A), 0, TYPES_##T, FLAG_##FLAG}, > diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md > index 868f4486218..c5e2c9f00d0 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -3132,7 +3132,7 @@ > DONE; > } > ) > -(define_insn "aarch64_float_extend_lo_<Vwide>" > +(define_insn "extend<mode><Vwide>2" > [(set (match_operand:<VWIDE> 0 "register_operand" "=w") > (float_extend:<VWIDE> > (match_operand:VDF 1 "register_operand" "w")))] > diff --git a/gcc/testsuite/gcc.target/aarch64/extend-vec.c b/gcc/testsuite/gcc.target/aarch64/extend-vec.c > new file mode 100644 > index 00000000000..f62418888d5 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/extend-vec.c > @@ -0,0 +1,21 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.2d, v[0-9]+.2s} 1 } } */ > +void > +f (float *__restrict a, double *__restrict b) > +{ > + b[0] = a[0]; > + b[1] = a[1]; > +} > + > +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.4s, v[0-9]+.4h} 1 } } */ > +void > +f1 (_Float16 *__restrict a, float *__restrict b) > +{ > + > + b[0] = a[0]; > + b[1] = a[1]; > + b[2] = a[2]; > + b[3] = a[3]; > +}
> Pengxuan Zheng <quic_pzheng@quicinc.com> writes: > > This patch adds vector floating point extend pattern for V2SF->V2DF > > and > > V4HF->V4SF conversions by renaming the existing > > V4HF->aarch64_float_extend_lo_<Vwide> > > pattern to the standard optab one, i.e., extend<mode><Vwide>2. This > > allows the vectorizer to vectorize certain floating point widening > > operations for the > > aarch64 target. > > > > PR target/113880 > > PR target/113869 > > > > gcc/ChangeLog: > > > > * config/aarch64/aarch64-builtins.cc (VAR1): Remap float_extend_lo_ > > builtin codes to standard optab ones. > > * config/aarch64/aarch64-simd.md > (aarch64_float_extend_lo_<Vwide>): Rename > > to... > > (extend<mode><Vwide>2): ... This. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.target/aarch64/extend-vec.c: New test. > > OK, thanks, and sorry for the slow review. > > Richard Thanks, Richard! Pushed as r15-1079-g230d62a2cdd16c. Thanks, Pengxuan > > > Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> > > --- > > gcc/config/aarch64/aarch64-builtins.cc | 9 ++++++++ > > gcc/config/aarch64/aarch64-simd.md | 2 +- > > gcc/testsuite/gcc.target/aarch64/extend-vec.c | 21 > > +++++++++++++++++++ > > 3 files changed, 31 insertions(+), 1 deletion(-) create mode 100644 > > gcc/testsuite/gcc.target/aarch64/extend-vec.c > > > > diff --git a/gcc/config/aarch64/aarch64-builtins.cc > > b/gcc/config/aarch64/aarch64-builtins.cc > > index f8eeccb554d..25189888d17 100644 > > --- a/gcc/config/aarch64/aarch64-builtins.cc > > +++ b/gcc/config/aarch64/aarch64-builtins.cc > > @@ -534,6 +534,15 @@ BUILTIN_VDQ_BHSI (urhadd, uavg, _ceil, 0) > > BUILTIN_VDQ_BHSI (shadd, avg, _floor, 0) BUILTIN_VDQ_BHSI (uhadd, > > uavg, _floor, 0) > > > > +/* The builtins below should be expanded through the standard optabs > > + CODE_FOR_extend<mode><Vwide>2. */ > > +#undef VAR1 > > +#define VAR1(F,T,N,M) \ > > + constexpr insn_code CODE_FOR_aarch64_##F##M = > > +CODE_FOR_##T##N##M##2; > > + > > +VAR1 (float_extend_lo_, extend, v2sf, v2df) > > +VAR1 (float_extend_lo_, extend, v4hf, v4sf) > > + > > #undef VAR1 > > #define VAR1(T, N, MAP, FLAG, A) \ > > {#N #A, UP (A), CF##MAP (N, A), 0, TYPES_##T, FLAG_##FLAG}, diff > > --git a/gcc/config/aarch64/aarch64-simd.md > > b/gcc/config/aarch64/aarch64-simd.md > > index 868f4486218..c5e2c9f00d0 100644 > > --- a/gcc/config/aarch64/aarch64-simd.md > > +++ b/gcc/config/aarch64/aarch64-simd.md > > @@ -3132,7 +3132,7 @@ > > DONE; > > } > > ) > > -(define_insn "aarch64_float_extend_lo_<Vwide>" > > +(define_insn "extend<mode><Vwide>2" > > [(set (match_operand:<VWIDE> 0 "register_operand" "=w") > > (float_extend:<VWIDE> > > (match_operand:VDF 1 "register_operand" "w")))] diff --git > > a/gcc/testsuite/gcc.target/aarch64/extend-vec.c > > b/gcc/testsuite/gcc.target/aarch64/extend-vec.c > > new file mode 100644 > > index 00000000000..f62418888d5 > > --- /dev/null > > +++ b/gcc/testsuite/gcc.target/aarch64/extend-vec.c > > @@ -0,0 +1,21 @@ > > +/* { dg-do compile } */ > > +/* { dg-options "-O2" } */ > > + > > +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.2d, v[0-9]+.2s} > > +1 } } */ void f (float *__restrict a, double *__restrict b) { > > + b[0] = a[0]; > > + b[1] = a[1]; > > +} > > + > > +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.4s, v[0-9]+.4h} > > +1 } } */ void > > +f1 (_Float16 *__restrict a, float *__restrict b) { > > + > > + b[0] = a[0]; > > + b[1] = a[1]; > > + b[2] = a[2]; > > + b[3] = a[3]; > > +}
diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index f8eeccb554d..25189888d17 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -534,6 +534,15 @@ BUILTIN_VDQ_BHSI (urhadd, uavg, _ceil, 0) BUILTIN_VDQ_BHSI (shadd, avg, _floor, 0) BUILTIN_VDQ_BHSI (uhadd, uavg, _floor, 0) +/* The builtins below should be expanded through the standard optabs + CODE_FOR_extend<mode><Vwide>2. */ +#undef VAR1 +#define VAR1(F,T,N,M) \ + constexpr insn_code CODE_FOR_aarch64_##F##M = CODE_FOR_##T##N##M##2; + +VAR1 (float_extend_lo_, extend, v2sf, v2df) +VAR1 (float_extend_lo_, extend, v4hf, v4sf) + #undef VAR1 #define VAR1(T, N, MAP, FLAG, A) \ {#N #A, UP (A), CF##MAP (N, A), 0, TYPES_##T, FLAG_##FLAG}, diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index 868f4486218..c5e2c9f00d0 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -3132,7 +3132,7 @@ DONE; } ) -(define_insn "aarch64_float_extend_lo_<Vwide>" +(define_insn "extend<mode><Vwide>2" [(set (match_operand:<VWIDE> 0 "register_operand" "=w") (float_extend:<VWIDE> (match_operand:VDF 1 "register_operand" "w")))] diff --git a/gcc/testsuite/gcc.target/aarch64/extend-vec.c b/gcc/testsuite/gcc.target/aarch64/extend-vec.c new file mode 100644 index 00000000000..f62418888d5 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/extend-vec.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.2d, v[0-9]+.2s} 1 } } */ +void +f (float *__restrict a, double *__restrict b) +{ + b[0] = a[0]; + b[1] = a[1]; +} + +/* { dg-final { scan-assembler-times {fcvtl\tv[0-9]+.4s, v[0-9]+.4h} 1 } } */ +void +f1 (_Float16 *__restrict a, float *__restrict b) +{ + + b[0] = a[0]; + b[1] = a[1]; + b[2] = a[2]; + b[3] = a[3]; +}
This patch adds vector floating point extend pattern for V2SF->V2DF and V4HF->V4SF conversions by renaming the existing aarch64_float_extend_lo_<Vwide> pattern to the standard optab one, i.e., extend<mode><Vwide>2. This allows the vectorizer to vectorize certain floating point widening operations for the aarch64 target. PR target/113880 PR target/113869 gcc/ChangeLog: * config/aarch64/aarch64-builtins.cc (VAR1): Remap float_extend_lo_ builtin codes to standard optab ones. * config/aarch64/aarch64-simd.md (aarch64_float_extend_lo_<Vwide>): Rename to... (extend<mode><Vwide>2): ... This. gcc/testsuite/ChangeLog: * gcc.target/aarch64/extend-vec.c: New test. Signed-off-by: Pengxuan Zheng <quic_pzheng@quicinc.com> --- gcc/config/aarch64/aarch64-builtins.cc | 9 ++++++++ gcc/config/aarch64/aarch64-simd.md | 2 +- gcc/testsuite/gcc.target/aarch64/extend-vec.c | 21 +++++++++++++++++++ 3 files changed, 31 insertions(+), 1 deletion(-) create mode 100644 gcc/testsuite/gcc.target/aarch64/extend-vec.c