Message ID | 00b637b5-eb0e-4f22-b655-7b4dd49fb0f6@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rs6000, remove __builtin_vsx_xvcmp* built-ins | expand |
Hi Carl, on 2024/7/17 23:52, Carl Love wrote: > GCC maintainers: > > The following patch removes the three __builtin_vsx_xvcmp[eq|ge|gt]sp builtins as they similar to the overloaded vec_cmp[eq|ge|gt] built-ins. The difference is the overloaded built-ins return a vector of boolean or a vector of long long booleans where as the removed built-ins returned a vector of floats or vector of doubles. > > The tests for __builtin_vsx_xvcmp[eq|ge|gt]sp and __builtin_vsx_xvcmp[eq|ge|gt]dp are updated to use the overloaded vec_cmp[eq|ge|gt] built-in with the required changes for the return type. Note __builtin_vsx_xvcmp[eq|ge|gt]dp are used internally. > > The patches have been tested on a Power 10 LE system with no regressions. > > Please let me know if the patch is acceptable for mainline. Thanks. > > Carl > --------------------------------------------------------------------------------- > rs6000, remove __builtin_vsx_xvcmp* built-ins > > This patch removes the built-ins: > __builtin_vsx_xvcmpeqsp, __builtin_vsx_xvcmpgesp, > __builtin_vsx_xvcmpgtsp. > > which are similar to the overloaded vec_cmpeq, vec_cmpgt and vec_cmpge > built-ins. The important thing is that vec_cmp{eq,ge,gt} are PVIPR built-in functions, which are more recommended. > > The difference is that the overloaded built-ins return a vector of > booleans or a vector of long long boolean depending if the inputs were a > vector of floats or a vector of doubles. The removed built-ins > returned a vector of floats or vector of double for the vector float and > vector double inputs respectively. This paragraph is a bit confusing, as the previous paragraph is saying to remove those *sp built-ins, then they can only return a bool int vector (not a bool long long vector). It needs some rewording. > > The __builtin_vsx_xvcmpeqdp, __builtin_vsx_xvcmpgedp and > __builtin_vsx_xvcmpgtdp are not removed as they are used by the > overloaded vec_cmpeq, vec_cmpgt and vec_cmpge built-ins. > > The test cases for the __builtin_vsx_xvcmpeqsp, __builtin_vsx_xvcmpgesp, > __builtin_vsx_xvcmpgtsp, __builtin_vsx_xvcmpeqdp, > __builtin_vsx_xvcmpgedp and __builtin_vsx_xvcmpgtdp are changed to use > the overloaded vec_cmpeq, vec_cmpgt, vec_cmpge built-ins. Use of the > overloaded built-ins requires the result to be stored in a vector of > boolean of the appropriate size or the result must be cast to the return > type used by the original __builtin_vsx_xvcmp* built-ins. It is missing a changelog here. > --- > gcc/config/rs6000/rs6000-builtins.def | 10 ------- > .../gcc.target/powerpc/vsx-builtin-3.c | 28 ++++++++++++++----- > 2 files changed, 21 insertions(+), 17 deletions(-) > > diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def > index 77eb0f7e406..896d9686ac6 100644 > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def > @@ -1579,30 +1579,20 @@ > const signed int __builtin_vsx_xvcmpeqdp_p (signed int, vd, vd); > XVCMPEQDP_P vector_eq_v2df_p {pred} > > - const vf __builtin_vsx_xvcmpeqsp (vf, vf); > - XVCMPEQSP vector_eqv4sf {} > - > const vd __builtin_vsx_xvcmpgedp (vd, vd); > XVCMPGEDP vector_gev2df {} > > const signed int __builtin_vsx_xvcmpgedp_p (signed int, vd, vd); > XVCMPGEDP_P vector_ge_v2df_p {pred} > > - const vf __builtin_vsx_xvcmpgesp (vf, vf); > - XVCMPGESP vector_gev4sf {} > - > const signed int __builtin_vsx_xvcmpgesp_p (signed int, vf, vf); > XVCMPGESP_P vector_ge_v4sf_p {pred} > > const vd __builtin_vsx_xvcmpgtdp (vd, vd); > XVCMPGTDP vector_gtv2df {} > - Unexpected remove. > const signed int __builtin_vsx_xvcmpgtdp_p (signed int, vd, vd); > XVCMPGTDP_P vector_gt_v2df_p {pred} > > - const vf __builtin_vsx_xvcmpgtsp (vf, vf); > - XVCMPGTSP vector_gtv4sf {} > - > const signed int __builtin_vsx_xvcmpgtsp_p (signed int, vf, vf); > XVCMPGTSP_P vector_gt_v4sf_p {pred} > > diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c > index 60f91aad23c..d67f97c8011 100644 > --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c > +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c > @@ -156,13 +156,27 @@ int do_cmp (void) > { > int i = 0; > > - d[i][0] = __builtin_vsx_xvcmpeqdp (d[i][1], d[i][2]); i++; > - d[i][0] = __builtin_vsx_xvcmpgtdp (d[i][1], d[i][2]); i++; > - d[i][0] = __builtin_vsx_xvcmpgedp (d[i][1], d[i][2]); i++; > - > - f[i][0] = __builtin_vsx_xvcmpeqsp (f[i][1], f[i][2]); i++; > - f[i][0] = __builtin_vsx_xvcmpgtsp (f[i][1], f[i][2]); i++; > - f[i][0] = __builtin_vsx_xvcmpgesp (f[i][1], f[i][2]); i++; > + /* The __builtin_vsx_xvcmp[gt|ge|eq]dp and __builtin_vsx_xvcmp[gt|ge|eq]sp > + have been removed in favor of the overloaded vec_cmpeq, vec_cmpgt and > + vec_cmpge built-ins. The __builtin_vsx_xvcmp* builtins returned a vector > + result of the same type as the arguments. The vec_cmp* built-ins return > + a vector of boolenas of the same size as the arguments. Thus the result > + assignment must be to a boolean or cast to a boolean. Test both cases. > + */ > + > + d[i][0] = (vector double) vec_cmpeq (d[i][1], d[i][2]); i++; > + d[i][0] = (vector double) vec_cmpgt (d[i][1], d[i][2]); i++; > + d[i][0] = (vector double) vec_cmpge (d[i][1], d[i][2]); i++; > + bl[i][0] = vec_cmpeq (d[i][1], d[i][2]); i++; > + bl[i][0] = vec_cmpgt (d[i][1], d[i][2]); i++; > + bl[i][0] = vec_cmpge (d[i][1], d[i][2]); i++; > + > + f[i][0] = (vector float) vec_cmpeq (f[i][1], f[i][2]); i++; > + f[i][0] = (vector float) vec_cmpgt (f[i][1], f[i][2]); i++; > + f[i][0] = (vector float) vec_cmpge (f[i][1], f[i][2]); i++; > + bi[i][0] = vec_cmpeq (f[i][1], f[i][2]); i++; > + bi[i][0] = vec_cmpgt (f[i][1], f[i][2]); i++; > + bi[i][0] = vec_cmpge (f[i][1], f[i][2]); i++; > return i; > } > BR, Kewen
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index 77eb0f7e406..896d9686ac6 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -1579,30 +1579,20 @@ const signed int __builtin_vsx_xvcmpeqdp_p (signed int, vd, vd); XVCMPEQDP_P vector_eq_v2df_p {pred} - const vf __builtin_vsx_xvcmpeqsp (vf, vf); - XVCMPEQSP vector_eqv4sf {} - const vd __builtin_vsx_xvcmpgedp (vd, vd); XVCMPGEDP vector_gev2df {} const signed int __builtin_vsx_xvcmpgedp_p (signed int, vd, vd); XVCMPGEDP_P vector_ge_v2df_p {pred} - const vf __builtin_vsx_xvcmpgesp (vf, vf); - XVCMPGESP vector_gev4sf {} - const signed int __builtin_vsx_xvcmpgesp_p (signed int, vf, vf); XVCMPGESP_P vector_ge_v4sf_p {pred} const vd __builtin_vsx_xvcmpgtdp (vd, vd); XVCMPGTDP vector_gtv2df {} - const signed int __builtin_vsx_xvcmpgtdp_p (signed int, vd, vd); XVCMPGTDP_P vector_gt_v2df_p {pred} - const vf __builtin_vsx_xvcmpgtsp (vf, vf); - XVCMPGTSP vector_gtv4sf {} - const signed int __builtin_vsx_xvcmpgtsp_p (signed int, vf, vf); XVCMPGTSP_P vector_gt_v4sf_p {pred} diff --git a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c index 60f91aad23c..d67f97c8011 100644 --- a/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c +++ b/gcc/testsuite/gcc.target/powerpc/vsx-builtin-3.c @@ -156,13 +156,27 @@ int do_cmp (void) { int i = 0; - d[i][0] = __builtin_vsx_xvcmpeqdp (d[i][1], d[i][2]); i++; - d[i][0] = __builtin_vsx_xvcmpgtdp (d[i][1], d[i][2]); i++; - d[i][0] = __builtin_vsx_xvcmpgedp (d[i][1], d[i][2]); i++; - - f[i][0] = __builtin_vsx_xvcmpeqsp (f[i][1], f[i][2]); i++; - f[i][0] = __builtin_vsx_xvcmpgtsp (f[i][1], f[i][2]); i++; - f[i][0] = __builtin_vsx_xvcmpgesp (f[i][1], f[i][2]); i++; + /* The __builtin_vsx_xvcmp[gt|ge|eq]dp and __builtin_vsx_xvcmp[gt|ge|eq]sp + have been removed in favor of the overloaded vec_cmpeq, vec_cmpgt and + vec_cmpge built-ins. The __builtin_vsx_xvcmp* builtins returned a vector + result of the same type as the arguments. The vec_cmp* built-ins return + a vector of boolenas of the same size as the arguments. Thus the result + assignment must be to a boolean or cast to a boolean. Test both cases. + */ + + d[i][0] = (vector double) vec_cmpeq (d[i][1], d[i][2]); i++; + d[i][0] = (vector double) vec_cmpgt (d[i][1], d[i][2]); i++; + d[i][0] = (vector double) vec_cmpge (d[i][1], d[i][2]); i++; + bl[i][0] = vec_cmpeq (d[i][1], d[i][2]); i++; + bl[i][0] = vec_cmpgt (d[i][1], d[i][2]); i++; + bl[i][0] = vec_cmpge (d[i][1], d[i][2]); i++; + + f[i][0] = (vector float) vec_cmpeq (f[i][1], f[i][2]); i++; + f[i][0] = (vector float) vec_cmpgt (f[i][1], f[i][2]); i++; + f[i][0] = (vector float) vec_cmpge (f[i][1], f[i][2]); i++; + bi[i][0] = vec_cmpeq (f[i][1], f[i][2]); i++;