Message ID | 1575297977-2589-14-git-send-email-stli@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | Use GCC builtins for some math functions if desired. | expand |
On Mon, 2 Dec 2019, Stefan Liebler wrote: > This patch is using the corresponding GCC builtin for copysignf, copysign, > copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined to one > in math-use-builtins.h. I believe this is always safe for these implementations (the only case where GCC might not expand copysign functions inline is copysignl for IBM long double, in the soft-float case).
On 12/2/19 10:00 PM, Joseph Myers wrote: > On Mon, 2 Dec 2019, Stefan Liebler wrote: > >> This patch is using the corresponding GCC builtin for copysignf, copysign, >> copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined to one >> in math-use-builtins.h. > > I believe this is always safe for these implementations (the only case > where GCC might not expand copysign functions inline is copysignl for IBM > long double, in the soft-float case). > Thus you mean we can do the following preset in sysdeps/generic/math-use-builtins.h? #define USE_COPYSIGN_BUILTIN 1 #define USE_COPYSIGNF_BUILTIN 1 #define USE_COPYSIGNL_BUILTIN 0 #define USE_COPYSIGNF128_BUILTIN 0 Or even also set USE_COPYSIGNL_BUILTIN to one as IBM long double has its own implementation in ./sysdeps/ieee754/ldbl-128ibm/s_copysignl.c. On the other hand, I'm not able to run tests on the different architectures. Thus the safest way would be to leave all those macros set to zero and let the architectures decide. In each case, I would prefer to set those macros to one in a separate patch. Then this patch could be reverted in case of failing on one architecture.
On Tue, 3 Dec 2019, Stefan Liebler wrote: > On 12/2/19 10:00 PM, Joseph Myers wrote: > > On Mon, 2 Dec 2019, Stefan Liebler wrote: > > > > > This patch is using the corresponding GCC builtin for copysignf, copysign, > > > copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined > > > to one > > > in math-use-builtins.h. > > > > I believe this is always safe for these implementations (the only case > > where GCC might not expand copysign functions inline is copysignl for IBM > > long double, in the soft-float case). > > > Thus you mean we can do the following preset in > sysdeps/generic/math-use-builtins.h? > #define USE_COPYSIGN_BUILTIN 1 > #define USE_COPYSIGNF_BUILTIN 1 > #define USE_COPYSIGNL_BUILTIN 0 > #define USE_COPYSIGNF128_BUILTIN 0 > > Or even also set USE_COPYSIGNL_BUILTIN to one as IBM long double has its own > implementation in ./sysdeps/ieee754/ldbl-128ibm/s_copysignl.c. I think they can all be 1, given that IBM long double has its own implementation. Except that you shouldn't need the indirection through these macros at all. Just use __builtin_copysign etc. directly in the implementations (other than IBM long double), unconditionally, the same way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly without such macros being needed.
On 12/3/19 5:50 PM, Joseph Myers wrote: > On Tue, 3 Dec 2019, Stefan Liebler wrote: > >> On 12/2/19 10:00 PM, Joseph Myers wrote: >>> On Mon, 2 Dec 2019, Stefan Liebler wrote: >>> >>>> This patch is using the corresponding GCC builtin for copysignf, copysign, >>>> copysignl and copysignf128 if the USE_FUNCTION_BUILTIN macros are defined >>>> to one >>>> in math-use-builtins.h. >>> >>> I believe this is always safe for these implementations (the only case >>> where GCC might not expand copysign functions inline is copysignl for IBM >>> long double, in the soft-float case). >>> >> Thus you mean we can do the following preset in >> sysdeps/generic/math-use-builtins.h? >> #define USE_COPYSIGN_BUILTIN 1 >> #define USE_COPYSIGNF_BUILTIN 1 >> #define USE_COPYSIGNL_BUILTIN 0 >> #define USE_COPYSIGNF128_BUILTIN 0 >> >> Or even also set USE_COPYSIGNL_BUILTIN to one as IBM long double has its own >> implementation in ./sysdeps/ieee754/ldbl-128ibm/s_copysignl.c. > > I think they can all be 1, given that IBM long double has its own > implementation. Except that you shouldn't need the indirection through > these macros at all. Just use __builtin_copysign etc. directly in the > implementations (other than IBM long double), unconditionally, the same > way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly > without such macros being needed. > Okay. I've set them to 1 and also used build-many-glibcs.py and had a look into the s_copysign*.os files. There is no function call to copysign* itself. If one architecture encounters issues with those builtins, an architecture specific math-use-builtins.sh file can set the macros to zero. Bye, Stefan
On Wed, 4 Dec 2019, Stefan Liebler wrote: > > I think they can all be 1, given that IBM long double has its own > > implementation. Except that you shouldn't need the indirection through > > these macros at all. Just use __builtin_copysign etc. directly in the > > implementations (other than IBM long double), unconditionally, the same > > way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly > > without such macros being needed. > > > > Okay. I've set them to 1 and also used build-many-glibcs.py and had a look > into the s_copysign*.os files. There is no function call to copysign* itself. I think we should handle this like fabs and not have the macros at all, just call __builtin_copysign etc. unconditionally.
On 12/4/19 2:20 PM, Joseph Myers wrote: > On Wed, 4 Dec 2019, Stefan Liebler wrote: > >>> I think they can all be 1, given that IBM long double has its own >>> implementation. Except that you shouldn't need the indirection through >>> these macros at all. Just use __builtin_copysign etc. directly in the >>> implementations (other than IBM long double), unconditionally, the same >>> way that sysdeps/ieee754/dbl-64/s_fabs.c uses __builtin_fabs directly >>> without such macros being needed. >>> >> >> Okay. I've set them to 1 and also used build-many-glibcs.py and had a look >> into the s_copysign*.os files. There is no function call to copysign* itself. > > I think we should handle this like fabs and not have the macros at all, > just call __builtin_copysign etc. unconditionally. > Okay. Then I've removed the macros for float and double and call the builtins unconditionally. But I still need the long double version as the f128 builtin is not available with every GCC version. Bye, Stefan
On Wed, 4 Dec 2019, Stefan Liebler wrote: > Okay. Then I've removed the macros for float and double and call the builtins > unconditionally. > But I still need the long double version as the f128 builtin is not available > with every GCC version. I think the _Float128 version could be defined to 1 for GCC 7 and later (so both long double and _Float128 versions could be removed once we require GCC 7 or later to build glibc).
On 12/4/19 9:43 PM, Joseph Myers wrote: > On Wed, 4 Dec 2019, Stefan Liebler wrote: > >> Okay. Then I've removed the macros for float and double and call the builtins >> unconditionally. >> But I still need the long double version as the f128 builtin is not available >> with every GCC version. > > I think the _Float128 version could be defined to 1 for GCC 7 and later > (so both long double and _Float128 versions could be removed once we > require GCC 7 or later to build glibc). > Okay. I've defined it to 1 for GCC 7 and later. Bye Stefan
diff --git a/sysdeps/generic/math-use-builtins.h b/sysdeps/generic/math-use-builtins.h index 34ca438a8c..a22d787791 100644 --- a/sysdeps/generic/math-use-builtins.h +++ b/sysdeps/generic/math-use-builtins.h @@ -51,4 +51,9 @@ #define USE_ROUNDL_BUILTIN 0 #define USE_ROUNDF128_BUILTIN 0 +#define USE_COPYSIGN_BUILTIN 0 +#define USE_COPYSIGNF_BUILTIN 0 +#define USE_COPYSIGNL_BUILTIN 0 +#define USE_COPYSIGNF128_BUILTIN 0 + #endif /* math-use-builtins.h */ diff --git a/sysdeps/ieee754/dbl-64/s_copysign.c b/sysdeps/ieee754/dbl-64/s_copysign.c index 589b088c95..0be1a6420f 100644 --- a/sysdeps/ieee754/dbl-64/s_copysign.c +++ b/sysdeps/ieee754/dbl-64/s_copysign.c @@ -10,7 +10,9 @@ * ==================================================== */ -#if defined(LIBM_SCCS) && !defined(lint) +#include <math-use-builtins.h> + +#if ! USE_COPYSIGN_BUILTIN && defined (LIBM_SCCS) && ! defined (lint) static char rcsid[] = "$NetBSD: s_copysign.c,v 1.8 1995/05/10 20:46:57 jtc Exp $"; #endif @@ -28,10 +30,14 @@ static char rcsid[] = "$NetBSD: s_copysign.c,v 1.8 1995/05/10 20:46:57 jtc Exp $ double __copysign (double x, double y) { +#if USE_COPYSIGN_BUILTIN + return __builtin_copysign (x, y); +#else uint32_t hx, hy; GET_HIGH_WORD (hx, x); GET_HIGH_WORD (hy, y); SET_HIGH_WORD (x, (hx & 0x7fffffff) | (hy & 0x80000000)); return x; +#endif } libm_alias_double (__copysign, copysign) diff --git a/sysdeps/ieee754/float128/float128_private.h b/sysdeps/ieee754/float128/float128_private.h index a6c76ce364..7f7f904152 100644 --- a/sysdeps/ieee754/float128/float128_private.h +++ b/sysdeps/ieee754/float128/float128_private.h @@ -152,6 +152,8 @@ #define USE_TRUNCL_BUILTIN USE_TRUNCF128_BUILTIN #undef USE_ROUNDL_BUILTIN #define USE_ROUNDL_BUILTIN USE_ROUNDF128_BUILTIN +#undef USE_COPYSIGNL_BUILTIN +#define USE_COPYSIGNL_BUILTIN USE_COPYSIGNF128_BUILTIN /* IEEE function renames. */ #define __ieee754_acoshl __ieee754_acoshf128 @@ -361,6 +363,7 @@ #define __builtin_ceill __builtin_ceilf128 #define __builtin_truncl __builtin_truncf128 #define __builtin_roundl __builtin_roundf128 +#define __builtin_copysignl __builtin_copysignf128 /* Get the constant suffix from bits/floatn-compat.h. */ #define L(x) __f128 (x) diff --git a/sysdeps/ieee754/flt-32/s_copysignf.c b/sysdeps/ieee754/flt-32/s_copysignf.c index 77d1d90e92..9a9e6389cd 100644 --- a/sysdeps/ieee754/flt-32/s_copysignf.c +++ b/sysdeps/ieee754/flt-32/s_copysignf.c @@ -13,7 +13,9 @@ * ==================================================== */ -#if defined(LIBM_SCCS) && !defined(lint) +#include <math-use-builtins.h> + +#if ! USE_COPYSIGNF_BUILTIN && defined (LIBM_SCCS) && ! defined (lint) static char rcsid[] = "$NetBSD: s_copysignf.c,v 1.4 1995/05/10 20:46:59 jtc Exp $"; #endif @@ -28,12 +30,17 @@ static char rcsid[] = "$NetBSD: s_copysignf.c,v 1.4 1995/05/10 20:46:59 jtc Exp #include <math_private.h> #include <libm-alias-float.h> -float __copysignf(float x, float y) +float +__copysignf (float x, float y) { - uint32_t ix,iy; - GET_FLOAT_WORD(ix,x); - GET_FLOAT_WORD(iy,y); - SET_FLOAT_WORD(x,(ix&0x7fffffff)|(iy&0x80000000)); - return x; +#if USE_COPYSIGNF_BUILTIN + return __builtin_copysignf (x, y); +#else + uint32_t ix, iy; + GET_FLOAT_WORD (ix, x); + GET_FLOAT_WORD (iy, y); + SET_FLOAT_WORD (x, (ix & 0x7fffffff) | (iy & 0x80000000)); + return x; +#endif } libm_alias_float (__copysign, copysign) diff --git a/sysdeps/ieee754/ldbl-128/s_copysignl.c b/sysdeps/ieee754/ldbl-128/s_copysignl.c index a501139f71..6095b7fc73 100644 --- a/sysdeps/ieee754/ldbl-128/s_copysignl.c +++ b/sysdeps/ieee754/ldbl-128/s_copysignl.c @@ -13,7 +13,9 @@ * ==================================================== */ -#if defined(LIBM_SCCS) && !defined(lint) +#include <math-use-builtins.h> + +#if ! USE_COPYSIGNL_BUILTIN && defined (LIBM_SCCS) && ! defined (lint) static char rcsid[] = "$NetBSD: $"; #endif @@ -28,13 +30,18 @@ static char rcsid[] = "$NetBSD: $"; #include <math_private.h> #include <libm-alias-ldouble.h> -_Float128 __copysignl(_Float128 x, _Float128 y) +_Float128 +__copysignl (_Float128 x, _Float128 y) { - uint64_t hx,hy; - GET_LDOUBLE_MSW64(hx,x); - GET_LDOUBLE_MSW64(hy,y); - SET_LDOUBLE_MSW64(x,(hx&0x7fffffffffffffffULL) - |(hy&0x8000000000000000ULL)); - return x; +#if USE_COPYSIGNL_BUILTIN + return __builtin_copysignl (x, y); +#else + uint64_t hx, hy; + GET_LDOUBLE_MSW64 (hx, x); + GET_LDOUBLE_MSW64 (hy, y); + SET_LDOUBLE_MSW64 (x, (hx & 0x7fffffffffffffffULL) + | (hy & 0x8000000000000000ULL)); + return x; +#endif } libm_alias_ldouble (__copysign, copysign) diff --git a/sysdeps/s390/fpu/math-use-builtins.h b/sysdeps/s390/fpu/math-use-builtins.h index 5838a31c50..ed2a05775f 100644 --- a/sysdeps/s390/fpu/math-use-builtins.h +++ b/sysdeps/s390/fpu/math-use-builtins.h @@ -50,6 +50,13 @@ # define USE_ROUNDF_BUILTIN 1 # define USE_ROUNDL_BUILTIN 1 +/* GCC emits the z9-ec zarch "copy sign" instruction for these + builtins if build with at least --march=z196 -mzarch. + Note: z196 is no typo but a bug in GCC! */ +# define USE_COPYSIGN_BUILTIN 1 +# define USE_COPYSIGNF_BUILTIN 1 +# define USE_COPYSIGNL_BUILTIN 1 + # if __GNUC_PREREQ (8, 1) # define USE_NEARBYINTF128_BUILTIN 1 # define USE_RINTF128_BUILTIN 1 @@ -57,6 +64,7 @@ # define USE_CEILF128_BUILTIN 1 # define USE_TRUNCF128_BUILTIN 1 # define USE_ROUNDF128_BUILTIN 1 +# define USE_COPYSIGNF128_BUILTIN 1 # else # define USE_NEARBYINTF128_BUILTIN 0 # define USE_RINTF128_BUILTIN 0 @@ -64,6 +72,7 @@ # define USE_CEILF128_BUILTIN 0 # define USE_TRUNCF128_BUILTIN 0 # define USE_ROUNDF128_BUILTIN 0 +# define USE_COPYSIGNF128_BUILTIN 0 # endif #else