Message ID | 1559773022-26575-1-git-send-email-pc@us.ibm.com |
---|---|
State | New |
Headers | show |
Series | [powerpc] get_rounding_mode: utilize faster method to get rounding mode | expand |
"Paul A. Clarke" <pc@us.ibm.com> writes: > Add support to use 'mffsl' instruction if compiled for POWER9 (or later). > > Also, mask the result to avoid bleeding unrelated bits into the result of > _FPU_GET_RC(). > > 2019-06-05 Paul A. Clarke <pc@us.ibm.com> > > * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code > with call to equivalent function. This file is missing in this patch. > diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h > index 8a0bace..621e57f 100644 > --- a/sysdeps/powerpc/fpu/fenv_libc.h > +++ b/sysdeps/powerpc/fpu/fenv_libc.h > @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; > pointer. */ > #define fegetenv_register() __builtin_mffs() > > +/* Equivalent to fegetenv_register, but only returns bits for > + status, exception enables, and mode. */ > +#ifdef _ARCH_PWR9 > +#define fegetenv_status() \ > + ({union { double __d; unsigned long long __ll; } __u; \ > + register double __fr; \ > + __asm__ ("mffsl %0" : "=f" (__fr)); \ > + __u.__d = __fr; \ > + __u.__ll; \ > + }) > +#else > +#define fegetenv_status() __builtin_mffs() > +#error "power8" > +#endif Is this macro used in fegetexcept.c? Is the error intentional?
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote: > "Paul A. Clarke" <pc@us.ibm.com> writes: > >> Add support to use 'mffsl' instruction if compiled for POWER9 (or later). >> >> Also, mask the result to avoid bleeding unrelated bits into the result of >> _FPU_GET_RC(). >> >> 2019-06-05 Paul A. Clarke <pc@us.ibm.com> >> >> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code >> with call to equivalent function. > > This file is missing in this patch. Copy and paste error from the previous patch I submitted earlier today. Sorry. I will fix when committing, presuming the rest of this patch is OK. PC
On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote: > "Paul A. Clarke" <pc@us.ibm.com> writes: > >> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h >> index 8a0bace..621e57f 100644 >> --- a/sysdeps/powerpc/fpu/fenv_libc.h >> +++ b/sysdeps/powerpc/fpu/fenv_libc.h >> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; >> pointer. */ >> #define fegetenv_register() __builtin_mffs() >> >> +/* Equivalent to fegetenv_register, but only returns bits for >> + status, exception enables, and mode. */ >> +#ifdef _ARCH_PWR9 >> +#define fegetenv_status() \ >> + ({union { double __d; unsigned long long __ll; } __u; \ >> + register double __fr; \ >> + __asm__ ("mffsl %0" : "=f" (__fr)); \ >> + __u.__d = __fr; \ >> + __u.__ll; \ >> + }) >> +#else >> +#define fegetenv_status() __builtin_mffs() >> +#error "power8" >> +#endif > > Is this macro used in fegetexcept.c? > Is the error intentional? This hunk should not be in this patch submission. I shall remove it. Sorry I missed this comment before. PC
"Paul A. Clarke" <pc@us.ibm.com> writes: > From: "Paul A. Clarke" <pc@us.ibm.com> > > Add support to use 'mffsl' instruction if compiled for POWER9 (or later). > > Also, mask the result to avoid bleeding unrelated bits into the result of > _FPU_GET_RC(). > > 2019-06-05 Paul A. Clarke <pc@us.ibm.com> > > * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code > with call to equivalent function. > * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function. As you explained, both entries have to be removed. > * sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New. > (__FPU_MFFS): New. > (__FPU_MFFSL): New. > (_FPU_GET_RC): New. > (_FPU_GETCW): Use __FPU_MFFS(). ^ Extra spaces here. Should be only 1 tab. > * sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file. I also think this file is unnecessary. > diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h > new file mode 100644 > index 0000000..e2fdbbb > --- /dev/null > +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h > @@ -0,0 +1,33 @@ > +/* Determine floating-point rounding mode within libc. powerpc64 version. > + Copyright (C) 2019 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <http://www.gnu.org/licenses/>. */ > + > +#ifndef _POWERPC64_GET_ROUNDING_MODE_H > +#define _POWERPC64_GET_ROUNDING_MODE_H 1 > + > +#include <fenv.h> > +#include <fpu_control.h> > + > +/* Return the floating-point rounding mode. */ > + > +static inline int > +get_rounding_mode (void) > +{ > + return _FPU_GET_RC (); > +} > + > +#endif /* get-rounding-mode.h */ I don't understand why this file/function is needed. Looks like another file that should be removed from this patch. LGTM if you fix the ChangeLog and keep only the changes to sysdeps/powerpc/fpu_control.h. Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>
On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote: > "Paul A. Clarke" <pc@us.ibm.com> writes: >> Add support to use 'mffsl' instruction if compiled for POWER9 (or later). >> >> Also, mask the result to avoid bleeding unrelated bits into the result of >> _FPU_GET_RC(). >> >> 2019-06-05 Paul A. Clarke <pc@us.ibm.com> >> >> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code >> with call to equivalent function. >> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function. > > As you explained, both entries have to be removed. Yes, sorry about the cruft. >> * sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New. >> (__FPU_MFFS): New. >> (__FPU_MFFSL): New. >> (_FPU_GET_RC): New. >> (_FPU_GETCW): Use __FPU_MFFS(). > > ^ Extra spaces here. Should be only 1 tab. OK. >> * sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file. > > I also think this file is unnecessary. (discussed below) >> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h >> new file mode 100644 >> index 0000000..e2fdbbb >> --- /dev/null >> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h >> @@ -0,0 +1,33 @@ >> +/* Determine floating-point rounding mode within libc. powerpc64 version. <snip> >> +static inline int >> +get_rounding_mode (void) >> +{ >> + return _FPU_GET_RC (); >> +} >> + >> +#endif /* get-rounding-mode.h */ > > I don't understand why this file/function is needed. > Looks like another file that should be removed from this patch. Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs". get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9). > LGTM if you fix the ChangeLog and keep only the changes to > sysdeps/powerpc/fpu_control.h. > > Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com> PC
Paul Clarke <pc@us.ibm.com> writes: > On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote: >> "Paul A. Clarke" <pc@us.ibm.com> writes: >>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h >>> new file mode 100644 >>> index 0000000..e2fdbbb >>> --- /dev/null >>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h >>> @@ -0,0 +1,33 @@ >>> +/* Determine floating-point rounding mode within libc. powerpc64 version. > <snip> >>> +static inline int >>> +get_rounding_mode (void) >>> +{ >>> + return _FPU_GET_RC (); >>> +} >>> + >>> +#endif /* get-rounding-mode.h */ >> >> I don't understand why this file/function is needed. >> Looks like another file that should be removed from this patch. > > Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs". get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9). Very good point. I had missed this. In that case, I think you should move this file to sysdeps/powerpc/fpu/get-rounding-mode.h allowing 32-bits builds to use it and restricting its usage to fpu.
diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h index 8a0bace..621e57f 100644 --- a/sysdeps/powerpc/fpu/fenv_libc.h +++ b/sysdeps/powerpc/fpu/fenv_libc.h @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; pointer. */ #define fegetenv_register() __builtin_mffs() +/* Equivalent to fegetenv_register, but only returns bits for + status, exception enables, and mode. */ +#ifdef _ARCH_PWR9 +#define fegetenv_status() \ + ({union { double __d; unsigned long long __ll; } __u; \ + register double __fr; \ + __asm__ ("mffsl %0" : "=f" (__fr)); \ + __u.__d = __fr; \ + __u.__ll; \ + }) +#else +#define fegetenv_status() __builtin_mffs() +#error "power8" +#endif + /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer. */ #define fesetenv_register(env) \ do { \ diff --git a/sysdeps/powerpc/fpu_control.h b/sysdeps/powerpc/fpu_control.h index e0c5cf6..22c52b7 100644 --- a/sysdeps/powerpc/fpu_control.h +++ b/sysdeps/powerpc/fpu_control.h @@ -71,6 +71,8 @@ extern fpu_control_t __fpu_control; # define _FPU_RC_UP 0x02 # define _FPU_RC_ZERO 0x01 +# define _FPU_MASK_RC (_FPU_RC_NEAREST|_FPU_RC_DOWN|_FPU_RC_UP|_FPU_RC_ZERO) + # define _FPU_MASK_NI 0x04 /* non-ieee mode */ /* masking of interrupts */ @@ -94,15 +96,36 @@ extern fpu_control_t __fpu_control; typedef unsigned int fpu_control_t; /* Macros for accessing the hardware control word. */ +# define __FPU_MFFS() \ + ({register double __fr; \ + __asm__ ("mffs %0" : "=f" (__fr)); \ + __fr; \ + }) + # define _FPU_GETCW(cw) \ ({union { double __d; unsigned long long __ll; } __u; \ - register double __fr; \ - __asm__ ("mffs %0" : "=f" (__fr)); \ - __u.__d = __fr; \ + __u.__d = __FPU_MFFS(); \ (cw) = (fpu_control_t) __u.__ll; \ (fpu_control_t) __u.__ll; \ }) +#ifdef _ARCH_PWR9 +# define __FPU_MFFSL() \ + ({register double __fr; \ + __asm__ ("mffsl %0" : "=f" (__fr)); \ + __fr; \ + }) +#else +# define __FPU_MFFSL() __FPU_MFFS() +#endif + +# define _FPU_GET_RC() \ + ({union { double __d; unsigned long long __ll; } __u; \ + __u.__d = __FPU_MFFSL(); \ + __u.__ll &= _FPU_MASK_RC; \ + (fpu_control_t) __u.__ll; \ + }) + # define _FPU_SETCW(cw) \ { union { double __d; unsigned long long __ll; } __u; \ register double __fr; \ diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h new file mode 100644 index 0000000..e2fdbbb --- /dev/null +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h @@ -0,0 +1,33 @@ +/* Determine floating-point rounding mode within libc. powerpc64 version. + Copyright (C) 2019 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _POWERPC64_GET_ROUNDING_MODE_H +#define _POWERPC64_GET_ROUNDING_MODE_H 1 + +#include <fenv.h> +#include <fpu_control.h> + +/* Return the floating-point rounding mode. */ + +static inline int +get_rounding_mode (void) +{ + return _FPU_GET_RC (); +} + +#endif /* get-rounding-mode.h */
From: "Paul A. Clarke" <pc@us.ibm.com> Add support to use 'mffsl' instruction if compiled for POWER9 (or later). Also, mask the result to avoid bleeding unrelated bits into the result of _FPU_GET_RC(). 2019-06-05 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code with call to equivalent function. * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function. * sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New. (__FPU_MFFS): New. (__FPU_MFFSL): New. (_FPU_GET_RC): New. (_FPU_GETCW): Use __FPU_MFFS(). * sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file. --- sysdeps/powerpc/fpu/fenv_libc.h | 15 ++++++++++++ sysdeps/powerpc/fpu_control.h | 29 ++++++++++++++++++++--- sysdeps/powerpc/powerpc64/get-rounding-mode.h | 33 +++++++++++++++++++++++++++ 3 files changed, 74 insertions(+), 3 deletions(-) create mode 100644 sysdeps/powerpc/powerpc64/get-rounding-mode.h