Message ID | 1559870599-2079-1-git-send-email-pc@us.ibm.com |
---|---|
State | New |
Headers | show |
Series | [powerpc] fegetmode: utilize faster method to get rounding mode | expand |
On 06/06/2019 22:23, Paul A. Clarke wrote: > From: "Paul A. Clarke" <pc@us.ibm.com> > > Add support to use 'mffsl' instruction if compiled for POWER9 (or later). What about using dl_hwcap to check for ISA 3.0 for fegetenv_register (as fesetenv_register does for fdp)? Not sure how performance-wise it would be, but in the other hand it would not require to build glibc using -mcpu=power9 to actually use this instruction. > > 2019-06-06 Paul A. Clarke <pc@us.ibm.com> > > * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New. > * sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status() > instead of fegetenv_register(). > --- > sysdeps/powerpc/fpu/fegetmode.c | 2 +- > sysdeps/powerpc/fpu/fenv_libc.h | 12 ++++++++++++ > 2 files changed, 13 insertions(+), 1 deletion(-) > > diff --git a/sysdeps/powerpc/fpu/fegetmode.c b/sysdeps/powerpc/fpu/fegetmode.c > index f43ab60..466f5b7 100644 > --- a/sysdeps/powerpc/fpu/fegetmode.c > +++ b/sysdeps/powerpc/fpu/fegetmode.c > @@ -21,6 +21,6 @@ > int > fegetmode (femode_t *modep) > { > - *modep = fegetenv_register (); > + *modep = fegetenv_status (); > return 0; > } > diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h > index f8dd1b7..64f7398 100644 > --- a/sysdeps/powerpc/fpu/fenv_libc.h > +++ b/sysdeps/powerpc/fpu/fenv_libc.h > @@ -34,6 +34,18 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; > pointer. */ > #define fegetenv_register() __builtin_mffs() > > +/* Equivalent to fegetenv_register, but only returns bits for > + status, exception enables, and mode. */ > +#ifdef _ARCH_PWR9 > +#define fegetenv_status() \ > + ({register double __fr; \ > + __asm__ ("mffsl %0" : "=f" (__fr)); \ > + __fr; \ > + }) > +#else > +#define fegetenv_status() fegetenv_register() > +#endif > + > /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer. */ > #define fesetenv_register(env) \ > do { \ >
On Thu, 6 Jun 2019, Paul A. Clarke wrote: > From: "Paul A. Clarke" <pc@us.ibm.com> > > Add support to use 'mffsl' instruction if compiled for POWER9 (or later). The patch subject says "rounding mode". fegetmode should save all control modes, not just the rounding mode. (The C2x wording is "all the dynamic floating-point control modes".) My understanding is that the change is in fact correct - mffsl does save all control modes (e.g. exception enable and non-IEEE bit), not just the rounding mode (and includes both binary and decimal rounding modes) - and so this is just an issue with the patch description, not the code changes.
Adhemerval, On 6/7/19 9:20 AM, Adhemerval Zanella wrote: > On 06/06/2019 22:23, Paul A. Clarke wrote: >> From: "Paul A. Clarke" <pc@us.ibm.com> >> Add support to use 'mffsl' instruction if compiled for POWER9 (or later). > > What about using dl_hwcap to check for ISA 3.0 for fegetenv_register > (as fesetenv_register does for fdp)? Not sure how performance-wise it would > be, but in the other hand it would not require to build glibc using -mcpu=power9 > to actually use this instruction. Good idea. I think I can do both, where the POWER9 instruction is used conditional on hwcap2, and unconditionally if compiled with -mcpu=power9 (pseudocode, not actual reviewable code, follows): #ifdef _ARCH_PWR9 #define IS_ISA300() 1 #else #define IS_ISA300() (hwcap2 & PPC_FEATURE2_ARCH_3_00) #endif rn = ({ union { double __d; unsigned long long __ll; } __u; if (__builtin_expect(IS_ISA300(),1)) __asm__ volatile ("mffsl %0" : "=f" (__u.__d)); else __u.__d = __builtin_mffs (); __u.__ll & 0x0000000000000003LL; }); PC
On 6/10/19 10:36 AM, Joseph Myers wrote: > On Thu, 6 Jun 2019, Paul A. Clarke wrote: >> From: "Paul A. Clarke" <pc@us.ibm.com> >> Add support to use 'mffsl' instruction if compiled for POWER9 (or later). > > The patch subject says "rounding mode". fegetmode should save all control > modes, not just the rounding mode. (The C2x wording is "all the dynamic > floating-point control modes".) > > My understanding is that the change is in fact correct - mffsl does save > all control modes (e.g. exception enable and non-IEEE bit), not just the > rounding mode (and includes both binary and decimal rounding modes) - and > so this is just an issue with the patch description, not the code changes. Good catch, and your understanding is correct. I will update the description in v2, which will also incorporate Adhemerval's suggestion. Coming soon. PC
On 10/06/2019 15:49, Paul Clarke wrote: > Adhemerval, > > On 6/7/19 9:20 AM, Adhemerval Zanella wrote: >> On 06/06/2019 22:23, Paul A. Clarke wrote: >>> From: "Paul A. Clarke" <pc@us.ibm.com> >>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later). >> >> What about using dl_hwcap to check for ISA 3.0 for fegetenv_register >> (as fesetenv_register does for fdp)? Not sure how performance-wise it would >> be, but in the other hand it would not require to build glibc using -mcpu=power9 >> to actually use this instruction. > > Good idea. I think I can do both, where the POWER9 instruction is used conditional > on hwcap2, and unconditionally if compiled with -mcpu=power9 (pseudocode, not actual > reviewable code, follows): > > #ifdef _ARCH_PWR9 > #define IS_ISA300() 1 > #else > #define IS_ISA300() (hwcap2 & PPC_FEATURE2_ARCH_3_00) > #endif > > rn = ({ > union { double __d; unsigned long long __ll; } __u; > if (__builtin_expect(IS_ISA300(),1)) > __asm__ volatile ("mffsl %0" : "=f" (__u.__d)); > else > __u.__d = __builtin_mffs (); > __u.__ll & 0x0000000000000003LL; > }); > > PC > Yeah, something like that.
diff --git a/sysdeps/powerpc/fpu/fegetmode.c b/sysdeps/powerpc/fpu/fegetmode.c index f43ab60..466f5b7 100644 --- a/sysdeps/powerpc/fpu/fegetmode.c +++ b/sysdeps/powerpc/fpu/fegetmode.c @@ -21,6 +21,6 @@ int fegetmode (femode_t *modep) { - *modep = fegetenv_register (); + *modep = fegetenv_status (); return 0; } diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h index f8dd1b7..64f7398 100644 --- a/sysdeps/powerpc/fpu/fenv_libc.h +++ b/sysdeps/powerpc/fpu/fenv_libc.h @@ -34,6 +34,18 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; pointer. */ #define fegetenv_register() __builtin_mffs() +/* Equivalent to fegetenv_register, but only returns bits for + status, exception enables, and mode. */ +#ifdef _ARCH_PWR9 +#define fegetenv_status() \ + ({register double __fr; \ + __asm__ ("mffsl %0" : "=f" (__fr)); \ + __fr; \ + }) +#else +#define fegetenv_status() fegetenv_register() +#endif + /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer. */ #define fesetenv_register(env) \ do { \
From: "Paul A. Clarke" <pc@us.ibm.com> Add support to use 'mffsl' instruction if compiled for POWER9 (or later). 2019-06-06 Paul A. Clarke <pc@us.ibm.com> * sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New. * sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status() instead of fegetenv_register(). --- sysdeps/powerpc/fpu/fegetmode.c | 2 +- sysdeps/powerpc/fpu/fenv_libc.h | 12 ++++++++++++ 2 files changed, 13 insertions(+), 1 deletion(-)