[powerpc] get_rounding_mode: utilize faster method to get rounding mode

Message ID	1559773022-26575-1-git-send-email-pc@us.ibm.com
State	New
Headers	show Return-Path: <libc-alpha-return-102473-incoming=patchwork.ozlabs.org@sourceware.org> DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id; q=dns; s= default; b=cmTv3zFz5omgSkFqZj25gijuDHIR5JDM5LS2Z0O/H7p+7nJSL1xGX 1TDRVf5pntwuCpxu8F75+Z64UHbdwYd7fA7/zOvdFkU4dlZo7lGPqrY3Z15NQoSl 3L2tHOmz/BW01sSl0NzxgYPzX/+7WCyl+eXOadMrPtxG8C/xk64+Dc= Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk Sender: libc-alpha-owner@sourceware.org From: "Paul A. Clarke" <pc@us.ibm.com> To: libc-alpha@sourceware.org Cc: tuliom@ascii.art.br Subject: [PATCH] [powerpc] get_rounding_mode: utilize faster method to get rounding mode Date: Wed, 5 Jun 2019 17:17:02 -0500 Message-Id: <1559773022-26575-1-git-send-email-pc@us.ibm.com>
Series	[powerpc] get_rounding_mode: utilize faster method to get rounding mode \| expand [powerpc] get_rounding_mode: utilize faster method to get rounding mode

Paul A. Clarke June 5, 2019, 10:17 p.m. UTC

From: "Paul A. Clarke" <pc@us.ibm.com>

Add support to use 'mffsl' instruction if compiled for POWER9 (or later).

Also, mask the result to avoid bleeding unrelated bits into the result of
_FPU_GET_RC().

2019-06-05  Paul A. Clarke  <pc@us.ibm.com>

	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
	with call to equivalent function.
	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
	  (__FPU_MFFS): New.
	  (__FPU_MFFSL): New.
	  (_FPU_GET_RC): New.
	  (_FPU_GETCW): Use __FPU_MFFS().
	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
---
 sysdeps/powerpc/fpu/fenv_libc.h               | 15 ++++++++++++
 sysdeps/powerpc/fpu_control.h                 | 29 ++++++++++++++++++++---
 sysdeps/powerpc/powerpc64/get-rounding-mode.h | 33 +++++++++++++++++++++++++++
 3 files changed, 74 insertions(+), 3 deletions(-)
 create mode 100644 sysdeps/powerpc/powerpc64/get-rounding-mode.h

Tulio Magno Quites Machado Filho June 5, 2019, 10:42 p.m. UTC | #1

"Paul A. Clarke" <pc@us.ibm.com> writes:

> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>
> Also, mask the result to avoid bleeding unrelated bits into the result of
> _FPU_GET_RC().
>
> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>
> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
> 	with call to equivalent function.

This file is missing in this patch.

> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
> index 8a0bace..621e57f 100644
> --- a/sysdeps/powerpc/fpu/fenv_libc.h
> +++ b/sysdeps/powerpc/fpu/fenv_libc.h
> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;
>     pointer.  */
>  #define fegetenv_register() __builtin_mffs()
>  
> +/* Equivalent to fegetenv_register, but only returns bits for
> +   status, exception enables, and mode.  */
> +#ifdef _ARCH_PWR9
> +#define fegetenv_status() \
> +  ({union { double __d; unsigned long long __ll; } __u;	\
> +    register double __fr;				\
> +    __asm__ ("mffsl %0" : "=f" (__fr));			\
> +    __u.__d = __fr;					\
> +    __u.__ll;						\
> +  })
> +#else
> +#define fegetenv_status() __builtin_mffs()
> +#error "power8"
> +#endif

Is this macro used in fegetexcept.c?
Is the error intentional?

Paul A. Clarke June 5, 2019, 10:54 p.m. UTC | #2

On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
> 
>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>>
>> Also, mask the result to avoid bleeding unrelated bits into the result of
>> _FPU_GET_RC().
>>
>> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>>
>> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
>> 	with call to equivalent function.
> 
> This file is missing in this patch.

Copy and paste error from the previous patch I submitted earlier today.  Sorry.
I will fix when committing, presuming the rest of this patch is OK.

PC

Paul A. Clarke June 5, 2019, 11:04 p.m. UTC | #3

On 6/5/19 5:42 PM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
> 
>> diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h
>> index 8a0bace..621e57f 100644
>> --- a/sysdeps/powerpc/fpu/fenv_libc.h
>> +++ b/sysdeps/powerpc/fpu/fenv_libc.h
>> @@ -34,6 +34,21 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden;
>>     pointer.  */
>>  #define fegetenv_register() __builtin_mffs()
>>  
>> +/* Equivalent to fegetenv_register, but only returns bits for
>> +   status, exception enables, and mode.  */
>> +#ifdef _ARCH_PWR9
>> +#define fegetenv_status() \
>> +  ({union { double __d; unsigned long long __ll; } __u;	\
>> +    register double __fr;				\
>> +    __asm__ ("mffsl %0" : "=f" (__fr));			\
>> +    __u.__d = __fr;					\
>> +    __u.__ll;						\
>> +  })
>> +#else
>> +#define fegetenv_status() __builtin_mffs()
>> +#error "power8"
>> +#endif
> 
> Is this macro used in fegetexcept.c?
> Is the error intentional?

This hunk should not be in this patch submission.  I shall remove it.

Sorry I missed this comment before.

PC

Tulio Magno Quites Machado Filho June 6, 2019, 2:17 p.m. UTC | #4

"Paul A. Clarke" <pc@us.ibm.com> writes:

> From: "Paul A. Clarke" <pc@us.ibm.com>
>
> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>
> Also, mask the result to avoid bleeding unrelated bits into the result of
> _FPU_GET_RC().
>
> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>
> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
> 	with call to equivalent function.
> 	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.

As you explained, both entries have to be removed.

> 	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
> 	  (__FPU_MFFS): New.
> 	  (__FPU_MFFSL): New.
> 	  (_FPU_GET_RC): New.
> 	  (_FPU_GETCW): Use __FPU_MFFS().

        ^ Extra spaces here.  Should be only 1 tab.

> 	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.

I also think this file is unnecessary.

> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
> new file mode 100644
> index 0000000..e2fdbbb
> --- /dev/null
> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
> @@ -0,0 +1,33 @@
> +/* Determine floating-point rounding mode within libc.  powerpc64 version.
> +   Copyright (C) 2019 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef _POWERPC64_GET_ROUNDING_MODE_H
> +#define _POWERPC64_GET_ROUNDING_MODE_H	1
> +
> +#include <fenv.h>
> +#include <fpu_control.h>
> +
> +/* Return the floating-point rounding mode.  */
> +
> +static inline int
> +get_rounding_mode (void)
> +{
> +  return _FPU_GET_RC ();
> +}
> +
> +#endif /* get-rounding-mode.h */

I don't understand why this file/function is needed.
Looks like another file that should be removed from this patch.

LGTM if you fix the ChangeLog and keep only the changes to
sysdeps/powerpc/fpu_control.h.

Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

Paul A. Clarke June 6, 2019, 5:41 p.m. UTC | #5

On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:
> "Paul A. Clarke" <pc@us.ibm.com> writes:
>> Add support to use 'mffsl' instruction if compiled for POWER9 (or later).
>>
>> Also, mask the result to avoid bleeding unrelated bits into the result of
>> _FPU_GET_RC().
>>
>> 2019-06-05  Paul A. Clarke  <pc@us.ibm.com>
>>
>> 	* sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Replace code
>> 	with call to equivalent function.
>> 	* sysdeps/powerpc/fpu/fenv_libc.h (fegetenv_status): New function.
> 
> As you explained, both entries have to be removed.

Yes, sorry about the cruft.

>> 	* sysdeps/powerpc/fpu_control.h (_FPU_MASK_RC): New.
>> 	  (__FPU_MFFS): New.
>> 	  (__FPU_MFFSL): New.
>> 	  (_FPU_GET_RC): New.
>> 	  (_FPU_GETCW): Use __FPU_MFFS().
> 
>         ^ Extra spaces here.  Should be only 1 tab.

OK.

>> 	* sysdeps/powerpc/powerpc64/get-rounding-mode.h: New file.
> 
> I also think this file is unnecessary.

(discussed below)

>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>> new file mode 100644
>> index 0000000..e2fdbbb
>> --- /dev/null
>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>> @@ -0,0 +1,33 @@
>> +/* Determine floating-point rounding mode within libc.  powerpc64 version.
<snip>
>> +static inline int
>> +get_rounding_mode (void)
>> +{
>> +  return _FPU_GET_RC ();
>> +}
>> +
>> +#endif /* get-rounding-mode.h */
> 
> I don't understand why this file/function is needed.
> Looks like another file that should be removed from this patch.

Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs".  get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).

> LGTM if you fix the ChangeLog and keep only the changes to
> sysdeps/powerpc/fpu_control.h.
> 
> Reviewed-by: Tulio Magno Quites Machado Filho <tuliom@linux.ibm.com>

PC

Tulio Magno Quites Machado Filho June 6, 2019, 6:05 p.m. UTC | #6

Paul Clarke <pc@us.ibm.com> writes:

> On 6/6/19 9:17 AM, Tulio Magno Quites Machado Filho wrote:
>> "Paul A. Clarke" <pc@us.ibm.com> writes:

>>> diff --git a/sysdeps/powerpc/powerpc64/get-rounding-mode.h b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>>> new file mode 100644
>>> index 0000000..e2fdbbb
>>> --- /dev/null
>>> +++ b/sysdeps/powerpc/powerpc64/get-rounding-mode.h
>>> @@ -0,0 +1,33 @@
>>> +/* Determine floating-point rounding mode within libc.  powerpc64 version.
> <snip>
>>> +static inline int
>>> +get_rounding_mode (void)
>>> +{
>>> +  return _FPU_GET_RC ();
>>> +}
>>> +
>>> +#endif /* get-rounding-mode.h */
>> 
>> I don't understand why this file/function is needed.
>> Looks like another file that should be removed from this patch.
>
> Without this file, the generic file is used, which calls _FPU_GETCW(), which is used generically to get the entire floating point status control register (FPSCR) and must resolve to using the slower "mffs".  get_rounding_mode() only needs the rounding mode, and the new-in-this-patch _FPU_GET_RC() used by this new file resolves to the faster "mffsl" (if compiled for POWER9).

Very good point.  I had missed this.

In that case, I think you should move this file to
sysdeps/powerpc/fpu/get-rounding-mode.h allowing 32-bits builds to use it
and restricting its usage to fpu.

[powerpc] get_rounding_mode: utilize faster method to get rounding mode

Commit Message

Comments

Patch