From patchwork Thu Jun 13 18:57:20 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul A. Clarke" X-Patchwork-Id: 1115564 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-102694-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=us.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="hoT0IbvF"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45PtKT6Hx3z9s6w for ; Fri, 14 Jun 2019 04:57:49 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=qGYZQgAN8/Y0F/LKG2qzoJ0GBjqDwjn kQu5rSNEaupX8ZwyXzHuwdF03cHuzBek7dcH9AxpeVMcfi209S7uRHoifqQ0bkji /3MoCjAl2Z6J+Invv31J6XFrAh1RpzXEgjs18tN2KcmsJU15Uj0wlxsc7JUhxALX 0BMroNflDx28= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=INbooYO92tR/YJqcXvhKHlXnHdE=; b=hoT0I bvFudO9css1uXCEGEHOPnJ165k1RJ17R7+PUHzSj3TrUuGy3Z3L6IMPpNwNscmNL gFGPxVIFxjkidwRHKQB5BwsAYNu9gwK8cLMBA+goec7I32pP/nN7BlalmeUGzKtU wKi7YKvcmEdNVnlmtsmOi0FB8IttEu6PAjA6QI= Received: (qmail 118039 invoked by alias); 13 Jun 2019 18:57:44 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 118029 invoked by uid 89); 13 Jun 2019 18:57:44 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-23.6 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 spammy=H*i:sk:1560452 X-HELO: mx0a-001b2d01.pphosted.com From: "Paul A. Clarke" To: libc-alpha@sourceware.org Cc: tuliom@ascii.art.br, fweimer@redhat.com, joseph@codesourcery.com Subject: [PATCH 1/2 v2] [powerpc] add 'volatile' to asm Date: Thu, 13 Jun 2019 13:57:20 -0500 Message-Id: <1560452241-11638-2-git-send-email-pc@us.ibm.com> In-Reply-To: <1560452241-11638-1-git-send-email-pc@us.ibm.com> References: <1560452241-11638-1-git-send-email-pc@us.ibm.com> From: "Paul A. Clarke" Add 'volatile' keyword to a few asm statements, to force the compiler to generate the instructions therein. Some instances were implicitly volatile, but adding keyword for consistency. 2019-06-13 Paul A. Clarke * sysdeps/powerpc/fpu/fenv_libc.h (relax_fenv_state): Add 'volatile'. * sysdeps/powerpc/fpu/fpu_control.h (__FPU_MFFS): Likewise. (__FPU_MFFSL): Likewise. (_FPU_SETCW): Likewise. v2: This fixes issues seen by Tulio in my earlier posted patch "[powerpc] fegetround: utilize faster method to get rounding mode" which was not committed. --- sysdeps/powerpc/fpu/fenv_libc.h | 4 ++-- sysdeps/powerpc/fpu_control.h | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h index f8dd1b7..f66bf24 100644 --- a/sysdeps/powerpc/fpu/fenv_libc.h +++ b/sysdeps/powerpc/fpu/fenv_libc.h @@ -56,9 +56,9 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; #define relax_fenv_state() \ do { \ if (GLRO(dl_hwcap) & PPC_FEATURE_HAS_DFP) \ - asm (".machine push; .machine \"power6\"; " \ + asm volatile (".machine push; .machine \"power6\"; " \ "mtfsfi 7,0,1; .machine pop"); \ - asm ("mtfsfi 7,0"); \ + asm volatile ("mtfsfi 7,0"); \ } while(0) /* Set/clear a particular FPSCR bit (for instance, diff --git a/sysdeps/powerpc/fpu_control.h b/sysdeps/powerpc/fpu_control.h index 07ccc84..0ab9331 100644 --- a/sysdeps/powerpc/fpu_control.h +++ b/sysdeps/powerpc/fpu_control.h @@ -67,7 +67,7 @@ typedef unsigned int fpu_control_t; /* Macros for accessing the hardware control word. */ # define __FPU_MFFS() \ ({register double __fr; \ - __asm__ ("mffs %0" : "=f" (__fr)); \ + __asm__ __volatile__("mffs %0" : "=f" (__fr)); \ __fr; \ }) @@ -81,7 +81,7 @@ typedef unsigned int fpu_control_t; #ifdef _ARCH_PWR9 # define __FPU_MFFSL() \ ({register double __fr; \ - __asm__ ("mffsl %0" : "=f" (__fr)); \ + __asm__ __volatile__("mffsl %0" : "=f" (__fr)); \ __fr; \ }) #else @@ -101,7 +101,7 @@ typedef unsigned int fpu_control_t; __u.__ll = 0xfff80000LL << 32; /* This is a QNaN. */ \ __u.__ll |= (cw) & 0xffffffffLL; \ __fr = __u.__d; \ - __asm__ ("mtfsf 255,%0" : : "f" (__fr)); \ + __asm__ __volatile__("mtfsf 255,%0" : : "f" (__fr)); \ } /* Default control word set at startup. */ From patchwork Thu Jun 13 18:57:21 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul A. Clarke" X-Patchwork-Id: 1115565 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-102695-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dmarc=none (p=none dis=none) header.from=us.ibm.com Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="PxlvyDm9"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 45PtKf6fVtz9s6w for ; Fri, 14 Jun 2019 04:57:58 +1000 (AEST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=IUncm9uTw7zTBiryRqk1fhPpg6mX/HQ 1U1517pjIgzl+0+Yy4Z0YaWX7EMW8CmthTgvmYFjzF638+jI1s59R+aIWRlktnrO m/MO1Cs3nkoHphXxacGnFwa2bp2V+MjEEtsHpmEQNzbx9UJ/QahKJga6xOQy+Ar1 vxVzLTwJ1Hqw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id:in-reply-to :references; s=default; bh=6puEg0LHVVVyyBdC23AOThLeZD0=; b=Pxlvy Dm9BWzrxeR3Odtv2L9jt/JvX3Y+sJMzpdNV5+3PFOPQD2YR9SgV/osX7OK/4l5C5 K09EEdrbqpIHwIUs3M0wfPJFv1hAIlWzbanVBUI2kOMwth+io3+36BGDUR69Qw4Q UsWI749AT2X5MBnbuxrNzHs91Q2tSuBNVNA+Kg= Received: (qmail 119146 invoked by alias); 13 Jun 2019 18:57:53 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 119137 invoked by uid 89); 13 Jun 2019 18:57:53 -0000 Authentication-Results: sourceware.org; auth=none X-Spam-SWARE-Status: No, score=-24.3 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_LOW autolearn=ham version=3.3.1 spammy=236 X-HELO: mx0a-001b2d01.pphosted.com From: "Paul A. Clarke" To: libc-alpha@sourceware.org Cc: tuliom@ascii.art.br, fweimer@redhat.com, joseph@codesourcery.com Subject: [PATCH 2/2 v2] [powerpc] Use faster means to access FPSCR when possible in some cases Date: Thu, 13 Jun 2019 13:57:21 -0500 Message-Id: <1560452241-11638-3-git-send-email-pc@us.ibm.com> In-Reply-To: <1560452241-11638-1-git-send-email-pc@us.ibm.com> References: <1560452241-11638-1-git-send-email-pc@us.ibm.com> From: "Paul A. Clarke" Using 'mffs' instruction to read the Floating Point Status Control Register (FPSCR) can force a processor flush in some cases, with undesirable performance impact. If the values of the bits in the FPSCR which force the flush are not needed, an instruction that is new to POWER9 (ISA version 3.0), 'mffsl' can be used instead. Cases included: get_rounding_mode, fegetround, fegetmode, fegetexcept. 2019-06-13 Paul A. Clarke * sysdeps/powerpc/bits/fenvinline.h (__fegetround): Use 'mffsl' when possible. * sysdeps/powerpc/fpu_control.h (IS_ISA300): New. (_FPU_MFFS): Move implementation... (_FPU_GETCW): Here. (_FPU_MFFSL): Move implementation.... (_FPU_GET_RC_FAST): Here. New. (_FPU_GET_RC): Change to use _FPU_GET_RC_FAST or _FPU_GETCW. * sysdeps/powerpc/fpu/fenv_libc.h (IS_ISA300): New. (fegetenv_status): New. * sysdeps/powerpc/fpu/fegetmode.c (fegetmode): Use fegetenv_status() instead of fegetenv_register(). * sysdeps/powerpc/fpu/fegetexcept.c (__fegetexcept): Likewise. v2: This incorporates the suggestion from Adhemerval in the earlier version of the changes posted with subject "[powerpc] fegetmode: utilize faster method to get rounding mode". Fixed some serious issues with a default build. Added '.machine "power9"' where 'mffsl' can be generated when not compiled for power9. --- sysdeps/powerpc/bits/fenvinline.h | 22 ++++++++++++++------ sysdeps/powerpc/fpu/fegetexcept.c | 2 +- sysdeps/powerpc/fpu/fegetmode.c | 2 +- sysdeps/powerpc/fpu/fenv_libc.h | 19 +++++++++++++++++ sysdeps/powerpc/fpu_control.h | 43 +++++++++++++++++++++------------------ 5 files changed, 60 insertions(+), 28 deletions(-) diff --git a/sysdeps/powerpc/bits/fenvinline.h b/sysdeps/powerpc/bits/fenvinline.h index 7079d1a..72cd263 100644 --- a/sysdeps/powerpc/bits/fenvinline.h +++ b/sysdeps/powerpc/bits/fenvinline.h @@ -19,12 +19,22 @@ #if defined __GNUC__ && !defined _SOFT_FLOAT && !defined __NO_FPRS__ /* Inline definition for fegetround. */ -# define __fegetround() \ - (__extension__ ({ int __fegetround_result; \ - __asm__ __volatile__ \ - ("mcrfs 7,7 ; mfcr %0" \ - : "=r"(__fegetround_result) : : "cr7"); \ - __fegetround_result & 3; })) +#ifdef _ARCH_PWR9 +# define __fegetround() \ + __extension__ ({ \ + union { double __d; unsigned long long __ll; } __u; \ + __asm__ ("mffsl %0" : "=f" (__u.__d)); \ + __u.__ll & 0x0000000000000003LL; \ + }) +#else +# define __fegetround() \ + __extension__ ({ \ + int __fegetround_result; \ + __asm__ __volatile__ ("mcrfs 7,7 ; mfcr %0" \ + : "=r"(__fegetround_result) : : "cr7"); \ + __fegetround_result & 3; \ + }) +#endif # define fegetround() __fegetround () # ifndef __NO_MATH_INLINES diff --git a/sysdeps/powerpc/fpu/fegetexcept.c b/sysdeps/powerpc/fpu/fegetexcept.c index 2173d77..10a37f0 100644 --- a/sysdeps/powerpc/fpu/fegetexcept.c +++ b/sysdeps/powerpc/fpu/fegetexcept.c @@ -24,7 +24,7 @@ __fegetexcept (void) { fenv_union_t fe; - fe.fenv = fegetenv_register (); + fe.fenv = fegetenv_status (); return fenv_reg_to_exceptions (fe.l); } diff --git a/sysdeps/powerpc/fpu/fegetmode.c b/sysdeps/powerpc/fpu/fegetmode.c index f43ab60..466f5b7 100644 --- a/sysdeps/powerpc/fpu/fegetmode.c +++ b/sysdeps/powerpc/fpu/fegetmode.c @@ -21,6 +21,6 @@ int fegetmode (femode_t *modep) { - *modep = fegetenv_register (); + *modep = fegetenv_status (); return 0; } diff --git a/sysdeps/powerpc/fpu/fenv_libc.h b/sysdeps/powerpc/fpu/fenv_libc.h index f66bf24..79e70df 100644 --- a/sysdeps/powerpc/fpu/fenv_libc.h +++ b/sysdeps/powerpc/fpu/fenv_libc.h @@ -33,6 +33,25 @@ extern const fenv_t *__fe_mask_env (void) attribute_hidden; /* Equivalent to fegetenv, but returns a fenv_t instead of taking a pointer. */ #define fegetenv_register() __builtin_mffs() + +#ifdef _ARCH_PWR9 +#define IS_ISA300() 1 +#else +#define IS_ISA300() __builtin_cpu_supports ("arch_3_00") +#endif + +/* Equivalent to fegetenv_register, but only returns bits for + status, exception enables, and mode. */ +#define fegetenv_status() \ + ({register double __fr; \ + if (__glibc_likely(IS_ISA300())) \ + __asm__ __volatile__ ( \ + ".machine push; .machine \"power9\"; mffsl %0; .machine pop" \ + : "=f" (__fr)); \ + else \ + __fr = __builtin_mffs(); \ + __fr; \ + }) /* Equivalent to fesetenv, but takes a fenv_t instead of a pointer. */ #define fesetenv_register(env) \ diff --git a/sysdeps/powerpc/fpu_control.h b/sysdeps/powerpc/fpu_control.h index 0ab9331..4c151e9 100644 --- a/sysdeps/powerpc/fpu_control.h +++ b/sysdeps/powerpc/fpu_control.h @@ -23,6 +23,12 @@ # error "SPE/e500 is no longer supported" #endif +#ifdef _ARCH_PWR9 +#define IS_ISA300() 1 +#else +#define IS_ISA300() __builtin_cpu_supports ("arch_3_00") +#endif + #ifdef _SOFT_FLOAT # define _FPU_RESERVED 0xffffffff @@ -65,34 +71,31 @@ extern fpu_control_t __fpu_control; typedef unsigned int fpu_control_t; /* Macros for accessing the hardware control word. */ -# define __FPU_MFFS() \ - ({register double __fr; \ - __asm__ __volatile__("mffs %0" : "=f" (__fr)); \ - __fr; \ - }) - # define _FPU_GETCW(cw) \ ({union { double __d; unsigned long long __ll; } __u; \ - __u.__d = __FPU_MFFS(); \ + register double __fr; \ + __asm__ __volatile__("mffs %0" : "=f" (__fr)); \ + __u.__d = __fr; \ (cw) = (fpu_control_t) __u.__ll; \ (fpu_control_t) __u.__ll; \ }) -#ifdef _ARCH_PWR9 -# define __FPU_MFFSL() \ - ({register double __fr; \ - __asm__ __volatile__("mffsl %0" : "=f" (__fr)); \ - __fr; \ +# define _FPU_GET_RC_FAST() \ + ({union { double __d; unsigned long long __ll; } __u; \ + register double __fr; \ + __asm__ __volatile__( \ + ".machine push; .machine \"power9\"; mffsl %0; .machine pop" \ + : "=f" (__fr)); \ + __u.__d = __fr; \ + __u.__ll &= _FPU_MASK_RC; \ + (fpu_control_t) __u.__ll; \ }) -#else -# define __FPU_MFFSL() __FPU_MFFS() -#endif - + # define _FPU_GET_RC() \ - ({union { double __d; unsigned long long __ll; } __u; \ - __u.__d = __FPU_MFFSL(); \ - __u.__ll &= _FPU_MASK_RC; \ - (fpu_control_t) __u.__ll; \ + ({fpu_control_t rc = __glibc_likely(IS_ISA300()) \ + ? _FPU_GET_RC_FAST() \ + : _FPU_GETCW(rc); \ + rc; \ }) # define _FPU_SETCW(cw) \