From patchwork Tue Aug 23 18:22:45 2016
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Siddhesh Poyarekar <siddhesh@sourceware.org>
X-Patchwork-Id: 661998
Return-Path: 
 <libc-alpha-return-72811-incoming=patchwork.ozlabs.org@sourceware.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 3sJf386q8qz9sRZ
	for <incoming@patchwork.ozlabs.org>;
	Wed, 24 Aug 2016 04:24:04 +1000 (AEST)
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	secure) header.d=sourceware.org header.i=@sourceware.org
	header.b=Mws3o1IS; dkim-atps=neutral
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:subject:date:message-id:in-reply-to
	:references; q=dns; s=default; b=YVgt9zXyFM0hD5R8q45eCeYDvAIEIDI
	5wE1F18ArB9QhOK6eXiZ5aWe78O0w6NFzsUjNEZqMT8lp5DggfVLhKDSHzJVmP5k
	n1f+DUTQOFUNdsW2/qdXmqxhazjLPZtG+efWN/hTk/IXwIIt67pzjocre4H6ntwP
	yeyxIPQwQJK8=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:subject:date:message-id:in-reply-to
	:references; s=default; bh=cv99puCwHJG3wlb93s6M0Nu8MmE=; b=Mws3o
	1ISw+OF0PscYlv5tpOJnrhf5szv8ChKQ3s6LnhAskr5rCuH/Mq1AIbu+gm9pY+rA
	y1tXEDKFnAzbSecIRyRJXcazhf0/uj7LuaoyNZWsH1p0ijdFiQQMBV48cLuGq5B0
	Jsa896YEI3lLCLURKvyH8H/fh7rc9SWW3v6W1Q=
Received: (qmail 97384 invoked by alias); 23 Aug 2016 18:23:22 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: 
 <mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 97276 invoked by uid 89); 23 Aug 2016 18:23:21 -0000
Authentication-Results: sourceware.org; auth=none
X-Virus-Found: No
X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=BAYES_00, FUZZY_DR_OZ,
	RCVD_IN_DNSWL_NONE,
	SPF_NEUTRAL autolearn=no version=3.3.2 spammy=EPS, eps,
	***************************************************************************,
	2297
X-HELO: homiemail-a59.g.dreamhost.com
From: Siddhesh Poyarekar <siddhesh@sourceware.org>
To: libc-alpha@sourceware.org
Subject: [PATCH 5/5] Inline all support functions for sin and cos
Date: Tue, 23 Aug 2016 23:52:45 +0530
Message-Id: <1471976565-3576-6-git-send-email-siddhesh@sourceware.org>
In-Reply-To: <1471976565-3576-1-git-send-email-siddhesh@sourceware.org>
References: <1471976565-3576-1-git-send-email-siddhesh@sourceware.org>

The support functions for sin and cos have a lot of identical
functionality, so inlining them gives a pretty decent jump in
functionality: ~19% in the sincos function.  On SPEC2006 this
translates to about 2.1% in the tonto test.

	* sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Mark as inline.
	(do_cos_slow): Likewise.
	(do_sin): Likewise.
	(do_sin_slow): Likewise.
	(slow): Likewise.
	(slow1): Likewise.
	(slow2): Likewise.
	(sloww): Likewise.
	(sloww1): Likewise.
	(sloww2): Likewise.
	(bsloww): Likewise.
	(bsloww1): Likewise.
	(bsloww2): Likewise.
	(cslow2): Likewise.
---
 sysdeps/ieee754/dbl-64/s_sin.c | 52 +++++++++++++++++++++++-------------------
 1 file changed, 28 insertions(+), 24 deletions(-)

diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c
index 82f9345..c20ef4d 100644
--- a/sysdeps/ieee754/dbl-64/s_sin.c
+++ b/sysdeps/ieee754/dbl-64/s_sin.c
@@ -145,7 +145,8 @@ static double cslow2 (double x);
    of the number by combining the sin and cos of X (as computed by a variation
    of the Taylor series) with the values looked up from the sin/cos table to
    get the result in RES and a correction value in COR.  */
-static double
+static inline double
+__always_inline
 do_cos (double x, double dx, double *corp)
 {
   mynumber u;
@@ -170,7 +171,8 @@ do_cos (double x, double dx, double *corp)
 
 /* A more precise variant of DO_COS.  EPS is the adjustment to the correction
    COR.  */
-static double
+static inline double
+__always_inline
 do_cos_slow (double x, double dx, double eps, double *corp)
 {
   mynumber u;
@@ -205,7 +207,8 @@ do_cos_slow (double x, double dx, double eps, double *corp)
    the number by combining the sin and cos of X (as computed by a variation of
    the Taylor series) with the values looked up from the sin/cos table to get
    the result in RES and a correction value in COR.  */
-static double
+static inline double
+__always_inline
 do_sin (double x, double dx, double *corp)
 {
   mynumber u;
@@ -229,7 +232,8 @@ do_sin (double x, double dx, double *corp)
 
 /* A more precise variant of DO_SIN.  EPS is the adjustment to the correction
    COR.  */
-static double
+static inline double
+__always_inline
 do_sin_slow (double x, double dx, double eps, double *corp)
 {
   mynumber u;
@@ -615,8 +619,8 @@ __cos (double x)
 /* precision  and if still doesn't accurate enough by mpsin   or dubsin */
 /************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 slow (double x)
 {
   double res, cor, w[2];
@@ -636,8 +640,8 @@ slow (double x)
 /* and if result still doesn't accurate enough by mpsin   or dubsin            */
 /*******************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 slow1 (double x)
 {
   double w[2], cor, res;
@@ -657,8 +661,8 @@ slow1 (double x)
 /*  Routine compute sin(x) for   0.855469  <|x|<2.426265  by  __sincostab.tbl  */
 /* and if result still doesn't accurate enough by mpsin   or dubsin       */
 /**************************************************************************/
-static double
-SECTION
+static inline double
+__always_inline
 slow2 (double x)
 {
   double w[2], y, y1, y2, cor, res;
@@ -686,8 +690,8 @@ slow2 (double x)
 /* result.And if result not accurate enough routine calls mpsin1 or dubsin */
 /***************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 sloww (double x, double dx, double orig, int k)
 {
   double y, t, res, cor, w[2], a, da, xn;
@@ -747,8 +751,8 @@ sloww (double x, double dx, double orig, int k)
 /* accurate enough routine calls  mpsin1   or dubsin                       */
 /***************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 sloww1 (double x, double dx, double orig, int k)
 {
   double w[2], cor, res;
@@ -777,8 +781,8 @@ sloww1 (double x, double dx, double orig, int k)
 /* accurate enough routine calls  mpsin1   or dubsin                       */
 /***************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 sloww2 (double x, double dx, double orig, int n)
 {
   double w[2], cor, res;
@@ -808,8 +812,8 @@ sloww2 (double x, double dx, double orig, int n)
 /* result.And if result not accurate enough routine calls other routines    */
 /***************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 bsloww (double x, double dx, double orig, int n)
 {
   double res, cor, w[2], a, da;
@@ -837,8 +841,8 @@ bsloww (double x, double dx, double orig, int n)
 /* And if result not  accurate enough routine calls  other routines         */
 /***************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 bsloww1 (double x, double dx, double orig, int n)
 {
   double w[2], cor, res;
@@ -865,8 +869,8 @@ bsloww1 (double x, double dx, double orig, int n)
 /* And if result not accurate enough routine calls  other routines          */
 /***************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 bsloww2 (double x, double dx, double orig, int n)
 {
   double w[2], cor, res;
@@ -891,8 +895,8 @@ bsloww2 (double x, double dx, double orig, int n)
 /* precision  and if still doesn't accurate enough by mpcos   or docos  */
 /************************************************************************/
 
-static double
-SECTION
+static inline double
+__always_inline
 cslow2 (double x)
 {
   double w[2], cor, res;