From patchwork Tue Aug 23 18:22:45 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Siddhesh Poyarekar X-Patchwork-Id: 661998 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3sJf386q8qz9sRZ for ; Wed, 24 Aug 2016 04:24:04 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b=Mws3o1IS; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; q=dns; s=default; b=YVgt9zXyFM0hD5R8q45eCeYDvAIEIDI 5wE1F18ArB9QhOK6eXiZ5aWe78O0w6NFzsUjNEZqMT8lp5DggfVLhKDSHzJVmP5k n1f+DUTQOFUNdsW2/qdXmqxhazjLPZtG+efWN/hTk/IXwIIt67pzjocre4H6ntwP yeyxIPQwQJK8= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:subject:date:message-id:in-reply-to :references; s=default; bh=cv99puCwHJG3wlb93s6M0Nu8MmE=; b=Mws3o 1ISw+OF0PscYlv5tpOJnrhf5szv8ChKQ3s6LnhAskr5rCuH/Mq1AIbu+gm9pY+rA y1tXEDKFnAzbSecIRyRJXcazhf0/uj7LuaoyNZWsH1p0ijdFiQQMBV48cLuGq5B0 Jsa896YEI3lLCLURKvyH8H/fh7rc9SWW3v6W1Q= Received: (qmail 97384 invoked by alias); 23 Aug 2016 18:23:22 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 97276 invoked by uid 89); 23 Aug 2016 18:23:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=BAYES_00, FUZZY_DR_OZ, RCVD_IN_DNSWL_NONE, SPF_NEUTRAL autolearn=no version=3.3.2 spammy=EPS, eps, ***************************************************************************, 2297 X-HELO: homiemail-a59.g.dreamhost.com From: Siddhesh Poyarekar To: libc-alpha@sourceware.org Subject: [PATCH 5/5] Inline all support functions for sin and cos Date: Tue, 23 Aug 2016 23:52:45 +0530 Message-Id: <1471976565-3576-6-git-send-email-siddhesh@sourceware.org> In-Reply-To: <1471976565-3576-1-git-send-email-siddhesh@sourceware.org> References: <1471976565-3576-1-git-send-email-siddhesh@sourceware.org> The support functions for sin and cos have a lot of identical functionality, so inlining them gives a pretty decent jump in functionality: ~19% in the sincos function. On SPEC2006 this translates to about 2.1% in the tonto test. * sysdeps/ieee754/dbl-64/s_sin.c (do_cos): Mark as inline. (do_cos_slow): Likewise. (do_sin): Likewise. (do_sin_slow): Likewise. (slow): Likewise. (slow1): Likewise. (slow2): Likewise. (sloww): Likewise. (sloww1): Likewise. (sloww2): Likewise. (bsloww): Likewise. (bsloww1): Likewise. (bsloww2): Likewise. (cslow2): Likewise. --- sysdeps/ieee754/dbl-64/s_sin.c | 52 +++++++++++++++++++++++------------------- 1 file changed, 28 insertions(+), 24 deletions(-) diff --git a/sysdeps/ieee754/dbl-64/s_sin.c b/sysdeps/ieee754/dbl-64/s_sin.c index 82f9345..c20ef4d 100644 --- a/sysdeps/ieee754/dbl-64/s_sin.c +++ b/sysdeps/ieee754/dbl-64/s_sin.c @@ -145,7 +145,8 @@ static double cslow2 (double x); of the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get the result in RES and a correction value in COR. */ -static double +static inline double +__always_inline do_cos (double x, double dx, double *corp) { mynumber u; @@ -170,7 +171,8 @@ do_cos (double x, double dx, double *corp) /* A more precise variant of DO_COS. EPS is the adjustment to the correction COR. */ -static double +static inline double +__always_inline do_cos_slow (double x, double dx, double eps, double *corp) { mynumber u; @@ -205,7 +207,8 @@ do_cos_slow (double x, double dx, double eps, double *corp) the number by combining the sin and cos of X (as computed by a variation of the Taylor series) with the values looked up from the sin/cos table to get the result in RES and a correction value in COR. */ -static double +static inline double +__always_inline do_sin (double x, double dx, double *corp) { mynumber u; @@ -229,7 +232,8 @@ do_sin (double x, double dx, double *corp) /* A more precise variant of DO_SIN. EPS is the adjustment to the correction COR. */ -static double +static inline double +__always_inline do_sin_slow (double x, double dx, double eps, double *corp) { mynumber u; @@ -615,8 +619,8 @@ __cos (double x) /* precision and if still doesn't accurate enough by mpsin or dubsin */ /************************************************************************/ -static double -SECTION +static inline double +__always_inline slow (double x) { double res, cor, w[2]; @@ -636,8 +640,8 @@ slow (double x) /* and if result still doesn't accurate enough by mpsin or dubsin */ /*******************************************************************************/ -static double -SECTION +static inline double +__always_inline slow1 (double x) { double w[2], cor, res; @@ -657,8 +661,8 @@ slow1 (double x) /* Routine compute sin(x) for 0.855469 <|x|<2.426265 by __sincostab.tbl */ /* and if result still doesn't accurate enough by mpsin or dubsin */ /**************************************************************************/ -static double -SECTION +static inline double +__always_inline slow2 (double x) { double w[2], y, y1, y2, cor, res; @@ -686,8 +690,8 @@ slow2 (double x) /* result.And if result not accurate enough routine calls mpsin1 or dubsin */ /***************************************************************************/ -static double -SECTION +static inline double +__always_inline sloww (double x, double dx, double orig, int k) { double y, t, res, cor, w[2], a, da, xn; @@ -747,8 +751,8 @@ sloww (double x, double dx, double orig, int k) /* accurate enough routine calls mpsin1 or dubsin */ /***************************************************************************/ -static double -SECTION +static inline double +__always_inline sloww1 (double x, double dx, double orig, int k) { double w[2], cor, res; @@ -777,8 +781,8 @@ sloww1 (double x, double dx, double orig, int k) /* accurate enough routine calls mpsin1 or dubsin */ /***************************************************************************/ -static double -SECTION +static inline double +__always_inline sloww2 (double x, double dx, double orig, int n) { double w[2], cor, res; @@ -808,8 +812,8 @@ sloww2 (double x, double dx, double orig, int n) /* result.And if result not accurate enough routine calls other routines */ /***************************************************************************/ -static double -SECTION +static inline double +__always_inline bsloww (double x, double dx, double orig, int n) { double res, cor, w[2], a, da; @@ -837,8 +841,8 @@ bsloww (double x, double dx, double orig, int n) /* And if result not accurate enough routine calls other routines */ /***************************************************************************/ -static double -SECTION +static inline double +__always_inline bsloww1 (double x, double dx, double orig, int n) { double w[2], cor, res; @@ -865,8 +869,8 @@ bsloww1 (double x, double dx, double orig, int n) /* And if result not accurate enough routine calls other routines */ /***************************************************************************/ -static double -SECTION +static inline double +__always_inline bsloww2 (double x, double dx, double orig, int n) { double w[2], cor, res; @@ -891,8 +895,8 @@ bsloww2 (double x, double dx, double orig, int n) /* precision and if still doesn't accurate enough by mpcos or docos */ /************************************************************************/ -static double -SECTION +static inline double +__always_inline cslow2 (double x) { double w[2], cor, res;