From patchwork Tue Nov 11 21:03:12 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Adhemerval Zanella X-Patchwork-Id: 409664 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id C7ADE14012C for ; Wed, 12 Nov 2014 08:03:35 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:date:from:mime-version:to:subject :content-type:content-transfer-encoding; q=dns; s=default; b=Tya WVRGL3PQYymgjwYnzeknKzBhjpmGtL4TPG46LZOz6Sfb4RWFfkULvuQ3n4GK6XhG cwwlBG0Wp7RR45SEme2WjbzJnuijM+1rYbV64dNvz8lT4SMyzXp5Z0ynKZTUbYE5 Ou5fwQHUBB3Wi10FLfDeM/+SPe9rKScgLp0xqhVQ= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:message-id:date:from:mime-version:to:subject :content-type:content-transfer-encoding; s=default; bh=HlgiHDiR5 uq869HkHgveRHgoCAQ=; b=Z5aBJrQt9nMaegphbF08P2bIv4/Ii8uVsvlcDlqQ/ t0J2eSGUwEgCCxk5Ai6QPUIwahUc+RUaMbsFQS3aoItyQQx7E7C87bkN4pnpR00U ef3/MJeozxQ1OIs25uh32JtBgtYp3Jv+QZRbX9lyt7LT2aQTnPl8aS2uQX4wDTWU uU= Received: (qmail 12541 invoked by alias); 11 Nov 2014 21:03:25 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 12476 invoked by uid 89); 11 Nov 2014 21:03:24 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.3 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD autolearn=ham version=3.3.2 X-HELO: e24smtp03.br.ibm.com Message-ID: <54627990.1060805@linux.vnet.ibm.com> Date: Tue, 11 Nov 2014 19:03:12 -0200 From: Adhemerval Zanella User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: "GNU C. Library" Subject: [PATCH 2/3] powerpc: POWER8 fmod optimization X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 14111121-0009-0000-0000-00000242CFE7 This patch adds a POWER8 fmod optimization. The implementation uses floating point operations instead of default one which uses integers. Inputs are handled for finite mode (since the symbol is meant to be used with -ffinite-math-only) and inexact exceptions are disabled in FP calculation. Using GLIBC fmod benchmark: * Default fmod: "fmod": { "": { "duration": 5.1043e+09, "iterations": 2.67654e+08, "max": 106.892, "min": 5.939, "mean": 19.0705 } * Patched version: "fmod": { "": { "duration": 5.08466e+09, "iterations": 6.35088e+08, "max": 92.284, "min": 6.849, "mean": 8.00623 } Tested on powerpc64 and powerpc64le. --- * sysdeps/powerpc/powerpc64/power8/fpu/e_fmod.S: New file: POWER8 fmod optimization. * sysdeps/powerpc/powerpc64/power8/fpu/e_fmodf.S: New file: POWER8 fmodf optimization. --- diff --git a/sysdeps/powerpc/powerpc64/power8/fpu/e_fmod.S b/sysdeps/powerpc/powerpc64/power8/fpu/e_fmod.S new file mode 100644 index 0000000..e9b35a2 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/power8/fpu/e_fmod.S @@ -0,0 +1,46 @@ +/* Finite fmod optimization - PowerPC64/POWER8 version. + Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define MFVSRD_R3_V2 .long 0x7c430066 /* mfvsrd r3,vs1 */ + +/* double [fp1] __ieee754_fmod (double [fp1] x, double [fp2] y) */ + + .machine power8 +ENTRY(__ieee754_fmod) + /* First check if 'y' is InF and return 'x' if it is the case. */ + MFVSRD_R3_V2 + lis r9,0x7ff0 /* r9 = 0x7ff0 */ + rldicl r10,r3,0,1 /* r10 = r3 & (0x8000000000000000) */ + sldi r9,r9,32 /* r9 = r9 << 52 */ + cmpd cr7,r10,r9 /* fp1 & 0x7ff0000000000000 ? */ + beqlr cr7 + + mtfsb0 4*cr7+lt /* Disable FE_INEXACT exception */ + fdiv fp0,fp1,fp2 /* fp0 = -trunc (fp1 / fp2) */ + fneg fp0,fp0 + friz fp0,fp0 + fmadd fp2,fp0,fp2,fp1 /* fp2 = x - (fp0) * y */ + fcpsgn fp1,fp1,fp2 + mtfsb0 4*cr1+eq /* Clear any FE_INEXACT exception */ + blr +END (__ieee754_fmod) + +strong_alias (__ieee754_fmod, __fmod_finite) diff --git a/sysdeps/powerpc/powerpc64/power8/fpu/e_fmodf.S b/sysdeps/powerpc/powerpc64/power8/fpu/e_fmodf.S new file mode 100644 index 0000000..7e16374 --- /dev/null +++ b/sysdeps/powerpc/powerpc64/power8/fpu/e_fmodf.S @@ -0,0 +1,46 @@ +/* Finite fmodf optimization - PowerPC64/POWER8 version. + Copyright (C) 2014 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include + +#define MFVSRD_R3_V2 .long 0x7c430066 /* mfvsrd r3,vs1 */ + +/* double [fp1] __ieee754_fmod (double [fp1] x, double [fp2] y) */ + + .machine power8 +ENTRY(__ieee754_fmodf) + /* First check if 'y' is InF and return 'x' if it is the case. */ + MFVSRD_R3_V2 + lis r9,0x7ff0 /* r9 = 0x7ff0 */ + rldicl r10,r3,0,1 /* r10 = r3 & (0x8000000000000000) */ + sldi r9,r9,32 /* r9 = r9 << 52 */ + cmpd cr7,r10,r9 /* fp1 & 0x7ff0000000000000 ? */ + beqlr cr7 + + mtfsb0 4*cr7+lt /* Disable FE_INEXACT exception */ + fdivs fp0,fp1,fp2 /* fp0 = -trunc (fp1 / fp2) */ + fneg fp0,fp0 + friz fp0,fp0 + fmadds fp2,fp0,fp2,fp1 /* fp2 = x - (fp0) * y */ + fcpsgn fp1,fp1,fp2 + mtfsb0 4*cr1+eq /* Clear any FE_INEXACT exception */ + blr +END (__ieee754_fmodf) + +strong_alias (__ieee754_fmodf, __fmodf_finite)