From patchwork Thu Dec 9 05:04:37 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sunil Pandey X-Patchwork-Id: 1565606 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: bilbo.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=s1kdtPPt; dkim-atps=neutral Authentication-Results: ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by bilbo.ozlabs.org (Postfix) with ESMTPS id 4J8hx16XvTz9sRR for ; Thu, 9 Dec 2021 16:12:21 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 3BDF83858431 for ; Thu, 9 Dec 2021 05:12:17 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 3BDF83858431 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1639026737; bh=La/7mac2WSLU//LG5E9XyknZA5x6Lk6/iHqK6XkiCEU=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To:Cc: From; b=s1kdtPPtwtVnWjnGbSwyGtNNBOCCKM/PBK3+MI9XfWGWar04U0dwoEl8vi+tf+JP1 YUbvukohE+cIJK3HeBpjUlse0R/tVbLvgRNx8zWzPrm+Ipt7pno7cEgO4AlEuFrSsg fN/QswaYXSDYMkTgkqYexaHDrTwHxWuaXJ9sg6ik= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by sourceware.org (Postfix) with ESMTPS id 696A73858426 for ; Thu, 9 Dec 2021 05:05:15 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 696A73858426 X-IronPort-AV: E=McAfee;i="6200,9189,10192"; a="235528556" X-IronPort-AV: E=Sophos;i="5.88,191,1635231600"; d="scan'208";a="235528556" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 08 Dec 2021 21:05:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,191,1635231600"; d="scan'208";a="461996176" Received: from scymds01.sc.intel.com ([10.148.94.138]) by orsmga003.jf.intel.com with ESMTP; 08 Dec 2021 21:05:12 -0800 Received: from gskx-1.sc.intel.com (gskx-1.sc.intel.com [172.25.149.211]) by scymds01.sc.intel.com with ESMTP id 1B9558Xh031636; Wed, 8 Dec 2021 21:05:11 -0800 To: libc-alpha@sourceware.org Subject: [PATCH v2 11/42] x86-64: Add vector atan2/atan2f implementation to libmvec Date: Wed, 8 Dec 2021 21:04:37 -0800 Message-Id: <20211209050508.2614536-12-skpgkp2@gmail.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20211209050508.2614536-1-skpgkp2@gmail.com> References: <20211209050508.2614536-1-skpgkp2@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=4.0 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, FORGED_GMAIL_RCVD, FREEMAIL_ENVFROM_END_DIGIT, FREEMAIL_FROM, GIT_PATCH_0, HK_RANDOM_ENVFROM, HK_RANDOM_FROM, KAM_DMARC_NONE, KAM_DMARC_STATUS, KAM_SHORT, KAM_STOCKGEN, LOTS_OF_MONEY, NML_ADSP_CUSTOM_MED, SPF_HELO_NONE, SPF_SOFTFAIL, TXREP, UNWANTED_LANGUAGE_BODY autolearn=ham autolearn_force=no version=3.4.4 X-Spam-Level: *** X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Sunil K Pandey via Libc-alpha From: Sunil Pandey Reply-To: Sunil K Pandey Cc: andrey.kolesov@intel.com Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" Implement vectorized atan2/atan2f containing SSE, AVX, AVX2 and AVX512 versions for libmvec as per vector ABI. It also contains accuracy and ABI tests for vector atan2/atan2f with regenerated ulps. --- bits/libm-simd-decl-stubs.h | 11 + math/bits/mathcalls.h | 2 +- .../unix/sysv/linux/x86_64/libmvec.abilist | 8 + sysdeps/x86/fpu/bits/math-vector.h | 4 + .../x86/fpu/finclude/math-vector-fortran.h | 4 + sysdeps/x86_64/fpu/Makeconfig | 1 + sysdeps/x86_64/fpu/Versions | 2 + sysdeps/x86_64/fpu/libm-test-ulps | 20 + .../fpu/multiarch/svml_d_atan22_core-sse2.S | 20 + .../x86_64/fpu/multiarch/svml_d_atan22_core.c | 28 + .../fpu/multiarch/svml_d_atan22_core_sse4.S | 3629 +++++++++++++++++ .../fpu/multiarch/svml_d_atan24_core-sse.S | 20 + .../x86_64/fpu/multiarch/svml_d_atan24_core.c | 28 + .../fpu/multiarch/svml_d_atan24_core_avx2.S | 3161 ++++++++++++++ .../fpu/multiarch/svml_d_atan28_core-avx2.S | 20 + .../x86_64/fpu/multiarch/svml_d_atan28_core.c | 28 + .../fpu/multiarch/svml_d_atan28_core_avx512.S | 2311 +++++++++++ .../fpu/multiarch/svml_s_atan2f16_core-avx2.S | 20 + .../fpu/multiarch/svml_s_atan2f16_core.c | 28 + .../multiarch/svml_s_atan2f16_core_avx512.S | 1998 +++++++++ .../fpu/multiarch/svml_s_atan2f4_core-sse2.S | 20 + .../fpu/multiarch/svml_s_atan2f4_core.c | 28 + .../fpu/multiarch/svml_s_atan2f4_core_sse4.S | 2668 ++++++++++++ .../fpu/multiarch/svml_s_atan2f8_core-sse.S | 20 + .../fpu/multiarch/svml_s_atan2f8_core.c | 28 + .../fpu/multiarch/svml_s_atan2f8_core_avx2.S | 2413 +++++++++++ sysdeps/x86_64/fpu/svml_d_atan22_core.S | 29 + sysdeps/x86_64/fpu/svml_d_atan24_core.S | 29 + sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S | 25 + sysdeps/x86_64/fpu/svml_d_atan28_core.S | 25 + sysdeps/x86_64/fpu/svml_s_atan2f16_core.S | 25 + sysdeps/x86_64/fpu/svml_s_atan2f4_core.S | 29 + sysdeps/x86_64/fpu/svml_s_atan2f8_core.S | 29 + sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S | 25 + .../fpu/test-double-libmvec-atan2-avx.c | 1 + .../fpu/test-double-libmvec-atan2-avx2.c | 1 + .../fpu/test-double-libmvec-atan2-avx512f.c | 1 + .../x86_64/fpu/test-double-libmvec-atan2.c | 3 + .../x86_64/fpu/test-double-vlen2-wrappers.c | 1 + .../fpu/test-double-vlen4-avx2-wrappers.c | 1 + .../x86_64/fpu/test-double-vlen4-wrappers.c | 1 + .../x86_64/fpu/test-double-vlen8-wrappers.c | 1 + .../fpu/test-float-libmvec-atan2f-avx.c | 1 + .../fpu/test-float-libmvec-atan2f-avx2.c | 1 + .../fpu/test-float-libmvec-atan2f-avx512f.c | 1 + .../x86_64/fpu/test-float-libmvec-atan2f.c | 3 + .../x86_64/fpu/test-float-vlen16-wrappers.c | 1 + .../x86_64/fpu/test-float-vlen4-wrappers.c | 1 + .../fpu/test-float-vlen8-avx2-wrappers.c | 1 + .../x86_64/fpu/test-float-vlen8-wrappers.c | 1 + 50 files changed, 16755 insertions(+), 1 deletion(-) create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c create mode 100644 sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan22_core.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan24_core.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S create mode 100644 sysdeps/x86_64/fpu/svml_d_atan28_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f16_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f4_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f8_core.S create mode 100644 sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c create mode 100644 sysdeps/x86_64/fpu/test-double-libmvec-atan2.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c create mode 100644 sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c diff --git a/bits/libm-simd-decl-stubs.h b/bits/libm-simd-decl-stubs.h index 3e0aa043b4..bd8019839c 100644 --- a/bits/libm-simd-decl-stubs.h +++ b/bits/libm-simd-decl-stubs.h @@ -153,4 +153,15 @@ #define __DECL_SIMD_atanf32x #define __DECL_SIMD_atanf64x #define __DECL_SIMD_atanf128x + +#define __DECL_SIMD_atan2 +#define __DECL_SIMD_atan2f +#define __DECL_SIMD_atan2l +#define __DECL_SIMD_atan2f16 +#define __DECL_SIMD_atan2f32 +#define __DECL_SIMD_atan2f64 +#define __DECL_SIMD_atan2f128 +#define __DECL_SIMD_atan2f32x +#define __DECL_SIMD_atan2f64x +#define __DECL_SIMD_atan2f128x #endif diff --git a/math/bits/mathcalls.h b/math/bits/mathcalls.h index f37dbeebfb..b1b11b74ee 100644 --- a/math/bits/mathcalls.h +++ b/math/bits/mathcalls.h @@ -56,7 +56,7 @@ __MATHCALL_VEC (asin,, (_Mdouble_ __x)); /* Arc tangent of X. */ __MATHCALL_VEC (atan,, (_Mdouble_ __x)); /* Arc tangent of Y/X. */ -__MATHCALL (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); +__MATHCALL_VEC (atan2,, (_Mdouble_ __y, _Mdouble_ __x)); /* Cosine of X. */ __MATHCALL_VEC (cos,, (_Mdouble_ __x)); diff --git a/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist b/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist index 2ead94d87e..9b47e83aec 100644 --- a/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist +++ b/sysdeps/unix/sysv/linux/x86_64/libmvec.abilist @@ -51,38 +51,46 @@ GLIBC_2.35 _ZGVbN2v_acosh F GLIBC_2.35 _ZGVbN2v_asin F GLIBC_2.35 _ZGVbN2v_asinh F GLIBC_2.35 _ZGVbN2v_atan F +GLIBC_2.35 _ZGVbN2vv_atan2 F GLIBC_2.35 _ZGVbN4v_acosf F GLIBC_2.35 _ZGVbN4v_acoshf F GLIBC_2.35 _ZGVbN4v_asinf F GLIBC_2.35 _ZGVbN4v_asinhf F GLIBC_2.35 _ZGVbN4v_atanf F +GLIBC_2.35 _ZGVbN4vv_atan2f F GLIBC_2.35 _ZGVcN4v_acos F GLIBC_2.35 _ZGVcN4v_acosh F GLIBC_2.35 _ZGVcN4v_asin F GLIBC_2.35 _ZGVcN4v_asinh F GLIBC_2.35 _ZGVcN4v_atan F +GLIBC_2.35 _ZGVcN4vv_atan2 F GLIBC_2.35 _ZGVcN8v_acosf F GLIBC_2.35 _ZGVcN8v_acoshf F GLIBC_2.35 _ZGVcN8v_asinf F GLIBC_2.35 _ZGVcN8v_asinhf F GLIBC_2.35 _ZGVcN8v_atanf F +GLIBC_2.35 _ZGVcN8vv_atan2f F GLIBC_2.35 _ZGVdN4v_acos F GLIBC_2.35 _ZGVdN4v_acosh F GLIBC_2.35 _ZGVdN4v_asin F GLIBC_2.35 _ZGVdN4v_asinh F GLIBC_2.35 _ZGVdN4v_atan F +GLIBC_2.35 _ZGVdN4vv_atan2 F GLIBC_2.35 _ZGVdN8v_acosf F GLIBC_2.35 _ZGVdN8v_acoshf F GLIBC_2.35 _ZGVdN8v_asinf F GLIBC_2.35 _ZGVdN8v_asinhf F GLIBC_2.35 _ZGVdN8v_atanf F +GLIBC_2.35 _ZGVdN8vv_atan2f F GLIBC_2.35 _ZGVeN16v_acosf F GLIBC_2.35 _ZGVeN16v_acoshf F GLIBC_2.35 _ZGVeN16v_asinf F GLIBC_2.35 _ZGVeN16v_asinhf F GLIBC_2.35 _ZGVeN16v_atanf F +GLIBC_2.35 _ZGVeN16vv_atan2f F GLIBC_2.35 _ZGVeN8v_acos F GLIBC_2.35 _ZGVeN8v_acosh F GLIBC_2.35 _ZGVeN8v_asin F GLIBC_2.35 _ZGVeN8v_asinh F GLIBC_2.35 _ZGVeN8v_atan F +GLIBC_2.35 _ZGVeN8vv_atan2 F diff --git a/sysdeps/x86/fpu/bits/math-vector.h b/sysdeps/x86/fpu/bits/math-vector.h index ef0a3fb7ed..67a326566c 100644 --- a/sysdeps/x86/fpu/bits/math-vector.h +++ b/sysdeps/x86/fpu/bits/math-vector.h @@ -78,6 +78,10 @@ # define __DECL_SIMD_atan __DECL_SIMD_x86_64 # undef __DECL_SIMD_atanf # define __DECL_SIMD_atanf __DECL_SIMD_x86_64 +# undef __DECL_SIMD_atan2 +# define __DECL_SIMD_atan2 __DECL_SIMD_x86_64 +# undef __DECL_SIMD_atan2f +# define __DECL_SIMD_atan2f __DECL_SIMD_x86_64 # endif #endif diff --git a/sysdeps/x86/fpu/finclude/math-vector-fortran.h b/sysdeps/x86/fpu/finclude/math-vector-fortran.h index 224285b69e..3e0fb32762 100644 --- a/sysdeps/x86/fpu/finclude/math-vector-fortran.h +++ b/sysdeps/x86/fpu/finclude/math-vector-fortran.h @@ -38,6 +38,8 @@ !GCC$ builtin (asinhf) attributes simd (notinbranch) if('x86_64') !GCC$ builtin (atan) attributes simd (notinbranch) if('x86_64') !GCC$ builtin (atanf) attributes simd (notinbranch) if('x86_64') +!GCC$ builtin (atan2) attributes simd (notinbranch) if('x86_64') +!GCC$ builtin (atan2f) attributes simd (notinbranch) if('x86_64') !GCC$ builtin (cos) attributes simd (notinbranch) if('x32') !GCC$ builtin (cosf) attributes simd (notinbranch) if('x32') @@ -61,3 +63,5 @@ !GCC$ builtin (asinhf) attributes simd (notinbranch) if('x32') !GCC$ builtin (atan) attributes simd (notinbranch) if('x32') !GCC$ builtin (atanf) attributes simd (notinbranch) if('x32') +!GCC$ builtin (atan2) attributes simd (notinbranch) if('x32') +!GCC$ builtin (atan2f) attributes simd (notinbranch) if('x32') diff --git a/sysdeps/x86_64/fpu/Makeconfig b/sysdeps/x86_64/fpu/Makeconfig index 1c06066bee..c9a1e0da93 100644 --- a/sysdeps/x86_64/fpu/Makeconfig +++ b/sysdeps/x86_64/fpu/Makeconfig @@ -27,6 +27,7 @@ libmvec-funcs = \ asin \ asinh \ atan \ + atan2 \ cos \ exp \ log \ diff --git a/sysdeps/x86_64/fpu/Versions b/sysdeps/x86_64/fpu/Versions index f7ce07574f..57de41e864 100644 --- a/sysdeps/x86_64/fpu/Versions +++ b/sysdeps/x86_64/fpu/Versions @@ -19,10 +19,12 @@ libmvec { _ZGVbN2v_asin; _ZGVcN4v_asin; _ZGVdN4v_asin; _ZGVeN8v_asin; _ZGVbN2v_asinh; _ZGVcN4v_asinh; _ZGVdN4v_asinh; _ZGVeN8v_asinh; _ZGVbN2v_atan; _ZGVcN4v_atan; _ZGVdN4v_atan; _ZGVeN8v_atan; + _ZGVbN2vv_atan2; _ZGVcN4vv_atan2; _ZGVdN4vv_atan2; _ZGVeN8vv_atan2; _ZGVbN4v_acosf; _ZGVcN8v_acosf; _ZGVdN8v_acosf; _ZGVeN16v_acosf; _ZGVbN4v_acoshf; _ZGVcN8v_acoshf; _ZGVdN8v_acoshf; _ZGVeN16v_acoshf; _ZGVbN4v_asinf; _ZGVcN8v_asinf; _ZGVdN8v_asinf; _ZGVeN16v_asinf; _ZGVbN4v_asinhf; _ZGVcN8v_asinhf; _ZGVdN8v_asinhf; _ZGVeN16v_asinhf; _ZGVbN4v_atanf; _ZGVcN8v_atanf; _ZGVdN8v_atanf; _ZGVeN16v_atanf; + _ZGVbN4vv_atan2f; _ZGVcN8vv_atan2f; _ZGVdN8vv_atan2f; _ZGVeN16vv_atan2f; } } diff --git a/sysdeps/x86_64/fpu/libm-test-ulps b/sysdeps/x86_64/fpu/libm-test-ulps index de345e2bf1..329e7f58a2 100644 --- a/sysdeps/x86_64/fpu/libm-test-ulps +++ b/sysdeps/x86_64/fpu/libm-test-ulps @@ -203,6 +203,26 @@ float: 2 float128: 2 ldouble: 1 +Function: "atan2_vlen16": +float: 2 + +Function: "atan2_vlen2": +double: 1 + +Function: "atan2_vlen4": +double: 1 +float: 2 + +Function: "atan2_vlen4_avx2": +double: 1 + +Function: "atan2_vlen8": +double: 1 +float: 2 + +Function: "atan2_vlen8_avx2": +float: 2 + Function: "atan_downward": double: 1 float: 2 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S new file mode 100644 index 0000000000..6c3ad05a6c --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core-sse2.S @@ -0,0 +1,20 @@ +/* SSE2 version of vectorized atan2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVbN2vv_atan2 _ZGVbN2vv_atan2_sse2 +#include "../svml_d_atan22_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c new file mode 100644 index 0000000000..43f1ee7f33 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2, vector length is 2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVbN2vv_atan2 +#include "ifunc-mathvec-sse4_1.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVbN2vv_atan2, __GI__ZGVbN2vv_atan2, + __redirect__ZGVbN2vv_atan2) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S new file mode 100644 index 0000000000..3924796722 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan22_core_sse4.S @@ -0,0 +1,3629 @@ +/* Function atan2 vectorized with SSE4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text + .section .text.sse4,"ax",@progbits +ENTRY(_ZGVbN2vv_atan2_sse4) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + movups %xmm8, 112(%rsp) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + movaps %xmm0, %xmm8 + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Get r0~=1/B + * Cannot be replaced by VQRCP(D, dR0, dB); + * Argument Absolute values + */ + movups 1728+__svml_datan2_data_internal(%rip), %xmm4 + movups %xmm9, 96(%rsp) + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + movaps %xmm1, %xmm9 + movaps %xmm4, %xmm1 + andps %xmm8, %xmm4 + andps %xmm9, %xmm1 + movaps %xmm4, %xmm2 + cmpnltpd %xmm1, %xmm2 + +/* Argument signs */ + movups 1536+__svml_datan2_data_internal(%rip), %xmm3 + movaps %xmm2, %xmm0 + movaps %xmm3, %xmm7 + movaps %xmm3, %xmm6 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + orps %xmm1, %xmm3 + andnps %xmm4, %xmm0 + andps %xmm2, %xmm3 + andps %xmm9, %xmm7 + movups 64+__svml_datan2_data_internal(%rip), %xmm5 + orps %xmm3, %xmm0 + movaps %xmm2, %xmm3 + andps %xmm2, %xmm5 + andnps %xmm1, %xmm3 + andps %xmm4, %xmm2 + orps %xmm2, %xmm3 + andps %xmm8, %xmm6 + divpd %xmm3, %xmm0 + movups %xmm10, 48(%rsp) + movq 1600+__svml_datan2_data_internal(%rip), %xmm2 + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + +/* Check if y and x are on main path. */ + pshufd $221, %xmm1, %xmm10 + psubd %xmm2, %xmm10 + movups %xmm11, 80(%rsp) + movups %xmm12, 32(%rsp) + movups %xmm4, 16(%rsp) + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + movq 1664+__svml_datan2_data_internal(%rip), %xmm11 + pshufd $221, %xmm4, %xmm12 + movdqa %xmm10, %xmm4 + pcmpgtd %xmm11, %xmm4 + pcmpeqd %xmm11, %xmm10 + por %xmm10, %xmm4 + +/* Polynomial. */ + movaps %xmm0, %xmm10 + mulpd %xmm0, %xmm10 + psubd %xmm2, %xmm12 + movups %xmm13, 144(%rsp) + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + movdqa %xmm12, %xmm13 + pcmpgtd %xmm11, %xmm13 + pcmpeqd %xmm11, %xmm12 + por %xmm12, %xmm13 + movaps %xmm10, %xmm12 + mulpd %xmm10, %xmm12 + por %xmm13, %xmm4 + movaps %xmm12, %xmm13 + mulpd %xmm12, %xmm13 + movmskps %xmm4, %eax + movups %xmm15, 160(%rsp) + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + movups 256+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm15 + movups 320+__svml_datan2_data_internal(%rip), %xmm11 + movups 384+__svml_datan2_data_internal(%rip), %xmm2 + addpd 512+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm13, %xmm15 + addpd 576+__svml_datan2_data_internal(%rip), %xmm11 + addpd 640+__svml_datan2_data_internal(%rip), %xmm2 + addpd 768+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm13, %xmm15 + addpd 832+__svml_datan2_data_internal(%rip), %xmm11 + addpd 896+__svml_datan2_data_internal(%rip), %xmm2 + addpd 1024+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm13, %xmm15 + addpd 1088+__svml_datan2_data_internal(%rip), %xmm11 + addpd 1152+__svml_datan2_data_internal(%rip), %xmm2 + addpd 1280+__svml_datan2_data_internal(%rip), %xmm15 + mulpd %xmm13, %xmm11 + mulpd %xmm13, %xmm2 + mulpd %xmm10, %xmm15 + addpd 1344+__svml_datan2_data_internal(%rip), %xmm11 + addpd 1408+__svml_datan2_data_internal(%rip), %xmm2 + addpd %xmm15, %xmm11 + mulpd %xmm2, %xmm10 + mulpd %xmm11, %xmm12 + movups %xmm14, 176(%rsp) + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + movups 448+__svml_datan2_data_internal(%rip), %xmm14 + mulpd %xmm13, %xmm14 + addpd 704+__svml_datan2_data_internal(%rip), %xmm14 + mulpd %xmm13, %xmm14 + addpd 960+__svml_datan2_data_internal(%rip), %xmm14 + mulpd %xmm13, %xmm14 + addpd 1216+__svml_datan2_data_internal(%rip), %xmm14 + +/* A00=1.0, account for it later VQFMA(D, dP4, dP4, dR8, dA00); */ + mulpd %xmm14, %xmm13 + addpd %xmm10, %xmm13 + addpd %xmm12, %xmm13 + +/* + * Reconstruction. + * dP=(R+R*dP) + dPIO2 + */ + mulpd %xmm0, %xmm13 + addpd %xmm13, %xmm0 + movups %xmm3, (%rsp) + +/* if x<0, dPI = Pi, else dPI =0 */ + movaps %xmm9, %xmm3 + cmplepd 1792+__svml_datan2_data_internal(%rip), %xmm3 + addpd %xmm5, %xmm0 + andps __svml_datan2_data_internal(%rip), %xmm3 + orps %xmm7, %xmm0 + addpd %xmm3, %xmm0 + +/* Special branch for fast (vector) processing of zero arguments */ + movups 16(%rsp), %xmm11 + orps %xmm6, %xmm0 + testb $3, %al + jne L(7) + +L(1): +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne L(3) + +L(2): + movups 112(%rsp), %xmm8 + cfi_restore(25) + movups 96(%rsp), %xmm9 + cfi_restore(26) + movups 48(%rsp), %xmm10 + cfi_restore(27) + movups 80(%rsp), %xmm11 + cfi_restore(28) + movups 32(%rsp), %xmm12 + cfi_restore(29) + movups 144(%rsp), %xmm13 + cfi_restore(30) + movups 176(%rsp), %xmm14 + cfi_restore(31) + movups 160(%rsp), %xmm15 + cfi_restore(32) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + +L(3): + movups %xmm8, 64(%rsp) + movups %xmm9, 128(%rsp) + movups %xmm0, 192(%rsp) + je L(2) + xorl %eax, %eax + movq %rsi, 8(%rsp) + movq %rdi, (%rsp) + movq %r12, 24(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 16(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + +L(4): + btl %r12d, %r13d + jc L(6) + +L(5): + incl %r12d + cmpl $2, %r12d + jl L(4) + movq 8(%rsp), %rsi + cfi_restore(4) + movq (%rsp), %rdi + cfi_restore(5) + movq 24(%rsp), %r12 + cfi_restore(12) + movq 16(%rsp), %r13 + cfi_restore(13) + movups 192(%rsp), %xmm0 + jmp L(2) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + +L(6): + lea 64(%rsp,%r12,8), %rdi + lea 128(%rsp,%r12,8), %rsi + lea 192(%rsp,%r12,8), %rdx + call __svml_datan2_cout_rare_internal + jmp L(5) + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + +L(7): +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + movups 1792+__svml_datan2_data_internal(%rip), %xmm2 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + movaps %xmm9, %xmm12 + movaps %xmm8, %xmm10 + cmpordpd %xmm9, %xmm12 + cmpordpd %xmm8, %xmm10 + cmpeqpd %xmm2, %xmm1 + cmpeqpd %xmm2, %xmm11 + andps %xmm10, %xmm12 + orps %xmm11, %xmm1 + pshufd $221, %xmm1, %xmm1 + pshufd $221, %xmm12, %xmm11 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + pand %xmm11, %xmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + movdqa %xmm1, %xmm13 + pandn %xmm4, %xmm13 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + movups (%rsp), %xmm4 + cmpeqpd %xmm2, %xmm4 + +/* Go to callout */ + movmskps %xmm13, %edx + +/* Set sPIO2 to zero if den. is zero */ + movaps %xmm4, %xmm15 + andps %xmm2, %xmm4 + andnps %xmm5, %xmm15 + andl $3, %edx + orps %xmm4, %xmm15 + pshufd $221, %xmm9, %xmm5 + orps %xmm7, %xmm15 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + pshufd $221, %xmm2, %xmm7 + pcmpgtd %xmm5, %xmm7 + pshufd $80, %xmm7, %xmm14 + andps %xmm3, %xmm14 + addpd %xmm14, %xmm15 + +/* Merge results from main and spec path */ + pshufd $80, %xmm1, %xmm3 + orps %xmm6, %xmm15 + movdqa %xmm3, %xmm6 + andps %xmm3, %xmm15 + andnps %xmm0, %xmm6 + movaps %xmm6, %xmm0 + orps %xmm15, %xmm0 + jmp L(1) + +END(_ZGVbN2vv_atan2_sse4) + + .align 16,0x90 + +__svml_datan2_cout_rare_internal: + + cfi_startproc + + movq %rdx, %rcx + movsd 1888+__datan2_la_CoutTab(%rip), %xmm1 + movsd (%rdi), %xmm2 + movsd (%rsi), %xmm0 + mulsd %xmm1, %xmm2 + mulsd %xmm0, %xmm1 + movsd %xmm2, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -42(%rsp), %r9d + andl $32752, %r9d + movb -33(%rsp), %al + movzwl -34(%rsp), %r8d + andb $-128, %al + andl $32752, %r8d + shrl $4, %r9d + movb -41(%rsp), %dl + shrb $7, %dl + shrb $7, %al + shrl $4, %r8d + cmpl $2047, %r9d + je L(31) + cmpl $2047, %r8d + je L(25) + testl %r9d, %r9d + jne L(8) + testl $1048575, -44(%rsp) + jne L(8) + cmpl $0, -48(%rsp) + je L(19) + +L(8): + testl %r8d, %r8d + jne L(9) + testl $1048575, -36(%rsp) + jne L(9) + cmpl $0, -40(%rsp) + je L(18) + +L(9): + negl %r8d + movsd %xmm2, -48(%rsp) + addl %r9d, %r8d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r8d + jle L(16) + cmpl $54, %r8d + jge L(15) + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %al, %al + jne L(10) + movsd 1976+__datan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp L(11) + +L(10): + movsd 1936+__datan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + +L(11): + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %r9d, %r9d + jle L(24) + cmpl $2046, %r9d + jge L(12) + andl $-32753, %esi + addl $-1023, %r9d + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp L(13) + +L(12): + movsd 1992+__datan2_la_CoutTab(%rip), %xmm3 + movl $1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +L(13): + negl %r9d + addl $1023, %r9d + andl $2047, %r9d + movzwl 1894+__datan2_la_CoutTab(%rip), %esi + movsd 1888+__datan2_la_CoutTab(%rip), %xmm3 + andl $-32753, %esi + shll $4, %r9d + movsd %xmm3, -40(%rsp) + orl %r9d, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm4 + mulsd %xmm4, %xmm2 + comisd 1880+__datan2_la_CoutTab(%rip), %xmm5 + jb L(14) + movsd 2000+__datan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %r8d + movl %r8d, %r9d + andl $-524288, %r8d + andl $-1048576, %r9d + addl $262144, %r8d + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %r8d + movsd -72(%rsp), %xmm4 + orl %r8d, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__datan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %edi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %edi + movsd -72(%rsp), %xmm6 + shrl $4, %edi + subsd %xmm6, %xmm5 + movl -12(%rsp), %esi + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %esi + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %edi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %esi, %edi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %edi + movsd -64(%rsp), %xmm15 + movl $113, %esi + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %edi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %edi + movsd -56(%rsp), %xmm7 + cmovl %edi, %esi + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %esi, %esi + movsd -64(%rsp), %xmm12 + lea __datan2_la_CoutTab(%rip), %rdi + movsd -56(%rsp), %xmm5 + movslq %esi, %rsi + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rdi,%rsi,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rdi,%rsi,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %r8 + movq %r8, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %r8 + addsd -32(%rsp), %xmm2 + shlb $7, %dl + addsd 8(%rdi,%rsi,8), %xmm2 + movb %al, %sil + andb $127, %r8b + shlb $7, %sil + movsd %xmm2, -32(%rsp) + orb %sil, %r8b + movb %r8b, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %dil + movb %dil, %r9b + shrb $7, %dil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %dil, %al + andb $127, %r9b + shlb $7, %al + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm12 + movq %rax, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %dl, %r10b + movb %r10b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(14): + movsd -48(%rsp), %xmm12 + movb %al, %r8b + movaps %xmm12, %xmm7 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm7 + shlb $7, %r8b + shlb $7, %dl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rsi + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rsi, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rdi + movsd -56(%rsp), %xmm15 + movq %rdi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rdi + addsd %xmm5, %xmm4 + andb $127, %dil + orb %r8b, %dil + movb %dil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %al + andb $127, %r10b + shlb $7, %al + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %dl, %r11b + movb %r11b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(15): + cmpl $74, %r8d + jge L(32) + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + subsd %xmm1, %xmm0 + addsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(16): + testb %al, %al + jne L(22) + movb %dil, -41(%rsp) + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + movsd %xmm2, -24(%rsp) + movzwl -18(%rsp), %eax + testl $32752, %eax + je L(17) + movsd 1888+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(17): + mulsd %xmm2, %xmm2 + shlb $7, %dl + movsd %xmm2, -72(%rsp) + movsd -72(%rsp), %xmm0 + addsd -24(%rsp), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(18): + testl %r9d, %r9d + jne L(32) + testl $1048575, -44(%rsp) + jne L(32) + jmp L(33) + +L(19): + jne L(32) + +L(20): + testb %al, %al + jne L(22) + +L(21): + shlb $7, %dl + movq 1976+__datan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(22): + movsd 1936+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + +L(23): + xorl %eax, %eax + ret + +L(24): + movsd 1984+__datan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp L(13) + +L(25): + cmpl $2047, %r9d + je L(31) + +L(26): + testl $1048575, -36(%rsp) + jne L(27) + cmpl $0, -40(%rsp) + je L(28) + +L(27): + addsd %xmm1, %xmm2 + movsd %xmm2, (%rcx) + jmp L(23) + +L(28): + cmpl $2047, %r9d + je L(29) + testb %al, %al + je L(21) + jmp L(22) + +L(29): + testb %al, %al + jne L(30) + movsd 1904+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1912+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(30): + movsd 1952+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1960+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(31): + testl $1048575, -44(%rsp) + jne L(27) + cmpl $0, -48(%rsp) + jne L(27) + cmpl $2047, %r8d + je L(26) + +L(32): + movsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(33): + cmpl $0, -48(%rsp) + jne L(32) + jmp L(20) + + cfi_endproc + + .type __svml_datan2_cout_rare_internal,@function + .size __svml_datan2_cout_rare_internal,.-__svml_datan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_datan2_data_internal: + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_datan2_data_internal,@object + .size __svml_datan2_data_internal,2304 + .align 32 + +__datan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __datan2_la_CoutTab,@object + .size __datan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S new file mode 100644 index 0000000000..0db843a088 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core-sse.S @@ -0,0 +1,20 @@ +/* SSE version of vectorized atan2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVdN4vv_atan2 _ZGVdN4vv_atan2_sse_wrapper +#include "../svml_d_atan24_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c new file mode 100644 index 0000000000..c2e2611584 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2, vector length is 4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVdN4vv_atan2 +#include "ifunc-mathvec-avx2.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVdN4vv_atan2, __GI__ZGVdN4vv_atan2, + __redirect__ZGVdN4vv_atan2) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S new file mode 100644 index 0000000000..534913f559 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan24_core_avx2.S @@ -0,0 +1,3161 @@ +/* Function atan2 vectorized with AVX2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text + .section .text.avx2,"ax",@progbits +ENTRY(_ZGVdN4vv_atan2_avx2) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $384, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Get r0~=1/B + * Cannot be replaced by VQRCP(D, dR0, dB); + * Argument Absolute values + */ + vmovupd 1728+__svml_datan2_data_internal(%rip), %ymm5 + +/* Argument signs */ + vmovupd 1536+__svml_datan2_data_internal(%rip), %ymm4 + vmovups %ymm8, 32(%rsp) + vmovups %ymm14, 320(%rsp) + vmovups %ymm10, 160(%rsp) + vmovups %ymm9, 96(%rsp) + vmovups %ymm13, 288(%rsp) + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + vmovups 1600+__svml_datan2_data_internal(%rip), %xmm13 + vmovups %ymm12, 256(%rsp) + vmovups %ymm11, 224(%rsp) + vmovupd %ymm0, (%rsp) + vmovups %ymm15, 352(%rsp) + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + vmovapd %ymm1, %ymm8 + vandpd %ymm5, %ymm8, %ymm2 + vandpd %ymm5, %ymm0, %ymm1 + vcmpnlt_uqpd %ymm2, %ymm1, %ymm3 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vorpd %ymm4, %ymm2, %ymm6 + vblendvpd %ymm3, %ymm6, %ymm1, %ymm6 + vblendvpd %ymm3, %ymm1, %ymm2, %ymm14 + vmovupd %ymm14, 64(%rsp) + vdivpd %ymm14, %ymm6, %ymm14 + vandpd %ymm4, %ymm8, %ymm5 + vandpd %ymm4, %ymm0, %ymm7 + vandpd 64+__svml_datan2_data_internal(%rip), %ymm3, %ymm4 + vmovups 1664+__svml_datan2_data_internal(%rip), %xmm3 + +/* Check if y and x are on main path. */ + vextractf128 $1, %ymm2, %xmm9 + vextractf128 $1, %ymm1, %xmm10 + vshufps $221, %xmm9, %xmm2, %xmm11 + vshufps $221, %xmm10, %xmm1, %xmm12 + vpsubd %xmm13, %xmm11, %xmm0 + vpsubd %xmm13, %xmm12, %xmm9 + vpcmpgtd %xmm3, %xmm0, %xmm15 + vpcmpeqd %xmm3, %xmm0, %xmm6 + vpcmpgtd %xmm3, %xmm9, %xmm10 + vpcmpeqd %xmm3, %xmm9, %xmm3 + vpor %xmm6, %xmm15, %xmm11 + vpor %xmm3, %xmm10, %xmm12 + +/* Polynomial. */ + vmulpd %ymm14, %ymm14, %ymm10 + vpor %xmm12, %xmm11, %xmm3 + vmovupd 320+__svml_datan2_data_internal(%rip), %ymm9 + vmovupd 384+__svml_datan2_data_internal(%rip), %ymm12 + vmovupd 448+__svml_datan2_data_internal(%rip), %ymm15 + vmulpd %ymm10, %ymm10, %ymm11 + +/* if x<0, dPI = Pi, else dPI =0 */ + vcmple_oqpd 1792+__svml_datan2_data_internal(%rip), %ymm8, %ymm13 + vmovmskps %xmm3, %eax + vmulpd %ymm11, %ymm11, %ymm0 + vandpd __svml_datan2_data_internal(%rip), %ymm13, %ymm6 + vmovupd 256+__svml_datan2_data_internal(%rip), %ymm13 + vfmadd213pd 576+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 640+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 704+__svml_datan2_data_internal(%rip), %ymm0, %ymm15 + vfmadd213pd 512+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + vfmadd213pd 832+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 896+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 960+__svml_datan2_data_internal(%rip), %ymm0, %ymm15 + vfmadd213pd 768+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + vfmadd213pd 1088+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 1152+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 1216+__svml_datan2_data_internal(%rip), %ymm0, %ymm15 + vfmadd213pd 1024+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + vfmadd213pd 1344+__svml_datan2_data_internal(%rip), %ymm0, %ymm9 + vfmadd213pd 1408+__svml_datan2_data_internal(%rip), %ymm0, %ymm12 + vfmadd213pd 1280+__svml_datan2_data_internal(%rip), %ymm0, %ymm13 + +/* A00=1.0, account for it later VQFMA(D, dP4, dP4, dR8, dA00); */ + vmulpd %ymm15, %ymm0, %ymm0 + vfmadd213pd %ymm9, %ymm10, %ymm13 + vfmadd213pd %ymm0, %ymm10, %ymm12 + vfmadd213pd %ymm12, %ymm11, %ymm13 + +/* + * Reconstruction. + * dP=(R+R*dP) + dPIO2 + */ + vfmadd213pd %ymm14, %ymm14, %ymm13 + vaddpd %ymm13, %ymm4, %ymm14 + vorpd %ymm5, %ymm14, %ymm0 + vaddpd %ymm0, %ymm6, %ymm9 + vorpd %ymm7, %ymm9, %ymm0 + +/* Special branch for fast (vector) processing of zero arguments */ + testl %eax, %eax + jne L(7) + +L(1): +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne L(3) + +L(2): + vmovups 32(%rsp), %ymm8 + cfi_restore(91) + vmovups 96(%rsp), %ymm9 + cfi_restore(92) + vmovups 160(%rsp), %ymm10 + cfi_restore(93) + vmovups 224(%rsp), %ymm11 + cfi_restore(94) + vmovups 256(%rsp), %ymm12 + cfi_restore(95) + vmovups 288(%rsp), %ymm13 + cfi_restore(96) + vmovups 320(%rsp), %ymm14 + cfi_restore(97) + vmovups 352(%rsp), %ymm15 + cfi_restore(98) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + +L(3): + vmovupd (%rsp), %ymm1 + vmovupd %ymm8, 128(%rsp) + vmovupd %ymm0, 192(%rsp) + vmovupd %ymm1, 64(%rsp) + je L(2) + xorl %eax, %eax + vzeroupper + movq %rsi, 8(%rsp) + movq %rdi, (%rsp) + movq %r12, 24(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 16(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + movl %edx, %r13d + +L(4): + btl %r12d, %r13d + jc L(6) + +L(5): + incl %r12d + cmpl $4, %r12d + jl L(4) + movq 8(%rsp), %rsi + cfi_restore(4) + movq (%rsp), %rdi + cfi_restore(5) + movq 24(%rsp), %r12 + cfi_restore(12) + movq 16(%rsp), %r13 + cfi_restore(13) + vmovupd 192(%rsp), %ymm0 + jmp L(2) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + +L(6): + lea 64(%rsp,%r12,8), %rdi + lea 128(%rsp,%r12,8), %rsi + lea 192(%rsp,%r12,8), %rdx + call __svml_datan2_cout_rare_internal + jmp L(5) + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + +L(7): + vmovupd (%rsp), %ymm11 + +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vmovupd 1792+__svml_datan2_data_internal(%rip), %ymm10 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmpordpd %ymm8, %ymm8, %ymm12 + vcmpordpd %ymm11, %ymm11, %ymm13 + vcmpeqpd %ymm10, %ymm2, %ymm2 + vcmpeqpd %ymm10, %ymm1, %ymm1 + vandpd %ymm13, %ymm12, %ymm14 + vorpd %ymm1, %ymm2, %ymm2 + vextractf128 $1, %ymm14, %xmm15 + vextractf128 $1, %ymm2, %xmm11 + vshufps $221, %xmm15, %xmm14, %xmm9 + vshufps $221, %xmm11, %xmm2, %xmm12 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vcmpeqpd 64(%rsp), %ymm10, %ymm2 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpand %xmm9, %xmm12, %xmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandn %xmm3, %xmm1, %xmm3 + +/* Go to callout */ + vmovmskps %xmm3, %edx + +/* Set sPIO2 to zero if den. is zero */ + vblendvpd %ymm2, %ymm10, %ymm4, %ymm4 + vorpd %ymm5, %ymm4, %ymm5 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vextractf128 $1, %ymm10, %xmm2 + vextractf128 $1, %ymm8, %xmm3 + vshufps $221, %xmm2, %xmm10, %xmm4 + vshufps $221, %xmm3, %xmm8, %xmm9 + vpcmpgtd %xmm9, %xmm4, %xmm12 + vpshufd $80, %xmm12, %xmm11 + vpshufd $250, %xmm12, %xmm13 + vinsertf128 $1, %xmm13, %ymm11, %ymm14 + vandpd %ymm6, %ymm14, %ymm6 + vaddpd %ymm6, %ymm5, %ymm2 + vorpd %ymm7, %ymm2, %ymm2 + +/* Merge results from main and spec path */ + vpshufd $80, %xmm1, %xmm7 + vpshufd $250, %xmm1, %xmm1 + vinsertf128 $1, %xmm1, %ymm7, %ymm3 + vblendvpd %ymm3, %ymm2, %ymm0, %ymm0 + jmp L(1) + +END(_ZGVdN4vv_atan2_avx2) + + .align 16,0x90 + +__svml_datan2_cout_rare_internal: + + cfi_startproc + + movq %rdx, %rcx + movsd 1888+__datan2_la_CoutTab(%rip), %xmm1 + movsd (%rdi), %xmm2 + movsd (%rsi), %xmm0 + mulsd %xmm1, %xmm2 + mulsd %xmm0, %xmm1 + movsd %xmm2, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -42(%rsp), %r9d + andl $32752, %r9d + movb -33(%rsp), %al + movzwl -34(%rsp), %r8d + andb $-128, %al + andl $32752, %r8d + shrl $4, %r9d + movb -41(%rsp), %dl + shrb $7, %dl + shrb $7, %al + shrl $4, %r8d + cmpl $2047, %r9d + je L(31) + cmpl $2047, %r8d + je L(25) + testl %r9d, %r9d + jne L(8) + testl $1048575, -44(%rsp) + jne L(8) + cmpl $0, -48(%rsp) + je L(19) + +L(8): + testl %r8d, %r8d + jne L(9) + testl $1048575, -36(%rsp) + jne L(9) + cmpl $0, -40(%rsp) + je L(18) + +L(9): + negl %r8d + movsd %xmm2, -48(%rsp) + addl %r9d, %r8d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r8d + jle L(16) + cmpl $54, %r8d + jge L(15) + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %al, %al + jne L(10) + movsd 1976+__datan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp L(11) + +L(10): + movsd 1936+__datan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + +L(11): + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %r9d, %r9d + jle L(24) + cmpl $2046, %r9d + jge L(12) + andl $-32753, %esi + addl $-1023, %r9d + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp L(13) + +L(12): + movsd 1992+__datan2_la_CoutTab(%rip), %xmm3 + movl $1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +L(13): + negl %r9d + addl $1023, %r9d + andl $2047, %r9d + movzwl 1894+__datan2_la_CoutTab(%rip), %esi + movsd 1888+__datan2_la_CoutTab(%rip), %xmm3 + andl $-32753, %esi + shll $4, %r9d + movsd %xmm3, -40(%rsp) + orl %r9d, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm4 + mulsd %xmm4, %xmm2 + comisd 1880+__datan2_la_CoutTab(%rip), %xmm5 + jb L(14) + movsd 2000+__datan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %r8d + movl %r8d, %r9d + andl $-524288, %r8d + andl $-1048576, %r9d + addl $262144, %r8d + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %r8d + movsd -72(%rsp), %xmm4 + orl %r8d, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__datan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %edi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %edi + movsd -72(%rsp), %xmm6 + shrl $4, %edi + subsd %xmm6, %xmm5 + movl -12(%rsp), %esi + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %esi + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %edi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %esi, %edi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %edi + movsd -64(%rsp), %xmm15 + movl $113, %esi + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %edi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %edi + movsd -56(%rsp), %xmm7 + cmovl %edi, %esi + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %esi, %esi + movsd -64(%rsp), %xmm12 + lea __datan2_la_CoutTab(%rip), %rdi + movsd -56(%rsp), %xmm5 + movslq %esi, %rsi + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rdi,%rsi,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rdi,%rsi,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %r8 + movq %r8, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %r8 + addsd -32(%rsp), %xmm2 + shlb $7, %dl + addsd 8(%rdi,%rsi,8), %xmm2 + movb %al, %sil + andb $127, %r8b + shlb $7, %sil + movsd %xmm2, -32(%rsp) + orb %sil, %r8b + movb %r8b, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %dil + movb %dil, %r9b + shrb $7, %dil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %dil, %al + andb $127, %r9b + shlb $7, %al + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm12 + movq %rax, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %dl, %r10b + movb %r10b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(14): + movsd -48(%rsp), %xmm12 + movb %al, %r8b + movaps %xmm12, %xmm7 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm7 + shlb $7, %r8b + shlb $7, %dl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rsi + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rsi, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rdi + movsd -56(%rsp), %xmm15 + movq %rdi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rdi + addsd %xmm5, %xmm4 + andb $127, %dil + orb %r8b, %dil + movb %dil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %al + andb $127, %r10b + shlb $7, %al + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %dl, %r11b + movb %r11b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(15): + cmpl $74, %r8d + jge L(32) + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + subsd %xmm1, %xmm0 + addsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(16): + testb %al, %al + jne L(22) + movb %dil, -41(%rsp) + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + movsd %xmm2, -24(%rsp) + movzwl -18(%rsp), %eax + testl $32752, %eax + je L(17) + movsd 1888+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(17): + mulsd %xmm2, %xmm2 + shlb $7, %dl + movsd %xmm2, -72(%rsp) + movsd -72(%rsp), %xmm0 + addsd -24(%rsp), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(18): + testl %r9d, %r9d + jne L(32) + testl $1048575, -44(%rsp) + jne L(32) + jmp L(33) + +L(19): + jne L(32) + +L(20): + testb %al, %al + jne L(22) + +L(21): + shlb $7, %dl + movq 1976+__datan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(22): + movsd 1936+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + +L(23): + xorl %eax, %eax + ret + +L(24): + movsd 1984+__datan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp L(13) + +L(25): + cmpl $2047, %r9d + je L(31) + +L(26): + testl $1048575, -36(%rsp) + jne L(27) + cmpl $0, -40(%rsp) + je L(28) + +L(27): + addsd %xmm1, %xmm2 + movsd %xmm2, (%rcx) + jmp L(23) + +L(28): + cmpl $2047, %r9d + je L(29) + testb %al, %al + je L(21) + jmp L(22) + +L(29): + testb %al, %al + jne L(30) + movsd 1904+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1912+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(30): + movsd 1952+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1960+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(31): + testl $1048575, -44(%rsp) + jne L(27) + cmpl $0, -48(%rsp) + jne L(27) + cmpl $2047, %r8d + je L(26) + +L(32): + movsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(33): + cmpl $0, -48(%rsp) + jne L(32) + jmp L(20) + + cfi_endproc + + .type __svml_datan2_cout_rare_internal,@function + .size __svml_datan2_cout_rare_internal,.-__svml_datan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_datan2_data_internal: + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_datan2_data_internal,@object + .size __svml_datan2_data_internal,2304 + .align 32 + +__datan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __datan2_la_CoutTab,@object + .size __datan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S new file mode 100644 index 0000000000..a8d34a6143 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core-avx2.S @@ -0,0 +1,20 @@ +/* AVX2 version of vectorized atan2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVeN8vv_atan2 _ZGVeN8vv_atan2_avx2_wrapper +#include "../svml_d_atan28_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c new file mode 100644 index 0000000000..a0897e9cf0 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2, vector length is 8. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVeN8vv_atan2 +#include "ifunc-mathvec-avx512-skx.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVeN8vv_atan2, __GI__ZGVeN8vv_atan2, + __redirect__ZGVeN8vv_atan2) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S new file mode 100644 index 0000000000..e14a5ab255 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_d_atan28_core_avx512.S @@ -0,0 +1,2311 @@ +/* Function atan2 vectorized with AVX-512. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text + .section .text.evex512,"ax",@progbits +ENTRY(_ZGVeN8vv_atan2_skx) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Get r0~=1/B + * Cannot be replaced by VQRCP(D, dR0, dB); + * Argument Absolute values + */ + vmovups 1728+__svml_datan2_data_internal(%rip), %zmm4 + +/* Argument signs */ + vmovups 1536+__svml_datan2_data_internal(%rip), %zmm6 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vmovups 64+__svml_datan2_data_internal(%rip), %zmm3 + vandpd %zmm4, %zmm0, %zmm11 + vmovaps %zmm1, %zmm7 + vandpd %zmm4, %zmm7, %zmm2 + vandpd %zmm6, %zmm7, %zmm5 + vandpd %zmm6, %zmm0, %zmm4 + vorpd %zmm6, %zmm2, %zmm12 + vcmppd $17, {sae}, %zmm2, %zmm11, %k1 + vmovdqu 1664+__svml_datan2_data_internal(%rip), %ymm6 + vmovups %zmm11, 64(%rsp) + +/* Check if y and x are on main path. */ + vpsrlq $32, %zmm2, %zmm9 + vblendmpd %zmm11, %zmm12, %zmm13{%k1} + vblendmpd %zmm2, %zmm11, %zmm15{%k1} + vpsrlq $32, %zmm11, %zmm8 + vmovdqu 1600+__svml_datan2_data_internal(%rip), %ymm12 + vdivpd {rn-sae}, %zmm15, %zmm13, %zmm1 + vmovups %zmm15, (%rsp) + vpmovqd %zmm9, %ymm14 + vpmovqd %zmm8, %ymm10 + vxorpd %zmm3, %zmm3, %zmm3{%k1} + vpsubd %ymm12, %ymm14, %ymm13 + vpsubd %ymm12, %ymm10, %ymm9 + +/* Polynomial. */ + vmulpd {rn-sae}, %zmm1, %zmm1, %zmm12 + vpcmpgtd %ymm6, %ymm13, %ymm15 + vpcmpeqd %ymm6, %ymm13, %ymm11 + vmulpd {rn-sae}, %zmm12, %zmm12, %zmm13 + vpor %ymm11, %ymm15, %ymm8 + vmovups 256+__svml_datan2_data_internal(%rip), %zmm11 + vmovups 512+__svml_datan2_data_internal(%rip), %zmm15 + vpcmpgtd %ymm6, %ymm9, %ymm14 + vpcmpeqd %ymm6, %ymm9, %ymm6 + vpor %ymm6, %ymm14, %ymm10 + vmulpd {rn-sae}, %zmm13, %zmm13, %zmm14 + vmovups 320+__svml_datan2_data_internal(%rip), %zmm9 + vpor %ymm10, %ymm8, %ymm6 + vmovups 384+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd231pd {rn-sae}, %zmm14, %zmm11, %zmm15 + vmovups 576+__svml_datan2_data_internal(%rip), %zmm11 + vmovups 704+__svml_datan2_data_internal(%rip), %zmm8 + vfmadd231pd {rn-sae}, %zmm14, %zmm9, %zmm11 + vmovups 640+__svml_datan2_data_internal(%rip), %zmm9 + vfmadd231pd {rn-sae}, %zmm14, %zmm10, %zmm9 + vmovups 448+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd231pd {rn-sae}, %zmm14, %zmm10, %zmm8 + vmovups 768+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm15 + vmovups 832+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm11 + vmovups 896+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm9 + vmovups 960+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm8 + vmovups 1024+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm15 + vmovups 1088+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm11 + vmovups 1152+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm9 + vmovups 1216+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm8 + vmovups 1280+__svml_datan2_data_internal(%rip), %zmm10 + +/* A00=1.0, account for it later VQFMA(D, dP4, dP4, dR8, dA00); */ + vmulpd {rn-sae}, %zmm14, %zmm8, %zmm8 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm15 + vmovups 1344+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm11 + vmovups 1408+__svml_datan2_data_internal(%rip), %zmm10 + vfmadd213pd {rn-sae}, %zmm11, %zmm12, %zmm15 + vfmadd213pd {rn-sae}, %zmm10, %zmm14, %zmm9 + vfmadd213pd {rn-sae}, %zmm8, %zmm12, %zmm9 + vmovups __svml_datan2_data_internal(%rip), %zmm8 + vfmadd213pd {rn-sae}, %zmm9, %zmm13, %zmm15 + +/* + * Reconstruction. + * dP=(R+R*dP) + dPIO2 + */ + vfmadd213pd {rn-sae}, %zmm1, %zmm1, %zmm15 + vaddpd {rn-sae}, %zmm3, %zmm15, %zmm1 + vorpd %zmm5, %zmm1, %zmm9 + +/* if x<0, dPI = Pi, else dPI =0 */ + vmovups 1792+__svml_datan2_data_internal(%rip), %zmm1 + vcmppd $18, {sae}, %zmm1, %zmm7, %k2 + vaddpd {rn-sae}, %zmm8, %zmm9, %zmm9{%k2} + vmovmskps %ymm6, %eax + vorpd %zmm4, %zmm9, %zmm11 + +/* Special branch for fast (vector) processing of zero arguments */ + vmovups 64(%rsp), %zmm9 + testl %eax, %eax + jne L(7) + +L(1): +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne L(3) + +L(2): + vmovaps %zmm11, %zmm0 + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + +L(3): + vmovups %zmm0, 64(%rsp) + vmovups %zmm7, 128(%rsp) + vmovups %zmm11, 192(%rsp) + je L(2) + xorl %eax, %eax + vzeroupper + kmovw %k4, 24(%rsp) + kmovw %k5, 16(%rsp) + kmovw %k6, 8(%rsp) + kmovw %k7, (%rsp) + movq %rsi, 40(%rsp) + movq %rdi, 32(%rsp) + movq %r12, 56(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 48(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +L(4): + btl %r12d, %r13d + jc L(6) + +L(5): + incl %r12d + cmpl $8, %r12d + jl L(4) + kmovw 24(%rsp), %k4 + cfi_restore(122) + kmovw 16(%rsp), %k5 + cfi_restore(123) + kmovw 8(%rsp), %k6 + cfi_restore(124) + kmovw (%rsp), %k7 + cfi_restore(125) + vmovups 192(%rsp), %zmm11 + movq 40(%rsp), %rsi + cfi_restore(4) + movq 32(%rsp), %rdi + cfi_restore(5) + movq 56(%rsp), %r12 + cfi_restore(12) + movq 48(%rsp), %r13 + cfi_restore(13) + jmp L(2) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +L(6): + lea 64(%rsp,%r12,8), %rdi + lea 128(%rsp,%r12,8), %rsi + lea 192(%rsp,%r12,8), %rdx + call __svml_datan2_cout_rare_internal + jmp L(5) + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + cfi_restore(122) + cfi_restore(123) + cfi_restore(124) + cfi_restore(125) + +L(7): +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmppd $3, {sae}, %zmm7, %zmm7, %k1 + vcmppd $3, {sae}, %zmm0, %zmm0, %k2 + +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vmovups 1792+__svml_datan2_data_internal(%rip), %zmm8 + vpbroadcastq .FLT_31(%rip), %zmm10 + vcmppd $4, {sae}, %zmm8, %zmm2, %k3 + vmovaps %zmm10, %zmm12 + vmovaps %zmm10, %zmm15 + vmovaps %zmm10, %zmm13 + vpandnq %zmm7, %zmm7, %zmm12{%k1} + vcmppd $4, {sae}, %zmm8, %zmm9, %k1 + vpandnq %zmm2, %zmm2, %zmm15{%k3} + vmovaps %zmm10, %zmm2 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vpcmpgtq %zmm7, %zmm8, %k3 + vpandnq %zmm0, %zmm0, %zmm13{%k2} + vpandnq %zmm9, %zmm9, %zmm2{%k1} + vandpd %zmm13, %zmm12, %zmm14 + vorpd %zmm2, %zmm15, %zmm9 + vpsrlq $32, %zmm14, %zmm1 + vpsrlq $32, %zmm9, %zmm2 + vpmovqd %zmm1, %ymm1 + vpmovqd %zmm2, %ymm9 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpand %ymm1, %ymm9, %ymm2 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vmovups (%rsp), %zmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandn %ymm6, %ymm2, %ymm6 + vcmppd $4, {sae}, %zmm8, %zmm1, %k2 + +/* Go to callout */ + vmovmskps %ymm6, %edx + vpandnq %zmm1, %zmm1, %zmm10{%k2} + +/* Set sPIO2 to zero if den. is zero */ + vpandnq %zmm3, %zmm10, %zmm3 + vpandq %zmm10, %zmm8, %zmm1 + vporq %zmm1, %zmm3, %zmm3 + vorpd %zmm5, %zmm3, %zmm1 + vmovups __svml_datan2_data_internal(%rip), %zmm5 + vaddpd {rn-sae}, %zmm5, %zmm1, %zmm1{%k3} + vorpd %zmm4, %zmm1, %zmm1 + +/* Merge results from main and spec path */ + vpmovzxdq %ymm2, %zmm4 + vpsllq $32, %zmm4, %zmm2 + vpord %zmm2, %zmm4, %zmm3 + vpandnq %zmm11, %zmm3, %zmm11 + vpandq %zmm3, %zmm1, %zmm1 + vporq %zmm1, %zmm11, %zmm11 + jmp L(1) + +END(_ZGVeN8vv_atan2_skx) + + .align 16,0x90 + +__svml_datan2_cout_rare_internal: + + cfi_startproc + + movq %rdx, %rcx + movsd 1888+__datan2_la_CoutTab(%rip), %xmm1 + movsd (%rdi), %xmm2 + movsd (%rsi), %xmm0 + mulsd %xmm1, %xmm2 + mulsd %xmm0, %xmm1 + movsd %xmm2, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -42(%rsp), %r9d + andl $32752, %r9d + movb -33(%rsp), %al + movzwl -34(%rsp), %r8d + andb $-128, %al + andl $32752, %r8d + shrl $4, %r9d + movb -41(%rsp), %dl + shrb $7, %dl + shrb $7, %al + shrl $4, %r8d + cmpl $2047, %r9d + je L(31) + cmpl $2047, %r8d + je L(25) + testl %r9d, %r9d + jne L(8) + testl $1048575, -44(%rsp) + jne L(8) + cmpl $0, -48(%rsp) + je L(19) + +L(8): + testl %r8d, %r8d + jne L(9) + testl $1048575, -36(%rsp) + jne L(9) + cmpl $0, -40(%rsp) + je L(18) + +L(9): + negl %r8d + movsd %xmm2, -48(%rsp) + addl %r9d, %r8d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r8d + jle L(16) + cmpl $54, %r8d + jge L(15) + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %al, %al + jne L(10) + movsd 1976+__datan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp L(11) + +L(10): + movsd 1936+__datan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + +L(11): + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %r9d, %r9d + jle L(24) + cmpl $2046, %r9d + jge L(12) + andl $-32753, %esi + addl $-1023, %r9d + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp L(13) + +L(12): + movsd 1992+__datan2_la_CoutTab(%rip), %xmm3 + movl $1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +L(13): + negl %r9d + addl $1023, %r9d + andl $2047, %r9d + movzwl 1894+__datan2_la_CoutTab(%rip), %esi + movsd 1888+__datan2_la_CoutTab(%rip), %xmm3 + andl $-32753, %esi + shll $4, %r9d + movsd %xmm3, -40(%rsp) + orl %r9d, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm4 + mulsd %xmm4, %xmm2 + comisd 1880+__datan2_la_CoutTab(%rip), %xmm5 + jb L(14) + movsd 2000+__datan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %r8d + movl %r8d, %r9d + andl $-524288, %r8d + andl $-1048576, %r9d + addl $262144, %r8d + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %r8d + movsd -72(%rsp), %xmm4 + orl %r8d, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__datan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %edi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %edi + movsd -72(%rsp), %xmm6 + shrl $4, %edi + subsd %xmm6, %xmm5 + movl -12(%rsp), %esi + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %esi + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %edi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %esi, %edi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %edi + movsd -64(%rsp), %xmm15 + movl $113, %esi + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %edi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %edi + movsd -56(%rsp), %xmm7 + cmovl %edi, %esi + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %esi, %esi + movsd -64(%rsp), %xmm12 + lea __datan2_la_CoutTab(%rip), %rdi + movsd -56(%rsp), %xmm5 + movslq %esi, %rsi + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rdi,%rsi,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rdi,%rsi,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %r8 + movq %r8, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %r8 + addsd -32(%rsp), %xmm2 + shlb $7, %dl + addsd 8(%rdi,%rsi,8), %xmm2 + movb %al, %sil + andb $127, %r8b + shlb $7, %sil + movsd %xmm2, -32(%rsp) + orb %sil, %r8b + movb %r8b, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %dil + movb %dil, %r9b + shrb $7, %dil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %dil, %al + andb $127, %r9b + shlb $7, %al + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm12 + movq %rax, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %dl, %r10b + movb %r10b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(14): + movsd -48(%rsp), %xmm12 + movb %al, %r8b + movaps %xmm12, %xmm7 + mulsd 2000+__datan2_la_CoutTab(%rip), %xmm7 + shlb $7, %r8b + shlb $7, %dl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__datan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rsi + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rsi, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__datan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__datan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__datan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__datan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__datan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__datan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rdi + movsd -56(%rsp), %xmm15 + movq %rdi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rdi + addsd %xmm5, %xmm4 + andb $127, %dil + orb %r8b, %dil + movb %dil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %al + andb $127, %r10b + shlb $7, %al + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %al, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rax + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %dl, %r11b + movb %r11b, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(15): + cmpl $74, %r8d + jge L(32) + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + subsd %xmm1, %xmm0 + addsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(16): + testb %al, %al + jne L(22) + movb %dil, -41(%rsp) + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + movsd %xmm2, -24(%rsp) + movzwl -18(%rsp), %eax + testl $32752, %eax + je L(17) + movsd 1888+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(17): + mulsd %xmm2, %xmm2 + shlb $7, %dl + movsd %xmm2, -72(%rsp) + movsd -72(%rsp), %xmm0 + addsd -24(%rsp), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(18): + testl %r9d, %r9d + jne L(32) + testl $1048575, -44(%rsp) + jne L(32) + jmp L(33) + +L(19): + jne L(32) + +L(20): + testb %al, %al + jne L(22) + +L(21): + shlb $7, %dl + movq 1976+__datan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(22): + movsd 1936+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1944+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + +L(23): + xorl %eax, %eax + ret + +L(24): + movsd 1984+__datan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %r9d + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp L(13) + +L(25): + cmpl $2047, %r9d + je L(31) + +L(26): + testl $1048575, -36(%rsp) + jne L(27) + cmpl $0, -40(%rsp) + je L(28) + +L(27): + addsd %xmm1, %xmm2 + movsd %xmm2, (%rcx) + jmp L(23) + +L(28): + cmpl $2047, %r9d + je L(29) + testb %al, %al + je L(21) + jmp L(22) + +L(29): + testb %al, %al + jne L(30) + movsd 1904+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1912+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(30): + movsd 1952+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1960+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(31): + testl $1048575, -44(%rsp) + jne L(27) + cmpl $0, -48(%rsp) + jne L(27) + cmpl $2047, %r8d + je L(26) + +L(32): + movsd 1920+__datan2_la_CoutTab(%rip), %xmm0 + shlb $7, %dl + addsd 1928+__datan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %dl, %al + movb %al, -17(%rsp) + movq -24(%rsp), %rdx + movq %rdx, (%rcx) + jmp L(23) + +L(33): + cmpl $0, -48(%rsp) + jne L(32) + jmp L(20) + + cfi_endproc + + .type __svml_datan2_cout_rare_internal,@function + .size __svml_datan2_cout_rare_internal,.-__svml_datan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_datan2_data_internal: + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1074340347 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 1413754136 + .long 1073291771 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 17919630 + .long 3202334474 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 350522012 + .long 1058555694 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 934004643 + .long 3203726773 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 912675337 + .long 1059908874 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2476035107 + .long 3209881212 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 2927800243 + .long 1064262173 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1636715437 + .long 3213013740 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 1712395941 + .long 1066487628 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 2961307292 + .long 3214564995 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 213298511 + .long 1067542936 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3848520124 + .long 3215257506 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3159386171 + .long 1067969551 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3936393556 + .long 3215643233 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 3177262543 + .long 1068373833 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 9713120 + .long 3216052356 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 1227445841 + .long 1068740906 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 163240596 + .long 3216459216 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 133682613 + .long 1069314503 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2448315847 + .long 3217180964 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 2576870964 + .long 1070176665 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 1431655365 + .long 3218429269 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 0 + .long 2147483648 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 2150629376 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4258267136 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 4294967295 + .long 2147483647 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 4293918720 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 2145386496 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 8388607 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 133169152 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 4294967295 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .long 0 + .long 1072693248 + .type __svml_datan2_data_internal,@object + .size __svml_datan2_data_internal,2304 + .align 32 + +__datan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __datan2_la_CoutTab,@object + .size __datan2_la_CoutTab,2008 + .align 8 + +.FLT_31: + .long 0xffffffff,0xffffffff + .type .FLT_31,@object + .size .FLT_31,8 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S new file mode 100644 index 0000000000..a2a76e8bfd --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core-avx2.S @@ -0,0 +1,20 @@ +/* AVX2 version of vectorized atan2f. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVeN16vv_atan2f _ZGVeN16vv_atan2f_avx2_wrapper +#include "../svml_s_atan2f16_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c new file mode 100644 index 0000000000..6fa806414d --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2f, vector length is 16. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVeN16vv_atan2f +#include "ifunc-mathvec-avx512-skx.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVeN16vv_atan2f, __GI__ZGVeN16vv_atan2f, + __redirect__ZGVeN16vv_atan2f) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S new file mode 100644 index 0000000000..d0ae280b48 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f16_core_avx512.S @@ -0,0 +1,1998 @@ +/* Function atan2f vectorized with AVX-512. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text + .section .text.exex512,"ax",@progbits +ENTRY(_ZGVeN16vv_atan2f_skx) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Arguments signs + */ + vmovups 256+__svml_satan2_data_internal(%rip), %zmm6 + vmovups 64+__svml_satan2_data_internal(%rip), %zmm3 + +/* Testing on working interval. */ + vmovups 1024+__svml_satan2_data_internal(%rip), %zmm9 + vmovups 1088+__svml_satan2_data_internal(%rip), %zmm14 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vmovups 320+__svml_satan2_data_internal(%rip), %zmm4 + vpternlogd $255, %zmm13, %zmm13, %zmm13 + vmovaps %zmm1, %zmm8 + vandps %zmm6, %zmm8, %zmm2 + vandps %zmm6, %zmm0, %zmm1 + vorps 192+__svml_satan2_data_internal(%rip), %zmm2, %zmm5 + vpsubd %zmm9, %zmm2, %zmm10 + vpsubd %zmm9, %zmm1, %zmm12 + vxorps %zmm2, %zmm8, %zmm7 + vxorps %zmm1, %zmm0, %zmm6 + vcmpps $17, {sae}, %zmm2, %zmm1, %k1 + vpcmpgtd %zmm10, %zmm14, %k2 + vpcmpgtd %zmm12, %zmm14, %k3 + vmovups 576+__svml_satan2_data_internal(%rip), %zmm14 + vblendmps %zmm1, %zmm5, %zmm11{%k1} + vblendmps %zmm2, %zmm1, %zmm5{%k1} + vxorps %zmm4, %zmm4, %zmm4{%k1} + +/* + * Division a/b. + * Enabled when FMA is available and + * performance is better with NR iteration + */ + vrcp14ps %zmm5, %zmm15 + vfnmadd231ps {rn-sae}, %zmm5, %zmm15, %zmm3 + vfmadd213ps {rn-sae}, %zmm15, %zmm3, %zmm15 + vmulps {rn-sae}, %zmm15, %zmm11, %zmm3 + vfnmadd231ps {rn-sae}, %zmm5, %zmm3, %zmm11 + vfmadd213ps {rn-sae}, %zmm3, %zmm11, %zmm15 + vmovups 448+__svml_satan2_data_internal(%rip), %zmm11 + vpternlogd $255, %zmm3, %zmm3, %zmm3 + +/* Polynomial. */ + vmulps {rn-sae}, %zmm15, %zmm15, %zmm9 + vpandnd %zmm10, %zmm10, %zmm13{%k2} + vmulps {rn-sae}, %zmm9, %zmm9, %zmm10 + vfmadd231ps {rn-sae}, %zmm10, %zmm11, %zmm14 + vmovups 640+__svml_satan2_data_internal(%rip), %zmm11 + vpandnd %zmm12, %zmm12, %zmm3{%k3} + vpord %zmm3, %zmm13, %zmm3 + vmovups 704+__svml_satan2_data_internal(%rip), %zmm13 + vmovups 512+__svml_satan2_data_internal(%rip), %zmm12 + vptestmd %zmm3, %zmm3, %k0 + vfmadd213ps {rn-sae}, %zmm13, %zmm10, %zmm14 + vfmadd231ps {rn-sae}, %zmm10, %zmm12, %zmm11 + vmovups 768+__svml_satan2_data_internal(%rip), %zmm12 + vmovups 832+__svml_satan2_data_internal(%rip), %zmm13 + +/* Special branch for fast (vector) processing of zero arguments */ + kortestw %k0, %k0 + vfmadd213ps {rn-sae}, %zmm12, %zmm10, %zmm11 + vmovups 896+__svml_satan2_data_internal(%rip), %zmm12 + vfmadd213ps {rn-sae}, %zmm13, %zmm10, %zmm14 + vmovups 960+__svml_satan2_data_internal(%rip), %zmm13 + vfmadd213ps {rn-sae}, %zmm12, %zmm10, %zmm11 + vfmadd213ps {rn-sae}, %zmm13, %zmm10, %zmm14 + vfmadd213ps {rn-sae}, %zmm14, %zmm9, %zmm11 + +/* Reconstruction. */ + vfmadd213ps {rn-sae}, %zmm4, %zmm15, %zmm11 + +/* if x<0, sPI = Pi, else sPI =0 */ + vmovups __svml_satan2_data_internal(%rip), %zmm15 + vorps %zmm7, %zmm11, %zmm9 + vcmpps $18, {sae}, %zmm15, %zmm8, %k1 + vmovups 384+__svml_satan2_data_internal(%rip), %zmm11 + vaddps {rn-sae}, %zmm11, %zmm9, %zmm9{%k1} + vorps %zmm6, %zmm9, %zmm10 + jne L(7) + +L(1): +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne L(3) + +L(2): + vmovaps %zmm10, %zmm0 + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + +L(3): + vmovups %zmm0, 64(%rsp) + vmovups %zmm8, 128(%rsp) + vmovups %zmm10, 192(%rsp) + je L(2) + xorl %eax, %eax + vzeroupper + kmovw %k4, 24(%rsp) + kmovw %k5, 16(%rsp) + kmovw %k6, 8(%rsp) + kmovw %k7, (%rsp) + movq %rsi, 40(%rsp) + movq %rdi, 32(%rsp) + movq %r12, 56(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 48(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +L(4): + btl %r12d, %r13d + jc L(6) + +L(5): + incl %r12d + cmpl $16, %r12d + jl L(4) + kmovw 24(%rsp), %k4 + cfi_restore(122) + kmovw 16(%rsp), %k5 + cfi_restore(123) + kmovw 8(%rsp), %k6 + cfi_restore(124) + kmovw (%rsp), %k7 + cfi_restore(125) + vmovups 192(%rsp), %zmm10 + movq 40(%rsp), %rsi + cfi_restore(4) + movq 32(%rsp), %rdi + cfi_restore(5) + movq 56(%rsp), %r12 + cfi_restore(12) + movq 48(%rsp), %r13 + cfi_restore(13) + jmp L(2) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x38, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfa, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x08, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xfd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +L(6): + lea 64(%rsp,%r12,4), %rdi + lea 128(%rsp,%r12,4), %rsi + lea 192(%rsp,%r12,4), %rdx + call __svml_satan2_cout_rare_internal + jmp L(5) + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + cfi_restore(122) + cfi_restore(123) + cfi_restore(124) + cfi_restore(125) + +L(7): +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vmovups __svml_satan2_data_internal(%rip), %zmm9 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmpps $3, {sae}, %zmm8, %zmm8, %k1 + vcmpps $3, {sae}, %zmm0, %zmm0, %k2 + vpcmpd $4, %zmm9, %zmm2, %k3 + vpternlogd $255, %zmm12, %zmm12, %zmm12 + vpternlogd $255, %zmm13, %zmm13, %zmm13 + vpternlogd $255, %zmm14, %zmm14, %zmm14 + vpandnd %zmm8, %zmm8, %zmm12{%k1} + vpcmpd $4, %zmm9, %zmm1, %k1 + vpandnd %zmm0, %zmm0, %zmm13{%k2} + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vcmpps $4, {sae}, %zmm9, %zmm5, %k2 + vandps %zmm13, %zmm12, %zmm12 + vpandnd %zmm2, %zmm2, %zmm14{%k3} + vpternlogd $255, %zmm2, %zmm2, %zmm2 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vpcmpgtd %zmm8, %zmm9, %k3 + vpandnd %zmm1, %zmm1, %zmm2{%k1} + vpord %zmm2, %zmm14, %zmm15 + vpternlogd $255, %zmm2, %zmm2, %zmm2 + vpandnd %zmm5, %zmm5, %zmm2{%k2} + +/* Set sPIO2 to zero if den. is zero */ + vpandnd %zmm4, %zmm2, %zmm4 + vpandd %zmm2, %zmm9, %zmm5 + vpord %zmm5, %zmm4, %zmm2 + vorps %zmm7, %zmm2, %zmm7 + vaddps {rn-sae}, %zmm11, %zmm7, %zmm7{%k3} + vorps %zmm6, %zmm7, %zmm6 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpandd %zmm12, %zmm15, %zmm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandnd %zmm3, %zmm1, %zmm3 + +/* Go to callout */ + vptestmd %zmm3, %zmm3, %k0 + kmovw %k0, %edx + +/* Merge results from main and spec path */ + vpandnd %zmm10, %zmm1, %zmm10 + vpandd %zmm1, %zmm6, %zmm11 + vpord %zmm11, %zmm10, %zmm10 + jmp L(1) + +END(_ZGVeN16vv_atan2f_skx) + + .align 16,0x90 + +__svml_satan2_cout_rare_internal: + + cfi_startproc + + pxor %xmm0, %xmm0 + movss (%rdi), %xmm3 + pxor %xmm1, %xmm1 + movss (%rsi), %xmm2 + movq %rdx, %r8 + cvtss2sd %xmm3, %xmm0 + cvtss2sd %xmm2, %xmm1 + movss %xmm3, -32(%rsp) + movss %xmm2, -28(%rsp) + movsd %xmm0, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -30(%rsp), %edi + andl $32640, %edi + movb -25(%rsp), %dl + movzwl -42(%rsp), %eax + andb $-128, %dl + movzwl -34(%rsp), %r9d + andl $32752, %eax + andl $32752, %r9d + shrl $7, %edi + movb -29(%rsp), %cl + shrb $7, %cl + shrb $7, %dl + shrl $4, %eax + shrl $4, %r9d + cmpl $255, %edi + je L(25) + movzwl -26(%rsp), %esi + andl $32640, %esi + cmpl $32640, %esi + je L(25) + testl %eax, %eax + jne L(8) + testl $8388607, -32(%rsp) + je L(20) + +L(8): + testl %r9d, %r9d + jne L(9) + testl $8388607, -28(%rsp) + je L(19) + +L(9): + negl %r9d + movsd %xmm0, -48(%rsp) + addl %eax, %r9d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r9d + jle L(17) + cmpl $54, %r9d + jge L(15) + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %dl, %dl + jne L(10) + movsd 1976+__satan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp L(11) + +L(10): + movsd 1936+__satan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + +L(11): + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %eax, %eax + jle L(24) + cmpl $2046, %eax + jge L(12) + andl $-32753, %esi + addl $-1023, %eax + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp L(13) + +L(12): + movsd 1992+__satan2_la_CoutTab(%rip), %xmm3 + movl $1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +L(13): + negl %eax + movq 1888+__satan2_la_CoutTab(%rip), %rsi + addl $1023, %eax + movq %rsi, -40(%rsp) + andl $2047, %eax + shrq $48, %rsi + shll $4, %eax + andl $-32753, %esi + orl %eax, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm3 + mulsd %xmm3, %xmm2 + comisd 1880+__satan2_la_CoutTab(%rip), %xmm5 + jb L(14) + movsd 2000+__satan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %edi + movl %edi, %r9d + andl $-524288, %edi + andl $-1048576, %r9d + addl $262144, %edi + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %edi + movsd -72(%rsp), %xmm4 + orl %edi, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__satan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %esi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %esi + movsd -72(%rsp), %xmm6 + shrl $4, %esi + subsd %xmm6, %xmm5 + movl -12(%rsp), %eax + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %eax + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %esi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %eax, %esi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %esi + movsd -64(%rsp), %xmm15 + movl $113, %eax + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %esi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %esi + movsd -56(%rsp), %xmm7 + cmovl %esi, %eax + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %eax, %eax + movsd -64(%rsp), %xmm12 + lea __satan2_la_CoutTab(%rip), %rsi + movsd -56(%rsp), %xmm5 + movslq %eax, %rax + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rsi,%rax,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rsi,%rax,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %rdi + movq %rdi, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %rdi + addsd -32(%rsp), %xmm2 + shlb $7, %cl + addsd 8(%rsi,%rax,8), %xmm2 + movb %dl, %al + andb $127, %dil + shlb $7, %al + movsd %xmm2, -32(%rsp) + orb %al, %dil + movb %dil, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %sil + movb %sil, %r9b + shrb $7, %sil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %sil, %dl + andb $127, %r9b + shlb $7, %dl + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm12 + movq %rdx, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %cl, %r10b + movb %r10b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(14): + movsd -48(%rsp), %xmm12 + movb %dl, %dil + movaps %xmm12, %xmm7 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm7 + shlb $7, %dil + shlb $7, %cl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rax + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rsi + movsd -56(%rsp), %xmm15 + movq %rsi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rsi + addsd %xmm5, %xmm4 + andb $127, %sil + orb %dil, %sil + movb %sil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %dl + andb $127, %r10b + shlb $7, %dl + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm3 + movq %rdx, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %cl, %r11b + movb %r11b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(15): + cmpl $74, %r9d + jge L(16) + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + subsd %xmm1, %xmm0 + addsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(16): + movsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(17): + testb %dl, %dl + jne L(22) + movb %dil, -41(%rsp) + pxor %xmm0, %xmm0 + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm0 + movss %xmm0, -8(%rsp) + movzwl -6(%rsp), %eax + movsd %xmm2, -24(%rsp) + testl $32640, %eax + je L(18) + movsd 1888+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm2 + movss %xmm2, (%r8) + jmp L(23) + +L(18): + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + shlb $7, %cl + movss %xmm0, -8(%rsp) + movss -8(%rsp), %xmm2 + movss -8(%rsp), %xmm1 + mulss %xmm1, %xmm2 + movss %xmm2, -8(%rsp) + movss -8(%rsp), %xmm3 + cvtss2sd %xmm3, %xmm3 + addsd -24(%rsp), %xmm3 + movsd %xmm3, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm4 + cvtsd2ss %xmm4, %xmm4 + movss %xmm4, (%r8) + jmp L(23) + +L(19): + testl %eax, %eax + jne L(16) + testl $8388607, -32(%rsp) + jne L(16) + +L(20): + testb %dl, %dl + jne L(22) + +L(21): + shlb $7, %cl + movq 1976+__satan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(22): + movsd 1936+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + +L(23): + xorl %eax, %eax + ret + +L(24): + movsd 1984+__satan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp L(13) + +L(25): + cmpl $2047, %eax + je L(32) + +L(26): + cmpl $2047, %r9d + je L(30) + +L(27): + movzwl -26(%rsp), %eax + andl $32640, %eax + cmpl $32640, %eax + jne L(16) + cmpl $255, %edi + je L(28) + testb %dl, %dl + je L(21) + jmp L(22) + +L(28): + testb %dl, %dl + jne L(29) + movsd 1904+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1912+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(29): + movsd 1952+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1960+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(30): + testl $8388607, -28(%rsp) + je L(27) + +L(31): + addss %xmm2, %xmm3 + movss %xmm3, (%r8) + jmp L(23) + +L(32): + testl $8388607, -32(%rsp) + jne L(31) + jmp L(26) + + cfi_endproc + + .type __svml_satan2_cout_rare_internal,@function + .size __svml_satan2_cout_rare_internal,.-__svml_satan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_satan2_data_internal: + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .type __svml_satan2_data_internal,@object + .size __svml_satan2_data_internal,1152 + .align 32 + +__satan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __satan2_la_CoutTab,@object + .size __satan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S new file mode 100644 index 0000000000..d1a67facf1 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core-sse2.S @@ -0,0 +1,20 @@ +/* SSE2 version of vectorized atan2f. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVbN4vv_atan2f _ZGVbN4vv_atan2f_sse2 +#include "../svml_s_atan2f4_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c new file mode 100644 index 0000000000..ee882b0557 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized atan2f, vector length is 4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVbN4vv_atan2f +#include "ifunc-mathvec-sse4_1.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVbN4vv_atan2f, __GI__ZGVbN4vv_atan2f, + __redirect__ZGVbN4vv_atan2f) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S new file mode 100644 index 0000000000..bfcf6628c3 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f4_core_sse4.S @@ -0,0 +1,2668 @@ +/* Function atan2f vectorized with SSE4. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text + .section .text.sse4,"ax",@progbits +ENTRY(_ZGVbN4vv_atan2f_sse4) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $256, %rsp + xorl %edx, %edx + movups %xmm9, 176(%rsp) + movups %xmm11, 112(%rsp) + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + movaps %xmm0, %xmm11 + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Arguments signs + */ + movups 256+__svml_satan2_data_internal(%rip), %xmm9 + movups %xmm12, 96(%rsp) + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + movaps %xmm1, %xmm12 + movups %xmm10, 144(%rsp) + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + movaps %xmm9, %xmm10 + andps %xmm11, %xmm9 + andps %xmm12, %xmm10 + movaps %xmm9, %xmm6 + movaps %xmm9, %xmm4 + cmpltps %xmm10, %xmm6 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + movups 192+__svml_satan2_data_internal(%rip), %xmm5 + movaps %xmm6, %xmm0 + orps %xmm10, %xmm5 + movaps %xmm10, %xmm1 + andnps %xmm5, %xmm0 + movaps %xmm6, %xmm5 + andps %xmm6, %xmm4 + andnps %xmm9, %xmm5 + andps %xmm6, %xmm1 + orps %xmm4, %xmm0 + orps %xmm1, %xmm5 + movaps %xmm9, %xmm3 + +/* Division a/b. */ + divps %xmm5, %xmm0 + movups %xmm13, 80(%rsp) + +/* if x<0, sPI = Pi, else sPI =0 */ + movaps %xmm12, %xmm4 + movups %xmm14, 48(%rsp) + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + movaps %xmm10, %xmm14 + +/* Testing on working interval. */ + movdqu 1024+__svml_satan2_data_internal(%rip), %xmm13 + movaps %xmm9, %xmm7 + psubd %xmm13, %xmm14 + psubd %xmm13, %xmm3 + movdqu 1088+__svml_satan2_data_internal(%rip), %xmm2 + movdqa %xmm14, %xmm1 + movdqa %xmm3, %xmm13 + pcmpgtd %xmm2, %xmm1 + pcmpeqd %xmm2, %xmm14 + pcmpgtd %xmm2, %xmm13 + pcmpeqd %xmm2, %xmm3 + por %xmm14, %xmm1 + por %xmm3, %xmm13 + pxor %xmm11, %xmm7 + por %xmm13, %xmm1 + +/* Polynomial. */ + movaps %xmm0, %xmm13 + mulps %xmm0, %xmm13 + cmpleps __svml_satan2_data_internal(%rip), %xmm4 + movmskps %xmm1, %eax + movaps %xmm13, %xmm14 + mulps %xmm13, %xmm14 + movups 448+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm14, %xmm2 + movups 512+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm14, %xmm3 + addps 576+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm14, %xmm2 + addps 640+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm14, %xmm3 + addps 704+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm14, %xmm2 + addps 768+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm14, %xmm3 + addps 832+__svml_satan2_data_internal(%rip), %xmm2 + mulps %xmm2, %xmm14 + addps 896+__svml_satan2_data_internal(%rip), %xmm3 + mulps %xmm3, %xmm13 + addps 960+__svml_satan2_data_internal(%rip), %xmm14 + andnps 320+__svml_satan2_data_internal(%rip), %xmm6 + addps %xmm13, %xmm14 + +/* Reconstruction. */ + mulps %xmm14, %xmm0 + movups %xmm8, 160(%rsp) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + movaps %xmm10, %xmm8 + pxor %xmm12, %xmm8 + addps %xmm6, %xmm0 + andps 384+__svml_satan2_data_internal(%rip), %xmm4 + orps %xmm8, %xmm0 + addps %xmm4, %xmm0 + orps %xmm7, %xmm0 + +/* Special branch for fast (vector) processing of zero arguments */ + testl %eax, %eax + jne L(7) + +L(1): +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne L(3) + +L(2): + movups 160(%rsp), %xmm8 + cfi_restore(25) + movups 176(%rsp), %xmm9 + cfi_restore(26) + movups 144(%rsp), %xmm10 + cfi_restore(27) + movups 112(%rsp), %xmm11 + cfi_restore(28) + movups 96(%rsp), %xmm12 + cfi_restore(29) + movups 80(%rsp), %xmm13 + cfi_restore(30) + movups 48(%rsp), %xmm14 + cfi_restore(31) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0x19, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1a, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xb0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1b, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x70, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1e, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x50, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x1f, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x30, 0xff, 0xff, 0xff, 0x22 + +L(3): + movups %xmm11, 64(%rsp) + movups %xmm12, 128(%rsp) + movups %xmm0, 192(%rsp) + je L(2) + xorl %eax, %eax + movups %xmm15, (%rsp) + movq %rsi, 24(%rsp) + movq %rdi, 16(%rsp) + movq %r12, 40(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 32(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + movl %edx, %r13d + +L(4): + btl %r12d, %r13d + jc L(6) + +L(5): + incl %r12d + cmpl $4, %r12d + jl L(4) + movups (%rsp), %xmm15 + cfi_restore(32) + movq 24(%rsp), %rsi + cfi_restore(4) + movq 16(%rsp), %rdi + cfi_restore(5) + movq 40(%rsp), %r12 + cfi_restore(12) + movq 32(%rsp), %r13 + cfi_restore(13) + movups 192(%rsp), %xmm0 + jmp L(2) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x18, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x10, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x28, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x20, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x00, 0xff, 0xff, 0xff, 0x22 + +L(6): + lea 64(%rsp,%r12,4), %rdi + lea 128(%rsp,%r12,4), %rsi + lea 192(%rsp,%r12,4), %rdx + call __svml_satan2_cout_rare_internal + jmp L(5) + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + cfi_restore(32) + +L(7): +/* Check if both X & Y are not NaNs: iXYnotNAN */ + movaps %xmm12, %xmm3 + movaps %xmm11, %xmm2 + cmpordps %xmm12, %xmm3 + cmpordps %xmm11, %xmm2 + +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + movups __svml_satan2_data_internal(%rip), %xmm13 + andps %xmm2, %xmm3 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + cmpeqps %xmm13, %xmm5 + pcmpeqd %xmm13, %xmm10 + pcmpeqd %xmm13, %xmm9 + por %xmm9, %xmm10 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + andps %xmm3, %xmm10 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + movaps %xmm10, %xmm9 + pandn %xmm1, %xmm9 + +/* Set sPIO2 to zero if den. is zero */ + movaps %xmm5, %xmm1 + andnps %xmm6, %xmm1 + andps %xmm13, %xmm5 + orps %xmm5, %xmm1 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + pcmpgtd %xmm12, %xmm13 + orps %xmm8, %xmm1 + andps %xmm4, %xmm13 + +/* Merge results from main and spec path */ + movaps %xmm10, %xmm4 + addps %xmm13, %xmm1 + +/* Go to callout */ + movmskps %xmm9, %edx + orps %xmm7, %xmm1 + andnps %xmm0, %xmm4 + andps %xmm10, %xmm1 + movaps %xmm4, %xmm0 + orps %xmm1, %xmm0 + jmp L(1) + +END(_ZGVbN4vv_atan2f_sse4) + + .align 16,0x90 + +__svml_satan2_cout_rare_internal: + + cfi_startproc + + pxor %xmm0, %xmm0 + movss (%rdi), %xmm3 + pxor %xmm1, %xmm1 + movss (%rsi), %xmm2 + movq %rdx, %r8 + cvtss2sd %xmm3, %xmm0 + cvtss2sd %xmm2, %xmm1 + movss %xmm3, -32(%rsp) + movss %xmm2, -28(%rsp) + movsd %xmm0, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -30(%rsp), %edi + andl $32640, %edi + movb -25(%rsp), %dl + movzwl -42(%rsp), %eax + andb $-128, %dl + movzwl -34(%rsp), %r9d + andl $32752, %eax + andl $32752, %r9d + shrl $7, %edi + movb -29(%rsp), %cl + shrb $7, %cl + shrb $7, %dl + shrl $4, %eax + shrl $4, %r9d + cmpl $255, %edi + je L(25) + movzwl -26(%rsp), %esi + andl $32640, %esi + cmpl $32640, %esi + je L(25) + testl %eax, %eax + jne L(8) + testl $8388607, -32(%rsp) + je L(20) + +L(8): + testl %r9d, %r9d + jne L(9) + testl $8388607, -28(%rsp) + je L(19) + +L(9): + negl %r9d + movsd %xmm0, -48(%rsp) + addl %eax, %r9d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r9d + jle L(17) + cmpl $54, %r9d + jge L(15) + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %dl, %dl + jne L(10) + movsd 1976+__satan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp L(11) + +L(10): + movsd 1936+__satan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + +L(11): + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %eax, %eax + jle L(24) + cmpl $2046, %eax + jge L(12) + andl $-32753, %esi + addl $-1023, %eax + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp L(13) + +L(12): + movsd 1992+__satan2_la_CoutTab(%rip), %xmm3 + movl $1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +L(13): + negl %eax + movq 1888+__satan2_la_CoutTab(%rip), %rsi + addl $1023, %eax + movq %rsi, -40(%rsp) + andl $2047, %eax + shrq $48, %rsi + shll $4, %eax + andl $-32753, %esi + orl %eax, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm3 + mulsd %xmm3, %xmm2 + comisd 1880+__satan2_la_CoutTab(%rip), %xmm5 + jb L(14) + movsd 2000+__satan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %edi + movl %edi, %r9d + andl $-524288, %edi + andl $-1048576, %r9d + addl $262144, %edi + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %edi + movsd -72(%rsp), %xmm4 + orl %edi, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__satan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %esi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %esi + movsd -72(%rsp), %xmm6 + shrl $4, %esi + subsd %xmm6, %xmm5 + movl -12(%rsp), %eax + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %eax + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %esi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %eax, %esi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %esi + movsd -64(%rsp), %xmm15 + movl $113, %eax + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %esi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %esi + movsd -56(%rsp), %xmm7 + cmovl %esi, %eax + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %eax, %eax + movsd -64(%rsp), %xmm12 + lea __satan2_la_CoutTab(%rip), %rsi + movsd -56(%rsp), %xmm5 + movslq %eax, %rax + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rsi,%rax,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rsi,%rax,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %rdi + movq %rdi, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %rdi + addsd -32(%rsp), %xmm2 + shlb $7, %cl + addsd 8(%rsi,%rax,8), %xmm2 + movb %dl, %al + andb $127, %dil + shlb $7, %al + movsd %xmm2, -32(%rsp) + orb %al, %dil + movb %dil, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %sil + movb %sil, %r9b + shrb $7, %sil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %sil, %dl + andb $127, %r9b + shlb $7, %dl + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm12 + movq %rdx, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %cl, %r10b + movb %r10b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(14): + movsd -48(%rsp), %xmm12 + movb %dl, %dil + movaps %xmm12, %xmm7 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm7 + shlb $7, %dil + shlb $7, %cl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rax + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rsi + movsd -56(%rsp), %xmm15 + movq %rsi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rsi + addsd %xmm5, %xmm4 + andb $127, %sil + orb %dil, %sil + movb %sil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %dl + andb $127, %r10b + shlb $7, %dl + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm3 + movq %rdx, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %cl, %r11b + movb %r11b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(15): + cmpl $74, %r9d + jge L(16) + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + subsd %xmm1, %xmm0 + addsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(16): + movsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(17): + testb %dl, %dl + jne L(22) + movb %dil, -41(%rsp) + pxor %xmm0, %xmm0 + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm0 + movss %xmm0, -8(%rsp) + movzwl -6(%rsp), %eax + movsd %xmm2, -24(%rsp) + testl $32640, %eax + je L(18) + movsd 1888+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm2 + movss %xmm2, (%r8) + jmp L(23) + +L(18): + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + shlb $7, %cl + movss %xmm0, -8(%rsp) + movss -8(%rsp), %xmm2 + movss -8(%rsp), %xmm1 + mulss %xmm1, %xmm2 + movss %xmm2, -8(%rsp) + movss -8(%rsp), %xmm3 + cvtss2sd %xmm3, %xmm3 + addsd -24(%rsp), %xmm3 + movsd %xmm3, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm4 + cvtsd2ss %xmm4, %xmm4 + movss %xmm4, (%r8) + jmp L(23) + +L(19): + testl %eax, %eax + jne L(16) + testl $8388607, -32(%rsp) + jne L(16) + +L(20): + testb %dl, %dl + jne L(22) + +L(21): + shlb $7, %cl + movq 1976+__satan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(22): + movsd 1936+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + +L(23): + xorl %eax, %eax + ret + +L(24): + movsd 1984+__satan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp L(13) + +L(25): + cmpl $2047, %eax + je L(32) + +L(26): + cmpl $2047, %r9d + je L(30) + +L(27): + movzwl -26(%rsp), %eax + andl $32640, %eax + cmpl $32640, %eax + jne L(16) + cmpl $255, %edi + je L(28) + testb %dl, %dl + je L(21) + jmp L(22) + +L(28): + testb %dl, %dl + jne L(29) + movsd 1904+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1912+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(29): + movsd 1952+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1960+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(30): + testl $8388607, -28(%rsp) + je L(27) + +L(31): + addss %xmm2, %xmm3 + movss %xmm3, (%r8) + jmp L(23) + +L(32): + testl $8388607, -32(%rsp) + jne L(31) + jmp L(26) + + cfi_endproc + + .type __svml_satan2_cout_rare_internal,@function + .size __svml_satan2_cout_rare_internal,.-__svml_satan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_satan2_data_internal: + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_satan2_data_internal,@object + .size __svml_satan2_data_internal,1152 + .align 32 + +__satan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __satan2_la_CoutTab,@object + .size __satan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S new file mode 100644 index 0000000000..21b1d3ff63 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core-sse.S @@ -0,0 +1,20 @@ +/* SSE version of vectorized atan2f. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define _ZGVdN8vv_atan2f _ZGVdN8vv_atan2f_sse_wrapper +#include "../svml_s_atan2f8_core.S" diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c new file mode 100644 index 0000000000..7e02050983 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core.c @@ -0,0 +1,28 @@ +/* Multiple versions of vectorized sinf, vector length is 8. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#define SYMBOL_NAME _ZGVdN8vv_atan2f +#include "ifunc-mathvec-avx2.h" + +libc_ifunc_redirected (REDIRECT_NAME, SYMBOL_NAME, IFUNC_SELECTOR ()); + +#ifdef SHARED +__hidden_ver1 (_ZGVdN8vv_atan2f, __GI__ZGVdN8vv_atan2f, + __redirect__ZGVdN8vv_atan2f) + __attribute__ ((visibility ("hidden"))); +#endif diff --git a/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S new file mode 100644 index 0000000000..8071e84be9 --- /dev/null +++ b/sysdeps/x86_64/fpu/multiarch/svml_s_atan2f8_core_avx2.S @@ -0,0 +1,2413 @@ +/* Function atan2f vectorized with AVX2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + https://www.gnu.org/licenses/. */ + +/* + * ALGORITHM DESCRIPTION: + * For 0.0 <= x <= 7.0/16.0: atan(x) = atan(0.0) + atan(s), where s=(x-0.0)/(1.0+0.0*x) + * For 7.0/16.0 <= x <= 11.0/16.0: atan(x) = atan(0.5) + atan(s), where s=(x-0.5)/(1.0+0.5*x) + * For 11.0/16.0 <= x <= 19.0/16.0: atan(x) = atan(1.0) + atan(s), where s=(x-1.0)/(1.0+1.0*x) + * For 19.0/16.0 <= x <= 39.0/16.0: atan(x) = atan(1.5) + atan(s), where s=(x-1.5)/(1.0+1.5*x) + * For 39.0/16.0 <= x <= inf : atan(x) = atan(inf) + atan(s), where s=-1.0/x + * Where atan(s) ~= s+s^3*Poly11(s^2) on interval |s|<7.0/0.16. + * + * + */ + +#include + + .text + .section .text.avx2,"ax",@progbits +ENTRY(_ZGVdN8vv_atan2f_avx2) + pushq %rbp + cfi_def_cfa_offset(16) + movq %rsp, %rbp + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + andq $-64, %rsp + subq $384, %rsp + xorl %edx, %edx + +/* + * #define NO_VECTOR_ZERO_ATAN2_ARGS + * Declarations + * Variables + * Constants + * The end of declarations + * Implementation + * Arguments signs + */ + vmovups 256+__svml_satan2_data_internal(%rip), %ymm2 + vmovups %ymm13, 288(%rsp) + vmovups %ymm12, 256(%rsp) + vmovups %ymm15, 352(%rsp) + vmovups %ymm14, 320(%rsp) + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + +/* Testing on working interval. */ + vmovups 1024+__svml_satan2_data_internal(%rip), %ymm15 + vmovups %ymm11, 224(%rsp) + vmovups %ymm9, 96(%rsp) + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + vmovups 1088+__svml_satan2_data_internal(%rip), %ymm9 + vmovups %ymm10, 160(%rsp) + vmovups %ymm8, 32(%rsp) + +/* if x<0, sPI = Pi, else sPI =0 */ + vmovups __svml_satan2_data_internal(%rip), %ymm5 + vmovaps %ymm1, %ymm7 + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + vandps %ymm2, %ymm7, %ymm13 + vandps %ymm2, %ymm0, %ymm12 + vcmplt_oqps %ymm13, %ymm12, %ymm4 + vcmple_oqps %ymm5, %ymm7, %ymm6 + vpsubd %ymm15, %ymm13, %ymm10 + vpsubd %ymm15, %ymm12, %ymm8 + +/* + * 1) If yx then a=-x, b=y, PIO2=Pi/2 + */ + vorps 192+__svml_satan2_data_internal(%rip), %ymm13, %ymm3 + vblendvps %ymm4, %ymm12, %ymm3, %ymm14 + vblendvps %ymm4, %ymm13, %ymm12, %ymm3 + +/* Division a/b. */ + vdivps %ymm3, %ymm14, %ymm11 + vpcmpgtd %ymm9, %ymm10, %ymm14 + vpcmpeqd %ymm9, %ymm10, %ymm15 + vpor %ymm15, %ymm14, %ymm10 + vmovups 512+__svml_satan2_data_internal(%rip), %ymm15 + vpcmpgtd %ymm9, %ymm8, %ymm14 + vpcmpeqd %ymm9, %ymm8, %ymm8 + vpor %ymm8, %ymm14, %ymm9 + vmovups 448+__svml_satan2_data_internal(%rip), %ymm14 + vpor %ymm9, %ymm10, %ymm10 + +/* Polynomial. */ + vmulps %ymm11, %ymm11, %ymm9 + vmulps %ymm9, %ymm9, %ymm8 + vfmadd213ps 576+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps 640+__svml_satan2_data_internal(%rip), %ymm8, %ymm15 + vfmadd213ps 704+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps 768+__svml_satan2_data_internal(%rip), %ymm8, %ymm15 + vfmadd213ps 832+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps 896+__svml_satan2_data_internal(%rip), %ymm8, %ymm15 + vfmadd213ps 960+__svml_satan2_data_internal(%rip), %ymm8, %ymm14 + vfmadd213ps %ymm14, %ymm9, %ymm15 + vandnps 320+__svml_satan2_data_internal(%rip), %ymm4, %ymm4 + +/* Reconstruction. */ + vfmadd213ps %ymm4, %ymm11, %ymm15 + vxorps %ymm13, %ymm7, %ymm1 + vandps 384+__svml_satan2_data_internal(%rip), %ymm6, %ymm6 + vorps %ymm1, %ymm15, %ymm11 + vaddps %ymm11, %ymm6, %ymm8 + vmovmskps %ymm10, %eax + vxorps %ymm12, %ymm0, %ymm2 + vorps %ymm2, %ymm8, %ymm9 + +/* Special branch for fast (vector) processing of zero arguments */ + testl %eax, %eax + jne L(7) + +L(1): +/* + * Special branch for fast (vector) processing of zero arguments + * The end of implementation + */ + testl %edx, %edx + jne L(3) + +L(2): + vmovaps %ymm9, %ymm0 + vmovups 32(%rsp), %ymm8 + cfi_restore(91) + vmovups 96(%rsp), %ymm9 + cfi_restore(92) + vmovups 160(%rsp), %ymm10 + cfi_restore(93) + vmovups 224(%rsp), %ymm11 + cfi_restore(94) + vmovups 256(%rsp), %ymm12 + cfi_restore(95) + vmovups 288(%rsp), %ymm13 + cfi_restore(96) + vmovups 320(%rsp), %ymm14 + cfi_restore(97) + vmovups 352(%rsp), %ymm15 + cfi_restore(98) + movq %rbp, %rsp + popq %rbp + cfi_def_cfa(7, 8) + cfi_restore(6) + ret + cfi_def_cfa(6, 16) + cfi_offset(6, -16) + .cfi_escape 0x10, 0xdb, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdc, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdd, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x20, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xde, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x60, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xdf, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe0, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xa0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe1, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0xe2, 0x00, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0xe0, 0xff, 0xff, 0xff, 0x22 + +L(3): + vmovups %ymm0, 64(%rsp) + vmovups %ymm7, 128(%rsp) + vmovups %ymm9, 192(%rsp) + je L(2) + xorl %eax, %eax + vzeroupper + movq %rsi, 8(%rsp) + movq %rdi, (%rsp) + movq %r12, 24(%rsp) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + movl %eax, %r12d + movq %r13, 16(%rsp) + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + movl %edx, %r13d + +L(4): + btl %r12d, %r13d + jc L(6) + +L(5): + incl %r12d + cmpl $8, %r12d + jl L(4) + movq 8(%rsp), %rsi + cfi_restore(4) + movq (%rsp), %rdi + cfi_restore(5) + movq 24(%rsp), %r12 + cfi_restore(12) + movq 16(%rsp), %r13 + cfi_restore(13) + vmovups 192(%rsp), %ymm9 + jmp L(2) + .cfi_escape 0x10, 0x04, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x88, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x05, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x80, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0c, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x98, 0xfe, 0xff, 0xff, 0x22 + .cfi_escape 0x10, 0x0d, 0x0e, 0x38, 0x1c, 0x0d, 0xc0, 0xff, 0xff, 0xff, 0x1a, 0x0d, 0x90, 0xfe, 0xff, 0xff, 0x22 + +L(6): + lea 64(%rsp,%r12,4), %rdi + lea 128(%rsp,%r12,4), %rsi + lea 192(%rsp,%r12,4), %rdx + call __svml_satan2_cout_rare_internal + jmp L(5) + cfi_restore(4) + cfi_restore(5) + cfi_restore(12) + cfi_restore(13) + +L(7): +/* Check if at least on of Y or Y is zero: iAXAYZERO */ + vpcmpeqd %ymm5, %ymm13, %ymm13 + vpcmpeqd %ymm5, %ymm12, %ymm12 + +/* Check if both X & Y are not NaNs: iXYnotNAN */ + vcmpordps %ymm7, %ymm7, %ymm11 + vcmpordps %ymm0, %ymm0, %ymm14 + +/* + * Path for zero arguments (at least one of both) + * Check if both args are zeros (den. is zero) + */ + vcmpeqps %ymm5, %ymm3, %ymm3 + vpor %ymm12, %ymm13, %ymm15 + +/* Set sPIO2 to zero if den. is zero */ + vblendvps %ymm3, %ymm5, %ymm4, %ymm4 + vandps %ymm14, %ymm11, %ymm8 + +/* Check if at least on of Y or Y is zero and not NaN: iAXAYZEROnotNAN */ + vpand %ymm8, %ymm15, %ymm8 + +/* Res = sign(Y)*(X<0)?(PIO2+PI):PIO2 */ + vpcmpgtd %ymm7, %ymm5, %ymm5 + vorps %ymm1, %ymm4, %ymm1 + vandps %ymm6, %ymm5, %ymm6 + vaddps %ymm6, %ymm1, %ymm1 + +/* Exclude from previous callout mask zero (and not NaN) arguments */ + vpandn %ymm10, %ymm8, %ymm10 + vorps %ymm2, %ymm1, %ymm2 + +/* Go to callout */ + vmovmskps %ymm10, %edx + +/* Merge results from main and spec path */ + vblendvps %ymm8, %ymm2, %ymm9, %ymm9 + jmp L(1) + +END(_ZGVdN8vv_atan2f_avx2) + + .align 16,0x90 + +__svml_satan2_cout_rare_internal: + + cfi_startproc + + pxor %xmm0, %xmm0 + movss (%rdi), %xmm3 + pxor %xmm1, %xmm1 + movss (%rsi), %xmm2 + movq %rdx, %r8 + cvtss2sd %xmm3, %xmm0 + cvtss2sd %xmm2, %xmm1 + movss %xmm3, -32(%rsp) + movss %xmm2, -28(%rsp) + movsd %xmm0, -48(%rsp) + movsd %xmm1, -40(%rsp) + movzwl -30(%rsp), %edi + andl $32640, %edi + movb -25(%rsp), %dl + movzwl -42(%rsp), %eax + andb $-128, %dl + movzwl -34(%rsp), %r9d + andl $32752, %eax + andl $32752, %r9d + shrl $7, %edi + movb -29(%rsp), %cl + shrb $7, %cl + shrb $7, %dl + shrl $4, %eax + shrl $4, %r9d + cmpl $255, %edi + je L(25) + movzwl -26(%rsp), %esi + andl $32640, %esi + cmpl $32640, %esi + je L(25) + testl %eax, %eax + jne L(8) + testl $8388607, -32(%rsp) + je L(20) + +L(8): + testl %r9d, %r9d + jne L(9) + testl $8388607, -28(%rsp) + je L(19) + +L(9): + negl %r9d + movsd %xmm0, -48(%rsp) + addl %eax, %r9d + movsd %xmm1, -40(%rsp) + movb -41(%rsp), %dil + movb -33(%rsp), %sil + andb $127, %dil + andb $127, %sil + cmpl $-54, %r9d + jle L(17) + cmpl $54, %r9d + jge L(15) + movb %sil, -33(%rsp) + movb %dil, -41(%rsp) + testb %dl, %dl + jne L(10) + movsd 1976+__satan2_la_CoutTab(%rip), %xmm1 + movaps %xmm1, %xmm0 + jmp L(11) + +L(10): + movsd 1936+__satan2_la_CoutTab(%rip), %xmm1 + movsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + +L(11): + movsd -48(%rsp), %xmm4 + movsd -40(%rsp), %xmm2 + movaps %xmm4, %xmm5 + divsd %xmm2, %xmm5 + movzwl -42(%rsp), %esi + movsd %xmm5, -16(%rsp) + testl %eax, %eax + jle L(24) + cmpl $2046, %eax + jge L(12) + andl $-32753, %esi + addl $-1023, %eax + movsd %xmm4, -48(%rsp) + addl $16368, %esi + movw %si, -42(%rsp) + jmp L(13) + +L(12): + movsd 1992+__satan2_la_CoutTab(%rip), %xmm3 + movl $1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + +L(13): + negl %eax + movq 1888+__satan2_la_CoutTab(%rip), %rsi + addl $1023, %eax + movq %rsi, -40(%rsp) + andl $2047, %eax + shrq $48, %rsi + shll $4, %eax + andl $-32753, %esi + orl %eax, %esi + movw %si, -34(%rsp) + movsd -40(%rsp), %xmm3 + mulsd %xmm3, %xmm2 + comisd 1880+__satan2_la_CoutTab(%rip), %xmm5 + jb L(14) + movsd 2000+__satan2_la_CoutTab(%rip), %xmm12 + movaps %xmm2, %xmm3 + mulsd %xmm2, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + movsd %xmm5, -24(%rsp) + subsd %xmm2, %xmm13 + movsd %xmm13, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm14 + movl -20(%rsp), %edi + movl %edi, %r9d + andl $-524288, %edi + andl $-1048576, %r9d + addl $262144, %edi + subsd %xmm14, %xmm15 + movsd %xmm15, -72(%rsp) + andl $1048575, %edi + movsd -72(%rsp), %xmm4 + orl %edi, %r9d + movl $0, -24(%rsp) + subsd %xmm4, %xmm3 + movl %r9d, -20(%rsp) + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -24(%rsp), %xmm11 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm5 + mulsd %xmm11, %xmm9 + movsd 1968+__satan2_la_CoutTab(%rip), %xmm8 + mulsd %xmm8, %xmm5 + mulsd %xmm8, %xmm9 + movaps %xmm5, %xmm7 + movzwl -10(%rsp), %esi + addsd %xmm9, %xmm7 + movsd %xmm7, -72(%rsp) + andl $32752, %esi + movsd -72(%rsp), %xmm6 + shrl $4, %esi + subsd %xmm6, %xmm5 + movl -12(%rsp), %eax + addsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + andl $1048575, %eax + movsd -48(%rsp), %xmm9 + movsd -72(%rsp), %xmm3 + movaps %xmm9, %xmm12 + movsd -64(%rsp), %xmm10 + movaps %xmm9, %xmm14 + movaps %xmm9, %xmm6 + addsd %xmm3, %xmm12 + movsd %xmm12, -72(%rsp) + movsd -72(%rsp), %xmm13 + shll $20, %esi + subsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + orl %eax, %esi + movsd -72(%rsp), %xmm4 + addl $-1069547520, %esi + movsd -64(%rsp), %xmm15 + movl $113, %eax + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm15, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -64(%rsp), %xmm8 + sarl $19, %esi + addsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + cmpl $113, %esi + movsd -56(%rsp), %xmm7 + cmovl %esi, %eax + subsd %xmm7, %xmm6 + movsd %xmm6, -56(%rsp) + addl %eax, %eax + movsd -64(%rsp), %xmm12 + lea __satan2_la_CoutTab(%rip), %rsi + movsd -56(%rsp), %xmm5 + movslq %eax, %rax + addsd %xmm5, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm7, %xmm13 + movsd -56(%rsp), %xmm8 + movsd %xmm13, -72(%rsp) + addsd %xmm10, %xmm8 + movsd -72(%rsp), %xmm4 + movaps %xmm9, %xmm10 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm10 + subsd %xmm7, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm3 + movsd -64(%rsp), %xmm14 + subsd %xmm14, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm15 + subsd %xmm15, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm4 + movsd %xmm10, -72(%rsp) + movaps %xmm2, %xmm10 + addsd %xmm4, %xmm8 + movsd -72(%rsp), %xmm4 + subsd -48(%rsp), %xmm4 + movsd %xmm4, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm3 + subsd %xmm3, %xmm6 + movaps %xmm2, %xmm3 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + subsd %xmm5, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm12 + movsd -64(%rsp), %xmm9 + mulsd %xmm11, %xmm12 + mulsd %xmm11, %xmm9 + movaps %xmm12, %xmm11 + addsd %xmm9, %xmm11 + movsd %xmm11, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm12 + addsd %xmm9, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm15 + movsd -64(%rsp), %xmm6 + addsd %xmm15, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm5, %xmm10 + movsd %xmm10, -64(%rsp) + movsd -72(%rsp), %xmm13 + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm13 + movsd %xmm13, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + addsd %xmm14, %xmm15 + movsd %xmm15, -64(%rsp) + movsd -56(%rsp), %xmm4 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm14 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -64(%rsp), %xmm4 + movsd -56(%rsp), %xmm2 + addsd %xmm2, %xmm4 + movsd %xmm4, -56(%rsp) + movsd -72(%rsp), %xmm12 + mulsd %xmm12, %xmm3 + movsd -56(%rsp), %xmm5 + movsd %xmm3, -72(%rsp) + addsd %xmm6, %xmm5 + movsd -72(%rsp), %xmm9 + subsd %xmm12, %xmm9 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm2 + subsd %xmm2, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm9 + divsd %xmm9, %xmm14 + mulsd %xmm14, %xmm13 + movsd -64(%rsp), %xmm10 + movsd %xmm13, -64(%rsp) + addsd %xmm10, %xmm5 + movsd -64(%rsp), %xmm15 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm12 + subsd %xmm14, %xmm15 + movsd %xmm15, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm4, %xmm2 + movsd %xmm2, -56(%rsp) + movsd -56(%rsp), %xmm3 + mulsd %xmm3, %xmm9 + movsd -56(%rsp), %xmm11 + subsd %xmm9, %xmm12 + mulsd %xmm11, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -64(%rsp), %xmm5 + subsd %xmm5, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -64(%rsp), %xmm2 + movq -56(%rsp), %r10 + movsd -64(%rsp), %xmm6 + movsd -56(%rsp), %xmm4 + movq %r10, -40(%rsp) + movsd -40(%rsp), %xmm3 + movaps %xmm3, %xmm5 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm5 + mulsd %xmm6, %xmm2 + mulsd %xmm4, %xmm2 + mulsd %xmm2, %xmm7 + mulsd %xmm8, %xmm2 + mulsd %xmm3, %xmm8 + addsd %xmm2, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm8, %xmm7 + movsd %xmm7, -72(%rsp) + movaps %xmm5, %xmm7 + movsd -72(%rsp), %xmm4 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm6 + addsd %xmm4, %xmm7 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + subsd %xmm8, %xmm5 + addsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm11 + movaps %xmm11, %xmm2 + mulsd %xmm11, %xmm2 + mulsd %xmm11, %xmm6 + mulsd %xmm2, %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm11, %xmm7 + mulsd %xmm2, %xmm3 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm9 + movsd -64(%rsp), %xmm8 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm8, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -72(%rsp) + movsd -72(%rsp), %xmm10 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm10, %xmm11 + mulsd %xmm2, %xmm3 + movsd %xmm11, -64(%rsp) + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm3, %xmm13 + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %r11 + movsd -56(%rsp), %xmm15 + movq %r11, -40(%rsp) + addsd %xmm15, %xmm4 + movsd -40(%rsp), %xmm8 + addsd %xmm5, %xmm4 + movsd %xmm4, -32(%rsp) + movaps %xmm8, %xmm4 + movaps %xmm8, %xmm2 + addsd (%rsi,%rax,8), %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd (%rsi,%rax,8), %xmm6 + movsd %xmm6, -64(%rsp) + movsd -56(%rsp), %xmm7 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movq -72(%rsp), %rdi + movq %rdi, -40(%rsp) + movsd -56(%rsp), %xmm2 + movaps %xmm1, %xmm3 + shrq $56, %rdi + addsd -32(%rsp), %xmm2 + shlb $7, %cl + addsd 8(%rsi,%rax,8), %xmm2 + movb %dl, %al + andb $127, %dil + shlb $7, %al + movsd %xmm2, -32(%rsp) + orb %al, %dil + movb %dil, -33(%rsp) + movsd -40(%rsp), %xmm9 + movaps %xmm9, %xmm5 + addsd %xmm9, %xmm3 + movsd %xmm3, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %sil + movb %sil, %r9b + shrb $7, %sil + subsd %xmm4, %xmm5 + movsd %xmm5, -64(%rsp) + movsd -72(%rsp), %xmm7 + movsd -64(%rsp), %xmm6 + xorb %sil, %dl + andb $127, %r9b + shlb $7, %dl + addsd %xmm6, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm8 + addsd %xmm8, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r9b + movsd -56(%rsp), %xmm1 + movb %r9b, -25(%rsp) + subsd %xmm1, %xmm9 + movsd %xmm9, -56(%rsp) + movsd -64(%rsp), %xmm11 + movsd -56(%rsp), %xmm10 + addsd %xmm10, %xmm11 + movsd %xmm11, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm12 + movq %rdx, -40(%rsp) + addsd %xmm12, %xmm0 + movsd -40(%rsp), %xmm13 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm13 + movsd %xmm13, -24(%rsp) + movb -17(%rsp), %r10b + andb $127, %r10b + orb %cl, %r10b + movb %r10b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(14): + movsd -48(%rsp), %xmm12 + movb %dl, %dil + movaps %xmm12, %xmm7 + mulsd 2000+__satan2_la_CoutTab(%rip), %xmm7 + shlb $7, %dil + shlb $7, %cl + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm8 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm2, %xmm13 + subsd -48(%rsp), %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -72(%rsp) + movsd -72(%rsp), %xmm11 + subsd %xmm11, %xmm12 + movsd %xmm12, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movsd %xmm13, -72(%rsp) + movsd -72(%rsp), %xmm14 + subsd %xmm2, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm4 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm3 + movsd 1888+__satan2_la_CoutTab(%rip), %xmm4 + subsd %xmm3, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm12 + divsd %xmm12, %xmm7 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm2 + mulsd %xmm7, %xmm2 + movsd -64(%rsp), %xmm14 + movsd %xmm2, -64(%rsp) + movsd -64(%rsp), %xmm8 + subsd %xmm7, %xmm8 + movsd %xmm8, -56(%rsp) + movsd -64(%rsp), %xmm10 + movsd -56(%rsp), %xmm9 + subsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -56(%rsp), %xmm11 + mulsd %xmm11, %xmm12 + movsd -56(%rsp), %xmm13 + subsd %xmm12, %xmm4 + mulsd %xmm13, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -64(%rsp), %xmm15 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm13 + subsd %xmm15, %xmm4 + movsd %xmm4, -64(%rsp) + movsd -64(%rsp), %xmm7 + movq -56(%rsp), %rax + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm3 + movq %rax, -40(%rsp) + movsd -40(%rsp), %xmm8 + movaps %xmm8, %xmm9 + addsd 1888+__satan2_la_CoutTab(%rip), %xmm7 + mulsd %xmm6, %xmm9 + mulsd %xmm5, %xmm8 + mulsd %xmm2, %xmm7 + movsd -16(%rsp), %xmm2 + mulsd %xmm2, %xmm2 + mulsd %xmm3, %xmm7 + movsd 1872+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + mulsd %xmm7, %xmm6 + mulsd %xmm5, %xmm7 + addsd 1864+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm7, %xmm6 + mulsd %xmm2, %xmm3 + addsd %xmm8, %xmm6 + addsd 1856+__satan2_la_CoutTab(%rip), %xmm3 + mulsd %xmm2, %xmm3 + movaps %xmm9, %xmm5 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm4 + addsd 1848+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm5 + mulsd %xmm2, %xmm3 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + movsd 2000+__satan2_la_CoutTab(%rip), %xmm5 + subsd %xmm6, %xmm9 + addsd 1840+__satan2_la_CoutTab(%rip), %xmm3 + addsd %xmm4, %xmm9 + mulsd %xmm2, %xmm3 + movsd %xmm9, -64(%rsp) + movsd -72(%rsp), %xmm11 + mulsd %xmm11, %xmm5 + addsd 1832+__satan2_la_CoutTab(%rip), %xmm3 + movsd -64(%rsp), %xmm4 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm7 + mulsd %xmm2, %xmm3 + subsd %xmm11, %xmm7 + movsd %xmm7, -64(%rsp) + movsd -72(%rsp), %xmm8 + movsd -64(%rsp), %xmm6 + addsd 1824+__satan2_la_CoutTab(%rip), %xmm3 + subsd %xmm6, %xmm8 + mulsd %xmm2, %xmm3 + movsd %xmm8, -72(%rsp) + movsd -72(%rsp), %xmm10 + mulsd %xmm3, %xmm13 + subsd %xmm10, %xmm11 + movsd %xmm11, -64(%rsp) + movsd -72(%rsp), %xmm2 + movsd -64(%rsp), %xmm12 + movsd %xmm13, -72(%rsp) + addsd %xmm12, %xmm4 + movsd -72(%rsp), %xmm14 + subsd %xmm3, %xmm14 + movsd %xmm14, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm15 + subsd %xmm15, %xmm5 + movsd %xmm5, -72(%rsp) + movsd -72(%rsp), %xmm6 + subsd %xmm6, %xmm3 + movsd %xmm3, -64(%rsp) + movsd -72(%rsp), %xmm6 + movsd -64(%rsp), %xmm5 + movaps %xmm6, %xmm12 + movaps %xmm5, %xmm3 + mulsd %xmm4, %xmm6 + mulsd %xmm4, %xmm3 + mulsd %xmm2, %xmm5 + mulsd %xmm2, %xmm12 + addsd %xmm3, %xmm6 + movaps %xmm12, %xmm7 + movaps %xmm12, %xmm8 + addsd %xmm5, %xmm6 + addsd %xmm2, %xmm7 + movsd %xmm6, -72(%rsp) + movsd -72(%rsp), %xmm5 + movsd %xmm7, -72(%rsp) + movsd -72(%rsp), %xmm3 + subsd %xmm3, %xmm8 + movsd %xmm8, -64(%rsp) + movsd -72(%rsp), %xmm10 + movsd -64(%rsp), %xmm9 + addsd %xmm9, %xmm10 + movsd %xmm10, -56(%rsp) + movsd -64(%rsp), %xmm11 + addsd %xmm11, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -56(%rsp), %xmm2 + subsd %xmm2, %xmm12 + movsd %xmm12, -56(%rsp) + movsd -64(%rsp), %xmm14 + movsd -56(%rsp), %xmm13 + addsd %xmm13, %xmm14 + movsd %xmm14, -56(%rsp) + movq -72(%rsp), %rsi + movsd -56(%rsp), %xmm15 + movq %rsi, -40(%rsp) + addsd %xmm15, %xmm4 + shrq $56, %rsi + addsd %xmm5, %xmm4 + andb $127, %sil + orb %dil, %sil + movb %sil, -33(%rsp) + movsd %xmm4, -32(%rsp) + movaps %xmm1, %xmm4 + movsd -40(%rsp), %xmm7 + movaps %xmm7, %xmm2 + addsd %xmm7, %xmm4 + movsd %xmm4, -72(%rsp) + movsd -72(%rsp), %xmm4 + movb -25(%rsp), %r9b + movb %r9b, %r10b + shrb $7, %r9b + subsd %xmm4, %xmm2 + movsd %xmm2, -64(%rsp) + movsd -72(%rsp), %xmm5 + movsd -64(%rsp), %xmm3 + xorb %r9b, %dl + andb $127, %r10b + shlb $7, %dl + addsd %xmm3, %xmm5 + movsd %xmm5, -56(%rsp) + movsd -64(%rsp), %xmm6 + addsd %xmm6, %xmm1 + movsd %xmm1, -64(%rsp) + orb %dl, %r10b + movsd -56(%rsp), %xmm1 + movb %r10b, -25(%rsp) + subsd %xmm1, %xmm7 + movsd %xmm7, -56(%rsp) + movsd -64(%rsp), %xmm2 + movsd -56(%rsp), %xmm1 + addsd %xmm1, %xmm2 + movsd %xmm2, -56(%rsp) + movq -72(%rsp), %rdx + movsd -56(%rsp), %xmm3 + movq %rdx, -40(%rsp) + addsd %xmm3, %xmm0 + movsd -40(%rsp), %xmm4 + addsd -32(%rsp), %xmm0 + movsd %xmm0, -32(%rsp) + addsd %xmm0, %xmm4 + movsd %xmm4, -24(%rsp) + movb -17(%rsp), %r11b + andb $127, %r11b + orb %cl, %r11b + movb %r11b, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(15): + cmpl $74, %r9d + jge L(16) + movb %dil, -41(%rsp) + divsd -48(%rsp), %xmm1 + movsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + subsd %xmm1, %xmm0 + addsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(16): + movsd 1920+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1928+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(17): + testb %dl, %dl + jne L(22) + movb %dil, -41(%rsp) + pxor %xmm0, %xmm0 + movb %sil, -33(%rsp) + movsd -48(%rsp), %xmm2 + divsd -40(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm0 + movss %xmm0, -8(%rsp) + movzwl -6(%rsp), %eax + movsd %xmm2, -24(%rsp) + testl $32640, %eax + je L(18) + movsd 1888+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd %xmm2, %xmm0 + movsd %xmm0, -72(%rsp) + movsd -72(%rsp), %xmm1 + mulsd %xmm1, %xmm2 + movsd %xmm2, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm2 + cvtsd2ss %xmm2, %xmm2 + movss %xmm2, (%r8) + jmp L(23) + +L(18): + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + shlb $7, %cl + movss %xmm0, -8(%rsp) + movss -8(%rsp), %xmm2 + movss -8(%rsp), %xmm1 + mulss %xmm1, %xmm2 + movss %xmm2, -8(%rsp) + movss -8(%rsp), %xmm3 + cvtss2sd %xmm3, %xmm3 + addsd -24(%rsp), %xmm3 + movsd %xmm3, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm4 + cvtsd2ss %xmm4, %xmm4 + movss %xmm4, (%r8) + jmp L(23) + +L(19): + testl %eax, %eax + jne L(16) + testl $8388607, -32(%rsp) + jne L(16) + +L(20): + testb %dl, %dl + jne L(22) + +L(21): + shlb $7, %cl + movq 1976+__satan2_la_CoutTab(%rip), %rax + movq %rax, -24(%rsp) + shrq $56, %rax + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm0 + cvtsd2ss %xmm0, %xmm0 + movss %xmm0, (%r8) + jmp L(23) + +L(22): + movsd 1936+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1944+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + +L(23): + xorl %eax, %eax + ret + +L(24): + movsd 1984+__satan2_la_CoutTab(%rip), %xmm3 + movl $-1022, %eax + mulsd %xmm3, %xmm4 + movsd %xmm4, -48(%rsp) + jmp L(13) + +L(25): + cmpl $2047, %eax + je L(32) + +L(26): + cmpl $2047, %r9d + je L(30) + +L(27): + movzwl -26(%rsp), %eax + andl $32640, %eax + cmpl $32640, %eax + jne L(16) + cmpl $255, %edi + je L(28) + testb %dl, %dl + je L(21) + jmp L(22) + +L(28): + testb %dl, %dl + jne L(29) + movsd 1904+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1912+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(29): + movsd 1952+__satan2_la_CoutTab(%rip), %xmm0 + shlb $7, %cl + addsd 1960+__satan2_la_CoutTab(%rip), %xmm0 + movsd %xmm0, -24(%rsp) + movb -17(%rsp), %al + andb $127, %al + orb %cl, %al + movb %al, -17(%rsp) + movsd -24(%rsp), %xmm1 + cvtsd2ss %xmm1, %xmm1 + movss %xmm1, (%r8) + jmp L(23) + +L(30): + testl $8388607, -28(%rsp) + je L(27) + +L(31): + addss %xmm2, %xmm3 + movss %xmm3, (%r8) + jmp L(23) + +L(32): + testl $8388607, -32(%rsp) + jne L(31) + jmp L(26) + + cfi_endproc + + .type __svml_satan2_cout_rare_internal,@function + .size __svml_satan2_cout_rare_internal,.-__svml_satan2_cout_rare_internal + + .section .rodata, "a" + .align 64 + +__svml_satan2_data_internal: + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .long 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .long 1073741824 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .long 2147483648 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .long 2147483647 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .long 1070141403 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .long 1078530011 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .long 993144000 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .long 3162449457 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .long 1026278276 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .long 3180885545 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .long 1037657204 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .long 3188810232 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .long 1045215135 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .long 3198855753 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .long 1065353216 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .long 2164260864 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .long 4227858432 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .byte 0 + .type __svml_satan2_data_internal,@object + .size __svml_satan2_data_internal,1152 + .align 32 + +__satan2_la_CoutTab: + .long 3892314112 + .long 1069799150 + .long 2332892550 + .long 1039715405 + .long 1342177280 + .long 1070305495 + .long 270726690 + .long 1041535749 + .long 939524096 + .long 1070817911 + .long 2253973841 + .long 3188654726 + .long 3221225472 + .long 1071277294 + .long 3853927037 + .long 1043226911 + .long 2818572288 + .long 1071767563 + .long 2677759107 + .long 1044314101 + .long 3355443200 + .long 1072103591 + .long 1636578514 + .long 3191094734 + .long 1476395008 + .long 1072475260 + .long 1864703685 + .long 3188646936 + .long 805306368 + .long 1072747407 + .long 192551812 + .long 3192726267 + .long 2013265920 + .long 1072892781 + .long 2240369452 + .long 1043768538 + .long 0 + .long 1072999953 + .long 3665168337 + .long 3192705970 + .long 402653184 + .long 1073084787 + .long 1227953434 + .long 3192313277 + .long 2013265920 + .long 1073142981 + .long 3853283127 + .long 1045277487 + .long 805306368 + .long 1073187261 + .long 1676192264 + .long 3192868861 + .long 134217728 + .long 1073217000 + .long 4290763938 + .long 1042034855 + .long 671088640 + .long 1073239386 + .long 994303084 + .long 3189643768 + .long 402653184 + .long 1073254338 + .long 1878067156 + .long 1042652475 + .long 1610612736 + .long 1073265562 + .long 670314820 + .long 1045138554 + .long 3221225472 + .long 1073273048 + .long 691126919 + .long 3189987794 + .long 3489660928 + .long 1073278664 + .long 1618990832 + .long 3188194509 + .long 1207959552 + .long 1073282409 + .long 2198872939 + .long 1044806069 + .long 3489660928 + .long 1073285217 + .long 2633982383 + .long 1042307894 + .long 939524096 + .long 1073287090 + .long 1059367786 + .long 3189114230 + .long 2281701376 + .long 1073288494 + .long 3158525533 + .long 1044484961 + .long 3221225472 + .long 1073289430 + .long 286581777 + .long 1044893263 + .long 4026531840 + .long 1073290132 + .long 2000245215 + .long 3191647611 + .long 134217728 + .long 1073290601 + .long 4205071590 + .long 1045035927 + .long 536870912 + .long 1073290952 + .long 2334392229 + .long 1043447393 + .long 805306368 + .long 1073291186 + .long 2281458177 + .long 3188885569 + .long 3087007744 + .long 1073291361 + .long 691611507 + .long 1044733832 + .long 3221225472 + .long 1073291478 + .long 1816229550 + .long 1044363390 + .long 2281701376 + .long 1073291566 + .long 1993843750 + .long 3189837440 + .long 134217728 + .long 1073291625 + .long 3654754496 + .long 1044970837 + .long 4026531840 + .long 1073291668 + .long 3224300229 + .long 3191935390 + .long 805306368 + .long 1073291698 + .long 2988777976 + .long 3188950659 + .long 536870912 + .long 1073291720 + .long 1030371341 + .long 1043402665 + .long 3221225472 + .long 1073291734 + .long 1524463765 + .long 1044361356 + .long 3087007744 + .long 1073291745 + .long 2754295320 + .long 1044731036 + .long 134217728 + .long 1073291753 + .long 3099629057 + .long 1044970710 + .long 2281701376 + .long 1073291758 + .long 962914160 + .long 3189838838 + .long 805306368 + .long 1073291762 + .long 3543908206 + .long 3188950786 + .long 4026531840 + .long 1073291764 + .long 1849909620 + .long 3191935434 + .long 3221225472 + .long 1073291766 + .long 1641333636 + .long 1044361352 + .long 536870912 + .long 1073291768 + .long 1373968792 + .long 1043402654 + .long 134217728 + .long 1073291769 + .long 2033191599 + .long 1044970710 + .long 3087007744 + .long 1073291769 + .long 4117947437 + .long 1044731035 + .long 805306368 + .long 1073291770 + .long 315378368 + .long 3188950787 + .long 2281701376 + .long 1073291770 + .long 2428571750 + .long 3189838838 + .long 3221225472 + .long 1073291770 + .long 1608007466 + .long 1044361352 + .long 4026531840 + .long 1073291770 + .long 1895711420 + .long 3191935434 + .long 134217728 + .long 1073291771 + .long 2031108713 + .long 1044970710 + .long 536870912 + .long 1073291771 + .long 1362518342 + .long 1043402654 + .long 805306368 + .long 1073291771 + .long 317461253 + .long 3188950787 + .long 939524096 + .long 1073291771 + .long 4117231784 + .long 1044731035 + .long 1073741824 + .long 1073291771 + .long 1607942376 + .long 1044361352 + .long 1207959552 + .long 1073291771 + .long 2428929577 + .long 3189838838 + .long 1207959552 + .long 1073291771 + .long 2031104645 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1895722602 + .long 3191935434 + .long 1342177280 + .long 1073291771 + .long 317465322 + .long 3188950787 + .long 1342177280 + .long 1073291771 + .long 1362515546 + .long 1043402654 + .long 1342177280 + .long 1073291771 + .long 1607942248 + .long 1044361352 + .long 1342177280 + .long 1073291771 + .long 4117231610 + .long 1044731035 + .long 1342177280 + .long 1073291771 + .long 2031104637 + .long 1044970710 + .long 1342177280 + .long 1073291771 + .long 1540251232 + .long 1045150466 + .long 1342177280 + .long 1073291771 + .long 2644671394 + .long 1045270303 + .long 1342177280 + .long 1073291771 + .long 2399244691 + .long 1045360181 + .long 1342177280 + .long 1073291771 + .long 803971124 + .long 1045420100 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192879152 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192849193 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192826724 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192811744 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192800509 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192793019 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192787402 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192783657 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192780848 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192778976 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192777572 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192776635 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192775933 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192775465 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192775114 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774880 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774704 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774587 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774500 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774441 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774397 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774368 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774346 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774331 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774320 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774313 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774308 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774304 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774301 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774299 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774298 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774297 + .long 1476395008 + .long 1073291771 + .long 3613709523 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 177735686 + .long 3192774296 + .long 1476395008 + .long 1073291771 + .long 3490996172 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2754716064 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 2263862659 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1895722605 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1650295902 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1466225875 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1343512524 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1251477510 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1190120835 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1144103328 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1113424990 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1090416237 + .long 3192774295 + .long 1476395008 + .long 1073291771 + .long 1075077068 + .long 3192774295 + .long 1431655765 + .long 3218429269 + .long 2576978363 + .long 1070176665 + .long 2453154343 + .long 3217180964 + .long 4189149139 + .long 1069314502 + .long 1775019125 + .long 3216459198 + .long 273199057 + .long 1068739452 + .long 874748308 + .long 3215993277 + .long 0 + .long 1069547520 + .long 0 + .long 1072693248 + .long 0 + .long 1073741824 + .long 1413754136 + .long 1072243195 + .long 856972295 + .long 1015129638 + .long 1413754136 + .long 1073291771 + .long 856972295 + .long 1016178214 + .long 1413754136 + .long 1074340347 + .long 856972295 + .long 1017226790 + .long 2134057426 + .long 1073928572 + .long 1285458442 + .long 1016756537 + .long 0 + .long 3220176896 + .long 0 + .long 0 + .long 0 + .long 2144337920 + .long 0 + .long 1048576 + .long 33554432 + .long 1101004800 + .type __satan2_la_CoutTab,@object + .size __satan2_la_CoutTab,2008 diff --git a/sysdeps/x86_64/fpu/svml_d_atan22_core.S b/sysdeps/x86_64/fpu/svml_d_atan22_core.S new file mode 100644 index 0000000000..f3089e70f9 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan22_core.S @@ -0,0 +1,29 @@ +/* Function atan2 vectorized with SSE2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVbN2vv_atan2) +WRAPPER_IMPL_SSE2_ff atan2 +END (_ZGVbN2vv_atan2) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVbN2vv_atan2) +#endif diff --git a/sysdeps/x86_64/fpu/svml_d_atan24_core.S b/sysdeps/x86_64/fpu/svml_d_atan24_core.S new file mode 100644 index 0000000000..8a163d12d2 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan24_core.S @@ -0,0 +1,29 @@ +/* Function atan2 vectorized with AVX2, wrapper version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVdN4vv_atan2) +WRAPPER_IMPL_AVX_ff _ZGVbN2vv_atan2 +END (_ZGVdN4vv_atan2) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVdN4vv_atan2) +#endif diff --git a/sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S b/sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S new file mode 100644 index 0000000000..0ee5ae8faf --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan24_core_avx.S @@ -0,0 +1,25 @@ +/* Function atan2 vectorized in AVX ISA as wrapper to SSE4 ISA version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVcN4vv_atan2) +WRAPPER_IMPL_AVX_ff _ZGVbN2vv_atan2 +END (_ZGVcN4vv_atan2) diff --git a/sysdeps/x86_64/fpu/svml_d_atan28_core.S b/sysdeps/x86_64/fpu/svml_d_atan28_core.S new file mode 100644 index 0000000000..b85f696686 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_d_atan28_core.S @@ -0,0 +1,25 @@ +/* Function atan2 vectorized with AVX-512. Wrapper to AVX2 version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_d_wrapper_impl.h" + + .text +ENTRY (_ZGVeN8vv_atan2) +WRAPPER_IMPL_AVX512_ff _ZGVdN4vv_atan2 +END (_ZGVeN8vv_atan2) diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f16_core.S b/sysdeps/x86_64/fpu/svml_s_atan2f16_core.S new file mode 100644 index 0000000000..25acb31dfb --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f16_core.S @@ -0,0 +1,25 @@ +/* Function atan2f vectorized with AVX-512. Wrapper to AVX2 version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY (_ZGVeN16vv_atan2f) +WRAPPER_IMPL_AVX512_ff _ZGVdN8vv_atan2f +END (_ZGVeN16vv_atan2f) diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f4_core.S b/sysdeps/x86_64/fpu/svml_s_atan2f4_core.S new file mode 100644 index 0000000000..bc99f0ba10 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f4_core.S @@ -0,0 +1,29 @@ +/* Function atan2f vectorized with SSE2. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY (_ZGVbN4vv_atan2f) +WRAPPER_IMPL_SSE2_ff atan2f +END (_ZGVbN4vv_atan2f) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVbN4vv_atan2f) +#endif diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f8_core.S b/sysdeps/x86_64/fpu/svml_s_atan2f8_core.S new file mode 100644 index 0000000000..bfcdb3c372 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f8_core.S @@ -0,0 +1,29 @@ +/* Function atan2f vectorized with AVX2, wrapper version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY (_ZGVdN8vv_atan2f) +WRAPPER_IMPL_AVX_ff _ZGVbN4vv_atan2f +END (_ZGVdN8vv_atan2f) + +#ifndef USE_MULTIARCH + libmvec_hidden_def (_ZGVdN8vv_atan2f) +#endif diff --git a/sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S b/sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S new file mode 100644 index 0000000000..1aa8d05822 --- /dev/null +++ b/sysdeps/x86_64/fpu/svml_s_atan2f8_core_avx.S @@ -0,0 +1,25 @@ +/* Function atan2f vectorized in AVX ISA as wrapper to SSE4 ISA version. + Copyright (C) 2021 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#include +#include "svml_s_wrapper_impl.h" + + .text +ENTRY(_ZGVcN8vv_atan2f) +WRAPPER_IMPL_AVX_ff _ZGVbN4vv_atan2f +END(_ZGVcN8vv_atan2f) diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c new file mode 100644 index 0000000000..e423bce25b --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx.c @@ -0,0 +1 @@ +#include "test-double-libmvec-atan2.c" diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c new file mode 100644 index 0000000000..e423bce25b --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx2.c @@ -0,0 +1 @@ +#include "test-double-libmvec-atan2.c" diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c new file mode 100644 index 0000000000..e423bce25b --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2-avx512f.c @@ -0,0 +1 @@ +#include "test-double-libmvec-atan2.c" diff --git a/sysdeps/x86_64/fpu/test-double-libmvec-atan2.c b/sysdeps/x86_64/fpu/test-double-libmvec-atan2.c new file mode 100644 index 0000000000..d0aa626d95 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-double-libmvec-atan2.c @@ -0,0 +1,3 @@ +#define LIBMVEC_TYPE double +#define LIBMVEC_FUNC atan2 +#include "test-vector-abi-arg2.h" diff --git a/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c index 7abe3211c8..cd802e0c6d 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen2-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVbN2v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVbN2v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVbN2v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVbN2v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVbN2vv_atan2) #define VEC_INT_TYPE __m128i diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c index 1537ed25cc..a04980e87a 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen4-avx2-wrappers.c @@ -35,6 +35,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVdN4v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVdN4v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVdN4v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVdN4v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVdN4vv_atan2) #ifndef __ILP32__ # define VEC_INT_TYPE __m256i diff --git a/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c index 27bcc9c59a..9c602445e7 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen4-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVcN4v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVcN4v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVcN4v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVcN4v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVcN4vv_atan2) #define VEC_INT_TYPE __m128i diff --git a/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c b/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c index 2333349893..d1e4b8dd01 100644 --- a/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c +++ b/sysdeps/x86_64/fpu/test-double-vlen8-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acosh), _ZGVeN8v_acosh) VECTOR_WRAPPER (WRAPPER_NAME (asin), _ZGVeN8v_asin) VECTOR_WRAPPER (WRAPPER_NAME (asinh), _ZGVeN8v_asinh) VECTOR_WRAPPER (WRAPPER_NAME (atan), _ZGVeN8v_atan) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2), _ZGVeN8vv_atan2) #ifndef __ILP32__ # define VEC_INT_TYPE __m512i diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c new file mode 100644 index 0000000000..5c7e2c9ad5 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx.c @@ -0,0 +1 @@ +#include "test-float-libmvec-atan2f.c" diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c new file mode 100644 index 0000000000..5c7e2c9ad5 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx2.c @@ -0,0 +1 @@ +#include "test-float-libmvec-atan2f.c" diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c new file mode 100644 index 0000000000..5c7e2c9ad5 --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f-avx512f.c @@ -0,0 +1 @@ +#include "test-float-libmvec-atan2f.c" diff --git a/sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c new file mode 100644 index 0000000000..beb5c745cb --- /dev/null +++ b/sysdeps/x86_64/fpu/test-float-libmvec-atan2f.c @@ -0,0 +1,3 @@ +#define LIBMVEC_TYPE float +#define LIBMVEC_FUNC atan2f +#include "test-vector-abi-arg2.h" diff --git a/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c index 723651140e..65e0c2af7d 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen16-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVeN16v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVeN16v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVeN16v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVeN16v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVeN16vv_atan2f) #define VEC_INT_TYPE __m512i diff --git a/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c index da77149021..b0cad1e107 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen4-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVbN4v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVbN4v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVbN4v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVbN4v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVbN4vv_atan2f) #define VEC_INT_TYPE __m128i diff --git a/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c index a978f37e79..359aa445ba 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen8-avx2-wrappers.c @@ -35,6 +35,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVdN8v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVdN8v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVdN8v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVdN8v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVdN8vv_atan2f) /* Redefinition of wrapper to be compatible with _ZGVdN8vvv_sincosf. */ #undef VECTOR_WRAPPER_fFF diff --git a/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c b/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c index 1ae9a8c3c0..80730777fc 100644 --- a/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c +++ b/sysdeps/x86_64/fpu/test-float-vlen8-wrappers.c @@ -32,6 +32,7 @@ VECTOR_WRAPPER (WRAPPER_NAME (acoshf), _ZGVcN8v_acoshf) VECTOR_WRAPPER (WRAPPER_NAME (asinf), _ZGVcN8v_asinf) VECTOR_WRAPPER (WRAPPER_NAME (asinhf), _ZGVcN8v_asinhf) VECTOR_WRAPPER (WRAPPER_NAME (atanf), _ZGVcN8v_atanf) +VECTOR_WRAPPER_ff (WRAPPER_NAME (atan2f), _ZGVcN8vv_atan2f) #define VEC_INT_TYPE __m128i