From patchwork Thu Jul  4 11:09:39 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Szabolcs Nagy <szabolcs.nagy@arm.com>
X-Patchwork-Id: 1127490
Return-Path: 
 <libc-alpha-return-103444-incoming=patchwork.ozlabs.org@sourceware.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=sourceware.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=libc-alpha-return-103444-incoming=patchwork.ozlabs.org@sourceware.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org;
	dmarc=none (p=none dis=none) header.from=arm.com
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	secure) header.d=sourceware.org header.i=@sourceware.org
	header.b="F9mcK3v6"; dkim=pass (1024-bit key;
	unprotected) header.d=armh.onmicrosoft.com
	header.i=@armh.onmicrosoft.com header.b="hJYIIulR";
	dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 45fZxv0pdZz9sPB
	for <incoming@patchwork.ozlabs.org>;
	Thu,  4 Jul 2019 21:09:54 +1000 (AEST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:cc:subject:date:message-id
	:content-type:mime-version; q=dns; s=default; b=UUyVRp/D0JXR+xCP
	GVEvXxOkB8/mAzNRoclXTiw4IrixS/HGfJDLtONIagudY32S9zGY87caJ15TipOq
	VnKfsG4j8WRS6K5haSffrjI6FEJDQsTj9Ioh9TB4My9/IiROpC0JlKyvMAoW3b0J
	cxJmhLS25lr+jc5zkB9iFMUqdPI=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:cc:subject:date:message-id
	:content-type:mime-version; s=default; bh=x4XfawCiPuuWjGc4TJrAKC
	CDhYc=; b=F9mcK3v6iCvkJ5u4OlUOjyOT+sx2TUH40ksgjQwavFlQptu+z1fnhV
	7TK+6ulM9QqtxTH0LUkIwE4ky784C822TnSWeXf8jh8tVbJQD2NosQsGlWSdhJLe
	kpj7/HjlHpTlHRlIGtE08kzPWOq1uaQCXBNYIyTV5upC8HFIWw0w4=
Received: (qmail 46389 invoked by alias); 4 Jul 2019 11:09:48 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: 
 <mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 46381 invoked by uid 89); 4 Jul 2019 11:09:48 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-18.0 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT,
	RCVD_IN_DNSWL_NONE, SPF_HELO_PASS,
	SPF_PASS autolearn=ham version=3.3.1 spammy=stp, backporting
X-HELO: FRA01-PR2-obe.outbound.protection.outlook.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;
	s=selector2-armh-onmicrosoft-com;
	h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
	bh=aaHq0MIl6tcqi2ClAU1hYjXJqcIfaX2zlJdoJVf20vI=;
	b=hJYIIulR59nWor1xOqVBheM62TM0RLCK9M4h76WrNeLs1N1cxDfd3UE/jwaY/BN6j+pLVYnf0Aa23sdvm522jEA1wiUOfE0gThhmo6u0pDRAXosNmMZpSQpVZXDD4d2dLHTtCCy8LHb4ivuvyyvK9e3a1ZSDmmKLCNwpvu5I4jY=
From: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
To: GNU C Library <libc-alpha@sourceware.org>, Steve Ellcey
	<sellcey@marvell.com>
CC: nd <nd@arm.com>
Subject: [RFC PATCH 1/2] Aarch64: Add simd exp/expf ABI symbols
Date: Thu, 4 Jul 2019 11:09:39 +0000
Message-ID: <0ae3af1b-cb2b-367f-8b86-f39971498291@arm.com>
user-agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101
	Thunderbird/60.7.0
authentication-results: spf=none (sender IP is )
	smtp.mailfrom=Szabolcs.Nagy@arm.com;
x-ms-exchange-purlcount: 1
x-ms-oob-tlc-oobclassifiers: OLM:813;
received-spf: None (protection.outlook.com: arm.com does not designate
	permitted sender hosts)
x-ms-exchange-senderadcheck: 1
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: Szabolcs.Nagy@arm.com

The implementation is in assembly and just calls the scalar math code.
This ensures that old compiler without vector call abi support can
build libmvec. The abi is supported since GCC 9.1, the specification is

https://developer.arm.com/tools-and-software/server-and-hpc/arm-architecture-tools/arm-compiler-for-hpc/vector-function-abi

Vector functions require a STO_AARCH64_VARIANT_PCS marking in the
dynamic symbol table for lazy bound calls to work. This will be
missing in libmvec, which works because the marking only affects
the behaviour if there are calls to the symbols in the binary.

(TODO: detect .variant_pcs asm support and use the directive if available.)

Testing requires vector call abi support, which is detected.

Header declarations are not added yet, so the symbols will not be used
by the compiler: they are just added so the abi is in place which
enables backporting later. Currently we cannot add correct declarations
that only declare the specific symbols we provide: the OpenMP pragma
mechanism would declare both AdvSIMD and SVE variants.

(TODO: figure out the c and fortran header magic that works and backportable)

this is a bit late for 2.30, but wanted to show how i planed to add
libmvec symbols for backporting.

2019-07-04  Steve Ellcey  <sellcey@marvell.com>
	    Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* sysdeps/aarch64/configure.ac (build_mathvec): Enable.
	(test-mathvec): Enable if ABI is supported.
	* sysdeps/aarch64/configure: Regenerate.
	* sysdeps/aarch64/fpu/Makefile
	(libmvec-support): Add libmvec_double_vlen2_exp,
	libmvec_float_vlen4_expf to list.
	(libmvec-tests): Add double-vlen2, float-vlen4 to list.
	(double-vlen2-funcs): Add new vector function name.
	(float-vlen4-funcs): Add new vector function name.
	* sysdeps/aarch64/fpu/Versions: New file.
	* sysdeps/aarch64/fpu/libmvec_double_vlen2.h: New file.
	* sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.S: New file.
	* sysdeps/aarch64/fpu/libmvec_float_vlen4.h: New file.
	* sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.S: New file.
	* sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c: New file.
	* sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c: New file.
	* sysdeps/aarch64/libm-test-ulps (exp_vlen2): New entry.
	(exp_vlen4): Likewise.
	* sysdeps/unix/sysv/linux/aarch64/libmvec.abilist: New file.
---
 sysdeps/aarch64/configure                     | 31 +++++++++
 sysdeps/aarch64/configure.ac                  | 24 +++++++
 sysdeps/aarch64/fpu/Makefile                  | 17 +++++
 sysdeps/aarch64/fpu/Versions                  |  5 ++
 sysdeps/aarch64/fpu/libmvec_double_vlen2.h    | 59 +++++++++++++++++
 .../aarch64/fpu/libmvec_double_vlen2_exp.S    | 21 ++++++
 sysdeps/aarch64/fpu/libmvec_float_vlen4.h     | 65 +++++++++++++++++++
 .../aarch64/fpu/libmvec_float_vlen4_expf.S    | 21 ++++++
 .../aarch64/fpu/test-double-vlen2-wrappers.c  | 28 ++++++++
 .../aarch64/fpu/test-float-vlen4-wrappers.c   | 28 ++++++++
 sysdeps/aarch64/libm-test-ulps                |  6 ++
 .../unix/sysv/linux/aarch64/libmvec.abilist   |  2 +
 12 files changed, 307 insertions(+)
 create mode 100644 sysdeps/aarch64/fpu/Versions
 create mode 100644 sysdeps/aarch64/fpu/libmvec_double_vlen2.h
 create mode 100644 sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_float_vlen4.h
 create mode 100644 sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.S
 create mode 100644 sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c
 create mode 100644 sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c
 create mode 100644 sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
diff --git a/sysdeps/aarch64/configure b/sysdeps/aarch64/configure
index 5bd355a691..df15cdb02a 100644
--- a/sysdeps/aarch64/configure
+++ b/sysdeps/aarch64/configure
@@ -172,3 +172,34 @@ else
   config_vars="$config_vars
 default-abi = lp64"
 fi
+
+{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for pcs attribute support" >&5
+$as_echo_n "checking for pcs attribute support... " >&6; }
+if ${libc_cv_gcc_pcs_attribute+:} false; then :
+  $as_echo_n "(cached) " >&6
+else
+  cat > conftest.c <<EOF
+__attribute__((aarch64_vector_pcs)) extern void foo (void);
+EOF
+libc_cv_gcc_pcs_attribute=no
+if ${CC-cc} -c -Wall -Werror conftest.c -o conftest.o 1>&5 \
+   2>&5 ; then
+  libc_cv_gcc_pcs_attribute=yes
+fi
+rm -f conftest*
+fi
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $libc_cv_gcc_pcs_attribute" >&5
+$as_echo "$libc_cv_gcc_pcs_attribute" >&6; }
+
+# Enable libmvec by default.
+if test x"$build_mathvec" = xnotset; then
+  build_mathvec=yes
+fi
+
+# Only test libmvec if the compiler supports aarch64_vector_pcs.
+if test x"$build_mathvec" = xyes; then
+  if test $libc_cv_gcc_pcs_attribute = yes; then
+    config_vars="$config_vars
+test-mathvec = yes"
+  fi
+fi
diff --git a/sysdeps/aarch64/configure.ac b/sysdeps/aarch64/configure.ac
index 7851dd4dac..eab411cad4 100644
--- a/sysdeps/aarch64/configure.ac
+++ b/sysdeps/aarch64/configure.ac
@@ -20,3 +20,27 @@ if test $libc_cv_aarch64_be = yes; then
 else
   LIBC_CONFIG_VAR([default-abi], [lp64])
 fi
+
+AC_CACHE_CHECK([for pcs attribute support],
+               libc_cv_gcc_pcs_attribute, [dnl
+cat > conftest.c <<EOF
+__attribute__((aarch64_vector_pcs)) extern void foo (void);
+EOF
+libc_cv_gcc_pcs_attribute=no
+if ${CC-cc} -c -Wall -Werror conftest.c -o conftest.o 1>&AS_MESSAGE_LOG_FD \
+   2>&AS_MESSAGE_LOG_FD ; then
+  libc_cv_gcc_pcs_attribute=yes
+fi
+rm -f conftest*])
+
+# Enable libmvec by default.
+if test x"$build_mathvec" = xnotset; then
+  build_mathvec=yes
+fi
+
+# Only test libmvec if the compiler supports aarch64_vector_pcs.
+if test x"$build_mathvec" = xyes; then
+  if test $libc_cv_gcc_pcs_attribute = yes; then
+    LIBC_CONFIG_VAR([test-mathvec], [yes])
+  fi
+fi
diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile
index 4a182bd6d6..220b664323 100644
--- a/sysdeps/aarch64/fpu/Makefile
+++ b/sysdeps/aarch64/fpu/Makefile
@@ -12,3 +12,20 @@ CFLAGS-s_fmaxf.c += -ffinite-math-only
 CFLAGS-s_fmin.c += -ffinite-math-only
 CFLAGS-s_fminf.c += -ffinite-math-only
 endif
+
+ifeq ($(subdir),mathvec)
+libmvec-support += \
+  libmvec_double_vlen2_exp \
+  libmvec_float_vlen4_expf \
+
+endif
+
+ifeq ($(subdir),math)
+ifeq ($(build-mathvec),yes)
+double-vlen2-funcs = exp
+float-vlen4-funcs = exp
+ifeq ($(test-mathvec),yes)
+libmvec-tests += double-vlen2 float-vlen4
+endif
+endif
+endif
diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions
new file mode 100644
index 0000000000..da36f3c495
--- /dev/null
+++ b/sysdeps/aarch64/fpu/Versions
@@ -0,0 +1,5 @@
+libmvec {
+  GLIBC_2.30 {
+    _ZGVnN2v_exp; _ZGVnN4v_expf;
+  }
+}
diff --git a/sysdeps/aarch64/fpu/libmvec_double_vlen2.h b/sysdeps/aarch64/fpu/libmvec_double_vlen2.h
new file mode 100644
index 0000000000..383980d6ef
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_double_vlen2.h
@@ -0,0 +1,59 @@
+/* Double-precision 2 element vector function template.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+
+ENTRY (VECTOR_FUNCTION)
+	stp	x29, x30, [sp, -288]!
+	cfi_adjust_cfa_offset (288)
+	cfi_rel_offset (x29, 0)
+	cfi_rel_offset (x30, 8)
+	mov	x29, sp
+	stp	 q8,  q9, [sp, 16]
+	stp	q10, q11, [sp, 48]
+	stp	q12, q13, [sp, 80]
+	stp	q14, q15, [sp, 112]
+	stp	q16, q17, [sp, 144]
+	stp	q18, q19, [sp, 176]
+	stp	q20, q21, [sp, 208]
+	stp	q22, q23, [sp, 240]
+
+	// Use per lane load/store to avoid endianness issues.
+	str	q0, [sp, 272]
+	ldr	d0, [sp, 272]
+	bl SCALAR_FUNCTION
+	str	d0, [sp, 272]
+	ldr	d0, [sp, 280]
+	bl SCALAR_FUNCTION
+	str	d0, [sp, 280]
+	ldr	q0, [sp, 272]
+
+	ldp	q8, q9, [sp, 16]
+	ldp	q10, q11, [sp, 48]
+	ldp	q12, q13, [sp, 80]
+	ldp	q14, q15, [sp, 112]
+	ldp	q16, q17, [sp, 144]
+	ldp	q18, q19, [sp, 176]
+	ldp	q20, q21, [sp, 208]
+	ldp	q22, q23, [sp, 240]
+	ldp	x29, x30, [sp], 288
+	cfi_adjust_cfa_offset (288)
+	cfi_restore (x29)
+	cfi_restore (x30)
+	ret
+END (VECTOR_FUNCTION)
diff --git a/sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.S b/sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.S
new file mode 100644
index 0000000000..644405cc4f
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_double_vlen2_exp.S
@@ -0,0 +1,21 @@
+/* Double-precision 2 element vector e^x function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION exp
+#define VECTOR_FUNCTION _ZGVnN2v_exp
+#include "libmvec_double_vlen2.h"
diff --git a/sysdeps/aarch64/fpu/libmvec_float_vlen4.h b/sysdeps/aarch64/fpu/libmvec_float_vlen4.h
new file mode 100644
index 0000000000..2450309c13
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_float_vlen4.h
@@ -0,0 +1,65 @@
+/* Single-precision 4 element vector function template.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+
+ENTRY (VECTOR_FUNCTION)
+	stp	x29, x30, [sp, -288]!
+	cfi_adjust_cfa_offset (288)
+	cfi_rel_offset (x29, 0)
+	cfi_rel_offset (x30, 8)
+	mov	x29, sp
+	stp	 q8,  q9, [sp, 16]
+	stp	q10, q11, [sp, 48]
+	stp	q12, q13, [sp, 80]
+	stp	q14, q15, [sp, 112]
+	stp	q16, q17, [sp, 144]
+	stp	q18, q19, [sp, 176]
+	stp	q20, q21, [sp, 208]
+	stp	q22, q23, [sp, 240]
+
+	// Use per lane load/store to avoid endianness issues.
+	str	q0, [sp, 272]
+	ldr	s0, [sp, 272]
+	bl SCALAR_FUNCTION
+	str	s0, [sp, 272]
+	ldr	s0, [sp, 276]
+	bl SCALAR_FUNCTION
+	str	s0, [sp, 276]
+	ldr	s0, [sp, 280]
+	bl SCALAR_FUNCTION
+	str	s0, [sp, 280]
+	ldr	s0, [sp, 284]
+	bl SCALAR_FUNCTION
+	str	s0, [sp, 284]
+	ldr	q0, [sp, 272]
+
+	ldp	q8, q9, [sp, 16]
+	ldp	q10, q11, [sp, 48]
+	ldp	q12, q13, [sp, 80]
+	ldp	q14, q15, [sp, 112]
+	ldp	q16, q17, [sp, 144]
+	ldp	q18, q19, [sp, 176]
+	ldp	q20, q21, [sp, 208]
+	ldp	q22, q23, [sp, 240]
+	ldp	x29, x30, [sp], 288
+	cfi_adjust_cfa_offset (288)
+	cfi_restore (x29)
+	cfi_restore (x30)
+	ret
+END (VECTOR_FUNCTION)
diff --git a/sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.S b/sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.S
new file mode 100644
index 0000000000..ab76ea0c77
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_float_vlen4_expf.S
@@ -0,0 +1,21 @@
+/* Single-precision 4 element vector e^x function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION expf
+#define VECTOR_FUNCTION _ZGVnN4v_expf
+#include "libmvec_float_vlen4.h"
diff --git a/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c b/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c
new file mode 100644
index 0000000000..6c6c44d6b5
--- /dev/null
+++ b/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c
@@ -0,0 +1,28 @@
+/* Wrapper part of tests for aarch64 double vector math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <arm_neon.h>
+#include "test-double-vlen2.h"
+
+#define VEC_TYPE float64x2_t
+
+/* Hack: VECTOR_WRAPPER declares the vector function without the pcs attribute,
+   placing it here happens to work, should be fixed in test-math-vector.h.  */
+__attribute__ ((aarch64_vector_pcs))
+
+VECTOR_WRAPPER (WRAPPER_NAME (exp), _ZGVnN2v_exp)
diff --git a/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c b/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c
new file mode 100644
index 0000000000..5117633f1f
--- /dev/null
+++ b/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c
@@ -0,0 +1,28 @@
+/* Wrapper part of tests for float aarch64 vector math functions.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <arm_neon.h>
+#include "test-float-vlen4.h"
+
+#define VEC_TYPE float32x4_t
+
+/* Hack: VECTOR_WRAPPER declares the vector function without the pcs attribute,
+   placing it here happens to work, should be fixed in test-math-vector.h.  */
+__attribute__ ((aarch64_vector_pcs))
+
+VECTOR_WRAPPER (WRAPPER_NAME (expf), _ZGVnN4v_expf)
diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps
index 585e5bbce7..1ed4af9e55 100644
--- a/sysdeps/aarch64/libm-test-ulps
+++ b/sysdeps/aarch64/libm-test-ulps
@@ -1601,6 +1601,12 @@ float: 1
 idouble: 1
 ifloat: 1
 
+Function: "exp_vlen2":
+double: 1
+
+Function: "exp_vlen4":
+float: 1
+
 Function: "expm1":
 double: 1
 float: 1
diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
new file mode 100644
index 0000000000..9e178253f7
--- /dev/null
+++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
@@ -0,0 +1,2 @@
+GLIBC_2.30 _ZGVnN2v_exp F
+GLIBC_2.30 _ZGVnN4v_expf F

From patchwork Thu Jul  4 11:09:42 2019
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Szabolcs Nagy <szabolcs.nagy@arm.com>
X-Patchwork-Id: 1127491
Return-Path: 
 <libc-alpha-return-103445-incoming=patchwork.ozlabs.org@sourceware.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@bilbo.ozlabs.org
Authentication-Results: ozlabs.org;
	spf=pass (mailfrom) smtp.mailfrom=sourceware.org
	(client-ip=209.132.180.131; helo=sourceware.org;
	envelope-from=libc-alpha-return-103445-incoming=patchwork.ozlabs.org@sourceware.org;
	receiver=<UNKNOWN>)
Authentication-Results: ozlabs.org;
	dmarc=none (p=none dis=none) header.from=arm.com
Authentication-Results: ozlabs.org; dkim=pass (1024-bit key;
	secure) header.d=sourceware.org header.i=@sourceware.org
	header.b="L8Hew8Ps"; dkim=pass (1024-bit key;
	unprotected) header.d=armh.onmicrosoft.com
	header.i=@armh.onmicrosoft.com header.b="fjiRZ9UL";
	dkim-atps=neutral
Received: from sourceware.org (server1.sourceware.org [209.132.180.131])
	(using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256
	bits)) (No client certificate requested)
	by ozlabs.org (Postfix) with ESMTPS id 45fZy40BT3z9sBp
	for <incoming@patchwork.ozlabs.org>;
	Thu,  4 Jul 2019 21:10:03 +1000 (AEST)
DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:cc:subject:date:message-id
	:content-type:mime-version; q=dns; s=default; b=wluUWldgCQ6AjYZj
	hpZwyxJFwcLzxKaDi3EaQiMAw4QSwpOrerT6SUKJJ96P1MBj+10zwNYkSIwY4Tup
	wz/GKgJH9hFrrXXs6InFeEigFM7ahUEqJnxLwnONxjPYEibeg3Tk7EDUspN7ODcS
	egYoH9fXL1KbUuV9G3JwdaVSqfc=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id
	:list-unsubscribe:list-subscribe:list-archive:list-post
	:list-help:sender:from:to:cc:subject:date:message-id
	:content-type:mime-version; s=default; bh=h5n+NGmdG0dGOm6fqyEI4Q
	FZN10=; b=L8Hew8PsGwch+ay2LAvEhLkwMeAqPgKNVybBqMZNlxwITPSmZW5rQb
	eDhE6/KWE/XZfzLMv63eqsI+qLdI48SWKnCebG1+7s3/XiaqReaeeukUov/OGmvs
	9mngr+y/M9ginzPJ1jDbECJKPk2Enq76FMMApe1My3HbtlII/XDZ0=
Received: (qmail 46968 invoked by alias); 4 Jul 2019 11:09:53 -0000
Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm
Precedence: bulk
List-Id: <libc-alpha.sourceware.org>
List-Unsubscribe: 
 <mailto:libc-alpha-unsubscribe-incoming=patchwork.ozlabs.org@sourceware.org>
List-Subscribe: <mailto:libc-alpha-subscribe@sourceware.org>
List-Archive: <http://sourceware.org/ml/libc-alpha/>
List-Post: <mailto:libc-alpha@sourceware.org>
List-Help: <mailto:libc-alpha-help@sourceware.org>,
	<http://sourceware.org/ml/#faqs>
Sender: libc-alpha-owner@sourceware.org
Delivered-To: mailing list libc-alpha@sourceware.org
Received: (qmail 46889 invoked by uid 89); 4 Jul 2019 11:09:52 -0000
Authentication-Results: sourceware.org; auth=none
X-Spam-SWARE-Status: No, score=-18.0 required=5.0 tests=AWL, BAYES_00,
	GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, KAM_SHORT,
	RCVD_IN_DNSWL_NONE,
	SPF_HELO_PASS autolearn=ham version=3.3.1 spammy=
X-HELO: FRA01-PR2-obe.outbound.protection.outlook.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com;
	s=selector2-armh-onmicrosoft-com;
	h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
	bh=jzsyMcwQYbpz+4bgV2v5HsU+rkBd9kVQCIi45dXeLMQ=;
	b=fjiRZ9ULcwjFs1d4IzMMPQo7eVr5DbXZry77SmDlY7PRZt7KeCEwp1DBzab3PiOLC1UiVOTexix54LKyKiQX22XO+2E3UGENekc2FuNWo5IZEsH/zk3ghpDnMSSlZ7UBv+hwdHgQnD7mhm/CDEOq8met8I/PAnWb8ADdI87aNt4=
From: Szabolcs Nagy <Szabolcs.Nagy@arm.com>
To: GNU C Library <libc-alpha@sourceware.org>, Steve Ellcey
	<sellcey@marvell.com>
CC: nd <nd@arm.com>
Subject: [RFC PATCH 2/2] aarch64: add vector sin, cos,
	log and pow abi symbols
Date: Thu, 4 Jul 2019 11:09:42 +0000
Message-ID: <e8de3b3c-52d6-ac60-2ef1-65e0903b7027@arm.com>
user-agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101
	Thunderbird/60.7.0
authentication-results: spf=none (sender IP is )
	smtp.mailfrom=Szabolcs.Nagy@arm.com;
x-ms-oob-tlc-oobclassifiers: OLM:3513;
received-spf: None (protection.outlook.com: arm.com does not designate
	permitted sender hosts)
x-ms-exchange-senderadcheck: 1
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: Szabolcs.Nagy@arm.com

Add simple assembly implementations that fall back to scalar code,
similar to the vector exp code.

These are the symbols we expect to optimize.

(TODO: deal with sincos)

2019-07-04  Szabolcs Nagy  <szabolcs.nagy@arm.com>

	* sysdeps/aarch64/fpu/Makefile: Add functions.
	* sysdeps/aarch64/fpu/Versions: Add symbols.
	* sysdeps/aarch64/fpu/libmvec_double_vlen2_cos.S: New file.
	* sysdeps/aarch64/fpu/libmvec_double_vlen2_log.S: New file.
	* sysdeps/aarch64/fpu/libmvec_double_vlen2_pow.S: New file.
	* sysdeps/aarch64/fpu/libmvec_double_vlen2_sin.S: New file.
	* sysdeps/aarch64/fpu/libmvec_float_vlen4_cosf.S: New file.
	* sysdeps/aarch64/fpu/libmvec_float_vlen4_logf.S: New file.
	* sysdeps/aarch64/fpu/libmvec_float_vlen4_powf.S: New file.
	* sysdeps/aarch64/fpu/libmvec_float_vlen4_sinf.S: New file.
	* sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c: Add wrappers.
	* sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c: Add wrappers.
	* sysdeps/aarch64/libm-test-ulps: Update.
	* sysdeps/unix/sysv/linux/aarch64/libmvec.abilist: Update.
---
 sysdeps/aarch64/fpu/Makefile                  | 12 +++-
 sysdeps/aarch64/fpu/Versions                  |  4 ++
 .../aarch64/fpu/libmvec_double_vlen2_cos.S    | 21 ++++++
 .../aarch64/fpu/libmvec_double_vlen2_log.S    | 21 ++++++
 .../aarch64/fpu/libmvec_double_vlen2_pow.S    | 62 ++++++++++++++++
 .../aarch64/fpu/libmvec_double_vlen2_sin.S    | 21 ++++++
 .../aarch64/fpu/libmvec_float_vlen4_cosf.S    | 21 ++++++
 .../aarch64/fpu/libmvec_float_vlen4_logf.S    | 21 ++++++
 .../aarch64/fpu/libmvec_float_vlen4_powf.S    | 70 +++++++++++++++++++
 .../aarch64/fpu/libmvec_float_vlen4_sinf.S    | 21 ++++++
 .../aarch64/fpu/test-double-vlen2-wrappers.c  | 12 ++++
 .../aarch64/fpu/test-float-vlen4-wrappers.c   | 12 ++++
 sysdeps/aarch64/libm-test-ulps                | 18 +++++
 .../unix/sysv/linux/aarch64/libmvec.abilist   |  8 +++
 14 files changed, 322 insertions(+), 2 deletions(-)
 create mode 100644 sysdeps/aarch64/fpu/libmvec_double_vlen2_cos.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_double_vlen2_log.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_double_vlen2_pow.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_double_vlen2_sin.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_float_vlen4_cosf.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_float_vlen4_logf.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_float_vlen4_powf.S
 create mode 100644 sysdeps/aarch64/fpu/libmvec_float_vlen4_sinf.S

diff --git a/sysdeps/aarch64/fpu/Makefile b/sysdeps/aarch64/fpu/Makefile
index 220b664323..fe72a74aec 100644
--- a/sysdeps/aarch64/fpu/Makefile
+++ b/sysdeps/aarch64/fpu/Makefile
@@ -15,15 +15,23 @@ endif
 
 ifeq ($(subdir),mathvec)
 libmvec-support += \
+  libmvec_double_vlen2_cos \
   libmvec_double_vlen2_exp \
+  libmvec_double_vlen2_log \
+  libmvec_double_vlen2_pow \
+  libmvec_double_vlen2_sin \
+  libmvec_float_vlen4_cosf \
   libmvec_float_vlen4_expf \
+  libmvec_float_vlen4_logf \
+  libmvec_float_vlen4_powf \
+  libmvec_float_vlen4_sinf \
 
 endif
 
 ifeq ($(subdir),math)
 ifeq ($(build-mathvec),yes)
-double-vlen2-funcs = exp
-float-vlen4-funcs = exp
+double-vlen2-funcs = cos exp log pow sin
+float-vlen4-funcs = cos exp log pow sin
 ifeq ($(test-mathvec),yes)
 libmvec-tests += double-vlen2 float-vlen4
 endif
diff --git a/sysdeps/aarch64/fpu/Versions b/sysdeps/aarch64/fpu/Versions
index da36f3c495..94ffaeee6d 100644
--- a/sysdeps/aarch64/fpu/Versions
+++ b/sysdeps/aarch64/fpu/Versions
@@ -1,5 +1,9 @@
 libmvec {
   GLIBC_2.30 {
+    _ZGVnN2v_cos; _ZGVnN4v_cosf;
     _ZGVnN2v_exp; _ZGVnN4v_expf;
+    _ZGVnN2v_log; _ZGVnN4v_logf;
+    _ZGVnN2vv_pow; _ZGVnN4vv_powf;
+    _ZGVnN2v_sin; _ZGVnN4v_sinf;
   }
 }
diff --git a/sysdeps/aarch64/fpu/libmvec_double_vlen2_cos.S b/sysdeps/aarch64/fpu/libmvec_double_vlen2_cos.S
new file mode 100644
index 0000000000..f4ad3c75f4
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_double_vlen2_cos.S
@@ -0,0 +1,21 @@
+/* Double-precision 2 element vector cos function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION cos
+#define VECTOR_FUNCTION _ZGVnN2v_cos
+#include "libmvec_double_vlen2.h"
diff --git a/sysdeps/aarch64/fpu/libmvec_double_vlen2_log.S b/sysdeps/aarch64/fpu/libmvec_double_vlen2_log.S
new file mode 100644
index 0000000000..b802a2608a
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_double_vlen2_log.S
@@ -0,0 +1,21 @@
+/* Double-precision 2 element vector log function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION log
+#define VECTOR_FUNCTION _ZGVnN2v_log
+#include "libmvec_double_vlen2.h"
diff --git a/sysdeps/aarch64/fpu/libmvec_double_vlen2_pow.S b/sysdeps/aarch64/fpu/libmvec_double_vlen2_pow.S
new file mode 100644
index 0000000000..85151482bf
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_double_vlen2_pow.S
@@ -0,0 +1,62 @@
+/* Double-precision 2 element vector x^y function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+
+ENTRY (_ZGVnN2vv_pow)
+	stp	x29, x30, [sp, -304]!
+	cfi_adjust_cfa_offset (304)
+	cfi_rel_offset (x29, 0)
+	cfi_rel_offset (x30, 8)
+	mov	x29, sp
+	stp	 q8,  q9, [sp, 16]
+	stp	q10, q11, [sp, 48]
+	stp	q12, q13, [sp, 80]
+	stp	q14, q15, [sp, 112]
+	stp	q16, q17, [sp, 144]
+	stp	q18, q19, [sp, 176]
+	stp	q20, q21, [sp, 208]
+	stp	q22, q23, [sp, 240]
+
+	// Use per lane load/store to avoid endianness issues.
+	str	q0, [sp, 272]
+	str	q1, [sp, 288]
+	ldr	d0, [sp, 272]
+	ldr	d1, [sp, 288]
+	bl pow
+	str	d0, [sp, 272]
+	ldr	d0, [sp, 280]
+	ldr	d1, [sp, 296]
+	bl pow
+	str	d0, [sp, 280]
+	ldr	q0, [sp, 272]
+
+	ldp	q8, q9, [sp, 16]
+	ldp	q10, q11, [sp, 48]
+	ldp	q12, q13, [sp, 80]
+	ldp	q14, q15, [sp, 112]
+	ldp	q16, q17, [sp, 144]
+	ldp	q18, q19, [sp, 176]
+	ldp	q20, q21, [sp, 208]
+	ldp	q22, q23, [sp, 240]
+	ldp	x29, x30, [sp], 304
+	cfi_adjust_cfa_offset (304)
+	cfi_restore (x29)
+	cfi_restore (x30)
+	ret
+END (_ZGVnN2vv_pow)
diff --git a/sysdeps/aarch64/fpu/libmvec_double_vlen2_sin.S b/sysdeps/aarch64/fpu/libmvec_double_vlen2_sin.S
new file mode 100644
index 0000000000..c01e4399cd
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_double_vlen2_sin.S
@@ -0,0 +1,21 @@
+/* Double-precision 2 element vector sin function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION sin
+#define VECTOR_FUNCTION _ZGVnN2v_sin
+#include "libmvec_double_vlen2.h"
diff --git a/sysdeps/aarch64/fpu/libmvec_float_vlen4_cosf.S b/sysdeps/aarch64/fpu/libmvec_float_vlen4_cosf.S
new file mode 100644
index 0000000000..2d9ea9fb36
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_float_vlen4_cosf.S
@@ -0,0 +1,21 @@
+/* Single-precision 4 element vector cos function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION cosf
+#define VECTOR_FUNCTION _ZGVnN4v_cosf
+#include "libmvec_float_vlen4.h"
diff --git a/sysdeps/aarch64/fpu/libmvec_float_vlen4_logf.S b/sysdeps/aarch64/fpu/libmvec_float_vlen4_logf.S
new file mode 100644
index 0000000000..df961eadba
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_float_vlen4_logf.S
@@ -0,0 +1,21 @@
+/* Single-precision 4 element vector log function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION logf
+#define VECTOR_FUNCTION _ZGVnN4v_logf
+#include "libmvec_float_vlen4.h"
diff --git a/sysdeps/aarch64/fpu/libmvec_float_vlen4_powf.S b/sysdeps/aarch64/fpu/libmvec_float_vlen4_powf.S
new file mode 100644
index 0000000000..95e593c151
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_float_vlen4_powf.S
@@ -0,0 +1,70 @@
+/* Single-precision 4 element vector x^y function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#include <sysdep.h>
+
+ENTRY (_ZGVnN4vv_powf)
+	stp	x29, x30, [sp, -304]!
+	cfi_adjust_cfa_offset (304)
+	cfi_rel_offset (x29, 0)
+	cfi_rel_offset (x30, 8)
+	mov	x29, sp
+	stp	 q8,  q9, [sp, 16]
+	stp	q10, q11, [sp, 48]
+	stp	q12, q13, [sp, 80]
+	stp	q14, q15, [sp, 112]
+	stp	q16, q17, [sp, 144]
+	stp	q18, q19, [sp, 176]
+	stp	q20, q21, [sp, 208]
+	stp	q22, q23, [sp, 240]
+
+	// Use per lane load/store to avoid endianness issues.
+	str	q0, [sp, 272]
+	str	q1, [sp, 288]
+	ldr	s0, [sp, 272]
+	ldr	s1, [sp, 288]
+	bl powf
+	str	s0, [sp, 272]
+	ldr	s0, [sp, 276]
+	ldr	s1, [sp, 292]
+	bl powf
+	str	s0, [sp, 276]
+	ldr	s0, [sp, 280]
+	ldr	s1, [sp, 296]
+	bl powf
+	str	s0, [sp, 280]
+	ldr	s0, [sp, 284]
+	ldr	s1, [sp, 300]
+	bl powf
+	str	s0, [sp, 284]
+	ldr	q0, [sp, 272]
+
+	ldp	q8, q9, [sp, 16]
+	ldp	q10, q11, [sp, 48]
+	ldp	q12, q13, [sp, 80]
+	ldp	q14, q15, [sp, 112]
+	ldp	q16, q17, [sp, 144]
+	ldp	q18, q19, [sp, 176]
+	ldp	q20, q21, [sp, 208]
+	ldp	q22, q23, [sp, 240]
+	ldp	x29, x30, [sp], 304
+	cfi_adjust_cfa_offset (304)
+	cfi_restore (x29)
+	cfi_restore (x30)
+	ret
+END (_ZGVnN4vv_powf)
diff --git a/sysdeps/aarch64/fpu/libmvec_float_vlen4_sinf.S b/sysdeps/aarch64/fpu/libmvec_float_vlen4_sinf.S
new file mode 100644
index 0000000000..49b8e95a91
--- /dev/null
+++ b/sysdeps/aarch64/fpu/libmvec_float_vlen4_sinf.S
@@ -0,0 +1,21 @@
+/* Single-precision 4 element vector sin function.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+#define SCALAR_FUNCTION sinf
+#define VECTOR_FUNCTION _ZGVnN4v_sinf
+#include "libmvec_float_vlen4.h"
diff --git a/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c b/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c
index 6c6c44d6b5..00c5f5bd4b 100644
--- a/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c
+++ b/sysdeps/aarch64/fpu/test-double-vlen2-wrappers.c
@@ -25,4 +25,16 @@
    placing it here happens to work, should be fixed in test-math-vector.h.  */
 __attribute__ ((aarch64_vector_pcs))
 
+VECTOR_WRAPPER (WRAPPER_NAME (cos), _ZGVnN2v_cos)
+
+__attribute__ ((aarch64_vector_pcs))
 VECTOR_WRAPPER (WRAPPER_NAME (exp), _ZGVnN2v_exp)
+
+__attribute__ ((aarch64_vector_pcs))
+VECTOR_WRAPPER (WRAPPER_NAME (log), _ZGVnN2v_log)
+
+__attribute__ ((aarch64_vector_pcs))
+VECTOR_WRAPPER_ff (WRAPPER_NAME (pow), _ZGVnN2vv_pow)
+
+__attribute__ ((aarch64_vector_pcs))
+VECTOR_WRAPPER (WRAPPER_NAME (sin), _ZGVnN2v_sin)
diff --git a/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c b/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c
index 5117633f1f..2b9cf6d31f 100644
--- a/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c
+++ b/sysdeps/aarch64/fpu/test-float-vlen4-wrappers.c
@@ -25,4 +25,16 @@
    placing it here happens to work, should be fixed in test-math-vector.h.  */
 __attribute__ ((aarch64_vector_pcs))
 
+VECTOR_WRAPPER (WRAPPER_NAME (cosf), _ZGVnN4v_cosf)
+
+__attribute__ ((aarch64_vector_pcs))
 VECTOR_WRAPPER (WRAPPER_NAME (expf), _ZGVnN4v_expf)
+
+__attribute__ ((aarch64_vector_pcs))
+VECTOR_WRAPPER (WRAPPER_NAME (logf), _ZGVnN4v_logf)
+
+__attribute__ ((aarch64_vector_pcs))
+VECTOR_WRAPPER_ff (WRAPPER_NAME (powf), _ZGVnN4vv_powf)
+
+__attribute__ ((aarch64_vector_pcs))
+VECTOR_WRAPPER (WRAPPER_NAME (sinf), _ZGVnN4v_sinf)
diff --git a/sysdeps/aarch64/libm-test-ulps b/sysdeps/aarch64/libm-test-ulps
index 1ed4af9e55..f83213c48c 100644
--- a/sysdeps/aarch64/libm-test-ulps
+++ b/sysdeps/aarch64/libm-test-ulps
@@ -1043,6 +1043,12 @@ ifloat: 1
 ildouble: 2
 ldouble: 2
 
+Function: "cos_vlen2":
+double: 1
+
+Function: "cos_vlen4":
+float: 1
+
 Function: "cosh":
 double: 1
 float: 1
@@ -1977,6 +1983,12 @@ ifloat: 1
 ildouble: 2
 ldouble: 2
 
+Function: "pow_vlen2":
+double: 1
+
+Function: "pow_vlen4":
+float: 1
+
 Function: "sin":
 double: 1
 float: 1
@@ -2009,6 +2021,12 @@ ifloat: 1
 ildouble: 3
 ldouble: 3
 
+Function: "sin_vlen2":
+double: 1
+
+Function: "sin_vlen4":
+float: 1
+
 Function: "sincos":
 double: 1
 float: 1
diff --git a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
index 9e178253f7..20cc3dcd7f 100644
--- a/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
+++ b/sysdeps/unix/sysv/linux/aarch64/libmvec.abilist
@@ -1,2 +1,10 @@
+GLIBC_2.30 _ZGVnN2v_cos F
 GLIBC_2.30 _ZGVnN2v_exp F
+GLIBC_2.30 _ZGVnN2v_log F
+GLIBC_2.30 _ZGVnN2v_sin F
+GLIBC_2.30 _ZGVnN2vv_pow F
+GLIBC_2.30 _ZGVnN4v_cosf F
 GLIBC_2.30 _ZGVnN4v_expf F
+GLIBC_2.30 _ZGVnN4v_logf F
+GLIBC_2.30 _ZGVnN4v_sinf F
+GLIBC_2.30 _ZGVnN4vv_powf F