From patchwork Fri Oct 14 22:39:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Noah Goldstein X-Patchwork-Id: 1690225 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=sourceware.org; envelope-from=libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.a=rsa-sha256 header.s=default header.b=KmR1VO8W; dkim-atps=neutral Received: from sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Mq1Zv6lpkz23jf for ; Sat, 15 Oct 2022 09:41:27 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id E3CC83851151 for ; Fri, 14 Oct 2022 22:41:25 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org E3CC83851151 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sourceware.org; s=default; t=1665787285; bh=g4oRMyVGEgJOCIVHbJS1w0FCoR9NlVUuKLeGv+4q8Y8=; h=To:Subject:Date:In-Reply-To:References:List-Id:List-Unsubscribe: List-Archive:List-Post:List-Help:List-Subscribe:From:Reply-To: From; b=KmR1VO8WjKuq7ggro/Yqxh/zx1ynMK40bPtCLPIiQhCcod/m9YLYfwGrdL3IUNsfp 2w1N59ZAG7vatXQnA+R0QlJ1cJGuUxdnQ1A7lwa0AuBk+qhRdzm3wcuEM0AGPBsv+e MzazTdPJU5heDN85tZgtzW9JPM2bhcwj7SL1x/Sk= X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mail-pg1-x530.google.com (mail-pg1-x530.google.com [IPv6:2607:f8b0:4864:20::530]) by sourceware.org (Postfix) with ESMTPS id 0DBAF3858D38 for ; Fri, 14 Oct 2022 22:39:20 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org 0DBAF3858D38 Received: by mail-pg1-x530.google.com with SMTP id e129so5496605pgc.9 for ; Fri, 14 Oct 2022 15:39:20 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=g4oRMyVGEgJOCIVHbJS1w0FCoR9NlVUuKLeGv+4q8Y8=; b=ZHfTx/nVCE+brHSII1keC/05IxEW8+bF0DURV7EE2h7lAKVQqW0cbRF6me3GLp8mCA 2MdvtAeeLUjwQXMhkqymsTmjMrCNA57GjmsDr9g+RHkXXeebMy55NtNf44Qe3AsbI4wZ rDLe4qdyBd6axcAXr2nUeod0OYvrisfNv9Tw2sHyhu8qurEXwWLjHjDBumnPxRV8lb5a EqIRUaHsYP/lMiTj5jt8UYQEB3IpzHz7TLnu1qH2EC6M4Sn1fpP/slLZ0bl/LCGAvQfV MWQ8tK4D7wPvDLk4wDcU4D5MSkM1cpcIEkI08aJeYXIGCRvqA2iaL5vmjiVj3cownkZ+ cfZQ== X-Gm-Message-State: ACrzQf20ZkpZnTYt7S607OBD39/MVretFKYjRP1Hr6/9pRuS5hNV+tV2 j0NzEHuDZm2ELbKl2JA8sHl98hT9iQruLA== X-Google-Smtp-Source: AMsMyM5ge17CFhTQNQw/S9bsOtUGjHGTypZRwN4gZapj64iPBwSn0XNHJ5N8dbB07WNzZffVGuY34g== X-Received: by 2002:a05:6a00:2384:b0:566:813c:ae24 with SMTP id f4-20020a056a00238400b00566813cae24mr101710pfc.17.1665787158416; Fri, 14 Oct 2022 15:39:18 -0700 (PDT) Received: from noahgold-desk.. ([192.55.60.38]) by smtp.gmail.com with ESMTPSA id r19-20020a170902e3d300b0017849a2b56asm2175471ple.46.2022.10.14.15.39.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Oct 2022 15:39:17 -0700 (PDT) To: libc-alpha@sourceware.org Subject: [PATCH v6 1/7] x86: Update and move evex256/512 vec macros Date: Fri, 14 Oct 2022 17:39:08 -0500 Message-Id: <20221014223914.700492-1-goldstein.w.n@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20221014164008.1325863-1-goldstein.w.n@gmail.com> References: <20221014164008.1325863-1-goldstein.w.n@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, URIBL_BLACK autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Noah Goldstein via Libc-alpha From: Noah Goldstein Reply-To: Noah Goldstein Errors-To: libc-alpha-bounces+incoming=patchwork.ozlabs.org@sourceware.org Sender: "Libc-alpha" 1) Copy so that backport will be easier. 2) Make section only define if there is not a previous definition 3) Add `VEC_lo` definition for proper reg-width but in the ymm/zmm0-15 range. This commit does not change libc.so Tested build on x86-64 --- sysdeps/x86_64/multiarch/x86-avx-rtm-vecs.h | 35 ++++++++ sysdeps/x86_64/multiarch/x86-avx-vecs.h | 47 ++++++++++ .../x86_64/multiarch/x86-evex-vecs-common.h | 39 ++++++++ sysdeps/x86_64/multiarch/x86-evex256-vecs.h | 38 ++++++++ sysdeps/x86_64/multiarch/x86-evex512-vecs.h | 38 ++++++++ sysdeps/x86_64/multiarch/x86-sse2-vecs.h | 47 ++++++++++ sysdeps/x86_64/multiarch/x86-vec-macros.h | 90 +++++++++++++++++++ 7 files changed, 334 insertions(+) create mode 100644 sysdeps/x86_64/multiarch/x86-avx-rtm-vecs.h create mode 100644 sysdeps/x86_64/multiarch/x86-avx-vecs.h create mode 100644 sysdeps/x86_64/multiarch/x86-evex-vecs-common.h create mode 100644 sysdeps/x86_64/multiarch/x86-evex256-vecs.h create mode 100644 sysdeps/x86_64/multiarch/x86-evex512-vecs.h create mode 100644 sysdeps/x86_64/multiarch/x86-sse2-vecs.h create mode 100644 sysdeps/x86_64/multiarch/x86-vec-macros.h diff --git a/sysdeps/x86_64/multiarch/x86-avx-rtm-vecs.h b/sysdeps/x86_64/multiarch/x86-avx-rtm-vecs.h new file mode 100644 index 0000000000..0b326c8a70 --- /dev/null +++ b/sysdeps/x86_64/multiarch/x86-avx-rtm-vecs.h @@ -0,0 +1,35 @@ +/* Common config for AVX-RTM VECs + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _X86_AVX_RTM_VECS_H +#define _X86_AVX_RTM_VECS_H 1 + +#define COND_VZEROUPPER COND_VZEROUPPER_XTEST +#define ZERO_UPPER_VEC_REGISTERS_RETURN \ + ZERO_UPPER_VEC_REGISTERS_RETURN_XTEST + +#define VZEROUPPER_RETURN jmp L(return_vzeroupper) + +#define USE_WITH_RTM 1 +#include "x86-avx-vecs.h" + +#undef SECTION +#define SECTION(p) p##.avx.rtm + +#endif diff --git a/sysdeps/x86_64/multiarch/x86-avx-vecs.h b/sysdeps/x86_64/multiarch/x86-avx-vecs.h new file mode 100644 index 0000000000..dca1089060 --- /dev/null +++ b/sysdeps/x86_64/multiarch/x86-avx-vecs.h @@ -0,0 +1,47 @@ +/* Common config for AVX VECs + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _X86_AVX_VECS_H +#define _X86_AVX_VECS_H 1 + +#ifdef VEC_SIZE +# error "Multiple VEC configs included!" +#endif + +#define VEC_SIZE 32 +#include "x86-vec-macros.h" + +#define USE_WITH_AVX 1 +#define SECTION(p) p##.avx + +/* 4-byte mov instructions with AVX2. */ +#define MOV_SIZE 4 +/* 1 (ret) + 3 (vzeroupper). */ +#define RET_SIZE 4 +#define VZEROUPPER vzeroupper + +#define VMOVU vmovdqu +#define VMOVA vmovdqa +#define VMOVNT vmovntdq + +/* Often need to access xmm portion. */ +#define VMM_128 VMM_any_xmm +#define VMM VMM_any_ymm + +#endif diff --git a/sysdeps/x86_64/multiarch/x86-evex-vecs-common.h b/sysdeps/x86_64/multiarch/x86-evex-vecs-common.h new file mode 100644 index 0000000000..f331e9d8ec --- /dev/null +++ b/sysdeps/x86_64/multiarch/x86-evex-vecs-common.h @@ -0,0 +1,39 @@ +/* Common config for EVEX256 and EVEX512 VECs + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _X86_EVEX_VECS_COMMON_H +#define _X86_EVEX_VECS_COMMON_H 1 + +#include "x86-vec-macros.h" + +/* 6-byte mov instructions with EVEX. */ +#define MOV_SIZE 6 +/* No vzeroupper needed. */ +#define RET_SIZE 1 +#define VZEROUPPER + +#define VMOVU vmovdqu64 +#define VMOVA vmovdqa64 +#define VMOVNT vmovntdq + +#define VMM_128 VMM_hi_xmm +#define VMM_256 VMM_hi_ymm +#define VMM_512 VMM_hi_zmm + +#endif diff --git a/sysdeps/x86_64/multiarch/x86-evex256-vecs.h b/sysdeps/x86_64/multiarch/x86-evex256-vecs.h new file mode 100644 index 0000000000..8337b95504 --- /dev/null +++ b/sysdeps/x86_64/multiarch/x86-evex256-vecs.h @@ -0,0 +1,38 @@ +/* Common config for EVEX256 VECs + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _EVEX256_VECS_H +#define _EVEX256_VECS_H 1 + +#ifdef VEC_SIZE +# error "Multiple VEC configs included!" +#endif + +#define VEC_SIZE 32 +#include "x86-evex-vecs-common.h" + +#define USE_WITH_EVEX256 1 + +#ifndef SECTION +# define SECTION(p) p##.evex +#endif + +#define VMM VMM_256 +#define VMM_lo VMM_any_ymm +#endif diff --git a/sysdeps/x86_64/multiarch/x86-evex512-vecs.h b/sysdeps/x86_64/multiarch/x86-evex512-vecs.h new file mode 100644 index 0000000000..7dc5c23ad0 --- /dev/null +++ b/sysdeps/x86_64/multiarch/x86-evex512-vecs.h @@ -0,0 +1,38 @@ +/* Common config for EVEX512 VECs + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _EVEX512_VECS_H +#define _EVEX512_VECS_H 1 + +#ifdef VEC_SIZE +# error "Multiple VEC configs included!" +#endif + +#define VEC_SIZE 64 +#include "x86-evex-vecs-common.h" + +#define USE_WITH_EVEX512 1 + +#ifndef SECTION +# define SECTION(p) p##.evex512 +#endif + +#define VMM VMM_512 +#define VMM_lo VMM_any_zmm +#endif diff --git a/sysdeps/x86_64/multiarch/x86-sse2-vecs.h b/sysdeps/x86_64/multiarch/x86-sse2-vecs.h new file mode 100644 index 0000000000..b8bbd5dc29 --- /dev/null +++ b/sysdeps/x86_64/multiarch/x86-sse2-vecs.h @@ -0,0 +1,47 @@ +/* Common config for SSE2 VECs + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _X86_SSE2_VECS_H +#define _X86_SSE2_VECS_H 1 + +#ifdef VEC_SIZE +# error "Multiple VEC configs included!" +#endif + +#define VEC_SIZE 16 +#include "x86-vec-macros.h" + +#define USE_WITH_SSE2 1 +#define SECTION(p) p + +/* 3-byte mov instructions with SSE2. */ +#define MOV_SIZE 3 +/* No vzeroupper needed. */ +#define RET_SIZE 1 +#define VZEROUPPER + +#define VMOVU movups +#define VMOVA movaps +#define VMOVNT movntdq + +#define VMM_128 VMM_any_xmm +#define VMM VMM_any_xmm + + +#endif diff --git a/sysdeps/x86_64/multiarch/x86-vec-macros.h b/sysdeps/x86_64/multiarch/x86-vec-macros.h new file mode 100644 index 0000000000..7d6bb31d55 --- /dev/null +++ b/sysdeps/x86_64/multiarch/x86-vec-macros.h @@ -0,0 +1,90 @@ +/* Macro helpers for VEC_{type}({vec_num}) + All versions must be listed in ifunc-impl-list.c. + Copyright (C) 2022 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + . */ + +#ifndef _X86_VEC_MACROS_H +#define _X86_VEC_MACROS_H 1 + +#ifndef VEC_SIZE +# error "Never include this file directly. Always include a vector config." +#endif + +/* Defines so we can use SSE2 / AVX2 / EVEX / EVEX512 encoding with same + VMM(N) values. */ +#define VMM_hi_xmm0 xmm16 +#define VMM_hi_xmm1 xmm17 +#define VMM_hi_xmm2 xmm18 +#define VMM_hi_xmm3 xmm19 +#define VMM_hi_xmm4 xmm20 +#define VMM_hi_xmm5 xmm21 +#define VMM_hi_xmm6 xmm22 +#define VMM_hi_xmm7 xmm23 +#define VMM_hi_xmm8 xmm24 +#define VMM_hi_xmm9 xmm25 +#define VMM_hi_xmm10 xmm26 +#define VMM_hi_xmm11 xmm27 +#define VMM_hi_xmm12 xmm28 +#define VMM_hi_xmm13 xmm29 +#define VMM_hi_xmm14 xmm30 +#define VMM_hi_xmm15 xmm31 + +#define VMM_hi_ymm0 ymm16 +#define VMM_hi_ymm1 ymm17 +#define VMM_hi_ymm2 ymm18 +#define VMM_hi_ymm3 ymm19 +#define VMM_hi_ymm4 ymm20 +#define VMM_hi_ymm5 ymm21 +#define VMM_hi_ymm6 ymm22 +#define VMM_hi_ymm7 ymm23 +#define VMM_hi_ymm8 ymm24 +#define VMM_hi_ymm9 ymm25 +#define VMM_hi_ymm10 ymm26 +#define VMM_hi_ymm11 ymm27 +#define VMM_hi_ymm12 ymm28 +#define VMM_hi_ymm13 ymm29 +#define VMM_hi_ymm14 ymm30 +#define VMM_hi_ymm15 ymm31 + +#define VMM_hi_zmm0 zmm16 +#define VMM_hi_zmm1 zmm17 +#define VMM_hi_zmm2 zmm18 +#define VMM_hi_zmm3 zmm19 +#define VMM_hi_zmm4 zmm20 +#define VMM_hi_zmm5 zmm21 +#define VMM_hi_zmm6 zmm22 +#define VMM_hi_zmm7 zmm23 +#define VMM_hi_zmm8 zmm24 +#define VMM_hi_zmm9 zmm25 +#define VMM_hi_zmm10 zmm26 +#define VMM_hi_zmm11 zmm27 +#define VMM_hi_zmm12 zmm28 +#define VMM_hi_zmm13 zmm29 +#define VMM_hi_zmm14 zmm30 +#define VMM_hi_zmm15 zmm31 + +#define PRIMITIVE_VMM(vec, num) vec##num + +#define VMM_any_xmm(i) PRIMITIVE_VMM(xmm, i) +#define VMM_any_ymm(i) PRIMITIVE_VMM(ymm, i) +#define VMM_any_zmm(i) PRIMITIVE_VMM(zmm, i) + +#define VMM_hi_xmm(i) PRIMITIVE_VMM(VMM_hi_xmm, i) +#define VMM_hi_ymm(i) PRIMITIVE_VMM(VMM_hi_ymm, i) +#define VMM_hi_zmm(i) PRIMITIVE_VMM(VMM_hi_zmm, i) + +#endif