From patchwork Wed Nov 6 18:21:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Sandiford X-Patchwork-Id: 2007656 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XkD7z6PdTz1xyS for ; Thu, 7 Nov 2024 05:22:23 +1100 (AEDT) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1BF7F3858D21 for ; Wed, 6 Nov 2024 18:22:22 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by sourceware.org (Postfix) with ESMTP id 9BFFD3858402; Wed, 6 Nov 2024 18:21:06 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 9BFFD3858402 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=arm.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 9BFFD3858402 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917269; cv=none; b=W7dDBRX1PoYcImgWfLFw32jlH+u2an5L4ZHU42tdsdU5nxDkEW++b0BbREMAuCjDxCLRlQtrNemJnD+NA25ju323RPAi4uM0jYzSGg7r3MwfvVBN3ZwBFrypEz6avoNH3/ibjkcXgYeyDLBNvCf9B2RBRcqUYs5TT9RGtuHQwbM= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1730917269; c=relaxed/simple; bh=cFPU7A9NiEKpT6D90HLriR/PFv1E98tBK9Qzw/AsRnA=; h=From:To:Subject:Date:Message-ID:MIME-Version; b=Gxh5DVYiDutGcNGtmJNECtU+6hyEtqydTkAoMEqmxgKS8H9+8pH1v3OHuYRG/P/yB0uB6kPj2ug1126xiSw5omcbFH+PXWwwKYEp+FajhUQazhMOZ5iGi/s6es5I68lupuTqSIKMwbSXAVZSTDK7Wrg4GMBX/DJDZiTWkRD8FEI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3C999497; Wed, 6 Nov 2024 10:21:36 -0800 (PST) Received: from localhost (e121540-lin.manchester.arm.com [10.32.110.72]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B99633F66E; Wed, 6 Nov 2024 10:21:05 -0800 (PST) From: Richard Sandiford To: gcc-patches@gcc.gnu.org Mail-Followup-To: gcc-patches@gcc.gnu.org,richard.earnshaw@arm.com, ktkachov@gcc.gnu.org, richard.sandiford@arm.com Cc: richard.earnshaw@arm.com, ktkachov@gcc.gnu.org Subject: [PATCH 11/15] aarch64: Define arm_neon.h types in arm_sve.h too In-Reply-To: (Richard Sandiford's message of "Wed, 06 Nov 2024 18:16:05 +0000") References: Date: Wed, 06 Nov 2024 18:21:04 +0000 Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org This patch moves the scalar and single-vector Advanced SIMD types from arm_neon.h into a private header, so that they can be defined by arm_sve.h as well. This is needed for the upcoming SVE2.1 hybrid-VLA reductions, which return 128-bit Advanced SIMD vectors. The approach follows Claudio's patch for FP8. gcc/ * config.gcc (extra_headers): Add arm_private_neon_types.h. * config/aarch64/arm_private_neon_types.h: New file, split out from... * config/aarch64/arm_neon.h: ...here. * config/aarch64/arm_sve.h: Include arm_private_neon_types.h --- gcc/config.gcc | 2 +- gcc/config/aarch64/arm_neon.h | 49 +------------ gcc/config/aarch64/arm_private_neon_types.h | 79 +++++++++++++++++++++ gcc/config/aarch64/arm_sve.h | 5 +- 4 files changed, 84 insertions(+), 51 deletions(-) create mode 100644 gcc/config/aarch64/arm_private_neon_types.h diff --git a/gcc/config.gcc b/gcc/config.gcc index 1b0637d7ff8..7e0108e2154 100644 --- a/gcc/config.gcc +++ b/gcc/config.gcc @@ -347,7 +347,7 @@ m32c*-*-*) ;; aarch64*-*-*) cpu_type=aarch64 - extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h arm_sme.h arm_neon_sve_bridge.h arm_private_fp8.h" + extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h arm_sme.h arm_neon_sve_bridge.h arm_private_fp8.h arm_private_neon_types.h" c_target_objs="aarch64-c.o" cxx_target_objs="aarch64-c.o" d_target_objs="aarch64-d.o" diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h index d3533f3ee6f..c727302ac75 100644 --- a/gcc/config/aarch64/arm_neon.h +++ b/gcc/config/aarch64/arm_neon.h @@ -30,58 +30,15 @@ #pragma GCC push_options #pragma GCC target ("+nothing+simd") +#include #include -#pragma GCC aarch64 "arm_neon.h" +#include -#include +#pragma GCC aarch64 "arm_neon.h" #define __AARCH64_UINT64_C(__C) ((uint64_t) __C) #define __AARCH64_INT64_C(__C) ((int64_t) __C) -typedef __Int8x8_t int8x8_t; -typedef __Int16x4_t int16x4_t; -typedef __Int32x2_t int32x2_t; -typedef __Int64x1_t int64x1_t; -typedef __Float16x4_t float16x4_t; -typedef __Float32x2_t float32x2_t; -typedef __Poly8x8_t poly8x8_t; -typedef __Poly16x4_t poly16x4_t; -typedef __Uint8x8_t uint8x8_t; -typedef __Uint16x4_t uint16x4_t; -typedef __Uint32x2_t uint32x2_t; -typedef __Float64x1_t float64x1_t; -typedef __Uint64x1_t uint64x1_t; -typedef __Int8x16_t int8x16_t; -typedef __Int16x8_t int16x8_t; -typedef __Int32x4_t int32x4_t; -typedef __Int64x2_t int64x2_t; -typedef __Float16x8_t float16x8_t; -typedef __Float32x4_t float32x4_t; -typedef __Float64x2_t float64x2_t; -typedef __Poly8x16_t poly8x16_t; -typedef __Poly16x8_t poly16x8_t; -typedef __Poly64x2_t poly64x2_t; -typedef __Poly64x1_t poly64x1_t; -typedef __Uint8x16_t uint8x16_t; -typedef __Uint16x8_t uint16x8_t; -typedef __Uint32x4_t uint32x4_t; -typedef __Uint64x2_t uint64x2_t; - -typedef __Poly8_t poly8_t; -typedef __Poly16_t poly16_t; -typedef __Poly64_t poly64_t; -typedef __Poly128_t poly128_t; - -typedef __Mfloat8x8_t mfloat8x8_t; -typedef __Mfloat8x16_t mfloat8x16_t; - -typedef __fp16 float16_t; -typedef float float32_t; -typedef double float64_t; - -typedef __Bfloat16x4_t bfloat16x4_t; -typedef __Bfloat16x8_t bfloat16x8_t; - /* __aarch64_vdup_lane internal macros. */ #define __aarch64_vdup_lane_any(__size, __q, __a, __b) \ vdup##__q##_n_##__size (__aarch64_vget_lane_any (__a, __b)) diff --git a/gcc/config/aarch64/arm_private_neon_types.h b/gcc/config/aarch64/arm_private_neon_types.h new file mode 100644 index 00000000000..0f588f026b7 --- /dev/null +++ b/gcc/config/aarch64/arm_private_neon_types.h @@ -0,0 +1,79 @@ +/* AArch64 type definitions for arm_neon.h + Do not include this file directly. Use one of arm_neon.h, arm_sme.h, + or arm_sve.h instead. + + Copyright (C) 2024 Free Software Foundation, Inc. + + This file is part of GCC. + + GCC is free software; you can redistribute it and/or modify it + under the terms of the GNU General Public License as published + by the Free Software Foundation; either version 3, or (at your + option) any later version. + + GCC is distributed in the hope that it will be useful, but WITHOUT + ANY WARRANTY; without even the implied warranty of MERCHANTABILITY + or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public + License for more details. + + Under Section 7 of GPL version 3, you are granted additional + permissions described in the GCC Runtime Library Exception, version + 3.1, as published by the Free Software Foundation. + + You should have received a copy of the GNU General Public License and + a copy of the GCC Runtime Library Exception along with this program; + see the files COPYING3 and COPYING.RUNTIME respectively. If not, see + . */ + +#ifndef _GCC_ARM_PRIVATE_NEON_TYPES_H +#define _GCC_ARM_PRIVATE_NEON_TYPES_H + +#if !defined(_AARCH64_NEON_H_) && !defined(_ARM_SVE_H_) +#error "This file should not be used standalone. Please include one of arm_neon.h arm_sve.h arm_sme.h instead." +#endif + +typedef __Int8x8_t int8x8_t; +typedef __Int16x4_t int16x4_t; +typedef __Int32x2_t int32x2_t; +typedef __Int64x1_t int64x1_t; +typedef __Float16x4_t float16x4_t; +typedef __Float32x2_t float32x2_t; +typedef __Poly8x8_t poly8x8_t; +typedef __Poly16x4_t poly16x4_t; +typedef __Uint8x8_t uint8x8_t; +typedef __Uint16x4_t uint16x4_t; +typedef __Uint32x2_t uint32x2_t; +typedef __Float64x1_t float64x1_t; +typedef __Uint64x1_t uint64x1_t; +typedef __Int8x16_t int8x16_t; +typedef __Int16x8_t int16x8_t; +typedef __Int32x4_t int32x4_t; +typedef __Int64x2_t int64x2_t; +typedef __Float16x8_t float16x8_t; +typedef __Float32x4_t float32x4_t; +typedef __Float64x2_t float64x2_t; +typedef __Poly8x16_t poly8x16_t; +typedef __Poly16x8_t poly16x8_t; +typedef __Poly64x2_t poly64x2_t; +typedef __Poly64x1_t poly64x1_t; +typedef __Uint8x16_t uint8x16_t; +typedef __Uint16x8_t uint16x8_t; +typedef __Uint32x4_t uint32x4_t; +typedef __Uint64x2_t uint64x2_t; + +typedef __Poly8_t poly8_t; +typedef __Poly16_t poly16_t; +typedef __Poly64_t poly64_t; +typedef __Poly128_t poly128_t; + +typedef __Mfloat8x8_t mfloat8x8_t; +typedef __Mfloat8x16_t mfloat8x16_t; + +typedef __fp16 float16_t; +typedef float float32_t; +typedef double float64_t; + +typedef __Bfloat16x4_t bfloat16x4_t; +typedef __Bfloat16x8_t bfloat16x8_t; + +#endif diff --git a/gcc/config/aarch64/arm_sve.h b/gcc/config/aarch64/arm_sve.h index aa0bd9909f9..a887c0f2f45 100644 --- a/gcc/config/aarch64/arm_sve.h +++ b/gcc/config/aarch64/arm_sve.h @@ -27,12 +27,9 @@ #include #include +#include #include -typedef __fp16 float16_t; -typedef float float32_t; -typedef double float64_t; - /* NOTE: This implementation of arm_sve.h is intentionally short. It does not define the SVE types and intrinsic functions directly in C and C++ code, but instead uses the following pragma to tell GCC to insert the