From patchwork Thu Dec 15 16:06:50 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Richard Earnshaw (lists)" X-Patchwork-Id: 706168 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tfdfK3Xqnz9ssP for ; Fri, 16 Dec 2016 03:08:41 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="S1tRO5Li"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:references:message-id:date:mime-version:in-reply-to :content-type; q=dns; s=default; b=OlYbc8/6HaiQChXjPiIKnDUMMBVFh AIMN3XD3CE1lhKHTS5WapTzlVBOyYetyyOggZwuCIIJox/TL6vWfDNQowQSqGzo/ vUKaneHRE313J10GIaxtJeYPpVDOIgWWykGRMaf0dcFvellnOxU6IPvWGkTkzTJs INIb/cTtZLxfKM= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :subject:to:references:message-id:date:mime-version:in-reply-to :content-type; s=default; bh=JnerS/xvtH0ONIs/gHeKS2MDPdc=; b=S1t RO5LikZkwkhUWmHN/V8tayI95LDyz32QvNFnUKEoTred8xO+yxt27K+N8INFqXZ4 Y2ayxAd3KtfivFGYdi+zZah+wlhls/Sna5thuuLhkS1tmbFx8KehY8JS5krZV0Qz jgx85PsCpjdKrmSsaUCJhvXIi2eCrWnzQx9iM8kM= Received: (qmail 37216 invoked by alias); 15 Dec 2016 16:06:57 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 37137 invoked by uid 89); 15 Dec 2016 16:06:56 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-5.0 required=5.0 tests=BAYES_00, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=differing, REV, sf, SF X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 15 Dec 2016 16:06:53 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8BE331516; Thu, 15 Dec 2016 08:06:52 -0800 (PST) Received: from e105689-lin.cambridge.arm.com (e105689-lin.cambridge.arm.com [10.2.207.32]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 14CA03F445 for ; Thu, 15 Dec 2016 08:06:51 -0800 (PST) From: "Richard Earnshaw (lists)" Subject: [PATCH 12/21] [arm] Eliminate vfp_reg_type To: gcc-patches@gcc.gnu.org References: Message-ID: <02d17e0d-7e6f-6698-f6b4-fcb0597d0b40@arm.com> Date: Thu, 15 Dec 2016 16:06:50 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1 MIME-Version: 1.0 In-Reply-To: Remove the VFP_REGS field by converting its meanings into flag attributes. The new flag attributes build on each other describing increasing capabilities. This allows us to do a better job when inlining functions with differing requiremetns on the fpu environment: we can now inline A into B if B has at least the same register set properties as B (previously we required identical register set properties). * arm.h (vfp_reg_type): Delete. (TARGET_FPU_REGS): Delete. (arm_fpu_desc): Delete regs field. (FPU_FL_NONE, FPU_FL_NEON, FPU_FL_FP16, FPU_FL_CRYPTO): Use unsigned values. (FPU_FL_DBL, FPU_FL_D32): Define. (TARGET_VFPD32): Use feature test. (TARGET_VFP_SINGLE): Likewise. (TARGET_VFP_DOUBLE): Likewise. * arm-fpus.def: Update all entries for new feature bits. * arm.c (all_fpus): Update initializer macro. (arm_can_inline_p): Remove test on fpu regs. --- gcc/config/arm/arm-fpus.def | 44 ++++++++++++++++++++++---------------------- gcc/config/arm/arm.c | 8 ++------ gcc/config/arm/arm.h | 26 +++++++++----------------- 3 files changed, 33 insertions(+), 45 deletions(-) diff --git a/gcc/config/arm/arm-fpus.def b/gcc/config/arm/arm-fpus.def index 04b2ef1..eca03bb 100644 --- a/gcc/config/arm/arm-fpus.def +++ b/gcc/config/arm/arm-fpus.def @@ -19,31 +19,31 @@ /* Before using #include to read this file, define a macro: - ARM_FPU(NAME, REV, VFP_REGS, FEATURES) + ARM_FPU(NAME, REV, FEATURES) The arguments are the fields of struct arm_fpu_desc. genopt.sh assumes no whitespace up to the first "," in each entry. */ -ARM_FPU("vfp", 2, VFP_REG_D16, FPU_FL_NONE) -ARM_FPU("vfpv2", 2, VFP_REG_D16, FPU_FL_NONE) -ARM_FPU("vfpv3", 3, VFP_REG_D32, FPU_FL_NONE) -ARM_FPU("vfpv3-fp16", 3, VFP_REG_D32, FPU_FL_FP16) -ARM_FPU("vfpv3-d16", 3, VFP_REG_D16, FPU_FL_NONE) -ARM_FPU("vfpv3-d16-fp16", 3, VFP_REG_D16, FPU_FL_FP16) -ARM_FPU("vfpv3xd", 3, VFP_REG_SINGLE, FPU_FL_NONE) -ARM_FPU("vfpv3xd-fp16", 3, VFP_REG_SINGLE, FPU_FL_FP16) -ARM_FPU("neon", 3, VFP_REG_D32, FPU_FL_NEON) -ARM_FPU("neon-vfpv3", 3, VFP_REG_D32, FPU_FL_NEON) -ARM_FPU("neon-fp16", 3, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16) -ARM_FPU("vfpv4", 4, VFP_REG_D32, FPU_FL_FP16) -ARM_FPU("vfpv4-d16", 4, VFP_REG_D16, FPU_FL_FP16) -ARM_FPU("fpv4-sp-d16", 4, VFP_REG_SINGLE, FPU_FL_FP16) -ARM_FPU("fpv5-sp-d16", 5, VFP_REG_SINGLE, FPU_FL_FP16) -ARM_FPU("fpv5-d16", 5, VFP_REG_D16, FPU_FL_FP16) -ARM_FPU("neon-vfpv4", 4, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16) -ARM_FPU("fp-armv8", 8, VFP_REG_D32, FPU_FL_FP16) -ARM_FPU("neon-fp-armv8", 8, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16) -ARM_FPU("crypto-neon-fp-armv8", 8, VFP_REG_D32, FPU_FL_NEON | FPU_FL_FP16 | FPU_FL_CRYPTO) +ARM_FPU("vfp", 2, FPU_FL_DBL) +ARM_FPU("vfpv2", 2, FPU_FL_DBL) +ARM_FPU("vfpv3", 3, FPU_FL_D32 | FPU_FL_DBL) +ARM_FPU("vfpv3-fp16", 3, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_FP16) +ARM_FPU("vfpv3-d16", 3, FPU_FL_DBL) +ARM_FPU("vfpv3-d16-fp16", 3, FPU_FL_DBL | FPU_FL_FP16) +ARM_FPU("vfpv3xd", 3, FPU_FL_NONE) +ARM_FPU("vfpv3xd-fp16", 3, FPU_FL_FP16) +ARM_FPU("neon", 3, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_NEON) +ARM_FPU("neon-vfpv3", 3, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_NEON) +ARM_FPU("neon-fp16", 3, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_NEON | FPU_FL_FP16) +ARM_FPU("vfpv4", 4, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_FP16) +ARM_FPU("vfpv4-d16", 4, FPU_FL_DBL | FPU_FL_FP16) +ARM_FPU("fpv4-sp-d16", 4, FPU_FL_FP16) +ARM_FPU("fpv5-sp-d16", 5, FPU_FL_FP16) +ARM_FPU("fpv5-d16", 5, FPU_FL_DBL | FPU_FL_FP16) +ARM_FPU("neon-vfpv4", 4, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_NEON | FPU_FL_FP16) +ARM_FPU("fp-armv8", 8, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_FP16) +ARM_FPU("neon-fp-armv8", 8, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_NEON | FPU_FL_FP16) +ARM_FPU("crypto-neon-fp-armv8", 8, FPU_FL_D32 | FPU_FL_DBL | FPU_FL_NEON | FPU_FL_FP16 | FPU_FL_CRYPTO) /* Compatibility aliases. */ -ARM_FPU("vfp3", 3, VFP_REG_D32, FPU_FL_NONE) +ARM_FPU("vfp3", 3, FPU_FL_D32 | FPU_FL_DBL) diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 822ef14..820a6ab 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -2323,8 +2323,8 @@ char arm_arch_name[] = "__ARM_ARCH_PROFILE__"; const struct arm_fpu_desc all_fpus[] = { -#define ARM_FPU(NAME, REV, VFP_REGS, FEATURES) \ - { NAME, REV, VFP_REGS, FEATURES }, +#define ARM_FPU(NAME, REV, FEATURES) \ + { NAME, REV, FEATURES }, #include "arm-fpus.def" #undef ARM_FPU }; @@ -30218,10 +30218,6 @@ arm_can_inline_p (tree caller, tree callee) if ((caller_fpu->features & callee_fpu->features) != callee_fpu->features) return false; - /* Need same FPU regs. */ - if (callee_fpu->regs != callee_fpu->regs) - return false; - /* OK to inline between different modes. Function with mode specific instructions, e.g using asm, must be explicitly protected with noinline. */ diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h index 7690e70..a412fb1 100644 --- a/gcc/config/arm/arm.h +++ b/gcc/config/arm/arm.h @@ -161,7 +161,7 @@ extern tree arm_fp16_type_node; to be more careful with TARGET_NEON as noted below. */ /* FPU is has the full VFPv3/NEON register file of 32 D registers. */ -#define TARGET_VFPD32 (TARGET_FPU_REGS == VFP_REG_D32) +#define TARGET_VFPD32 (TARGET_FPU_FEATURES & FPU_FL_D32) /* FPU supports VFPv3 instructions. */ #define TARGET_VFP3 (TARGET_FPU_REV >= 3) @@ -170,10 +170,10 @@ extern tree arm_fp16_type_node; #define TARGET_VFP5 (TARGET_FPU_REV >= 5) /* FPU only supports VFP single-precision instructions. */ -#define TARGET_VFP_SINGLE (TARGET_FPU_REGS == VFP_REG_SINGLE) +#define TARGET_VFP_SINGLE ((TARGET_FPU_FEATURES & FPU_FL_DBL) == 0) /* FPU supports VFP double-precision instructions. */ -#define TARGET_VFP_DOUBLE (TARGET_FPU_REGS != VFP_REG_SINGLE) +#define TARGET_VFP_DOUBLE (TARGET_FPU_FEATURES & FPU_FL_DBL) /* FPU supports half-precision floating-point with NEON element load/store. */ #define TARGET_NEON_FP16 \ @@ -335,24 +335,17 @@ typedef unsigned long arm_fpu_feature_set; #define ARM_FPU_FSET_HAS(S,F) (((S) & (F)) == (F)) /* FPU Features. */ -#define FPU_FL_NONE (0) -#define FPU_FL_NEON (1 << 0) /* NEON instructions. */ -#define FPU_FL_FP16 (1 << 1) /* Half-precision. */ -#define FPU_FL_CRYPTO (1 << 2) /* Crypto extensions. */ - -enum vfp_reg_type -{ - VFP_NONE = 0, - VFP_REG_D16, - VFP_REG_D32, - VFP_REG_SINGLE -}; +#define FPU_FL_NONE (0u) +#define FPU_FL_NEON (1u << 0) /* NEON instructions. */ +#define FPU_FL_FP16 (1u << 1) /* Half-precision. */ +#define FPU_FL_CRYPTO (1u << 2) /* Crypto extensions. */ +#define FPU_FL_DBL (1u << 3) /* Has double precision. */ +#define FPU_FL_D32 (1u << 4) /* Has 32 double precision regs. */ extern const struct arm_fpu_desc { const char *name; int rev; - enum vfp_reg_type regs; arm_fpu_feature_set features; } all_fpus[]; @@ -360,7 +353,6 @@ extern const struct arm_fpu_desc #define TARGET_FPU_NAME (all_fpus[arm_fpu_index].name) #define TARGET_FPU_REV (all_fpus[arm_fpu_index].rev) -#define TARGET_FPU_REGS (all_fpus[arm_fpu_index].regs) #define TARGET_FPU_FEATURES (all_fpus[arm_fpu_index].features) /* Which floating point hardware to schedule for. */