From patchwork Tue Sep 15 10:47:58 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Bruel X-Patchwork-Id: 517812 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 81838140180 for ; Tue, 15 Sep 2015 20:48:18 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=hOxLX96r; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=qGbIh/IqfqOtkf20t WZny888CgZTYiwMuWImWrbo3wqRh0Hv7qTyfGXysq/sujFlqGEJDaAsOYbuQi2Uc S0H7Js8IP48XXpUu6SB8fWz+4B68Pe5IG9Zz9qf7DsObITtBMxhqI5B86Czc9bih yR3dBOjZC8DRcz2dK8Y4eA8h30= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=QEf3BElF16UBhqdbNLxgeG0 aXSw=; b=hOxLX96rsX65Fx81COkjf7sV1iI+qhU1+hzO6YHCtpWmSnXjGt1bWXQ rmX4dMqXa6tKCu+DxrLvPRniUksCwXiVvyuGNdaPGkGXK02i1cM2yMuvNH12s9P5 +JWgiSgSDXMNp326gg9vx9ucIc+xt0VZeR/pA+Cu8Xu5vMKsM8WY= Received: (qmail 77949 invoked by alias); 15 Sep 2015 10:48:09 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 75810 invoked by uid 89); 15 Sep 2015 10:48:07 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=1.3 required=5.0 tests=AWL, BAYES_99, BAYES_999, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-HELO: mx07-00178001.pphosted.com Received: from mx08-00178001.pphosted.com (HELO mx07-00178001.pphosted.com) (91.207.212.93) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Tue, 15 Sep 2015 10:48:05 +0000 Received: from pps.filterd (m0046660.ppops.net [127.0.0.1]) by mx08-00178001.pphosted.com (8.14.5/8.14.5) with SMTP id t8FAcHce024647; Tue, 15 Sep 2015 12:48:01 +0200 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx08-00178001.pphosted.com with ESMTP id 1www7dpe2d-1 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Tue, 15 Sep 2015 12:48:01 +0200 Received: from zeta.dmz-eu.st.com (zeta.dmz-eu.st.com [164.129.230.9]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id EC27D3A; Tue, 15 Sep 2015 10:47:46 +0000 (GMT) Received: from Webmail-eu.st.com (safex1hubcas4.st.com [10.75.90.69]) by zeta.dmz-eu.st.com (STMicroelectronics) with ESMTP id BB093AF83; Tue, 15 Sep 2015 10:47:59 +0000 (GMT) Received: from [164.129.122.197] (164.129.122.197) by webmail-eu.st.com (10.75.90.13) with Microsoft SMTP Server (TLS) id 8.3.342.0; Tue, 15 Sep 2015 12:47:59 +0200 Subject: Re: [PATCH 4/4] [ARM] Add attribute/pragma target fpu= To: , , References: <55F6D9FF.4030600@st.com> CC: From: Christian Bruel X-No-Archive: yes Message-ID: <55F7F75E.4070800@st.com> Date: Tue, 15 Sep 2015 12:47:58 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: <55F6D9FF.4030600@st.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151, 1.0.33, 0.0.0000 definitions=2015-09-15_05:2015-09-15, 2015-09-15, 1970-01-01 signatures=0 X-IsSubscribed: yes On 09/14/2015 04:30 PM, Christian Bruel wrote: > Finally, the final part of the patch set does the attribute target > parsing and checking, redefines the preprocessor macros and implements > the inlining rules. > > testcases and documentation included. > new version to remove a shadowed remnant piece of code. > thanks > > Christian > 2015-09-14 Christian Bruel PR target/65837 * config/arm/arm-c.c (arm_cpu_builtins): Set or reset __ARM_FEATURE_CRYPTO, __VFP_FP__, __ARM_NEON__ (arm_pragma_target_parse): Change check for arm_cpu_builtins. undefine __ARM_FP. * doc/invoke.texi (-mfpu=): Mention attribute and pragma. * doc/extend.texi (-mfpu=): Describe attribute. 2015-09-14 Christian Bruel PR target/65837 gcc.target/arm/lto/pr65837_0.c gcc.target/arm/attr-neon2.c gcc.target/arm/attr-neon.c gcc.target/arm/attr-neon-builtin-fail.c gcc.target/arm/attr-crypto.c diff -ruN gnu_trunk.p3/gcc/gcc/config/arm/arm.c gnu_trunk.p4/gcc/gcc/config/arm/arm.c --- gnu_trunk.p3/gcc/gcc/config/arm/arm.c 2015-09-11 16:26:33.869000746 +0200 +++ gnu_trunk.p4/gcc/gcc/config/arm/arm.c 2015-09-15 12:26:12.756161709 +0200 @@ -29486,11 +29486,42 @@ /* Hook to determine if one function can safely inline another. */ static bool -arm_can_inline_p (tree caller ATTRIBUTE_UNUSED, tree callee ATTRIBUTE_UNUSED) +arm_can_inline_p (tree caller, tree callee) { - /* Overidde default hook: Always OK to inline between different modes. - Function with mode specific instructions, e.g using asm, must be explicitely - protected with noinline. */ + tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); + tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); + + struct cl_target_option *caller_opts + = TREE_TARGET_OPTION (caller_tree ? caller_tree + : target_option_default_node); + + struct cl_target_option *callee_opts + = TREE_TARGET_OPTION (callee_tree ? callee_tree + : target_option_default_node); + + const struct arm_fpu_desc *fpu_desc1 + = &all_fpus[caller_opts->x_arm_fpu_index]; + const struct arm_fpu_desc *fpu_desc2 + = &all_fpus[callee_opts->x_arm_fpu_index]; + + /* Can't inline NEON extension if the caller doesn't support it. */ + if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_NEON) + && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_NEON)) + return false; + + /* Can't inline CRYPTO extension if the caller doesn't support it. */ + if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_CRYPTO) + && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_CRYPTO)) + return false; + + /* Need same model and regs. */ + if (fpu_desc2->model != fpu_desc1->model + || fpu_desc2->regs != fpu_desc1->regs) + return false; + + /* OK to inline between different modes. + Function with mode specific instructions, e.g using asm, + must be explicitely protected with noinline. */ return true; } @@ -29504,6 +29535,7 @@ if (TREE_CODE (args) == TREE_LIST) { bool ret = true; + for (; args; args = TREE_CHAIN (args)) if (TREE_VALUE (args) && !arm_valid_target_attribute_rec (TREE_VALUE (args), opts)) @@ -29518,30 +29550,38 @@ } char *argstr = ASTRDUP (TREE_STRING_POINTER (args)); - while (argstr && *argstr != '\0') + char *q; + + while ((q = strtok (argstr, ",")) != NULL) { - while (ISSPACE (*argstr)) - argstr++; + while (ISSPACE (*q)) ++q; - if (!strcmp (argstr, "thumb")) - { + argstr = NULL; + if (!strncmp (q, "thumb", 5)) opts->x_target_flags |= MASK_THUMB; - arm_option_check_internal (opts); - return true; - } - if (!strcmp (argstr, "arm")) - { + else if (!strncmp (q, "arm", 3)) opts->x_target_flags &= ~MASK_THUMB; - arm_option_check_internal (opts); - return true; + + else if (!strncmp (q, "fpu=", 4)) + { + if (! opt_enum_arg_to_value (OPT_mfpu_, q+4, + &opts->x_arm_fpu_index, CL_TARGET)) + { + error ("invalid fpu for attribute(target(\"%s\"))", q); + return false; + } + } + else + { + error ("attribute(target(\"%s\")) is unknown", q); + return false; } - warning (0, "attribute(target(\"%s\")) is unknown", argstr); - return false; + arm_option_check_internal (opts); } - return false; + return true; } /* Return a TARGET_OPTION_NODE tree of the target options listed or NULL. */ diff -ruN gnu_trunk.p3/gcc/gcc/config/arm/arm-c.c gnu_trunk.p4/gcc/gcc/config/arm/arm-c.c --- gnu_trunk.p3/gcc/gcc/config/arm/arm-c.c 2015-09-11 16:25:32.180858606 +0200 +++ gnu_trunk.p4/gcc/gcc/config/arm/arm-c.c 2015-09-11 17:00:26.085645968 +0200 @@ -68,8 +68,8 @@ def_or_undef_macro (pfile, "__ARM_FEATURE_DSP", TARGET_DSP_MULTIPLY); def_or_undef_macro (pfile, "__ARM_FEATURE_QBIT", TARGET_ARM_QBIT); def_or_undef_macro (pfile, "__ARM_FEATURE_SAT", TARGET_ARM_SAT); - if (TARGET_CRYPTO) - builtin_define ("__ARM_FEATURE_CRYPTO"); + def_or_undef_macro (pfile, "__ARM_FEATURE_CRYPTO", TARGET_CRYPTO); + if (unaligned_access) builtin_define ("__ARM_FEATURE_UNALIGNED"); if (TARGET_CRC32) @@ -129,8 +129,7 @@ if (TARGET_SOFT_FLOAT) builtin_define ("__SOFTFP__"); - if (TARGET_VFP) - builtin_define ("__VFP_FP__"); + def_or_undef_macro (pfile, "__VFP_FP__", TARGET_VFP); if (TARGET_ARM_FP) builtin_define_with_int_value ("__ARM_FP", TARGET_ARM_FP); @@ -141,11 +140,9 @@ if (TARGET_FMA) builtin_define ("__ARM_FEATURE_FMA"); - if (TARGET_NEON) - { - builtin_define ("__ARM_NEON__"); - builtin_define ("__ARM_NEON"); - } + def_or_undef_macro (pfile, "__ARM_NEON__", TARGET_NEON); + def_or_undef_macro (pfile, "__ARM_NEON", TARGET_NEON); + if (TARGET_NEON_FP) builtin_define_with_int_value ("__ARM_NEON_FP", TARGET_NEON_FP); @@ -231,7 +228,7 @@ gcc_assert (prev_opt); gcc_assert (cur_opt); - if (cur_opt->x_target_flags != prev_opt->x_target_flags) + if (cur_opt != prev_opt) { /* For the definitions, ensure all newly defined macros are considered as used for -Wunused-macros. There is no point warning about the @@ -242,6 +239,8 @@ /* Update macros. */ gcc_assert (cur_opt->x_target_flags == target_flags); + /* This one can be redefined by the pragma without warning. */ + cpp_undef (parse_in, "__ARM_FP"); arm_cpu_builtins (parse_in); cpp_opts->warn_unused_macros = saved_warn_unused_macros; diff -ruN gnu_trunk.p3/gcc/gcc/doc/extend.texi gnu_trunk.p4/gcc/gcc/doc/extend.texi --- gnu_trunk.p3/gcc/gcc/doc/extend.texi 2015-09-07 13:35:20.777683005 +0200 +++ gnu_trunk.p4/gcc/gcc/doc/extend.texi 2015-09-14 13:58:49.271385001 +0200 @@ -3606,10 +3606,17 @@ @item arm @cindex @code{target("arm")} function attribute, ARM Force code generation in the ARM (A32) ISA. -@end table Functions from different modes can be inlined in the caller's mode. +@item fpu= +@cindex @code{target("fpu=")} function attribute, ARM +Specifies the fpu for which to tune the performance of this function. +The behavior and permissible arguments are the same as for the @option{-mfpu=} +command-line option. + +@end table + @end table @node AVR Function Attributes diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi --- gnu_trunk.p3/gcc/gcc/doc/invoke.texi 2015-09-10 12:21:00.698911244 +0200 +++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi 2015-09-14 10:27:20.281932581 +0200 @@ -13360,6 +13363,8 @@ floating-point arithmetic (in particular denormal values are treated as zero), so the use of NEON instructions may lead to a loss of precision. +You can also set the fpu name at function level by using the @code{target("mfpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}). + @item -mfp16-format=@var{name} @opindex mfp16-format Specify the format of the @code{__fp16} half-precision floating-point type. diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c 2015-09-14 15:58:24.967898634 +0200 @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_crypto_ok } */ + +#pragma GCC target ("fpu=crypto-neon-fp-armv8") + +#ifndef __ARM_FEATURE_CRYPTO +#error __ARM_FEATURE_CRYPTO not defined. +#endif + +#ifndef __ARM_NEON +#error __ARM_NEON not defined. +#endif + +#if !defined(__ARM_FP) || (__ARM_FP != 14) +#error __ARM_FP +#endif + +#include "arm_neon.h" + +int +foo (void) +{ + uint32x4_t a = {0xd, 0xe, 0xa, 0xd}; + uint32x4_t b = {0, 1, 2, 3}; + + uint32x4_t res = vsha256su0q_u32 (a, b); + return res[0]; +} + +#pragma GCC reset_options + +/* Check that the FP version is correctly reset. */ + +#if !defined(__ARM_FP) || (__ARM_FP != 12) +#error __ARM_FP +#endif + +/* { dg-final { scan-assembler "sha256su0.32\tq\[0-9\]+, q\[0-9\]+" } } */ diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c 2015-09-14 15:58:24.967898634 +0200 @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mfloat-abi=softfp -mfpu=vfp" } */ + +#pragma GCC target ("fpu=neon") +#include + +/* Check that pragma target is used. */ +int8x8_t +my (int8x8_t __a, int8x8_t __b) +{ + return __a + __b; +} + +#pragma GCC reset_options + +/* Check that command line option is restored. */ +int8x8_t +my1 (int8x8_t __a, int8x8_t __b) +{ + return __a + __b; +} + +/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */ +/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */ +/* { dg-final { scan-assembler "vadd" } } */ + + diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c 2015-09-14 15:58:24.967898634 +0200 @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mfloat-abi=softfp -mfpu=neon" } */ + +#include + +void __attribute__ ((target ("fpu=vfp"))) +foo (uint8x16_t *p) +{ + *p = vmovq_n_u8 (3); /* { dg-error "called from here" } */ + +} + + +/* { dg-error "inlining failed in call to always_inline" "" { target *-*-* } 0 } */ + + + diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 2015-09-14 16:12:08.449698268 +0200 @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O3 -mfloat-abi=softfp -ftree-vectorize" } */ + +void +f3(int n, int x[], int y[]) { + int i; + for (i = 0; i < n; ++i) + y[i] = x[i] << 3; +} + +/* Verify that neon instructions are emitted once. */ +void __attribute__ ((target("fpu=neon"))) + f1(int n, int x[], int y[]) { + int i; + for (i = 0; i < n; ++i) + y[i] = x[i] << 3; +} + +/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */ +/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */ +/* { dg-final { scan-assembler-times "vshl" 1 } } */ + + + + diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c 2015-09-14 15:58:13.899874587 +0200 @@ -0,0 +1,14 @@ +/* { dg-lto-do run } */ +/* { dg-lto-options {{-flto -mfpu=neon}} } */ +/* { dg-suppress-ld-options {-mfpu=neon} } */ + +#include "arm_neon.h" + +float32x2_t a, b, c, e; + +int main() +{ + e = __builtin_neon_vmls_lanev2sf (a, b, c, 0); + return 0; +} +