From patchwork Mon Sep 14 14:30:23 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Christian Bruel X-Patchwork-Id: 517450 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 2994E140770 for ; Tue, 15 Sep 2015 00:30:46 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=DpFnt39Q; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; q=dns; s=default; b=c5oXkb9YKvq8j5VEV35zfwWgaSnE6a9GGN7d99A7RrHrSxz7sU LFzAc4qCYBWTEHizZiTh3RvVEx/ehLY3jr9+WFNbVYl98kvpqvd3BEt6ewam9WsE tY+NDzSaIrBgFvhfVXdSwYMPIENFlpm/KENe/KayLf6xroeRgJo0NvSWo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:to:cc :from:subject:message-id:date:mime-version:content-type; s= default; bh=0aRSLLXwaTlBm1wudmRs5XFJBHc=; b=DpFnt39Q6rll+bx0ICid np13PI2tVP6jtBbalOOV0kdKzzrnOzlnIC3KD8LVy5DH9Qt+xBSyxzrFSvm/ZKw6 i1qNlrhlczls8iyjAh8QaN79oUuDLddRSd+u1j5zlgRFN6c5Y3qFETMUXiCZ5Ptf 8NXCT1f1sPBEhgHmte2ktPE= Received: (qmail 24766 invoked by alias); 14 Sep 2015 14:30:33 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 24711 invoked by uid 89); 14 Sep 2015 14:30:33 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL, BAYES_50, KAM_LAZY_DOMAIN_SECURITY, RCVD_IN_DNSWL_LOW autolearn=no version=3.3.2 X-HELO: mx07-00178001.pphosted.com Received: from mx07-00178001.pphosted.com (HELO mx07-00178001.pphosted.com) (62.209.51.94) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-SHA encrypted) ESMTPS; Mon, 14 Sep 2015 14:30:31 +0000 Received: from pps.filterd (m0046668.ppops.net [127.0.0.1]) by mx07-00178001.pphosted.com (8.14.5/8.14.5) with SMTP id t8EEOXE0022762; Mon, 14 Sep 2015 16:30:26 +0200 Received: from beta.dmz-eu.st.com (beta.dmz-eu.st.com [164.129.1.35]) by mx07-00178001.pphosted.com with ESMTP id 1ww4bgy5ny-1 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Mon, 14 Sep 2015 16:30:26 +0200 Received: from zeta.dmz-eu.st.com (zeta.dmz-eu.st.com [164.129.230.9]) by beta.dmz-eu.st.com (STMicroelectronics) with ESMTP id D92143A; Mon, 14 Sep 2015 14:30:11 +0000 (GMT) Received: from Webmail-eu.st.com (safex1hubcas4.st.com [10.75.90.69]) by zeta.dmz-eu.st.com (STMicroelectronics) with ESMTP id 271CF12DE; Mon, 14 Sep 2015 14:30:24 +0000 (GMT) Received: from [164.129.122.197] (164.129.122.197) by webmail-eu.st.com (10.75.90.13) with Microsoft SMTP Server (TLS) id 8.3.342.0; Mon, 14 Sep 2015 16:30:23 +0200 To: , CC: From: Christian Bruel Subject: [PATCH 4/4] [ARM] Add attribute/pragma target fpu= X-No-Archive: yes Message-ID: <55F6D9FF.4030600@st.com> Date: Mon, 14 Sep 2015 16:30:23 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151, 1.0.33, 0.0.0000 definitions=2015-09-14_02:2015-09-14, 2015-09-14, 1970-01-01 signatures=0 X-IsSubscribed: yes Finally, the final part of the patch set does the attribute target parsing and checking, redefines the preprocessor macros and implements the inlining rules. testcases and documentation included. thanks Christian 2015-05-26 Christian Bruel PR target/65837 * config/arm/arm-c.c (arm_cpu_builtins): Set or reset __ARM_FEATURE_CRYPTO, __VFP_FP__, __ARM_NEON__ (arm_pragma_target_parse): Change check for arm_cpu_builtins. undefine __ARM_FP. * doc/invoke.texi (-mfpu=): Mention attribute and pragma. * doc/extend.texi (-mfpu=): Describe attribute. 2015-09-14 Christian Bruel PR target/65837 gcc.target/arm/lto/pr65837_0.c gcc.target/arm/attr-neon2.c gcc.target/arm/attr-neon.c gcc.target/arm/attr-neon-builtin-fail.c gcc.target/arm/attr-crypto.c diff -ruN gnu_trunk.p3/gcc/gcc/config/arm/arm.c gnu_trunk.p4/gcc/gcc/config/arm/arm.c --- gnu_trunk.p3/gcc/gcc/config/arm/arm.c 2015-09-11 16:26:33.869000746 +0200 +++ gnu_trunk.p4/gcc/gcc/config/arm/arm.c 2015-09-11 17:24:23.636876647 +0200 @@ -29486,11 +29486,42 @@ /* Hook to determine if one function can safely inline another. */ static bool -arm_can_inline_p (tree caller ATTRIBUTE_UNUSED, tree callee ATTRIBUTE_UNUSED) +arm_can_inline_p (tree caller, tree callee) { - /* Overidde default hook: Always OK to inline between different modes. - Function with mode specific instructions, e.g using asm, must be explicitely - protected with noinline. */ + tree caller_tree = DECL_FUNCTION_SPECIFIC_TARGET (caller); + tree callee_tree = DECL_FUNCTION_SPECIFIC_TARGET (callee); + + struct cl_target_option *caller_opts + = TREE_TARGET_OPTION (caller_tree ? caller_tree + : target_option_default_node); + + struct cl_target_option *callee_opts + = TREE_TARGET_OPTION (callee_tree ? callee_tree + : target_option_default_node); + + const struct arm_fpu_desc *fpu_desc1 + = &all_fpus[caller_opts->x_arm_fpu_index]; + const struct arm_fpu_desc *fpu_desc2 + = &all_fpus[callee_opts->x_arm_fpu_index]; + + /* Can't inline NEON extension if the caller doesn't support it. */ + if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_NEON) + && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_NEON)) + return false; + + /* Can't inline CRYPTO extension if the caller doesn't support it. */ + if (ARM_FPU_FSET_HAS (fpu_desc2->features, FPU_FL_CRYPTO) + && ! ARM_FPU_FSET_HAS (fpu_desc1->features, FPU_FL_CRYPTO)) + return false; + + /* Need same model and regs. */ + if (fpu_desc2->model != fpu_desc1->model + || fpu_desc2->regs != fpu_desc1->regs) + return false; + + /* OK to inline between different modes. + Function with mode specific instructions, e.g using asm, + must be explicitely protected with noinline. */ return true; } @@ -29501,6 +29532,8 @@ static bool arm_valid_target_attribute_rec (tree args, struct gcc_options *opts) { + int ret=true; + if (TREE_CODE (args) == TREE_LIST) { bool ret = true; @@ -29518,30 +29551,35 @@ } char *argstr = ASTRDUP (TREE_STRING_POINTER (args)); - while (argstr && *argstr != '\0') + char *q; + + while ((q = strtok (argstr, ",")) != NULL) { - while (ISSPACE (*argstr)) - argstr++; + while (ISSPACE (*q)) ++q; - if (!strcmp (argstr, "thumb")) - { + argstr = NULL; + if (!strncmp (q, "thumb", 5)) opts->x_target_flags |= MASK_THUMB; - arm_option_check_internal (opts); - return true; - } - if (!strcmp (argstr, "arm")) - { + else if (!strncmp (q, "arm", 3)) opts->x_target_flags &= ~MASK_THUMB; - arm_option_check_internal (opts); - return true; + + else if (!strncmp (q, "fpu=", 4)) + { + if (! opt_enum_arg_to_value (OPT_mfpu_, q+4, + &opts->x_arm_fpu_index, CL_TARGET)) + { + error ("invalid fpu for attribute(target(\"%s\"))", q); + return false; + } } + else + warning (0, "attribute(target(\"%s\")) is unknown", argstr); - warning (0, "attribute(target(\"%s\")) is unknown", argstr); - return false; + arm_option_check_internal (opts); } - return false; + return ret; } /* Return a TARGET_OPTION_NODE tree of the target options listed or NULL. */ diff -ruN gnu_trunk.p3/gcc/gcc/config/arm/arm-c.c gnu_trunk.p4/gcc/gcc/config/arm/arm-c.c --- gnu_trunk.p3/gcc/gcc/config/arm/arm-c.c 2015-09-11 16:25:32.180858606 +0200 +++ gnu_trunk.p4/gcc/gcc/config/arm/arm-c.c 2015-09-11 17:00:26.085645968 +0200 @@ -68,8 +68,8 @@ def_or_undef_macro (pfile, "__ARM_FEATURE_DSP", TARGET_DSP_MULTIPLY); def_or_undef_macro (pfile, "__ARM_FEATURE_QBIT", TARGET_ARM_QBIT); def_or_undef_macro (pfile, "__ARM_FEATURE_SAT", TARGET_ARM_SAT); - if (TARGET_CRYPTO) - builtin_define ("__ARM_FEATURE_CRYPTO"); + def_or_undef_macro (pfile, "__ARM_FEATURE_CRYPTO", TARGET_CRYPTO); + if (unaligned_access) builtin_define ("__ARM_FEATURE_UNALIGNED"); if (TARGET_CRC32) @@ -129,8 +129,7 @@ if (TARGET_SOFT_FLOAT) builtin_define ("__SOFTFP__"); - if (TARGET_VFP) - builtin_define ("__VFP_FP__"); + def_or_undef_macro (pfile, "__VFP_FP__", TARGET_VFP); if (TARGET_ARM_FP) builtin_define_with_int_value ("__ARM_FP", TARGET_ARM_FP); @@ -141,11 +140,9 @@ if (TARGET_FMA) builtin_define ("__ARM_FEATURE_FMA"); - if (TARGET_NEON) - { - builtin_define ("__ARM_NEON__"); - builtin_define ("__ARM_NEON"); - } + def_or_undef_macro (pfile, "__ARM_NEON__", TARGET_NEON); + def_or_undef_macro (pfile, "__ARM_NEON", TARGET_NEON); + if (TARGET_NEON_FP) builtin_define_with_int_value ("__ARM_NEON_FP", TARGET_NEON_FP); @@ -231,7 +228,7 @@ gcc_assert (prev_opt); gcc_assert (cur_opt); - if (cur_opt->x_target_flags != prev_opt->x_target_flags) + if (cur_opt != prev_opt) { /* For the definitions, ensure all newly defined macros are considered as used for -Wunused-macros. There is no point warning about the @@ -242,6 +239,8 @@ /* Update macros. */ gcc_assert (cur_opt->x_target_flags == target_flags); + /* This one can be redefined by the pragma without warning. */ + cpp_undef (parse_in, "__ARM_FP"); arm_cpu_builtins (parse_in); cpp_opts->warn_unused_macros = saved_warn_unused_macros; diff -ruN gnu_trunk.p3/gcc/gcc/doc/extend.texi gnu_trunk.p4/gcc/gcc/doc/extend.texi --- gnu_trunk.p3/gcc/gcc/doc/extend.texi 2015-09-07 13:35:20.777683005 +0200 +++ gnu_trunk.p4/gcc/gcc/doc/extend.texi 2015-09-14 13:58:49.271385001 +0200 @@ -3606,10 +3606,17 @@ @item arm @cindex @code{target("arm")} function attribute, ARM Force code generation in the ARM (A32) ISA. -@end table Functions from different modes can be inlined in the caller's mode. +@item fpu= +@cindex @code{target("fpu=")} function attribute, ARM +Specifies the fpu for which to tune the performance of this function. +The behavior and permissible arguments are the same as for the @option{-mfpu=} +command-line option. + +@end table + @end table @node AVR Function Attributes diff -ruN gnu_trunk.p3/gcc/gcc/doc/invoke.texi gnu_trunk.p4/gcc/gcc/doc/invoke.texi --- gnu_trunk.p3/gcc/gcc/doc/invoke.texi 2015-09-10 12:21:00.698911244 +0200 +++ gnu_trunk.p4/gcc/gcc/doc/invoke.texi 2015-09-14 10:27:20.281932581 +0200 @@ -13360,6 +13363,8 @@ floating-point arithmetic (in particular denormal values are treated as zero), so the use of NEON instructions may lead to a loss of precision. +You can also set the fpu name at function level by using the @code{target("mfpu=")} function attributes (@pxref{ARM Function Attributes}) or pragmas (@pxref{Function Specific Option Pragmas}). + @item -mfp16-format=@var{name} @opindex mfp16-format Specify the format of the @code{__fp16} half-precision floating-point type. diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-crypto.c 2015-09-14 15:58:24.967898634 +0200 @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_crypto_ok } */ + +#pragma GCC target ("fpu=crypto-neon-fp-armv8") + +#ifndef __ARM_FEATURE_CRYPTO +#error __ARM_FEATURE_CRYPTO not defined. +#endif + +#ifndef __ARM_NEON +#error __ARM_NEON not defined. +#endif + +#if !defined(__ARM_FP) || (__ARM_FP != 14) +#error __ARM_FP +#endif + +#include "arm_neon.h" + +int +foo (void) +{ + uint32x4_t a = {0xd, 0xe, 0xa, 0xd}; + uint32x4_t b = {0, 1, 2, 3}; + + uint32x4_t res = vsha256su0q_u32 (a, b); + return res[0]; +} + +#pragma GCC reset_options + +/* Check that the FP version is correctly reset. */ + +#if !defined(__ARM_FP) || (__ARM_FP != 12) +#error __ARM_FP +#endif + +/* { dg-final { scan-assembler "sha256su0.32\tq\[0-9\]+, q\[0-9\]+" } } */ diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon2.c 2015-09-14 15:58:24.967898634 +0200 @@ -0,0 +1,28 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mfloat-abi=softfp -mfpu=vfp" } */ + +#pragma GCC target ("fpu=neon") +#include + +/* Check that pragma target is used. */ +int8x8_t +my (int8x8_t __a, int8x8_t __b) +{ + return __a + __b; +} + +#pragma GCC reset_options + +/* Check that command line option is restored. */ +int8x8_t +my1 (int8x8_t __a, int8x8_t __b) +{ + return __a + __b; +} + +/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */ +/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */ +/* { dg-final { scan-assembler "vadd" } } */ + + diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon-builtin-fail.c 2015-09-14 15:58:24.967898634 +0200 @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O2 -mfloat-abi=softfp -mfpu=neon" } */ + +#include + +void __attribute__ ((target ("fpu=vfp"))) +foo (uint8x16_t *p) +{ + *p = vmovq_n_u8 (3); /* { dg-error "called from here" } */ + +} + + +/* { dg-error "inlining failed in call to always_inline" "" { target *-*-* } 0 } */ + + + diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/attr-neon.c 2015-09-14 16:12:08.449698268 +0200 @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-O3 -mfloat-abi=softfp -ftree-vectorize" } */ + +void +f3(int n, int x[], int y[]) { + int i; + for (i = 0; i < n; ++i) + y[i] = x[i] << 3; +} + +/* Verify that neon instructions are emitted once. */ +void __attribute__ ((target("fpu=neon"))) + f1(int n, int x[], int y[]) { + int i; + for (i = 0; i < n; ++i) + y[i] = x[i] << 3; +} + +/* { dg-final { scan-assembler-times "\.fpu vfp" 1 } } */ +/* { dg-final { scan-assembler-times "\.fpu neon" 1 } } */ +/* { dg-final { scan-assembler-times "vshl" 1 } } */ + + + + diff -ruN gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c --- gnu_trunk.p3/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c 1970-01-01 01:00:00.000000000 +0100 +++ gnu_trunk.p4/gcc/gcc/testsuite/gcc.target/arm/lto/pr65837_0.c 2015-09-14 15:58:13.899874587 +0200 @@ -0,0 +1,14 @@ +/* { dg-lto-do run } */ +/* { dg-lto-options {{-flto -mfpu=neon}} } */ +/* { dg-suppress-ld-options {-mfpu=neon} } */ + +#include "arm_neon.h" + +float32x2_t a, b, c, e; + +int main() +{ + e = __builtin_neon_vmls_lanev2sf (a, b, c, 0); + return 0; +} +