From patchwork Thu Feb 4 13:50:31 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 578936 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 8F3CD1402DD for ; Fri, 5 Feb 2016 00:50:48 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=c9dzbJFz; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=tOUORMXAO+LGCPu6LFAOT7qkn6Ot3e1gmmuXYQ3/gky 19chK2zqV37wUV3cfI4LX7SyvAfURNDE1afd/9NoGS3rL6dv2L5ER3obq30B2CDi L6IK6wCT41NkjHo0iHi1tNFooN1szZHfKfYnUcdTT/M64jGRMbNCbOlMTdfZhxrk = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=iSkyTCxiDh+UQf8y8vbpOIizeHY=; b=c9dzbJFzPm1AcUX6s 43pnD6LABz50bR91faoMtBFSjMhb0ZP0N6DeB8XdB8P2M0zai/9W8gv18XjmgClR Gpe8lYiCLYGs00mnF602YmIsyDdqe4w3qFv/uyQ65sIHYqNbCw0aEnQ/OSxNIqGl VgCO6DLcPB4Yy3Vj9FJCGt8aCM= Received: (qmail 124187 invoked by alias); 4 Feb 2016 13:50:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 124162 invoked by uid 89); 4 Feb 2016 13:50:37 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.4 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, RP_MATCHES_RCVD autolearn=no version=3.3.2 spammy=H*u:31.2.0, H*UA:31.2.0, lse, 10, 6 X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Thu, 04 Feb 2016 13:50:35 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 590663A1; Thu, 4 Feb 2016 05:49:49 -0800 (PST) Received: from [10.2.206.200] (e100706-lin.cambridge.arm.com [10.2.206.200]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 29FBD3F21A; Thu, 4 Feb 2016 05:50:33 -0800 (PST) Message-ID: <56B35727.6030104@foss.arm.com> Date: Thu, 04 Feb 2016 13:50:31 +0000 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Marcus Shawcroft , Richard Earnshaw , James Greenhalgh Subject: [PATCH][AArch64] Only update assembler .arch directive when necessary Hi all, As part of the target attributes and pragmas support for GCC 6 I changed the aarch64 port to emit a .arch assembly directive for each function that describes the architectural features used by that function. This is a change from GCC 5 behaviour where we output a single .arch directive at the beginning of the assembly file corresponding to architectural features given on the command line. We need that change to accommodate use cases where a user tags a single function with a target attribute that changes the arch features from those given on the command line and then proceeds to use an inline assembly instruction that needs those new features. We need to emit the new architecture state for that function as a .arch directive to prevent the assembler from rejecting the assembly. Unfortunately, the new behaviour has caused some headaches when building the Linux kernel. There, they have a header called lse.h that is used to access some armv8.1-a-specific inline assembly (using the "lse" architecture feature in particular) This header has the line: __asm__(".arch_extension lse"); to tell the assembler to accept LSE instructions. This header is included in various .c files so the .arch_extension appears at the beginning of the assembly file. But since we now emit a .arch directive for every function, we effectively reset the architectural state on each function, rendering that .arch_extension ineffective. GCC is arguably right in emitting the .arch directive on each function, but I'd like to make life easier for the kernel folk so this patch helps with that. With this patch we go back to emitting the .arch directive at the top of the assembly file, same in GCC 5. We will still emit per-function .arch directives but only if they differ from the last .arch directive that we emitted i.e. when the architecture features were changed due to a target attribute or pragma. For code that doesn't make use of target attributes or pragmas the behaviour of GCC 5 is thus preserved and we still get the correct .arch updates when needed. The TARGET_ASM_FILE_START hook implementation is brought back after I deleted it during the target attributes work last August and is used again to output the .arch directive at the top of the file. We keep track of the architecture and the architecture features we last output in the asm directive in a new static variable (yuck, but unavoidable IMO) and output the new .arch directive only if what we want to output is different from the previously output directive. This still allows the kernel to do naughty things with .arch_extension directives in header files, but this usecase is one where they really know what they're doing and don't want the assembler to get in their way. Bootstrapped and tested on aarch64-none-linux-gnu. With this patch I managed to build a recent allyesconfig Linux kernel where before the build would fail when assembling the LSE instructions. Ok for trunk? Thanks, Kyrill 2016-02-04 Kyrylo Tkachov * config/aarch64/aarch64.c (struct aarch64_output_asm_info): New struct definition. (aarch64_previous_asm_output): New variable. (aarch64_declare_function_name): Only output .arch assembler directive if it will be different from the previously output directive. (aarch64_start_file): New function. (TARGET_ASM_FILE_START): Define. 2016-02-04 Kyrylo Tkachov * gcc.target/aarch64/assembler_arch_1.c: Add -dA to dg-options. Delete unneeded -save-temps. * gcc.target/aarch64/assembler_arch_7.c: Likewise. * gcc.target/aarch64/target_attr_15.c: Scan assembly for .arch armv8-a\n. * gcc.target/aarch64/assembler_arch_1.c: New test. commit 2df0f24332e316b8d18d4571438f76726a0326e7 Author: Kyrylo Tkachov Date: Wed Jan 27 12:54:54 2016 +0000 [AArch64] Only update assembler .arch directive when necessary diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 5ca2ae8..0751440 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -11163,6 +11163,17 @@ aarch64_asm_preferred_eh_data_format (int code ATTRIBUTE_UNUSED, int global) return (global ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel | type; } +struct aarch64_output_asm_info +{ + const struct processor *arch; + const struct processor *cpu; + unsigned long isa_flags; +}; + +/* A record of the last architecture state that we output in assembly + i.e. the .arch directive. */ +static struct aarch64_output_asm_info aarch64_previous_asm_output; + /* Implement ASM_DECLARE_FUNCTION_NAME. Output the ISA features used by the function fndecl. */ @@ -11183,25 +11194,60 @@ aarch64_declare_function_name (FILE *stream, const char* name, = aarch64_get_arch (targ_options->x_explicit_arch); unsigned long isa_flags = targ_options->x_aarch64_isa_flags; - std::string extension - = aarch64_get_extension_string_for_isa_flags (isa_flags); - asm_fprintf (asm_out_file, "\t.arch %s%s\n", - this_arch->name, extension.c_str ()); + /* Only update the assembler .arch state if we are changing to a + distinct architecture state. */ + if (this_arch != aarch64_previous_asm_output.arch + || isa_flags != aarch64_previous_asm_output.isa_flags) + { + std::string extension + = aarch64_get_extension_string_for_isa_flags (isa_flags); + asm_fprintf (asm_out_file, "\t.arch %s%s\n", + this_arch->name, extension.c_str ()); + aarch64_previous_asm_output.arch = this_arch; + aarch64_previous_asm_output.isa_flags = isa_flags; + } /* Print the cpu name we're tuning for in the comments, might be useful to readers of the generated asm. */ const struct processor *this_tune = aarch64_get_tune_cpu (targ_options->x_explicit_tune_core); - asm_fprintf (asm_out_file, "\t" ASM_COMMENT_START ".tune %s\n", - this_tune->name); + if (flag_debug_asm && this_tune != aarch64_previous_asm_output.cpu) + { + asm_fprintf (asm_out_file, "\t" ASM_COMMENT_START ".tune %s\n", + this_tune->name); + aarch64_previous_asm_output.cpu = this_tune; + } /* Don't forget the type directive for ELF. */ ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "function"); ASM_OUTPUT_LABEL (stream, name); } +/* Implements TARGET_ASM_FILE_START. Output the assembly header. */ + +static void +aarch64_start_file (void) +{ + struct cl_target_option *default_options + = TREE_TARGET_OPTION (target_option_default_node); + + const struct processor *default_arch + = aarch64_get_arch (default_options->x_explicit_arch); + unsigned long default_isa_flags = default_options->x_aarch64_isa_flags; + std::string extension + = aarch64_get_extension_string_for_isa_flags (default_isa_flags); + + asm_fprintf (asm_out_file, "\t.arch %s%s\n", default_arch->name, + extension.c_str ()); + aarch64_previous_asm_output.arch = default_arch; + aarch64_previous_asm_output.isa_flags = default_isa_flags; + aarch64_previous_asm_output.cpu = NULL; + + default_file_start (); +} + /* Emit load exclusive. */ static void @@ -13935,6 +13981,9 @@ aarch64_optab_supported_p (int op, machine_mode, machine_mode, #define TARGET_ASM_CAN_OUTPUT_MI_THUNK \ hook_bool_const_tree_hwi_hwi_const_tree_true +#undef TARGET_ASM_FILE_START +#define TARGET_ASM_FILE_START aarch64_start_file + #undef TARGET_ASM_OUTPUT_MI_THUNK #define TARGET_ASM_OUTPUT_MI_THUNK aarch64_output_mi_thunk diff --git a/gcc/testsuite/gcc.target/aarch64/assembler_arch_1.c b/gcc/testsuite/gcc.target/aarch64/assembler_arch_1.c new file mode 100644 index 0000000..901e50a --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/assembler_arch_1.c @@ -0,0 +1,20 @@ +/* { dg-do assemble } */ +/* { dg-options "-march=armv8-a" } */ + +/* Make sure that the function header in assembly doesn't override + user asm arch_extension directives. */ + +__asm__ (".arch_extension lse"); + +void +foo (int i, int *v) +{ + register int w0 asm ("w0") = i; + register int *x1 asm ("x1") = v; + + asm volatile ( + "\tstset %w[i], %[v]\n" + : [i] "+r" (w0), [v] "+Q" (v) + : "r" (x1) + : "x30"); +} diff --git a/gcc/testsuite/gcc.target/aarch64/target_attr_1.c b/gcc/testsuite/gcc.target/aarch64/target_attr_1.c index 852ce1e..0527d0c 100644 --- a/gcc/testsuite/gcc.target/aarch64/target_attr_1.c +++ b/gcc/testsuite/gcc.target/aarch64/target_attr_1.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mcpu=thunderx -save-temps" } */ +/* { dg-options "-O2 -mcpu=thunderx -dA" } */ /* Test that cpu attribute overrides the command-line -mcpu. */ diff --git a/gcc/testsuite/gcc.target/aarch64/target_attr_15.c b/gcc/testsuite/gcc.target/aarch64/target_attr_15.c index 02091c6..f72bec8 100644 --- a/gcc/testsuite/gcc.target/aarch64/target_attr_15.c +++ b/gcc/testsuite/gcc.target/aarch64/target_attr_15.c @@ -10,6 +10,4 @@ foo (int a) return a + 1; } -/* { dg-final { scan-assembler-not "\\+fp" } } */ -/* { dg-final { scan-assembler-not "\\+crypto" } } */ -/* { dg-final { scan-assembler-not "\\+simd" } } */ +/* { dg-final { scan-assembler-times "\\.arch armv8-a\n" 1 } } */ diff --git a/gcc/testsuite/gcc.target/aarch64/target_attr_7.c b/gcc/testsuite/gcc.target/aarch64/target_attr_7.c index 32a8403..818d327 100644 --- a/gcc/testsuite/gcc.target/aarch64/target_attr_7.c +++ b/gcc/testsuite/gcc.target/aarch64/target_attr_7.c @@ -1,5 +1,5 @@ /* { dg-do compile } */ -/* { dg-options "-O2 -mcpu=thunderx -save-temps" } */ +/* { dg-options "-O2 -mcpu=thunderx -dA" } */ /* Make sure that #pragma overrides command line option and target attribute overrides the pragma. */