From patchwork Tue Nov 29 09:45:06 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Andre Vieira (lists)" X-Patchwork-Id: 700370 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3tSdvy0zHsz9t2C for ; Tue, 29 Nov 2016 20:45:49 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="TklGK62O"; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; q=dns; s=default; b=deO681f+3ihy4IAyN DEpoZx7EXmip1CyhyIPJ13afyxTjcnLrx7ifRRdfST4TOQoyXY725xO2qEHCcq6B KOnzwUXLLColCbyyJftUzLN2/CM9cTjt/neX6OLchkuA2fVW7nlmviL/gHI9RAqp DEaZe+BC+rcA1ulUfi6x09wMl0= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :subject:to:references:cc:from:message-id:date:mime-version :in-reply-to:content-type; s=default; bh=XJYMBTTpHtjsdTY1OqTH5V3 YTpo=; b=TklGK62OmVPm7Xsp1Rmh3JeyJx82c0wRjv8wr+9SRQl97Lf3fWByhOj zXQ9kITyT87BGk2JYXHlNsSp1UZM/aruLUI1bzeqeNKsEjElNWtUpQoRqe6xkrqz PcsEwNmkqI45JeoRYy5ekp1G76XTvJYmAYlKMmDiWUy91INLcV1o= Received: (qmail 29786 invoked by alias); 29 Nov 2016 09:45:38 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 29680 invoked by uid 89); 29 Nov 2016 09:45:31 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.5 required=5.0 tests=BAYES_00, KAM_LOTSOFHASH, RP_MATCHES_RCVD, SPF_PASS autolearn=ham version=3.3.2 spammy=accepting, match_operand, simode, SImode X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 29 Nov 2016 09:45:17 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 284B416; Tue, 29 Nov 2016 01:45:09 -0800 (PST) Received: from [10.2.206.251] (e107157-lin.cambridge.arm.com [10.2.206.251]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 63F703F445; Tue, 29 Nov 2016 01:45:08 -0800 (PST) Subject: Re: [PATCH, ARM] PR71607: New approach to arm_disable_literal_pool To: Ramana Radhakrishnan References: <57F6583E.4030002@arm.com> Cc: GCC Patches , Ramana Radhakrishnan , Kyrill Tkachov From: "Andre Vieira (lists)" Message-ID: <583D4E22.5050400@arm.com> Date: Tue, 29 Nov 2016 09:45:06 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0 MIME-Version: 1.0 In-Reply-To: X-IsSubscribed: yes On 17/11/16 10:00, Ramana Radhakrishnan wrote: > On Thu, Oct 6, 2016 at 2:57 PM, Andre Vieira (lists) > wrote: >> Hello, >> >> This patch tackles the issue reported in PR71607. This patch takes a >> different approach for disabling the creation of literal pools. Instead >> of disabling the patterns that would normally transform the rtl into >> actual literal pools, it disables the creation of this literal pool rtl >> by making the target hook TARGET_CANNOT_FORCE_CONST_MEM return true if >> arm_disable_literal_pool is true. I added patterns to split floating >> point constants for both SF and DFmode. A pattern to handle the >> addressing of label_refs had to be included as well since all >> "memory_operand" patterns are disabled when >> TARGET_CANNOT_FORCE_CONST_MEM returns true. Also the pattern for >> splitting 32-bit immediates had to be changed, it was not accepting >> unsigned 32-bit unsigned integers with the MSB set. I believe >> const_int_operand expects the mode of the operand to be set to VOIDmode >> and not SImode. I have only changed it in the patterns that were >> affecting this code, though I suggest looking into changing it in the >> rest of the ARM backend. >> >> I added more test cases. No regressions for arm-none-eabi with >> Cortex-M0, Cortex-M3 and Cortex-M7. >> >> Is this OK for trunk? > > Including -mslow-flash-data in your multilib flags ? If no regressions > with that ok . > > > regards > Ramana > >> Hello, I found some new ICE's with the -mslow-flash-data testing so I had to rework this patch. I took the opportunity to rebase it as well. The problem was with the way the old version of the patch handled label references. After some digging I found I wasn't using the right target hook and so I implemented the 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' for ARM. This target hook determines whether a literal pool ends up in an 'object_block' structure. So I reverted the changes made in the old version of the patch to the ARM implementation of the 'TARGET_CANNOT_FORCE_CONST_MEM' hook and rely on 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' instead. This patch adds an ARM implementation for this hook that returns false if 'arm_disable_literal_pool' is set to true and true otherwise. This version of the patch also reverts back to using the check for 'SYMBOL_REF' in 'thumb2_legitimate_address_p' that was removed in the last version, this code is required to place the label references in rodata sections. Another thing this patch does is revert the changes made to the 32-bit constant split in arm.md. The reason this was needed before was because 'real_to_target' returns a long array and does not sign-extend values in it, which would make sense on hosts with 64-bit longs. To fix this the value is now casted to 'int' first. It would probably be a good idea to change the 'real_to_target' function to return an array with 'HOST_WIDE_INT' elements instead and either use all 64-bits or sign-extend them. Something for the future? I added more test cases in this patch and reran regression tests for: Cortex-M0, Cortex-M4 with and without -mslow-flash-data. Also did a bootstrap+regressions on arm-none-linux-gnueabihf. Is this OK for trunk? Cheers, Andre gcc/ChangeLog: 2016-11-29 Andre Vieira PR target/71607 * config/arm/arm.md (use_literal_pool): Removes. (64-bit immediate split): No longer takes cost into consideration if 'arm_disable_literal_pool' is enabled. * config/arm/arm.c (arm_use_blocks_for_constant_p): New. (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Define. (arm_max_const_double_inline_cost): Remove use of arm_disable_literal_pool. * config/arm/vfp.md (no_literal_pool_df_immediate): New. (no_literal_pool_sf_immediate): New. gcc/testsuite/ChangeLog: 2016-11-29 Andre Vieira Thomas Preud'homme PR target/71607 * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ... * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this. * gcc.target/arm/thumb2-slow-flash-data-2.c: New. * gcc.target/arm/thumb2-slow-flash-data-3.c: New. * gcc.target/arm/thumb2-slow-flash-data-4.c: New. * gcc.target/arm/thumb2-slow-flash-data-5.c: New. diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index abd3276f13125e87fe7a88b60f0bf98cd580e7fb..1fcf57ccd9bda6c477db7a98084fd6f0e359de21 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -300,6 +300,8 @@ static bool arm_asm_elf_flags_numeric (unsigned int flags, unsigned int *num); static unsigned int arm_elf_section_type_flags (tree decl, const char *name, int reloc); static void arm_expand_divmod_libfunc (rtx, machine_mode, rtx, rtx, rtx *, rtx *); +static bool arm_use_blocks_for_constant_p (machine_mode var, const_rtx x); + /* Table of machine attributes. */ static const struct attribute_spec arm_attribute_table[] = @@ -738,6 +740,9 @@ static const struct attribute_spec arm_attribute_table[] = #undef TARGET_EXPAND_DIVMOD_LIBFUNC #define TARGET_EXPAND_DIVMOD_LIBFUNC arm_expand_divmod_libfunc +#undef TARGET_USE_BLOCKS_FOR_CONSTANT_P +#define TARGET_USE_BLOCKS_FOR_CONSTANT_P arm_use_blocks_for_constant_p + struct gcc_target targetm = TARGET_INITIALIZER; /* Obstack for minipool constant handling. */ @@ -15983,10 +15988,6 @@ push_minipool_fix (rtx_insn *insn, HOST_WIDE_INT address, rtx *loc, int arm_max_const_double_inline_cost () { - /* Let the value get synthesized to avoid the use of literal pools. */ - if (arm_disable_literal_pool) - return 99; - return ((optimize_size || arm_ld_sched) ? 3 : 4); } @@ -29668,4 +29669,14 @@ arm_expand_divmod_libfunc (rtx libfunc, machine_mode mode, *rem_p = remainder; } +/* Implements the TARGET_USE_BLOCKS_FOR_CONSTANT_P hook. + + If we have disabled the generation of constants inside a literal pool, then + this function returns false. Otherwise, return true. */ + +static bool +arm_use_blocks_for_constant_p (machine_mode /* var */, const_rtx /* x */) +{ + return !arm_disable_literal_pool; +} #include "gt-arm.h" diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index ccae728bd16e403cbcada323463b473ad0397b67..10b79687b1690f7ecb168272e57707c9143aad8f 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -233,10 +233,6 @@ (match_test "arm_restrict_it")) (const_string "no") - (and (eq_attr "use_literal_pool" "yes") - (match_test "arm_disable_literal_pool")) - (const_string "no") - (eq_attr "arch_enabled" "no") (const_string "no")] (const_string "yes"))) @@ -5846,8 +5842,9 @@ (match_operand:ANY64 1 "immediate_operand" ""))] "TARGET_32BIT && reload_completed - && (arm_const_double_inline_cost (operands[1]) - <= arm_max_const_double_inline_cost ())" + && (arm_disable_literal_pool + || (arm_const_double_inline_cost (operands[1]) + <= arm_max_const_double_inline_cost ()))" [(const_int 0)] " arm_split_constant (SET, SImode, curr_insn, diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md index ce56e160c043264fd41077e075896760644cd9a5..396e5a27fe61c95dc188fc8eae7434481746304e 100644 --- a/gcc/config/arm/vfp.md +++ b/gcc/config/arm/vfp.md @@ -2059,3 +2059,40 @@ ;; fmdhr et al (VFPv1) ;; Support for xD (single precision only) variants. ;; fmrrs, fmsrr + +;; Split an immediate DF move to two immediate SI moves. +(define_insn_and_split "no_literal_pool_df_immediate" + [(set (match_operand 0 "s_register_operand" "") + (match_operand:DF 1 "const_double_operand" ""))] + "TARGET_THUMB2 && arm_disable_literal_pool + && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE + && vfp3_const_double_rtx (operands[1]))" + "#" + "&& !reload_completed" + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) + (set (subreg:SI (match_dup 1) 4) (match_dup 3)) + (set (match_dup 0) (match_dup 1))] + " + long buf[2]; + real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode); + operands[2] = GEN_INT ((int) buf[0]); + operands[3] = GEN_INT ((int) buf[1]); + operands[1] = gen_reg_rtx (DFmode); + ") + +;; Split an immediate SF move to one immediate SI move. +(define_insn_and_split "no_literal_pool_sf_immediate" + [(set (match_operand 0 "s_register_operand" "") + (match_operand:SF 1 "const_double_operand" ""))] + "TARGET_THUMB2 && arm_disable_literal_pool + && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))" + "#" + "&& !reload_completed" + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) + (set (match_dup 0) (match_dup 1))] + " + long buf; + real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode); + operands[2] = GEN_INT ((int) buf); + operands[1] = gen_reg_rtx (SFmode); + ") diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c new file mode 100644 index 0000000000000000000000000000000000000000..c52de113809b63986a4621524d68686fc795f7ee --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +float f (float); + +const float max = 0.01f; + +int +g (float in) +{ + if (f (in) + f (in) < max) + return 0; + return 1; +} + +double foo (void) +{ + return 0xF1EC7A5239123AF; +} + +double bar (void) +{ + return 0.0f; +} diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c new file mode 100644 index 0000000000000000000000000000000000000000..d25ba87413cb949e5e2162dbf4e695e205b01a34 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +/* From PR71607 */ + +float b; +void fn1 (); + +float +fn2 () +{ + return 1.1f; +} + +void +fn3 () +{ + float a[2]; + a[1] = b; + fn1 (a); +} diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c new file mode 100644 index 0000000000000000000000000000000000000000..dbe129168d7dbc86232587ad5ba5ec0438ed2622 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +double __attribute__ ((target ("fpu=fpv5-d16"))) +foo (void) +{ + return 1.0f; +} + +float __attribute__ ((target ("fpu=fpv5-d16"))) +bar (void) +{ + return 1.0f; +} + +float __attribute__ ((target ("fpu=fpv5-sp-d16"))) +baz (void) +{ + return 1.0f; +} + +/* { dg-final { scan-assembler-times "#1\\.0e\\+0" 3 } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c new file mode 100644 index 0000000000000000000000000000000000000000..7d1b2384738c7f495be257a46e6587bf43b6534a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +double __attribute__ ((target ("fpu=fpv5-sp-d16"))) +foo (void) +{ + return 1.0f; +} + +/* { dg-final { scan-assembler-not "#1\\.0e\\+0" } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c similarity index 100% rename from gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c rename to gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c