Message ID | 58638CBC.9000303@arm.com |
---|---|
State | New |
Headers | show |
On 28/12/16 09:58, Andre Vieira (lists) wrote: > On 29/11/16 09:45, Andre Vieira (lists) wrote: >> On 17/11/16 10:00, Ramana Radhakrishnan wrote: >>> On Thu, Oct 6, 2016 at 2:57 PM, Andre Vieira (lists) >>> <Andre.SimoesDiasVieira@arm.com> wrote: >>>> Hello, >>>> >>>> This patch tackles the issue reported in PR71607. This patch takes a >>>> different approach for disabling the creation of literal pools. Instead >>>> of disabling the patterns that would normally transform the rtl into >>>> actual literal pools, it disables the creation of this literal pool rtl >>>> by making the target hook TARGET_CANNOT_FORCE_CONST_MEM return true if >>>> arm_disable_literal_pool is true. I added patterns to split floating >>>> point constants for both SF and DFmode. A pattern to handle the >>>> addressing of label_refs had to be included as well since all >>>> "memory_operand" patterns are disabled when >>>> TARGET_CANNOT_FORCE_CONST_MEM returns true. Also the pattern for >>>> splitting 32-bit immediates had to be changed, it was not accepting >>>> unsigned 32-bit unsigned integers with the MSB set. I believe >>>> const_int_operand expects the mode of the operand to be set to VOIDmode >>>> and not SImode. I have only changed it in the patterns that were >>>> affecting this code, though I suggest looking into changing it in the >>>> rest of the ARM backend. >>>> >>>> I added more test cases. No regressions for arm-none-eabi with >>>> Cortex-M0, Cortex-M3 and Cortex-M7. >>>> >>>> Is this OK for trunk? >>> >>> Including -mslow-flash-data in your multilib flags ? If no regressions >>> with that ok . >>> >>> >>> regards >>> Ramana >>> >>>> >> >> Hello, >> >> I found some new ICE's with the -mslow-flash-data testing so I had to >> rework this patch. I took the opportunity to rebase it as well. >> >> The problem was with the way the old version of the patch handled label >> references. After some digging I found I wasn't using the right target >> hook and so I implemented the 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' for >> ARM. This target hook determines whether a literal pool ends up in an >> 'object_block' structure. So I reverted the changes made in the old >> version of the patch to the ARM implementation of the >> 'TARGET_CANNOT_FORCE_CONST_MEM' hook and rely on >> 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' instead. This patch adds an ARM >> implementation for this hook that returns false if >> 'arm_disable_literal_pool' is set to true and true otherwise. >> >> This version of the patch also reverts back to using the check for >> 'SYMBOL_REF' in 'thumb2_legitimate_address_p' that was removed in the >> last version, this code is required to place the label references in >> rodata sections. >> >> Another thing this patch does is revert the changes made to the 32-bit >> constant split in arm.md. The reason this was needed before was because >> 'real_to_target' returns a long array and does not sign-extend values in >> it, which would make sense on hosts with 64-bit longs. To fix this the >> value is now casted to 'int' first. It would probably be a good idea to >> change the 'real_to_target' function to return an array with >> 'HOST_WIDE_INT' elements instead and either use all 64-bits or >> sign-extend them. Something for the future? >> >> I added more test cases in this patch and reran regression tests for: >> Cortex-M0, Cortex-M4 with and without -mslow-flash-data. Also did a >> bootstrap+regressions on arm-none-linux-gnueabihf. >> >> Is this OK for trunk? >> >> Cheers, >> Andre >> >> gcc/ChangeLog: >> >> 2016-11-29 Andre Vieira <andre.simoesdiasvieira@arm.com> >> >> PR target/71607 >> * config/arm/arm.md (use_literal_pool): Removes. >> (64-bit immediate split): No longer takes cost into consideration >> if 'arm_disable_literal_pool' is enabled. >> * config/arm/arm.c (arm_use_blocks_for_constant_p): New. >> (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Define. >> (arm_max_const_double_inline_cost): Remove use of >> arm_disable_literal_pool. >> * config/arm/vfp.md (no_literal_pool_df_immediate): New. >> (no_literal_pool_sf_immediate): New. >> >> >> gcc/testsuite/ChangeLog: >> >> 2016-11-29 Andre Vieira <andre.simoesdiasvieira@arm.com> >> Thomas Preud'homme <thomas.preudhomme@arm.com> >> >> PR target/71607 >> * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ... >> * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this. >> * gcc.target/arm/thumb2-slow-flash-data-2.c: New. >> * gcc.target/arm/thumb2-slow-flash-data-3.c: New. >> * gcc.target/arm/thumb2-slow-flash-data-4.c: New. >> * gcc.target/arm/thumb2-slow-flash-data-5.c: New. >> > Hello, > > As I have mentioned in my commit message for the fix on embedded-6 (see > https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01304.html) I found an > issue with this code when testing its backport on the embedded-6-branch. > > I misread the definition of the target hook > TARGET_USE_BLOCKS_FOR_CONSTANT_P and it seems the way I implemented it > before only changed code generation if the -mslow-flash-data option > wasn't passed. I.e. I don't need to implement it to solve this issue. > The other changes in this patch are sufficient... > > I reran regressions for arm-none-eabi-gcc with the following targets: > Cortex-M0, Cortex-M3, Cortex-M7 and ARMv8-M Baseline. I also ran > bootstraps and full regressions for arm-none-linux-gnueabihf both in ARM > and THUMB mode. All green! > > Is this OK for trunk? > > Cheers, > Andre > > gcc/ChangeLog: > > 2016-12-28 Andre Vieira <andre.simoesdiasvieira@arm.com> > > PR target/71607 > * config/arm/arm.md (use_literal_pool): Removes. > (64-bit immediate split): No longer takes cost into consideration > if 'arm_disable_literal_pool' is enabled. > * config/arm/arm.c (arm_max_const_double_inline_cost): Remove use > of arm_disable_literal_pool. > * config/arm/vfp.md (no_literal_pool_df_immediate): New. > (no_literal_pool_sf_immediate): New. > > > gcc/testsuite/ChangeLog: > > 2016-12-28 Andre Vieira <andre.simoesdiasvieira@arm.com> > Thomas Preud'homme <thomas.preudhomme@arm.com> > > PR target/71607 > * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ... > * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this. > * gcc.target/arm/thumb2-slow-flash-data-2.c: New. > * gcc.target/arm/thumb2-slow-flash-data-3.c: New. > * gcc.target/arm/thumb2-slow-flash-data-4.c: New. > * gcc.target/arm/thumb2-slow-flash-data-5.c: New. > Ping. Cheers, Andre
On Wed, Dec 28, 2016 at 9:58 AM, Andre Vieira (lists) <Andre.SimoesDiasVieira@arm.com> wrote: > On 29/11/16 09:45, Andre Vieira (lists) wrote: >> On 17/11/16 10:00, Ramana Radhakrishnan wrote: >>> On Thu, Oct 6, 2016 at 2:57 PM, Andre Vieira (lists) >>> <Andre.SimoesDiasVieira@arm.com> wrote: >>>> Hello, >>>> >>>> This patch tackles the issue reported in PR71607. This patch takes a >>>> different approach for disabling the creation of literal pools. Instead >>>> of disabling the patterns that would normally transform the rtl into >>>> actual literal pools, it disables the creation of this literal pool rtl >>>> by making the target hook TARGET_CANNOT_FORCE_CONST_MEM return true if >>>> arm_disable_literal_pool is true. I added patterns to split floating >>>> point constants for both SF and DFmode. A pattern to handle the >>>> addressing of label_refs had to be included as well since all >>>> "memory_operand" patterns are disabled when >>>> TARGET_CANNOT_FORCE_CONST_MEM returns true. Also the pattern for >>>> splitting 32-bit immediates had to be changed, it was not accepting >>>> unsigned 32-bit unsigned integers with the MSB set. I believe >>>> const_int_operand expects the mode of the operand to be set to VOIDmode >>>> and not SImode. I have only changed it in the patterns that were >>>> affecting this code, though I suggest looking into changing it in the >>>> rest of the ARM backend. >>>> >>>> I added more test cases. No regressions for arm-none-eabi with >>>> Cortex-M0, Cortex-M3 and Cortex-M7. >>>> >>>> Is this OK for trunk? >>> >>> Including -mslow-flash-data in your multilib flags ? If no regressions >>> with that ok . >>> >>> >>> regards >>> Ramana >>> >>>> >> >> Hello, >> >> I found some new ICE's with the -mslow-flash-data testing so I had to >> rework this patch. I took the opportunity to rebase it as well. >> >> The problem was with the way the old version of the patch handled label >> references. After some digging I found I wasn't using the right target >> hook and so I implemented the 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' for >> ARM. This target hook determines whether a literal pool ends up in an >> 'object_block' structure. So I reverted the changes made in the old >> version of the patch to the ARM implementation of the >> 'TARGET_CANNOT_FORCE_CONST_MEM' hook and rely on >> 'TARGET_USE_BLOCKS_FOR_CONSTANT_P' instead. This patch adds an ARM >> implementation for this hook that returns false if >> 'arm_disable_literal_pool' is set to true and true otherwise. >> >> This version of the patch also reverts back to using the check for >> 'SYMBOL_REF' in 'thumb2_legitimate_address_p' that was removed in the >> last version, this code is required to place the label references in >> rodata sections. >> >> Another thing this patch does is revert the changes made to the 32-bit >> constant split in arm.md. The reason this was needed before was because >> 'real_to_target' returns a long array and does not sign-extend values in >> it, which would make sense on hosts with 64-bit longs. To fix this the >> value is now casted to 'int' first. It would probably be a good idea to >> change the 'real_to_target' function to return an array with >> 'HOST_WIDE_INT' elements instead and either use all 64-bits or >> sign-extend them. Something for the future? >> >> I added more test cases in this patch and reran regression tests for: >> Cortex-M0, Cortex-M4 with and without -mslow-flash-data. Also did a >> bootstrap+regressions on arm-none-linux-gnueabihf. >> >> Is this OK for trunk? >> >> Cheers, >> Andre >> >> gcc/ChangeLog: >> >> 2016-11-29 Andre Vieira <andre.simoesdiasvieira@arm.com> >> >> PR target/71607 >> * config/arm/arm.md (use_literal_pool): Removes. >> (64-bit immediate split): No longer takes cost into consideration >> if 'arm_disable_literal_pool' is enabled. >> * config/arm/arm.c (arm_use_blocks_for_constant_p): New. >> (TARGET_USE_BLOCKS_FOR_CONSTANT_P): Define. >> (arm_max_const_double_inline_cost): Remove use of >> arm_disable_literal_pool. >> * config/arm/vfp.md (no_literal_pool_df_immediate): New. >> (no_literal_pool_sf_immediate): New. >> >> >> gcc/testsuite/ChangeLog: >> >> 2016-11-29 Andre Vieira <andre.simoesdiasvieira@arm.com> >> Thomas Preud'homme <thomas.preudhomme@arm.com> >> >> PR target/71607 >> * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ... >> * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this. >> * gcc.target/arm/thumb2-slow-flash-data-2.c: New. >> * gcc.target/arm/thumb2-slow-flash-data-3.c: New. >> * gcc.target/arm/thumb2-slow-flash-data-4.c: New. >> * gcc.target/arm/thumb2-slow-flash-data-5.c: New. >> > Hello, > > As I have mentioned in my commit message for the fix on embedded-6 (see > https://gcc.gnu.org/ml/gcc-patches/2016-12/msg01304.html) I found an > issue with this code when testing its backport on the embedded-6-branch. > > I misread the definition of the target hook > TARGET_USE_BLOCKS_FOR_CONSTANT_P and it seems the way I implemented it > before only changed code generation if the -mslow-flash-data option > wasn't passed. I.e. I don't need to implement it to solve this issue. > The other changes in this patch are sufficient... > > I reran regressions for arm-none-eabi-gcc with the following targets: > Cortex-M0, Cortex-M3, Cortex-M7 and ARMv8-M Baseline. I also ran > bootstraps and full regressions for arm-none-linux-gnueabihf both in ARM > and THUMB mode. All green! And presumably with -mslow-flash-data as a multilib again ? > > Is this OK for trunk? We should also be adding to arm_reorg if (arm_disable_literal_pool) return ; after we split all the insns. After all if the splitters added had done their job, there's no reason to run the minipool placement code at all - is there :) ? Furthermore is it guaranteed that we will not have any references to the literal pool post reload - IIRC , reload in its older days could easily just push any old constant into a standard constant pool entry and make things work . Is LRA likely not to produce any constant pool entries in this form ? The bad thing here would be producing literal pool entries when the user had asked the compiler to do so - thus having more belts and braces for this would be good (thus , why is a change to arm_cannot_force_const_mem to return false in the face of arm_disable_literal_pool not appropriate ? ) Nick also had a couple of good points on making the test more usable across all testers in private conversation - so please make those changes. regards Ramana > > Cheers, > Andre > > gcc/ChangeLog: > > 2016-12-28 Andre Vieira <andre.simoesdiasvieira@arm.com> > > PR target/71607 > * config/arm/arm.md (use_literal_pool): Removes. > (64-bit immediate split): No longer takes cost into consideration > if 'arm_disable_literal_pool' is enabled. > * config/arm/arm.c (arm_max_const_double_inline_cost): Remove use > of arm_disable_literal_pool. > * config/arm/vfp.md (no_literal_pool_df_immediate): New. > (no_literal_pool_sf_immediate): New. > > > gcc/testsuite/ChangeLog: > > 2016-12-28 Andre Vieira <andre.simoesdiasvieira@arm.com> > Thomas Preud'homme <thomas.preudhomme@arm.com> > > PR target/71607 > * gcc.target/arm/thumb2-slow-flash-data.c: Renamed to ... > * gcc.target/arm/thumb2-slow-flash-data-1.c: ... this. > * gcc.target/arm/thumb2-slow-flash-data-2.c: New. > * gcc.target/arm/thumb2-slow-flash-data-3.c: New. > * gcc.target/arm/thumb2-slow-flash-data-4.c: New. > * gcc.target/arm/thumb2-slow-flash-data-5.c: New.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index bbf10f23987e58a1e066715e2168772da0245ff1..1f5c7b3e6108b8931e286f5b86de9c3aa1804d98 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -16373,10 +16373,6 @@ push_minipool_fix (rtx_insn *insn, HOST_WIDE_INT address, rtx *loc, int arm_max_const_double_inline_cost () { - /* Let the value get synthesized to avoid the use of literal pools. */ - if (arm_disable_literal_pool) - return 99; - return ((optimize_size || arm_ld_sched) ? 3 : 4); } @@ -30887,5 +30883,4 @@ arm_expand_divmod_libfunc (rtx libfunc, machine_mode mode, *quot_p = quotient; *rem_p = remainder; } - #include "gt-arm.h" diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md index ff1f565b850d1b9e7378461c5a808e7d486936bc..7d16e412765ccff4c527871b6b5d6142aa088b56 100644 --- a/gcc/config/arm/arm.md +++ b/gcc/config/arm/arm.md @@ -233,10 +233,6 @@ (match_test "arm_restrict_it")) (const_string "no") - (and (eq_attr "use_literal_pool" "yes") - (match_test "arm_disable_literal_pool")) - (const_string "no") - (eq_attr "arch_enabled" "no") (const_string "no")] (const_string "yes"))) @@ -5871,8 +5867,9 @@ (match_operand:ANY64 1 "immediate_operand" ""))] "TARGET_32BIT && reload_completed - && (arm_const_double_inline_cost (operands[1]) - <= arm_max_const_double_inline_cost ())" + && (arm_disable_literal_pool + || (arm_const_double_inline_cost (operands[1]) + <= arm_max_const_double_inline_cost ()))" [(const_int 0)] " arm_split_constant (SET, SImode, curr_insn, diff --git a/gcc/config/arm/vfp.md b/gcc/config/arm/vfp.md index f83dc9b130e4288faa49e40ee8285a0cd7d325b1..b8db34a607e4fac75789fa6a7c09b3b6104f45d3 100644 --- a/gcc/config/arm/vfp.md +++ b/gcc/config/arm/vfp.md @@ -2079,3 +2079,40 @@ ;; fmdhr et al (VFPv1) ;; Support for xD (single precision only) variants. ;; fmrrs, fmsrr + +;; Split an immediate DF move to two immediate SI moves. +(define_insn_and_split "no_literal_pool_df_immediate" + [(set (match_operand 0 "s_register_operand" "") + (match_operand:DF 1 "const_double_operand" ""))] + "TARGET_THUMB2 && arm_disable_literal_pool + && !(TARGET_HARD_FLOAT && TARGET_VFP_DOUBLE + && vfp3_const_double_rtx (operands[1]))" + "#" + "&& !reload_completed" + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) + (set (subreg:SI (match_dup 1) 4) (match_dup 3)) + (set (match_dup 0) (match_dup 1))] + " + long buf[2]; + real_to_target (buf, CONST_DOUBLE_REAL_VALUE (operands[1]), DFmode); + operands[2] = GEN_INT ((int) buf[0]); + operands[3] = GEN_INT ((int) buf[1]); + operands[1] = gen_reg_rtx (DFmode); + ") + +;; Split an immediate SF move to one immediate SI move. +(define_insn_and_split "no_literal_pool_sf_immediate" + [(set (match_operand 0 "s_register_operand" "") + (match_operand:SF 1 "const_double_operand" ""))] + "TARGET_THUMB2 && arm_disable_literal_pool + && !(TARGET_HARD_FLOAT && vfp3_const_double_rtx (operands[1]))" + "#" + "&& !reload_completed" + [(set (subreg:SI (match_dup 1) 0) (match_dup 2)) + (set (match_dup 0) (match_dup 1))] + " + long buf; + real_to_target (&buf, CONST_DOUBLE_REAL_VALUE (operands[1]), SFmode); + operands[2] = GEN_INT ((int) buf); + operands[1] = gen_reg_rtx (SFmode); + ") diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c new file mode 100644 index 0000000000000000000000000000000000000000..c52de113809b63986a4621524d68686fc795f7ee --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-2.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +float f (float); + +const float max = 0.01f; + +int +g (float in) +{ + if (f (in) + f (in) < max) + return 0; + return 1; +} + +double foo (void) +{ + return 0xF1EC7A5239123AF; +} + +double bar (void) +{ + return 0.0f; +} diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c new file mode 100644 index 0000000000000000000000000000000000000000..d25ba87413cb949e5e2162dbf4e695e205b01a34 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-3.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +/* From PR71607 */ + +float b; +void fn1 (); + +float +fn2 () +{ + return 1.1f; +} + +void +fn3 () +{ + float a[2]; + a[1] = b; + fn1 (a); +} diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c new file mode 100644 index 0000000000000000000000000000000000000000..dbe129168d7dbc86232587ad5ba5ec0438ed2622 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-4.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +double __attribute__ ((target ("fpu=fpv5-d16"))) +foo (void) +{ + return 1.0f; +} + +float __attribute__ ((target ("fpu=fpv5-d16"))) +bar (void) +{ + return 1.0f; +} + +float __attribute__ ((target ("fpu=fpv5-sp-d16"))) +baz (void) +{ + return 1.0f; +} + +/* { dg-final { scan-assembler-times "#1\\.0e\\+0" 3 } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c new file mode 100644 index 0000000000000000000000000000000000000000..7d1b2384738c7f495be257a46e6587bf43b6534a --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-5.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target arm_cortex_m } */ +/* { dg-require-effective-target arm_vfp_ok } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-skip-if "do not override -mfloat-abi" { *-*-* } { "-mfloat-abi=*" } { "-mfloat-abi=hard" } } */ +/* { dg-options "-O2 -mthumb -mfloat-abi=hard -mslow-flash-data" } */ + +double __attribute__ ((target ("fpu=fpv5-sp-d16"))) +foo (void) +{ + return 1.0f; +} + +/* { dg-final { scan-assembler-not "#1\\.0e\\+0" } } */ diff --git a/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c b/gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c similarity index 100% rename from gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data.c rename to gcc/testsuite/gcc.target/arm/thumb2-slow-flash-data-1.c