Message ID | 930934bd-0341-6607-3d7e-96c34cebbabc@redhat.com |
---|---|
State | New |
Headers | show |
Hi Vladimir, On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote: > This is the second try to fix > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478 > > The first try patch triggered a latent bug and broke one Fortran testcase > on x86-64. > > The patch was successfully bootstrapped on x86-64 and tested on x86-64, > ppc64, and aarch64. > > Committed as rev. 246808. > > This patch causes regression on arm*hf configurations: Executed from: gcc.target/arm/arm.exp gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times ldrh\\tr[0-9]+ 2 gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times strh\\tr[0-9]+ 2 gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times vld1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2 gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times vmov\\.f16\\tr[0-9]+, s[0-9]+ 4 gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times vmov\\.f16\\ts[0-9]+, r[0-9]+ 4 gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times vst1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2 See http://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/246809/report-build-info.html Is it just a matter of adjusting the testcases? (note that there is no regression when forcing either: -march=armv5t -mthumb/-march=armv8-a/-mfpu=crypto-neon-fp-armv8/-mfloat-abi=hard in the runtestflags. I would have to re--run the build/test manually to get the generated code, let me know if it's needed. Thanks, Christophe
On 04/11/2017 03:30 AM, Christophe Lyon wrote: > Hi Vladimir, > > On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote: >> This is the second try to fix >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478 >> >> The first try patch triggered a latent bug and broke one Fortran testcase >> on x86-64. >> >> The patch was successfully bootstrapped on x86-64 and tested on x86-64, >> ppc64, and aarch64. >> >> Committed as rev. 246808. >> >> > I would have to re--run the build/test manually to get the generated > code, let me know if it's needed. Yes, Christophe. It would be helpful. I've tried to reproduce it but I don't see the difference in the generated code.
On 04/11/2017 11:42 AM, Vladimir Makarov wrote: > > > Yes, Christophe. It would be helpful. I've tried to reproduce it but > I don't see the difference in the generated code. > Never mind. I've reproduced it. Thanks.
On 04/11/2017 03:30 AM, Christophe Lyon wrote: > Hi Vladimir, > > On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote: >> This is the second try to fix >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478 >> >> The first try patch triggered a latent bug and broke one Fortran testcase >> on x86-64. >> >> The patch was successfully bootstrapped on x86-64 and tested on x86-64, >> ppc64, and aarch64. >> >> Committed as rev. 246808. >> >> > This patch causes regression on arm*hf configurations: > Executed from: gcc.target/arm/arm.exp > gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times ldrh\\tr[0-9]+ 2 > gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times strh\\tr[0-9]+ 2 > gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times > vld1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2 > gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times > vmov\\.f16\\tr[0-9]+, s[0-9]+ 4 > gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times > vmov\\.f16\\ts[0-9]+, r[0-9]+ 4 > gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times > vst1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2 > > I've committed a patch which is supposed to fix the regression.
On 11 April 2017 at 21:43, Vladimir Makarov <vmakarov@redhat.com> wrote: > > > On 04/11/2017 03:30 AM, Christophe Lyon wrote: >> >> Hi Vladimir, >> >> On 10 April 2017 at 17:05, Vladimir Makarov <vmakarov@redhat.com> wrote: >>> >>> This is the second try to fix >>> >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70478 >>> >>> The first try patch triggered a latent bug and broke one Fortran >>> testcase >>> on x86-64. >>> >>> The patch was successfully bootstrapped on x86-64 and tested on >>> x86-64, >>> ppc64, and aarch64. >>> >>> Committed as rev. 246808. >>> >>> >> This patch causes regression on arm*hf configurations: >> Executed from: gcc.target/arm/arm.exp >> gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times >> ldrh\\tr[0-9]+ 2 >> gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times >> strh\\tr[0-9]+ 2 >> gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times >> vld1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2 >> gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times >> vmov\\.f16\\tr[0-9]+, s[0-9]+ 4 >> gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times >> vmov\\.f16\\ts[0-9]+, r[0-9]+ 4 >> gcc.target/arm/armv8_2-fp16-move-1.c scan-assembler-times >> vst1\\.16\\t{d[0-9]+\\[[0-9]+\\]}, \\[r[0-9]+\\] 2 >> >> > I've committed a patch which is supposed to fix the regression. > I confirm it's now OK. Thanks for the prompt fix! Christophe
Index: ChangeLog =================================================================== --- ChangeLog (revision 246807) +++ ChangeLog (working copy) @@ -1,3 +1,12 @@ +2017-04-10 Vladimir Makarov <vmakarov@redhat.com> + + PR rtl-optimization/70478 + * lra-constraints.c (curr_small_class_check): New. + (update_and_check_small_class_inputs): New. + (process_alt_operands): Update curr_small_class_check. Disfavor + alternative insn memory operands. Check available regs for small + class operands. + 2017-03-31 Matthew Fortune <matthew.fortune@imgtec.com> PR target/80057 Index: lra-constraints.c =================================================================== --- lra-constraints.c (revision 246789) +++ lra-constraints.c (working copy) @@ -1852,6 +1852,42 @@ prohibited_class_reg_set_mode_p (enum re (temp, ira_prohibited_class_mode_regs[rclass][mode])); } + +/* Used to check validity info about small class input operands. It + should be incremented at start of processing an insn + alternative. */ +static unsigned int curr_small_class_check = 0; + +/* Update number of used inputs of class OP_CLASS for operand NOP. + Return true if we have more such class operands than the number of + available regs. */ +static bool +update_and_check_small_class_inputs (int nop, enum reg_class op_class) +{ + static unsigned int small_class_check[LIM_REG_CLASSES]; + static int small_class_input_nums[LIM_REG_CLASSES]; + + if (SMALL_REGISTER_CLASS_P (op_class) + /* We are interesting in classes became small because of fixing + some hard regs, e.g. by an user through GCC options. */ + && hard_reg_set_intersect_p (reg_class_contents[op_class], + ira_no_alloc_regs) + && (curr_static_id->operand[nop].type != OP_OUT + || curr_static_id->operand[nop].early_clobber)) + { + if (small_class_check[op_class] == curr_small_class_check) + small_class_input_nums[op_class]++; + else + { + small_class_check[op_class] = curr_small_class_check; + small_class_input_nums[op_class] = 1; + } + if (small_class_input_nums[op_class] > ira_class_hard_regs_num[op_class]) + return true; + } + return false; +} + /* Major function to choose the current insn alternative and what operands should be reloaded and how. If ONLY_ALTERNATIVE is not negative we should consider only this alternative. Return false if @@ -1952,6 +1988,7 @@ process_alt_operands (int only_alternati if (!TEST_BIT (preferred, nalt)) continue; + curr_small_class_check++; overall = losers = addr_losers = 0; static_reject = reject = reload_nregs = reload_sum = 0; for (nop = 0; nop < n_operands; nop++) @@ -2685,6 +2722,21 @@ process_alt_operands (int only_alternati } } + /* When we use memory operand, the insn should read the + value from memory and even if we just wrote a value + into the memory it is costly in comparison with an + insn alternative which does not use memory + (e.g. register or immediate operand). */ + if (no_regs_p && offmemok) + { + if (lra_dump_file != NULL) + fprintf + (lra_dump_file, + " Using memory insn operand %d: reject+=3\n", + nop); + reject += 3; + } + #ifdef SECONDARY_MEMORY_NEEDED /* If reload requires moving value through secondary memory, it will need one more insn at least. */ @@ -2747,6 +2799,14 @@ process_alt_operands (int only_alternati goto fail; } + if (update_and_check_small_class_inputs (nop, this_alternative)) + { + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + " alt=%d, not enough small class regs -- refuse\n", + nalt); + goto fail; + } curr_alt[nop] = this_alternative; COPY_HARD_REG_SET (curr_alt_set[nop], this_alternative_set); curr_alt_win[nop] = this_alternative_win;