Message ID | a9956825-4423-c479-8ad3-06d6bad4c782@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [rs6000] Fix PR88845: ICE in lra_set_insn_recog_data | expand |
Hi Peter, On Fri, Mar 01, 2019 at 01:33:27PM -0600, Peter Bergner wrote: > PR88845 shows a problem where LRA spilled an input operand of an inline > asm statement by calling our generic movsf pattern which ended up generating > an insn we don't have a pattern for, so we ICE. The insn was: > > (insn (set (reg:SF 125) > (subreg:SF (reg:SI 124) 0))) > > The problem is that rs6000_emit_move_si_sf_subreg() is disabled for LRA > and so wasn't able to call gen_movsf_from_si() which generates the correct > pattern for moving a 32-bit value from a GPR to a FPR. The patch below > fixes the issue by allowing rs6000_emit_move_si_sf_subreg() to be called > during LRA as well as creating an expander so that when it is called during > LRA, we can create the scratch register that is required for its associated > splitter. We have to do this, since LRA has already converted all of the > scratches into real registers before it does any spilling. > + /* If LRA is generating a direct move from a GPR to a FPR, > + then the splitter is going to need a scratch register. */ > + rtx insn = gen_movsf_from_si_internal (operands[0], operands[1]); > + XEXP (XVECEXP (insn, 0, 1), 0) = gen_reg_rtx (DImode); > + emit_insn (insn); > + DONE; This part isn't so great, needing detailed knowledge of the RTL generated by some other pattern. Maybe there already exists some function that generates a register for every scratch in an insn, or you can make such a function? Okay for trunk with or without such an improvement. Also backports, if you want those. But note (on trunk): > +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ > +/* { dg-options "-mcpu=power8 -O2" } */ These two lines should now be just /* { dg-options "-mdejagnu-cpu=power8 -O2" } */ Thanks! Segher
On 3/4/19 1:27 PM, Segher Boessenkool wrote: >> + /* If LRA is generating a direct move from a GPR to a FPR, >> + then the splitter is going to need a scratch register. */ >> + rtx insn = gen_movsf_from_si_internal (operands[0], operands[1]); >> + XEXP (XVECEXP (insn, 0, 1), 0) = gen_reg_rtx (DImode); >> + emit_insn (insn); >> + DONE; > > This part isn't so great, needing detailed knowledge of the RTL generated > by some other pattern. Maybe there already exists some function that > generates a register for every scratch in an insn, or you can make such > a function? A function that updates one insn does not exist. There is remove_scratches(), but that works on the entire cfg. As part of my earlier attempts, I did split remove_scratches() into a function that traverses the cfg and another that replaces the scratches in one insn. I've included it below. I went with the current patch, because that doesn't touch anything outside of the port. If you prefer the patch below, we can go with that instead. Let me know which you prefer. >> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ >> +/* { dg-options "-mcpu=power8 -O2" } */ > > These two lines should now be just > > /* { dg-options "-mdejagnu-cpu=power8 -O2" } */ Ok, will update that. Peter Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 269028) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac static bool rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode) { - if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed + if (TARGET_DIRECT_MOVE_64BIT && !reload_completed && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode)) && SUBREG_P (source) && sf_subreg_operand (source, mode)) { @@ -9902,7 +9902,9 @@ rs6000_emit_move_si_sf_subreg (rtx dest, if (mode == SFmode && inner_mode == SImode) { - emit_insn (gen_movsf_from_si (dest, inner_source)); + rtx_insn *insn = emit_insn (gen_movsf_from_si (dest, inner_source)); + if (lra_in_progress) + remove_scratches_1 (insn); return true; } } Index: gcc/lra.c =================================================================== --- gcc/lra.c (revision 269028) +++ gcc/lra.c (working copy) @@ -2077,7 +2077,40 @@ lra_register_new_scratch_op (rtx_insn *i add_reg_note (insn, REG_UNUSED, op); } -/* Change scratches onto pseudos and save their location. */ +/* Change INSN's scratches into pseudos and save their location. */ +void +remove_scratches_1 (rtx_insn *insn) +{ + int i; + bool insn_changed_p; + rtx reg; + lra_insn_recog_data_t id; + struct lra_static_insn_data *static_id; + + id = lra_get_insn_recog_data (insn); + static_id = id->insn_static_data; + insn_changed_p = false; + for (i = 0; i < static_id->n_operands; i++) + if (GET_CODE (*id->operand_loc[i]) == SCRATCH + && GET_MODE (*id->operand_loc[i]) != VOIDmode) + { + insn_changed_p = true; + *id->operand_loc[i] = reg + = lra_create_new_reg (static_id->operand[i].mode, + *id->operand_loc[i], ALL_REGS, NULL); + lra_register_new_scratch_op (insn, i, id->icode); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + "Removing SCRATCH in insn #%u (nop %d)\n", + INSN_UID (insn), i); + } + if (insn_changed_p) + /* Because we might use DF right after caller-saves sub-pass + we need to keep DF info up to date. */ + df_insn_rescan (insn); +} + +/* Change scratches into pseudos and save their location. */ static void remove_scratches (void) { @@ -2095,29 +2128,7 @@ remove_scratches (void) FOR_EACH_BB_FN (bb, cfun) FOR_BB_INSNS (bb, insn) if (INSN_P (insn)) - { - id = lra_get_insn_recog_data (insn); - static_id = id->insn_static_data; - insn_changed_p = false; - for (i = 0; i < static_id->n_operands; i++) - if (GET_CODE (*id->operand_loc[i]) == SCRATCH - && GET_MODE (*id->operand_loc[i]) != VOIDmode) - { - insn_changed_p = true; - *id->operand_loc[i] = reg - = lra_create_new_reg (static_id->operand[i].mode, - *id->operand_loc[i], ALL_REGS, NULL); - lra_register_new_scratch_op (insn, i, id->icode); - if (lra_dump_file != NULL) - fprintf (lra_dump_file, - "Removing SCRATCH in insn #%u (nop %d)\n", - INSN_UID (insn), i); - } - if (insn_changed_p) - /* Because we might use DF right after caller-saves sub-pass - we need to keep DF info up to date. */ - df_insn_rescan (insn); - } + remove_scratches_1 (insn); } /* Changes pseudos created by function remove_scratches onto scratches. */ Index: gcc/rtl.h =================================================================== --- gcc/rtl.h (revision 269028) +++ gcc/rtl.h (working copy) @@ -4087,6 +4087,9 @@ extern rtx fis_get_condition (rtx_insn * extern HARD_REG_SET eliminable_regset; extern void mark_elimination (int, int); +/* In lra.c */ +extern void remove_scratches_1 (rtx_insn *); + /* In reginfo.c */ extern int reg_classes_intersect_p (reg_class_t, reg_class_t); extern int reg_class_subset_p (reg_class_t, reg_class_t);
On 3/4/19 4:16 PM, Peter Bergner wrote: > Index: gcc/config/rs6000/rs6000.c > =================================================================== > --- gcc/config/rs6000/rs6000.c (revision 269028) > +++ gcc/config/rs6000/rs6000.c (working copy) > @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac > static bool > rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode) > { > - if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed > + if (TARGET_DIRECT_MOVE_64BIT && !reload_completed > && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode)) > && SUBREG_P (source) && sf_subreg_operand (source, mode)) > { > @@ -9902,7 +9902,9 @@ rs6000_emit_move_si_sf_subreg (rtx dest, > > if (mode == SFmode && inner_mode == SImode) > { > - emit_insn (gen_movsf_from_si (dest, inner_source)); > + rtx_insn *insn = emit_insn (gen_movsf_from_si (dest, inner_source)); > + if (lra_in_progress) > + remove_scratches_1 (insn); > return true; > } > } But maybe the call to remove_scratches_1() should move to lra_emit_move(), which is how we get to this code in the first place? Who knows what other generic move patterns might need scratches? Peter
On 3/4/19 4:24 PM, Peter Bergner wrote: > On 3/4/19 4:16 PM, Peter Bergner wrote: >> Index: gcc/config/rs6000/rs6000.c >> =================================================================== >> --- gcc/config/rs6000/rs6000.c (revision 269028) >> +++ gcc/config/rs6000/rs6000.c (working copy) >> @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac >> static bool >> rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode) >> { >> - if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed >> + if (TARGET_DIRECT_MOVE_64BIT && !reload_completed >> && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode)) >> && SUBREG_P (source) && sf_subreg_operand (source, mode)) >> { >> @@ -9902,7 +9902,9 @@ rs6000_emit_move_si_sf_subreg (rtx dest, >> >> if (mode == SFmode && inner_mode == SImode) >> { >> - emit_insn (gen_movsf_from_si (dest, inner_source)); >> + rtx_insn *insn = emit_insn (gen_movsf_from_si (dest, inner_source)); >> + if (lra_in_progress) >> + remove_scratches_1 (insn); >> return true; >> } >> } > > But maybe the call to remove_scratches_1() should move to lra_emit_move(), > which is how we get to this code in the first place? Who knows what other > generic move patterns might need scratches too? Like this. This bootstraps and regtests with no regressions. Do you prefer this instead? If so, we'll need Vlad or Jeff or ... to approve the LRA changes. Vlad and Jeff, The original problem and patch is described here: https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00061.html Short answer is, after enabling a rs6000 move pattern we need for spilling, we ICE when spilling, because the move pattern uses a scratch register and scratch registers are replaced early on during LRA initialization. The patch below just extracts out the code that fixes up one insn and makes it a function itself. I then changed lra_emit_move() to then call that function after generating the move insn so as to replace the scratch register the move pattern generated. Thoughts on this patch compared to the rs6000 only patch linked above? Peter gcc/ PR rtl-optimization/88845 * config/rs6000/rs6000.c (rs6000_emit_move_si_sf_subreg): Enable during LRA. * lra.c (remove_scratches_1): New function. (remove_scratches): Use it. (lra_emit_move): Likewise. gcc/testsuite/ PR rtl-optimization/88845 * gcc.target/powerpc/pr88845.c: New test. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 269263) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac static bool rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode) { - if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed + if (TARGET_DIRECT_MOVE_64BIT && !reload_completed && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode)) && SUBREG_P (source) && sf_subreg_operand (source, mode)) { Index: gcc/lra.c =================================================================== --- gcc/lra.c (revision 269263) +++ gcc/lra.c (working copy) @@ -159,6 +159,7 @@ static void invalidate_insn_recog_data ( static int get_insn_freq (rtx_insn *); static void invalidate_insn_data_regno_info (lra_insn_recog_data_t, rtx_insn *, int); +static void remove_scratches_1 (rtx_insn *); /* Expand all regno related info needed for LRA. */ static void @@ -494,7 +495,11 @@ lra_emit_move (rtx x, rtx y) if (rtx_equal_p (x, y)) return; old = max_reg_num (); - emit_move_insn (x, y); + rtx_insn *insn = emit_move_insn (x, y); + /* The move pattern may require scratch registers, so convert them + into real registers now. */ + if (insn != NULL_RTX) + remove_scratches_1 (insn); if (REG_P (x)) lra_reg_info[ORIGINAL_REGNO (x)].last_reload = ++lra_curr_reload_num; /* Function emit_move can create pseudos -- so expand the pseudo @@ -2077,47 +2082,53 @@ lra_register_new_scratch_op (rtx_insn *i add_reg_note (insn, REG_UNUSED, op); } -/* Change scratches onto pseudos and save their location. */ +/* Change INSN's scratches into pseudos and save their location. */ static void -remove_scratches (void) +remove_scratches_1 (rtx_insn *insn) { int i; bool insn_changed_p; - basic_block bb; - rtx_insn *insn; rtx reg; lra_insn_recog_data_t id; struct lra_static_insn_data *static_id; + id = lra_get_insn_recog_data (insn); + static_id = id->insn_static_data; + insn_changed_p = false; + for (i = 0; i < static_id->n_operands; i++) + if (GET_CODE (*id->operand_loc[i]) == SCRATCH + && GET_MODE (*id->operand_loc[i]) != VOIDmode) + { + insn_changed_p = true; + *id->operand_loc[i] = reg + = lra_create_new_reg (static_id->operand[i].mode, + *id->operand_loc[i], ALL_REGS, NULL); + lra_register_new_scratch_op (insn, i, id->icode); + if (lra_dump_file != NULL) + fprintf (lra_dump_file, + "Removing SCRATCH in insn #%u (nop %d)\n", + INSN_UID (insn), i); + } + if (insn_changed_p) + /* Because we might use DF right after caller-saves sub-pass + we need to keep DF info up to date. */ + df_insn_rescan (insn); +} + +/* Change scratches into pseudos and save their location. */ +static void +remove_scratches (void) +{ + basic_block bb; + rtx_insn *insn; + scratches.create (get_max_uid ()); bitmap_initialize (&scratch_bitmap, ®_obstack); bitmap_initialize (&scratch_operand_bitmap, ®_obstack); FOR_EACH_BB_FN (bb, cfun) FOR_BB_INSNS (bb, insn) if (INSN_P (insn)) - { - id = lra_get_insn_recog_data (insn); - static_id = id->insn_static_data; - insn_changed_p = false; - for (i = 0; i < static_id->n_operands; i++) - if (GET_CODE (*id->operand_loc[i]) == SCRATCH - && GET_MODE (*id->operand_loc[i]) != VOIDmode) - { - insn_changed_p = true; - *id->operand_loc[i] = reg - = lra_create_new_reg (static_id->operand[i].mode, - *id->operand_loc[i], ALL_REGS, NULL); - lra_register_new_scratch_op (insn, i, id->icode); - if (lra_dump_file != NULL) - fprintf (lra_dump_file, - "Removing SCRATCH in insn #%u (nop %d)\n", - INSN_UID (insn), i); - } - if (insn_changed_p) - /* Because we might use DF right after caller-saves sub-pass - we need to keep DF info up to date. */ - df_insn_rescan (insn); - } + remove_scratches_1 (insn); } /* Changes pseudos created by function remove_scratches onto scratches. */ Index: gcc/testsuite/gcc.target/powerpc/pr88845.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr88845.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr88845.c (working copy) @@ -0,0 +1,24 @@ +/* { dg-do compile { target powerpc*-*-linux* } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-skip-if "" { powerpc*-*-*spe* } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-options "-mdejagnu-cpu=power8 -O2" } */ +/* { dg-final { scan-assembler {\mmtvsrd\M} { target { lp64 } } } } */ +/* { dg-final { scan-assembler {\mxscvspdpn\M} { target { lp64 } } } } */ + +/* Verify that we do not ICE and that we generate a direct move + for float types when compiling for 64-bit. */ + +struct a { + unsigned ui; + float f; +}; + +void +foo (void) +{ + float e; + struct a s; + e = s.f; + __asm__("" : : "d" (e)); +}
Hi Peter, On Mon, Mar 04, 2019 at 04:16:23PM -0600, Peter Bergner wrote: > On 3/4/19 1:27 PM, Segher Boessenkool wrote: > >> + /* If LRA is generating a direct move from a GPR to a FPR, > >> + then the splitter is going to need a scratch register. */ > >> + rtx insn = gen_movsf_from_si_internal (operands[0], operands[1]); > >> + XEXP (XVECEXP (insn, 0, 1), 0) = gen_reg_rtx (DImode); > >> + emit_insn (insn); > >> + DONE; > > > > This part isn't so great, needing detailed knowledge of the RTL generated > > by some other pattern. Maybe there already exists some function that > > generates a register for every scratch in an insn, or you can make such > > a function? > > A function that updates one insn does not exist. There is remove_scratches(), > but that works on the entire cfg. As part of my earlier attempts, I did split > remove_scratches() into a function that traverses the cfg and another that > replaces the scratches in one insn. I've included it below. I went with the > current patch, because that doesn't touch anything outside of the port. > If you prefer the patch below, we can go with that instead. Let me know which > you prefer. I meant just like you did in your original patch, just gen_reg_rtx and nothing more, but finding the SCRATCH locations by itself. This might also be useful for the very many splitters we have that allocate regs for SCRATCHes. Something like (totally untested) void itch_scratches (rtx_insn *insn) { subrtx_ptr_iterator::array_type array; FOR_EACH_SUBRTX_PTR (iter, array, PATTERN (insn), NONCONST) if (GET_CODE (**iter) == SCRATCH) **iter = gen_reg_rtx (GET_MODE (**iter)); } But, it seems you need to keep track of other things on the side for LRA? Segher
On Mon, Mar 04, 2019 at 06:41:16PM -0600, Peter Bergner wrote: > Like this. This bootstraps and regtests with no regressions. Do you prefer > this instead? If so, we'll need Vlad or Jeff or ... to approve the LRA > changes. Either solution is fine with me, whatever the LRA experts prefer. Even your original patch is okay, just a bit shaky; I don't expect us to need more like this though, SF is a bit special, so I can just close my eyes and move on, no problem :-) [ I do wonder why we are having these problems, it's not a very special problem, other targets should hit similar... do they use secondary reloads maybe? ] Segher
On 3/5/19 5:25 AM, Segher Boessenkool wrote:
> But, it seems you need to keep track of other things on the side for LRA?
The extra LRA info is to keep track of scratches that are not needed.
In our case, only the one alternative in movsf_from_si requires a
scratch register. The rest use an X constraint. For those alternatives,
LRA changes the scratch reg back to just a scratch when it's all done.
Peter
On 3/5/19 5:51 AM, Segher Boessenkool wrote: > On Mon, Mar 04, 2019 at 06:41:16PM -0600, Peter Bergner wrote: >> Like this. This bootstraps and regtests with no regressions. Do you prefer >> this instead? If so, we'll need Vlad or Jeff or ... to approve the LRA >> changes. > > Either solution is fine with me, whatever the LRA experts prefer. Even > your original patch is okay, just a bit shaky; I don't expect us to need > more like this though, SF is a bit special, so I can just close my eyes > and move on, no problem :-) I'd like to hold off until Vlad and/or Jeff...or anyone else chimes in. It seems like LRA is assuming the move insn it generates won't have a scratch, since it replaces all scratches early, but as we can see for our 32-bit GPR to FPR copy, we do need one. > [ I do wonder why we are having these problems, it's not a very special > problem, other targets should hit similar... do they use secondary reloads > maybe? ] Secondary reload can't fix the problem of the scratch replacement, since we try and immediately recog the move insn we just created and we ICE because the scratch hasn't yet been replaced with a pseudo. Peter
On 03/04/2019 07:41 PM, Peter Bergner wrote: > On 3/4/19 4:24 PM, Peter Bergner wrote: >> On 3/4/19 4:16 PM, Peter Bergner wrote: >>> Index: gcc/config/rs6000/rs6000.c >>> =================================================================== >>> --- gcc/config/rs6000/rs6000.c (revision 269028) >>> +++ gcc/config/rs6000/rs6000.c (working copy) >>> @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac >>> static bool >>> rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode) >>> { >>> - if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed >>> + if (TARGET_DIRECT_MOVE_64BIT && !reload_completed >>> && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode)) >>> && SUBREG_P (source) && sf_subreg_operand (source, mode)) >>> { >>> @@ -9902,7 +9902,9 @@ rs6000_emit_move_si_sf_subreg (rtx dest, >>> >>> if (mode == SFmode && inner_mode == SImode) >>> { >>> - emit_insn (gen_movsf_from_si (dest, inner_source)); >>> + rtx_insn *insn = emit_insn (gen_movsf_from_si (dest, inner_source)); >>> + if (lra_in_progress) >>> + remove_scratches_1 (insn); >>> return true; >>> } >>> } >> But maybe the call to remove_scratches_1() should move to lra_emit_move(), >> which is how we get to this code in the first place? Who knows what other >> generic move patterns might need scratches too? > Like this. This bootstraps and regtests with no regressions. Do you prefer > this instead? If so, we'll need Vlad or Jeff or ... to approve the LRA > changes. > > Vlad and Jeff, > > The original problem and patch is described here: > > https://gcc.gnu.org/ml/gcc-patches/2019-03/msg00061.html > > Short answer is, after enabling a rs6000 move pattern we need for spilling, > we ICE when spilling, because the move pattern uses a scratch register > and scratch registers are replaced early on during LRA initialization. > The patch below just extracts out the code that fixes up one insn and > makes it a function itself. I then changed lra_emit_move() to then call > that function after generating the move insn so as to replace the scratch > register the move pattern generated. Thoughts on this patch compared to > the rs6000 only patch linked above? I recommend don't use scratches at all because IRA ignores them and it might result in worse allocation. Unfortunately, removing scratches for all targets requires a lot of work. This patch deals with scratches during all LRA work which is good. So I like the patch and LRA part of it is ok to commit. Peter, thank you for adding the new functionality to LRA. > > > gcc/ > PR rtl-optimization/88845 > * config/rs6000/rs6000.c (rs6000_emit_move_si_sf_subreg): Enable during > LRA. > * lra.c (remove_scratches_1): New function. > (remove_scratches): Use it. > (lra_emit_move): Likewise. > > gcc/testsuite/ > PR rtl-optimization/88845 > * gcc.target/powerpc/pr88845.c: New test. > > > Index: gcc/lra.c > =================================================================== > --- gcc/lra.c (revision 269263) > +++ gcc/lra.c (working copy) > @@ -159,6 +159,7 @@ static void invalidate_insn_recog_data ( > static int get_insn_freq (rtx_insn *); > static void invalidate_insn_data_regno_info (lra_insn_recog_data_t, > rtx_insn *, int); > +static void remove_scratches_1 (rtx_insn *); > > /* Expand all regno related info needed for LRA. */ > static void > @@ -494,7 +495,11 @@ lra_emit_move (rtx x, rtx y) > if (rtx_equal_p (x, y)) > return; > old = max_reg_num (); > - emit_move_insn (x, y); > + rtx_insn *insn = emit_move_insn (x, y); > + /* The move pattern may require scratch registers, so convert them > + into real registers now. */ > + if (insn != NULL_RTX) > + remove_scratches_1 (insn); > if (REG_P (x)) > lra_reg_info[ORIGINAL_REGNO (x)].last_reload = ++lra_curr_reload_num; > /* Function emit_move can create pseudos -- so expand the pseudo > @@ -2077,47 +2082,53 @@ lra_register_new_scratch_op (rtx_insn *i > add_reg_note (insn, REG_UNUSED, op); > } > > -/* Change scratches onto pseudos and save their location. */ > +/* Change INSN's scratches into pseudos and save their location. */ > static void > -remove_scratches (void) > +remove_scratches_1 (rtx_insn *insn) > { > int i; > bool insn_changed_p; > - basic_block bb; > - rtx_insn *insn; > rtx reg; > lra_insn_recog_data_t id; > struct lra_static_insn_data *static_id; > > + id = lra_get_insn_recog_data (insn); > + static_id = id->insn_static_data; > + insn_changed_p = false; > + for (i = 0; i < static_id->n_operands; i++) > + if (GET_CODE (*id->operand_loc[i]) == SCRATCH > + && GET_MODE (*id->operand_loc[i]) != VOIDmode) > + { > + insn_changed_p = true; > + *id->operand_loc[i] = reg > + = lra_create_new_reg (static_id->operand[i].mode, > + *id->operand_loc[i], ALL_REGS, NULL); > + lra_register_new_scratch_op (insn, i, id->icode); > + if (lra_dump_file != NULL) > + fprintf (lra_dump_file, > + "Removing SCRATCH in insn #%u (nop %d)\n", > + INSN_UID (insn), i); > + } > + if (insn_changed_p) > + /* Because we might use DF right after caller-saves sub-pass > + we need to keep DF info up to date. */ > + df_insn_rescan (insn); > +} > + > +/* Change scratches into pseudos and save their location. */ > +static void > +remove_scratches (void) > +{ > + basic_block bb; > + rtx_insn *insn; > + > scratches.create (get_max_uid ()); > bitmap_initialize (&scratch_bitmap, ®_obstack); > bitmap_initialize (&scratch_operand_bitmap, ®_obstack); > FOR_EACH_BB_FN (bb, cfun) > FOR_BB_INSNS (bb, insn) > if (INSN_P (insn)) > - { > - id = lra_get_insn_recog_data (insn); > - static_id = id->insn_static_data; > - insn_changed_p = false; > - for (i = 0; i < static_id->n_operands; i++) > - if (GET_CODE (*id->operand_loc[i]) == SCRATCH > - && GET_MODE (*id->operand_loc[i]) != VOIDmode) > - { > - insn_changed_p = true; > - *id->operand_loc[i] = reg > - = lra_create_new_reg (static_id->operand[i].mode, > - *id->operand_loc[i], ALL_REGS, NULL); > - lra_register_new_scratch_op (insn, i, id->icode); > - if (lra_dump_file != NULL) > - fprintf (lra_dump_file, > - "Removing SCRATCH in insn #%u (nop %d)\n", > - INSN_UID (insn), i); > - } > - if (insn_changed_p) > - /* Because we might use DF right after caller-saves sub-pass > - we need to keep DF info up to date. */ > - df_insn_rescan (insn); > - } > + remove_scratches_1 (insn); > } >
Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 269263) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -9887,7 +9887,7 @@ valid_sf_si_move (rtx dest, rtx src, mac static bool rs6000_emit_move_si_sf_subreg (rtx dest, rtx source, machine_mode mode) { - if (TARGET_DIRECT_MOVE_64BIT && !lra_in_progress && !reload_completed + if (TARGET_DIRECT_MOVE_64BIT && !reload_completed && (!SUBREG_P (dest) || !sf_subreg_operand (dest, mode)) && SUBREG_P (source) && sf_subreg_operand (source, mode)) { Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 269263) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -7353,9 +7353,31 @@ (define_insn "*mov<mode>_softfloat" ;; This function is called before reload, and it creates the temporary as ;; needed. +(define_expand "movsf_from_si" + [(parallel [(set (match_operand:SF 0 "nonimmediate_operand") + (unspec:SF [(match_operand:SI 1 "input_operand" )] + UNSPEC_SF_FROM_SI)) + (clobber (match_scratch:DI 2))])] + "TARGET_NO_SF_SUBREG + && (register_operand (operands[0], SFmode) + || register_operand (operands[1], SImode))" +{ + if (lra_in_progress + && REG_P (operands[0]) + && REG_P (operands[1])) + { + /* If LRA is generating a direct move from a GPR to a FPR, + then the splitter is going to need a scratch register. */ + rtx insn = gen_movsf_from_si_internal (operands[0], operands[1]); + XEXP (XVECEXP (insn, 0, 1), 0) = gen_reg_rtx (DImode); + emit_insn (insn); + DONE; + } +}) + ;; LWZ LFS LXSSP LXSSPX STW STFIWX ;; STXSIWX GPR->VSX VSX->GPR GPR->GPR -(define_insn_and_split "movsf_from_si" +(define_insn_and_split "movsf_from_si_internal" [(set (match_operand:SF 0 "nonimmediate_operand" "=!r, f, wb, wu, m, Z, Z, wy, ?r, !r") Index: gcc/testsuite/gcc.target/powerpc/pr88845.c =================================================================== --- gcc/testsuite/gcc.target/powerpc/pr88845.c (nonexistent) +++ gcc/testsuite/gcc.target/powerpc/pr88845.c (working copy) @@ -0,0 +1,25 @@ +/* { dg-do compile { target powerpc*-*-linux* } } */ +/* { dg-skip-if "" { powerpc*-*-darwin* } } */ +/* { dg-skip-if "" { powerpc*-*-*spe* } } */ +/* { dg-require-effective-target powerpc_p8vector_ok } */ +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */ +/* { dg-options "-mcpu=power8 -O2" } */ +/* { dg-final { scan-assembler {\mmtvsrd\M} { target { lp64 } } } } */ +/* { dg-final { scan-assembler {\mxscvspdpn\M} { target { lp64 } } } } */ + +/* Verify that we do not ICE and that we generate a direct move + for float types when compiling for 64-bit. */ + +struct a { + unsigned ui; + float f; +}; + +void +foo (void) +{ + float e; + struct a s; + e = s.f; + __asm__("" : : "d" (e)); +}