Message ID | 53E3EB6C.7080302@redhat.com |
---|---|
State | New |
Headers | show |
Hi, Vlad Why does rs6000_emit_move need special support for SDmode with LRA and not with reload? In other words, why is this a specific fix for rs6000 instead of a general improvement for reload? SDmode has some bizarre restrictions on PPC, but one of the historical advantages of GCC was reload's ability to support strange addressing modes of processors. Why fix this in a target-specific way and then possibly replicated this in other architectures instead of fixing it more generally? Thanks, David On Thu, Aug 7, 2014 at 5:11 PM, Vladimir Makarov <vmakarov@redhat.com> wrote: > Hi, Mike. > > Here is the patch which solves LRA SDmode problem which we discussed on > IRC on Friday. > > The patch was successfully tested and bootstrapped on ppc64 with -mlra > switched on. > > David, is it ok to commit it to the trunk? > > 2014-08-07 Vladimir Makarov <vmakarov@redhat.com> > > * config/rs6000/rs6000.c (rs6000_emit_move): Use SDmode to > load/store from/to non-floating class pseudo. >
On 2014-08-08, 2:53 PM, David Edelsohn wrote: > Hi, Vlad > > Why does rs6000_emit_move need special support for SDmode with LRA and > not with reload? rs6000_emit_move has also special SDmode support for *reload too*. It is not implemented generically in reload as you wrote. Please, look at the code with reload_in_progress nearby. So I am trying to do basically the same as done by rs6000 back end for reload but by means of LRA. The difference between code for reload and LRA is in that a special stack slot is created for reload. Reload through a hook SECONDARY_MEMORY_NEEDED_RTX provided by rs6000 back-end generates a move using the stack slot and that triggers a special code generation in rs6000_emit_move. So it is a trick, not a general support in reload. LRA uses a spilled pseudo for this and don't use the hook at all. It means that the stack slot created for reload is used only for SD moves. LRA uses a common stack slot allocation techniques for spilled pseudos which permits to share this slot for other purposes, generates a better code, and uses less one hook. But basically code for LRA in rs600_emit_move do the same as analogous code for reload. But if you mean that general support you mentioned is the usage of absolutely the same trick for LRA as for reload (through SECONDARY_MEMORY_NEEDED_RTX), I could do this. But as I wrote it would be worse code generation than LRA currently produces and using an additional hook out of many already used by LRA. > In other words, why is this a specific fix for > rs6000 instead of a general improvement for reload? > Because it is too specific. It is about how rs6000 too specifically implements sd load/store. For example, SECONDARY_MEMORY_NEEDED_RTX is a creation only for rs6000. No other port uses it because the hook is used for the trick I wrote above. I don't think other port will use the trick in future. > SDmode has some bizarre restrictions on PPC, but one of the historical > advantages of GCC was reload's ability to support strange addressing > modes of processors. Why fix this in a target-specific way and then > possibly replicated this in other architectures instead of fixing it > more generally? > It is not about addressing. It is about machine description of insns. If movsd_{load,store} insns were part of common move insns, we would have no problem at all. But I guess, achieving this would be not possible or easy. I believe Mike had serious reasons to do this such way. I am not such a specialist in md writing as Mike. If I could solve this problem only in LRA, I'd prefer to do this because it is simpler for me as I don't need an approval.
Hi, Vlad Thanks for the explanation. The patch is okay. Thanks, David On Fri, Aug 8, 2014 at 5:26 PM, Vladimir Makarov <vmakarov@redhat.com> wrote: > On 2014-08-08, 2:53 PM, David Edelsohn wrote: >> >> Hi, Vlad >> >> Why does rs6000_emit_move need special support for SDmode with LRA and >> not with reload? > > > rs6000_emit_move has also special SDmode support for *reload too*. It is > not implemented generically in reload as you wrote. Please, look at the > code with reload_in_progress nearby. So I am trying to do basically the > same as done by rs6000 back end for reload but by means of LRA. The > difference between code for reload and LRA is in that a special stack slot > is created for reload. Reload through a hook SECONDARY_MEMORY_NEEDED_RTX > provided by rs6000 back-end generates a move using the stack slot and that > triggers a special code generation in rs6000_emit_move. So it is a trick, > not a general support in reload. > > LRA uses a spilled pseudo for this and don't use the hook at all. It means > that the stack slot created for reload is used only for SD moves. LRA uses > a common stack slot allocation techniques for spilled pseudos which permits > to share this slot for other purposes, generates a better code, and uses > less one hook. But basically code for LRA in rs600_emit_move do the same as > analogous code for reload. > > But if you mean that general support you mentioned is the usage of > absolutely the same trick for LRA as for reload (through > SECONDARY_MEMORY_NEEDED_RTX), I could do this. But as I wrote it would be > worse code generation than LRA currently produces and using an additional > hook out of many already used by LRA. > > >> In other words, why is this a specific fix for >> rs6000 instead of a general improvement for reload? >> > > Because it is too specific. It is about how rs6000 too specifically > implements sd load/store. For example, SECONDARY_MEMORY_NEEDED_RTX is a > creation only for rs6000. No other port uses it because the hook is used > for the trick I wrote above. I don't think other port will use the trick in > future. > > >> SDmode has some bizarre restrictions on PPC, but one of the historical >> advantages of GCC was reload's ability to support strange addressing >> modes of processors. Why fix this in a target-specific way and then >> possibly replicated this in other architectures instead of fixing it >> more generally? >> > > It is not about addressing. It is about machine description of insns. If > movsd_{load,store} insns were part of common move insns, we would have no > problem at all. But I guess, achieving this would be not possible or easy. > I believe Mike had serious reasons to do this such way. I am not such a > specialist in md writing as Mike. > > If I could solve this problem only in LRA, I'd prefer to do this because it > is simpler for me as I don't need an approval. >
Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 213506) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -8271,6 +8271,30 @@ rs6000_emit_move (rtx dest, rtx source, eliminate_regs (cfun->machine->sdmode_stack_slot, VOIDmode, NULL_RTX); + /* Transform (p0:DD, (SUBREG:DD p1:SD)) to ((SUBREG:SD p0:DD), + p1:SD) if p1 is not of floating point class and p0 is spilled as + we can have no analogous movsd_store for this. */ + if (lra_in_progress && mode == DDmode + && REG_P (operands[0]) && REGNO (operands[0]) >= FIRST_PSEUDO_REGISTER + && reg_preferred_class (REGNO (operands[0])) == NO_REGS + && GET_CODE (operands[1]) == SUBREG && REG_P (SUBREG_REG (operands[1])) + && GET_MODE (SUBREG_REG (operands[1])) == SDmode) + { + enum reg_class cl; + int regno = REGNO (SUBREG_REG (operands[1])); + + if (regno >= FIRST_PSEUDO_REGISTER) + { + cl = reg_preferred_class (regno); + regno = cl == NO_REGS ? -1 : ira_class_hard_regs[cl][1]; + } + if (regno >= 0 && ! FP_REGNO_P (regno)) + { + mode = SDmode; + operands[0] = gen_lowpart_SUBREG (SDmode, operands[0]); + operands[1] = SUBREG_REG (operands[1]); + } + } if (lra_in_progress && mode == SDmode && REG_P (operands[0]) && REGNO (operands[0]) >= FIRST_PSEUDO_REGISTER @@ -8301,6 +8325,30 @@ rs6000_emit_move (rtx dest, rtx source, gcc_unreachable(); return; } + /* Transform ((SUBREG:DD p0:SD), p1:DD) to (p0:SD, (SUBREG:SD + p:DD)) if p0 is not of floating point class and p1 is spilled as + we can have no analogous movsd_load for this. */ + if (lra_in_progress && mode == DDmode + && GET_CODE (operands[0]) == SUBREG && REG_P (SUBREG_REG (operands[0])) + && GET_MODE (SUBREG_REG (operands[0])) == SDmode + && REG_P (operands[1]) && REGNO (operands[1]) >= FIRST_PSEUDO_REGISTER + && reg_preferred_class (REGNO (operands[1])) == NO_REGS) + { + enum reg_class cl; + int regno = REGNO (SUBREG_REG (operands[0])); + + if (regno >= FIRST_PSEUDO_REGISTER) + { + cl = reg_preferred_class (regno); + regno = cl == NO_REGS ? -1 : ira_class_hard_regs[cl][0]; + } + if (regno >= 0 && ! FP_REGNO_P (regno)) + { + mode = SDmode; + operands[0] = SUBREG_REG (operands[0]); + operands[1] = gen_lowpart_SUBREG (SDmode, operands[1]); + } + } if (lra_in_progress && mode == SDmode && (REG_P (operands[0])