Message ID | AANLkTi=UapBhS8xBT5LdOFn7YL_dOKJPVzu7S-NN0uBs@mail.gmail.com |
---|---|
State | New |
Headers | show |
Hi, On Fri, 30 Jul 2010, Martin Thuresson wrote: > This patch updates the handling of temporary registers in postreload.c > to avoid very long build times in certain files. > > By creating new registers instead of updating one existing one using > SET_REGNO it avoids going through the scan df structure. But this also means generating heaps of garbage for the collector to collect. I think it would be better to not loose the current optimization of doing just one gen_reg_RTX(), and rather create a variant of SET_REGNO that doesn't call df_ref_change_reg_with_loc; SET_REGNO_RAW or something (or just inline the "XCUINT (RTX, 0, REG) = N" instead of using SET_REGNO at that place). Ciao, Michael.
On Fri, Jul 30, 2010 at 4:57 PM, Michael Matz <matz@suse.de> wrote: > > Hi, > > On Fri, 30 Jul 2010, Martin Thuresson wrote: > > > This patch updates the handling of temporary registers in postreload.c > > to avoid very long build times in certain files. > > > > By creating new registers instead of updating one existing one using > > SET_REGNO it avoids going through the scan df structure. > > But this also means generating heaps of garbage for the collector to > collect. I think it would be better to not loose the current optimization > of doing just one gen_reg_RTX(), and rather create a variant of SET_REGNO > that doesn't call df_ref_change_reg_with_loc; SET_REGNO_RAW or something > (or just inline the "XCUINT (RTX, 0, REG) = N" instead of using SET_REGNO > at that place). Thanks for the feedback. I like your suggestion of SET_REGNO_RAW. I'll update my patch. Martin > > Ciao, > Michael.
On 07/31/2010 01:07 AM, Martin Thuresson wrote: > This patch updates the handling of temporary registers in postreload.c > to avoid very long build times in certain files. > > By creating new registers instead of updating one existing one using > SET_REGNO it avoids going through the scan df structure. > > Im new to gcc RTL and appreciate any feedback. > > In the (inspired from real code) attached source, the time > for "reload CSE regs" went down from 20 usr seconds to less than 1. I have trouble reproducing the slowness. Which target are you using? It's a bit unfortunate, and probably avoidable, to create additional garbage RTL. Maybe what's really needed is df_set_flags (DF_DEFER_INSN_RESCAN); at the top of postreload? Can you try that? Or, even simpler, avoid the SET_REGNO macro, and check all other occurrences of it. There seem to be a few, e.g. in caller-save.c or ira.c, which also probably shouldn't invoke df. Bernd
On 07/31/2010 02:03 AM, Bernd Schmidt wrote: > It's a bit unfortunate, and probably avoidable, to create additional > garbage RTL. Maybe what's really needed is > df_set_flags (DF_DEFER_INSN_RESCAN); > at the top of postreload? Can you try that? That's a possibility, but considering postreload is a heavy user of note_uses/note_stores, deferring rescans makes it impossible to switch it to use the operand caches someday. > Or, even simpler, avoid > the SET_REGNO macro, and check all other occurrences of it. There seem > to be a few, e.g. in caller-save.c or ira.c, which also probably > shouldn't invoke df. Better, yes. Paolo
--- postreload.c.orig 2010-07-27 09:37:16.000000000 -0700 +++ postreload.c 2010-07-30 13:53:37.000000000 -0700 @@ -51,10 +51,10 @@ along with GCC; see the file COPYING3. #include "dbgcnt.h" static int reload_cse_noop_set_p (rtx); -static void reload_cse_simplify (rtx, rtx); +static void reload_cse_simplify (rtx); static void reload_cse_regs_1 (rtx); static int reload_cse_simplify_set (rtx, rtx); -static int reload_cse_simplify_operands (rtx, rtx); +static int reload_cse_simplify_operands (rtx); static void reload_combine (void); static void reload_combine_note_use (rtx *, rtx, int, rtx); @@ -92,7 +92,7 @@ reload_cse_noop_set_p (rtx set) /* Try to simplify INSN. */ static void -reload_cse_simplify (rtx insn, rtx testreg) +reload_cse_simplify (rtx insn) { rtx body = PATTERN (insn); @@ -120,7 +120,7 @@ reload_cse_simplify (rtx insn, rtx testr if (count > 0) apply_change_group (); else - reload_cse_simplify_operands (insn, testreg); + reload_cse_simplify_operands (insn); } else if (GET_CODE (body) == PARALLEL) { @@ -177,7 +177,7 @@ reload_cse_simplify (rtx insn, rtx testr if (count > 0) apply_change_group (); else - reload_cse_simplify_operands (insn, testreg); + reload_cse_simplify_operands (insn); } } @@ -202,7 +202,6 @@ static void reload_cse_regs_1 (rtx first) { rtx insn; - rtx testreg = gen_rtx_REG (VOIDmode, -1); cselib_init (CSELIB_RECORD_MEMORY); init_alias_analysis (); @@ -210,7 +209,7 @@ reload_cse_regs_1 (rtx first) for (insn = first; insn; insn = NEXT_INSN (insn)) { if (INSN_P (insn)) - reload_cse_simplify (insn, testreg); + reload_cse_simplify (insn); cselib_process_insn (insn); } @@ -371,7 +370,7 @@ reload_cse_simplify_set (rtx set, rtx in hard registers. */ static int -reload_cse_simplify_operands (rtx insn, rtx testreg) +reload_cse_simplify_operands (rtx insn) { int i, j; @@ -469,7 +468,7 @@ reload_cse_simplify_operands (rtx insn, 1); if (! apply_change_group ()) return 0; - return reload_cse_simplify_operands (insn, testreg); + return reload_cse_simplify_operands (insn); } else /* ??? There might be arithmetic operations with memory that are @@ -524,13 +523,11 @@ reload_cse_simplify_operands (rtx insn, for (regno = 0; regno < FIRST_PSEUDO_REGISTER; regno++) { enum reg_class rclass = NO_REGS; + rtx testreg = gen_rtx_REG (mode, regno); if (! TEST_HARD_REG_BIT (equiv_regs[i], regno)) continue; - SET_REGNO (testreg, regno); - PUT_MODE (testreg, mode); - /* We found a register equal to this operand. Now look for all alternatives that can accept this register and have not been assigned a register they can use yet. */