Message ID | 4E96F664.1020602@gjlay.de |
---|---|
State | New |
Headers | show |
> -----Original Message----- > From: Georg-Johann Lay [mailto:avr@gjlay.de] > Sent: Thursday, October 13, 2011 8:32 AM > To: gcc-patches@gcc.gnu.org > Cc: Anatoly Sokolov; Denis Chertykov; Weddington, Eric > Subject: [Patch,AVR] Fix PR46278, Take #3 > > This is yet another attempt to fix PR46278 (fake X addressing). > > After the previous clean-ups it is just a small change. > > caller-saves.c tries to eliminate call-clobbered hard-regs allocated to > pseudos > around function calls and that leads to situations that reload is no more > capable to perform all requested spills because of the very few AVR's > address > registers. > > Thus, the patch adds a new target option -mstrict-X so that the user can > turn > that option if he like to do so, and then -fcaller-save is disabled. > > The patch passes the testsuite without regressions. Moreover, the > testsuite > passes without regressions if all test cases are run with -mstrict-X and > all > libraries (libgcc, avr-libc) are built with the new option turned on. Hi Johann, Sorry, I haven't been keeping up with the discussion on this PR. But if all test cases pass with running -mstrict-X and everything built with that option on, then why is this even an option? Is it because that it may not always reduce code size?... Thanks, Eric
Weddington, Eric a écrit: > > Georg-Johann Lay wrote: >> >> This is yet another attempt to fix PR46278 (fake X addressing). >> >> After the previous clean-ups it is just a small change. >> >> caller-saves.c tries to eliminate call-clobbered hard-regs >> allocated to pseudos around function calls and that leads to >> situations that reload is no more capable to perform all requested >> spills because of the very few AVR's address registers. >> >> Thus, the patch adds a new target option -mstrict-X so that the >> user can turn that option if he like to do so, and then >> -fcaller-save is disabled. >> >> The patch passes the testsuite without regressions. Moreover, the >> testsuite passes without regressions if all test cases are run with >> -mstrict-X and all libraries (libgcc, avr-libc) are built with the >> new option turned on. > > Hi Johann, > > But if all test cases pass with running -mstrict-X and everything > built with that option on, then why is this even an option? Is it > because that it may not always reduce code size?... As with any other optimization, I'd guess yes. But the major problem with this patch -- or any patch that addresses this PR -- is that taking away the X+const addressing might render the challenge of register allocation for AVR to a too big one so that reload cannot cope with it any more and ICEs with spill failure. Denis' analysis showed that the root cause of these spill fails is that there is just one register that can perform R+const addressing besides FP but that register (Z) is call-clobbered. Dunno if these problems were also triggered by caller-saves. Thus, if some real world code breaks with a spill fail, the option provides a fallback to cure reload's shortcomings. It's just a kludge, of course, but trying to fix reload is nothing I would do (I know my limits) and the prospects are thin that a target as unimportant as AVR will draw attention of some reload expert. But looking at the results, I think it's worth it. 30% less for an already optimized program is not bad -- or the other way round: 50% bloat without this option is horrible. And these test programs are not fancy; they are just accessing struct components via pstruct->component which is not uncommon code. > Thanks, Eric Johann
Weddington, Eric schrieb: > >> This is yet another attempt to fix PR46278 (fake X addressing). >> >> After the previous clean-ups it is just a small change. >> >> caller-saves.c tries to eliminate call-clobbered hard-regs allocated to >> pseudos around function calls and that leads to situations that reload is >> no more capable to perform all requested spills because of the very few >> AVR's address registers. >> >> Thus, the patch adds a new target option -mstrict-X so that the user can >> turn that option if he like to do so, and then -fcaller-save is disabled. >> >> The patch passes the testsuite without regressions. Moreover, the >> testsuite passes without regressions if all test cases are run with >> -mstrict-X and all libraries (libgcc, avr-libc) are built with the new >> option turned on. > > Hi Johann, > > Sorry, I haven't been keeping up with the discussion on this PR. > > But if all test cases pass with running -mstrict-X and everything built with > that option on, then why is this even an option? Is it because that it may > not always reduce code size?... An alternative would be to set -mstrict-X per default if -O or higher. Let's see what Denis thinks. Johann > > Thanks, Eric >
2011/10/14 Georg-Johann Lay <avr@gjlay.de>: > Weddington, Eric schrieb: >> >>> This is yet another attempt to fix PR46278 (fake X addressing). >>> >>> After the previous clean-ups it is just a small change. >>> >>> caller-saves.c tries to eliminate call-clobbered hard-regs allocated to >>> pseudos around function calls and that leads to situations that reload is >>> no more capable to perform all requested spills because of the very few >>> AVR's address registers. >>> >>> Thus, the patch adds a new target option -mstrict-X so that the user can >>> turn that option if he like to do so, and then -fcaller-save is disabled. >>> >>> The patch passes the testsuite without regressions. Moreover, the >>> testsuite passes without regressions if all test cases are run with >>> -mstrict-X and all libraries (libgcc, avr-libc) are built with the new >>> option turned on. >> >> Hi Johann, >> >> Sorry, I haven't been keeping up with the discussion on this PR. >> >> But if all test cases pass with running -mstrict-X and everything built with >> that option on, then why is this even an option? Is it because that it may >> not always reduce code size?... > > An alternative would be to set -mstrict-X per default if -O or higher. > Let's see what Denis thinks. I think that it's just a great results. I vote for committing this patch. About "to set -mstrict-X per default": if it's possible to print something like "Please use -mno-strict-X" instead of "Spill error failure...." then we can use -mstrict-X by default. i.e. how user can get a knowledge about a correlation between "Spill error..." and -mstrict-X Denis.
Denis Chertykov schrieb: > Georg-Johann Lay : >> Weddington, Eric schrieb: >>>> This is yet another attempt to fix PR46278 (fake X addressing). >>>> >>>> After the previous clean-ups it is just a small change. >>>> >>>> caller-saves.c tries to eliminate call-clobbered hard-regs allocated to >>>> pseudos around function calls and that leads to situations that reload is >>>> no more capable to perform all requested spills because of the very few >>>> AVR's address registers. >>>> >>>> Thus, the patch adds a new target option -mstrict-X so that the user can >>>> turn that option if he like to do so, and then -fcaller-save is disabled. >>>> >>>> The patch passes the testsuite without regressions. Moreover, the >>>> testsuite passes without regressions if all test cases are run with >>>> -mstrict-X and all libraries (libgcc, avr-libc) are built with the new >>>> option turned on. >>> Hi Johann, >>> >>> Sorry, I haven't been keeping up with the discussion on this PR. >>> >>> But if all test cases pass with running -mstrict-X and everything built with >>> that option on, then why is this even an option? Is it because that it may >>> not always reduce code size?... >> An alternative would be to set -mstrict-X per default if -O or higher. >> Let's see what Denis thinks. > > I think that it's just a great results. > I vote for committing this patch. So I will proceed and commit this patch if there are no objections or propositions to improve it. > About "to set -mstrict-X per default": if it's possible to print > something like "Please use -mno-strict-X" instead of "Spill error > failure...." then we can use -mstrict-X by default. I don't see a way to hook in spill_failure or find_reload_regs or have backend target specific customization of diagnostics. Except someone experienced in that field recommends, say, hooking into diagnostic printer somehow. If so, I'd prefer to do that in a separate patch. > i.e. how user can get a knowledge about a correlation between "Spill > error..." and -mstrict-X ...or between "spill error" and -fcaller-saves... Johann > Denis.
Index: config/avr/avr.opt =================================================================== --- config/avr/avr.opt (revision 179842) +++ config/avr/avr.opt (working copy) @@ -61,3 +61,7 @@ Relax branches mpmem-wrap-around Target Report Make the linker relaxation machine assume that a program counter wrap-around occurs. + +mstrict-X +Target Report Var(avr_strict_X) Init(0) +When accessing RAM, use X as imposed by the hardware, i.e. just use pre-decrement, post-increment and indirect addressing with the X register. Without this option, the compiler may assume that there is an addressing mode X+const similar to Y+const and Z+const and emit instructions to emulate such an addressing mode for X. Index: config/avr/avr.c =================================================================== --- config/avr/avr.c (revision 179843) +++ config/avr/avr.c (working copy) @@ -351,6 +351,17 @@ avr_option_override (void) { flag_delete_null_pointer_checks = 0; + /* caller-save.c looks for call-clobbered hard registers that are assigned + to pseudos that cross calls and tries so save-restore them around calls + in order to reduce the number of stack slots needed. + + This might leads to situations where reload is no more able to cope + with the challenge of AVR's very few address registers and fails to + perform the requested spills. */ + + if (avr_strict_X) + flag_caller_saves = 0; + /* Unwind tables currently require a frame pointer for correctness, see toplev.c:process_options(). */ @@ -1205,11 +1216,12 @@ avr_cannot_modify_jumps_p (void) /* Helper function for `avr_legitimate_address_p'. */ static inline bool -avr_reg_ok_for_addr_p (rtx reg, addr_space_t as ATTRIBUTE_UNUSED, int strict) +avr_reg_ok_for_addr_p (rtx reg, addr_space_t as ATTRIBUTE_UNUSED, + RTX_CODE outer_code, bool strict) { return (REG_P (reg) && (avr_regno_mode_code_ok_for_base_p (REGNO (reg), - QImode, MEM, UNKNOWN) + QImode, outer_code, UNKNOWN) || (!strict && REGNO (reg) >= FIRST_PSEUDO_REGISTER))); } @@ -1221,58 +1233,69 @@ avr_reg_ok_for_addr_p (rtx reg, addr_spa static bool avr_legitimate_address_p (enum machine_mode mode, rtx x, bool strict) { - reg_class_t r = NO_REGS; + bool ok = CONSTANT_ADDRESS_P (x); - if (REG_P (x) - && avr_reg_ok_for_addr_p (x, ADDR_SPACE_GENERIC, strict)) - { - r = POINTER_REGS; - } - else if (CONSTANT_ADDRESS_P (x)) - { - r = ALL_REGS; - } - else if (GET_CODE (x) == PLUS - && REG_P (XEXP (x, 0)) - && CONST_INT_P (XEXP (x, 1)) - && INTVAL (XEXP (x, 1)) >= 0) + switch (GET_CODE (x)) { - rtx reg = XEXP (x, 0); - bool fit = INTVAL (XEXP (x, 1)) <= MAX_LD_OFFSET (mode); - - if (fit) - { - if (! strict - || REGNO (reg) == REG_X - || REGNO (reg) == REG_Y - || REGNO (reg) == REG_Z) - { - r = BASE_POINTER_REGS; - } - - if (reg == frame_pointer_rtx - || reg == arg_pointer_rtx) - { - r = BASE_POINTER_REGS; - } - } - else if (frame_pointer_needed && reg == frame_pointer_rtx) + case REG: + ok = avr_reg_ok_for_addr_p (x, ADDR_SPACE_GENERIC, + MEM, strict); + + if (strict + && DImode == mode + && REG_X == REGNO (x)) { - r = POINTER_Y_REGS; + ok = false; } - } - else if ((GET_CODE (x) == PRE_DEC || GET_CODE (x) == POST_INC) - && REG_P (XEXP (x, 0)) - && avr_reg_ok_for_addr_p (XEXP (x, 0), ADDR_SPACE_GENERIC, strict)) - { - r = POINTER_REGS; - } + break; + + case POST_INC: + case PRE_DEC: + ok = avr_reg_ok_for_addr_p (XEXP (x, 0), ADDR_SPACE_GENERIC, + GET_CODE (x), strict); + break; + case PLUS: + { + rtx reg = XEXP (x, 0); + rtx op1 = XEXP (x, 1); + + if (REG_P (reg) + && CONST_INT_P (op1) + && INTVAL (op1) >= 0) + { + bool fit = IN_RANGE (INTVAL (op1), 0, MAX_LD_OFFSET (mode)); + + if (fit) + { + ok = (! strict + || avr_reg_ok_for_addr_p (reg, ADDR_SPACE_GENERIC, + PLUS, strict)); + + if (reg == frame_pointer_rtx + || reg == arg_pointer_rtx) + { + ok = true; + } + } + else if (frame_pointer_needed + && reg == frame_pointer_rtx) + { + ok = true; + } + } + } + break; + + default: + break; + } + if (avr_log.legitimate_address_p) { - avr_edump ("\n%?: ret=%d=%R, mode=%m strict=%d " + avr_edump ("\n%?: ret=%d, mode=%m strict=%d " "reload_completed=%d reload_in_progress=%d %s:", - !!r, r, mode, strict, reload_completed, reload_in_progress, + ok, mode, strict, reload_completed, reload_in_progress, reg_renumber ? "(reg_renumber)" : ""); if (GET_CODE (x) == PLUS @@ -1288,7 +1311,7 @@ avr_legitimate_address_p (enum machine_m avr_edump ("\n%r\n", x); } - return r == NO_REGS ? 0 : (int)r; + return ok; } /* Attempts to replace X with a valid @@ -7304,10 +7327,13 @@ avr_hard_regno_mode_ok (int regno, enum reg_class_t avr_mode_code_base_reg_class (enum machine_mode mode ATTRIBUTE_UNUSED, - RTX_CODE outer_code ATTRIBUTE_UNUSED, + RTX_CODE outer_code, RTX_CODE index_code ATTRIBUTE_UNUSED) { - return reload_completed ? BASE_POINTER_REGS : POINTER_REGS; + if (!avr_strict_X) + return reload_completed ? BASE_POINTER_REGS : POINTER_REGS; + + return PLUS == outer_code ? BASE_POINTER_REGS : POINTER_REGS; } @@ -7316,19 +7342,20 @@ avr_mode_code_base_reg_class (enum machi bool avr_regno_mode_code_ok_for_base_p (int regno, enum machine_mode mode ATTRIBUTE_UNUSED, - RTX_CODE outer_code ATTRIBUTE_UNUSED, + RTX_CODE outer_code, RTX_CODE index_code ATTRIBUTE_UNUSED) { + bool ok = false; + if (regno < FIRST_PSEUDO_REGISTER && (regno == REG_X || regno == REG_Y || regno == REG_Z || regno == ARG_POINTER_REGNUM)) { - return true; + ok = true; } - - if (reg_renumber) + else if (reg_renumber) { regno = reg_renumber[regno]; @@ -7337,11 +7364,18 @@ avr_regno_mode_code_ok_for_base_p (int r || regno == REG_Z || regno == ARG_POINTER_REGNUM) { - return true; + ok = true; } } - - return false; + + if (avr_strict_X + && PLUS == outer_code + && regno == REG_X) + { + ok = false; + } + + return ok; } Index: doc/invoke.texi =================================================================== --- doc/invoke.texi (revision 179765) +++ doc/invoke.texi (working copy) @@ -486,7 +486,7 @@ Objective-C and Objective-C++ Dialects}. @emph{AVR Options} @gccoptlist{-mmcu=@var{mcu} -mno-interrupts @gol --mcall-prologues -mtiny-stack -mint8} +-mcall-prologues -mtiny-stack -mint8 -mstrict-X} @emph{Blackfin Options} @gccoptlist{-mcpu=@var{cpu}@r{[}-@var{sirevision}@r{]} @gol @@ -10668,6 +10668,23 @@ char will be 1 byte, an int will be 1 by and long long will be 4 bytes. Please note that this option does not comply to the C standards, but it will provide you with smaller code size. + +@item -mstrict-X +@opindex mstrict-X +Use register @code{X} in the way imposed by the hardware. This means +that @code{X} can only be used in indirect, post-increment or +pre-decrement addressing. + +Without this option, the @code{X} register may be used in the same way +as @code{Y} or @code{Z} which then is emulated by additional +instructions. +For example, loading a value with @code{X+const} addressing with a +small @code{const @leq{} 63} to a register @var{Rn} will be printed as +@example +adiw r26, const +ld @var{Rn}, X +sbiw r26, const +@end example @end table @node Blackfin Options