Message ID | 000f01cdde97$d7a90a20$86fb1e60$@ye@arm.com |
---|---|
State | New |
Headers | show |
Ping > -----Original Message----- > From: Joey Ye > Sent: Thursday, December 20, 2012 17:53 > To: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan > Cc: Joey Ye > Subject: [PATCH][ARM][thumb1] Reduce lr save for leaf function with non- > far jump > > Current GCC thumb1 has an annoying problem that always assuming far > branch. > So it forces to save lr, even when unnecessarily. The most extreme case > complained by partner is: > > // compiled with "-mthumb -mcpu=cortex-m0 -Os". > void foo() { for (;;); } > => > foo: > push {lr} // Crazy!!! > .L2: > b .L2 > > The reason is that thumb1 far jump is only resolved in the very late > pass > "shorten_branch". Prologue/epilogue pass doesn't actually know a branch > is > far or not from its attribute. It has to conservatively save/restore lr > whenever there is a branch. > > This patch tries to fix it with a simple heuristic, i.e., using function > size to decide if a far jump will likely be used. Function size > information > is meaningful in prologue/epilogue pass. The heuristic uses following > check > to decide if lr should be saved for far jump: > > function_size * 3 >= 2048 // yes: save lr for possible far jump. No: > don't > save lr for far jump > > The scheme has an issue: if some corner case does break above condition, > there is no chance to fix-up but to ICE. But the heuristic condition is > very > conservative. It is base on the worse normal condition that each > instruction > is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3 > times ). > I can't think of a real case to trigger the ICE. So I think it should > work. > > Other approaches than the heuristic scheme are too expensive to > implement > for this small size/performance issue. I did explored some but none of > them > persuaded myself. > > Tests passed: > * build libgcc, libstdc++, newlib, libm > * make check-gcc with cpu=cortex-m0 > * Small and extreme test cases > > ChangeLog: > > 2012-12-20 Joey Ye <joey.ye@arm.com> > > * config/arm/arm.c(thumb1_final_prescan_insn): > Assert lr save for real far jump. > (thumb_far_jump_used_p): Count instruction size and set > far_jump_used. > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index 327ef22..ad79451 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) > else if (conds != CONDS_NOCOND) > cfun->machine->thumb1_cc_insn = NULL_RTX; > } > + > + /* Check if unexpected far jump is used. */ > + if (cfun->machine->lr_save_eliminated > + && get_attr_far_jump (insn) == FAR_JUMP_YES) > + internal_error("Unexpected thumb1 far jump"); > } > > int > @@ -21815,6 +21887,8 @@ static int > thumb_far_jump_used_p (void) > { > rtx insn; > + bool far_jump = false; > + unsigned int func_size = 0; > > /* This test is only important for leaf functions. */ > /* assert (!leaf_function_p ()); */ > @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void) > && get_attr_far_jump (insn) == FAR_JUMP_YES > ) > { > + far_jump = true; > + } > + func_size += get_attr_length (insn); > + } > + > + /* Attribute far_jump will always be true for thumb1 before > shorten_branch > + pass. So checking far_jump attribute before shorten_branch isn't > much > + useful. > + > + Following heuristic tries to estimate more accruately if a far > jump > may > + finally be used. The heuristic is very conservative as there is no > chance > + to roll-back the decision of not to use far jump. > + > + Thumb1 long branch offset is -2048 to 2046. The worst case is each > 2-byte > + insn is assiociated with a 4 byte constant pool. Using function > size > + 2048/3 as the threshold is conservative enough. */ > + if (far_jump) > + { > + if ((func_size * 3) >= 2048) > + { > /* Record the fact that we have decided that > the function does use far jumps. */ > cfun->machine->far_jump_used = 1; > > >
Ping^2 > -----Original Message----- > From: Joey Ye > Sent: Saturday, January 05, 2013 15:41 > To: Ramana Radhakrishnan > Cc: Joey Ye; gcc-patches@gcc.gnu.org > Subject: RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with > non-far jump > > Ping > > > -----Original Message----- > > From: Joey Ye > > Sent: Thursday, December 20, 2012 17:53 > > To: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan > > Cc: Joey Ye > > Subject: [PATCH][ARM][thumb1] Reduce lr save for leaf function with > non- > > far jump > > > > Current GCC thumb1 has an annoying problem that always assuming far > > branch. > > So it forces to save lr, even when unnecessarily. The most extreme > case > > complained by partner is: > > > > // compiled with "-mthumb -mcpu=cortex-m0 -Os". > > void foo() { for (;;); } > > => > > foo: > > push {lr} // Crazy!!! > > .L2: > > b .L2 > > > > The reason is that thumb1 far jump is only resolved in the very late > > pass > > "shorten_branch". Prologue/epilogue pass doesn't actually know a > branch > > is > > far or not from its attribute. It has to conservatively save/restore > lr > > whenever there is a branch. > > > > This patch tries to fix it with a simple heuristic, i.e., using > function > > size to decide if a far jump will likely be used. Function size > > information > > is meaningful in prologue/epilogue pass. The heuristic uses following > > check > > to decide if lr should be saved for far jump: > > > > function_size * 3 >= 2048 // yes: save lr for possible far jump. No: > > don't > > save lr for far jump > > > > The scheme has an issue: if some corner case does break above > condition, > > there is no chance to fix-up but to ICE. But the heuristic condition > is > > very > > conservative. It is base on the worse normal condition that each > > instruction > > is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3 > > times ). > > I can't think of a real case to trigger the ICE. So I think it should > > work. > > > > Other approaches than the heuristic scheme are too expensive to > > implement > > for this small size/performance issue. I did explored some but none of > > them > > persuaded myself. > > > > Tests passed: > > * build libgcc, libstdc++, newlib, libm > > * make check-gcc with cpu=cortex-m0 > > * Small and extreme test cases > > > > ChangeLog: > > > > 2012-12-20 Joey Ye <joey.ye@arm.com> > > > > * config/arm/arm.c(thumb1_final_prescan_insn): > > Assert lr save for real far jump. > > (thumb_far_jump_used_p): Count instruction size and set > > far_jump_used. > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > > index 327ef22..ad79451 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) > > else if (conds != CONDS_NOCOND) > > cfun->machine->thumb1_cc_insn = NULL_RTX; > > } > > + > > + /* Check if unexpected far jump is used. */ > > + if (cfun->machine->lr_save_eliminated > > + && get_attr_far_jump (insn) == FAR_JUMP_YES) > > + internal_error("Unexpected thumb1 far jump"); > > } > > > > int > > @@ -21815,6 +21887,8 @@ static int > > thumb_far_jump_used_p (void) > > { > > rtx insn; > > + bool far_jump = false; > > + unsigned int func_size = 0; > > > > /* This test is only important for leaf functions. */ > > /* assert (!leaf_function_p ()); */ > > @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void) > > && get_attr_far_jump (insn) == FAR_JUMP_YES > > ) > > { > > + far_jump = true; > > + } > > + func_size += get_attr_length (insn); > > + } > > + > > + /* Attribute far_jump will always be true for thumb1 before > > shorten_branch > > + pass. So checking far_jump attribute before shorten_branch isn't > > much > > + useful. > > + > > + Following heuristic tries to estimate more accruately if a far > > jump > > may > > + finally be used. The heuristic is very conservative as there is > no > > chance > > + to roll-back the decision of not to use far jump. > > + > > + Thumb1 long branch offset is -2048 to 2046. The worst case is > each > > 2-byte > > + insn is assiociated with a 4 byte constant pool. Using function > > size > > + 2048/3 as the threshold is conservative enough. */ > > + if (far_jump) > > + { > > + if ((func_size * 3) >= 2048) > > + { > > /* Record the fact that we have decided that > > the function does use far jumps. */ > > cfun->machine->far_jump_used = 1; > > > > > > > >
Ping ^ 2 > -----Original Message----- > From: Joey Ye > Sent: Saturday, January 05, 2013 3:41 PM > To: Ramana Radhakrishnan > Cc: Joey Ye; gcc-patches@gcc.gnu.org > Subject: RE: [PATCH][ARM][thumb1] Reduce lr save for leaf function with > non-far jump > > Ping > > > -----Original Message----- > > From: Joey Ye > > Sent: Thursday, December 20, 2012 17:53 > > To: gcc-patches@gcc.gnu.org; Ramana Radhakrishnan > > Cc: Joey Ye > > Subject: [PATCH][ARM][thumb1] Reduce lr save for leaf function with > > non- far jump > > > > Current GCC thumb1 has an annoying problem that always assuming far > > branch. > > So it forces to save lr, even when unnecessarily. The most extreme > > case complained by partner is: > > > > // compiled with "-mthumb -mcpu=cortex-m0 -Os". > > void foo() { for (;;); } > > => > > foo: > > push {lr} // Crazy!!! > > .L2: > > b .L2 > > > > The reason is that thumb1 far jump is only resolved in the very late > > pass "shorten_branch". Prologue/epilogue pass doesn't actually know a > > branch is far or not from its attribute. It has to conservatively > > save/restore lr whenever there is a branch. > > > > This patch tries to fix it with a simple heuristic, i.e., using > > function size to decide if a far jump will likely be used. Function > > size information is meaningful in prologue/epilogue pass. The > > heuristic uses following check to decide if lr should be saved for far > > jump: > > > > function_size * 3 >= 2048 // yes: save lr for possible far jump. No: > > don't > > save lr for far jump > > > > The scheme has an issue: if some corner case does break above > > condition, there is no chance to fix-up but to ICE. But the heuristic > > condition is very conservative. It is base on the worse normal > > condition that each instruction is associated with a 4 byte literal ( > > (2+4)/2=3, blooming size by 3 times ). > > I can't think of a real case to trigger the ICE. So I think it should > > work. > > > > Other approaches than the heuristic scheme are too expensive to > > implement for this small size/performance issue. I did explored some > > but none of them persuaded myself. > > > > Tests passed: > > * build libgcc, libstdc++, newlib, libm > > * make check-gcc with cpu=cortex-m0 > > * Small and extreme test cases > > > > ChangeLog: > > > > 2012-12-20 Joey Ye <joey.ye@arm.com> > > > > * config/arm/arm.c(thumb1_final_prescan_insn): > > Assert lr save for real far jump. > > (thumb_far_jump_used_p): Count instruction size and set > > far_jump_used. > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index > > 327ef22..ad79451 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) > > else if (conds != CONDS_NOCOND) > > cfun->machine->thumb1_cc_insn = NULL_RTX; > > } > > + > > + /* Check if unexpected far jump is used. */ > > + if (cfun->machine->lr_save_eliminated > > + && get_attr_far_jump (insn) == FAR_JUMP_YES) > > + internal_error("Unexpected thumb1 far jump"); > > } > > > > int > > @@ -21815,6 +21887,8 @@ static int > > thumb_far_jump_used_p (void) > > { > > rtx insn; > > + bool far_jump = false; > > + unsigned int func_size = 0; > > > > /* This test is only important for leaf functions. */ > > /* assert (!leaf_function_p ()); */ @@ -21870,6 +21944,26 @@ > > thumb_far_jump_used_p (void) > > && get_attr_far_jump (insn) == FAR_JUMP_YES > > ) > > { > > + far_jump = true; > > + } > > + func_size += get_attr_length (insn); > > + } > > + > > + /* Attribute far_jump will always be true for thumb1 before > > shorten_branch > > + pass. So checking far_jump attribute before shorten_branch isn't > > much > > + useful. > > + > > + Following heuristic tries to estimate more accruately if a far > > jump > > may > > + finally be used. The heuristic is very conservative as there is > > + no > > chance > > + to roll-back the decision of not to use far jump. > > + > > + Thumb1 long branch offset is -2048 to 2046. The worst case is > > + each > > 2-byte > > + insn is assiociated with a 4 byte constant pool. Using function > > size > > + 2048/3 as the threshold is conservative enough. */ if > > + (far_jump) > > + { > > + if ((func_size * 3) >= 2048) > > + { > > /* Record the fact that we have decided that > > the function does use far jumps. */ > > cfun->machine->far_jump_used = 1; > > > > > > > >
On 12/20/12 09:53, Joey Ye wrote: > Current GCC thumb1 has an annoying problem that always assuming far branch. > So it forces to save lr, even when unnecessarily. The most extreme case > complained by partner is: > > // compiled with "-mthumb -mcpu=cortex-m0 -Os". > void foo() { for (;;); } > => > foo: > push {lr} // Crazy!!! > .L2: > b .L2 > > The reason is that thumb1 far jump is only resolved in the very late pass > "shorten_branch". Prologue/epilogue pass doesn't actually know a branch is > far or not from its attribute. It has to conservatively save/restore lr > whenever there is a branch. > > This patch tries to fix it with a simple heuristic, i.e., using function > size to decide if a far jump will likely be used. Function size information > is meaningful in prologue/epilogue pass. The heuristic uses following check > to decide if lr should be saved for far jump: > > function_size * 3 >= 2048 // yes: save lr for possible far jump. No: don't > save lr for far jump > > The scheme has an issue: if some corner case does break above condition, > there is no chance to fix-up but to ICE. But the heuristic condition is very > conservative. It is base on the worse normal condition that each instruction > is associated with a 4 byte literal ( (2+4)/2=3, blooming size by 3 times ). > I can't think of a real case to trigger the ICE. So I think it should work. > > Other approaches than the heuristic scheme are too expensive to implement > for this small size/performance issue. I did explored some but none of them > persuaded myself. > > Tests passed: > * build libgcc, libstdc++, newlib, libm > * make check-gcc with cpu=cortex-m0 > * Small and extreme test cases > > ChangeLog: > > 2012-12-20 Joey Ye <joey.ye@arm.com> > > * config/arm/arm.c(thumb1_final_prescan_insn): > Assert lr save for real far jump. > (thumb_far_jump_used_p): Count instruction size and set > far_jump_used. > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > index 327ef22..ad79451 100644 > --- a/gcc/config/arm/arm.c > +++ b/gcc/config/arm/arm.c > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) > else if (conds != CONDS_NOCOND) > cfun->machine->thumb1_cc_insn = NULL_RTX; > } > + > + /* Check if unexpected far jump is used. */ > + if (cfun->machine->lr_save_eliminated > + && get_attr_far_jump (insn) == FAR_JUMP_YES) > + internal_error("Unexpected thumb1 far jump"); > } > > int > @@ -21815,6 +21887,8 @@ static int > thumb_far_jump_used_p (void) > { > rtx insn; > + bool far_jump = false; > + unsigned int func_size = 0; > > /* This test is only important for leaf functions. */ > /* assert (!leaf_function_p ()); */ > @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void) > && get_attr_far_jump (insn) == FAR_JUMP_YES > ) > { > + far_jump = true; > + } > + func_size += get_attr_length (insn); > + } > + > + /* Attribute far_jump will always be true for thumb1 before > shorten_branch > + pass. So checking far_jump attribute before shorten_branch isn't much > + useful. > + > + Following heuristic tries to estimate more accruately if a far jump > may > + finally be used. The heuristic is very conservative as there is no > chance > + to roll-back the decision of not to use far jump. > + > + Thumb1 long branch offset is -2048 to 2046. The worst case is each > 2-byte > + insn is assiociated with a 4 byte constant pool. Using function size > + 2048/3 as the threshold is conservative enough. */ > + if (far_jump) > + { > + if ((func_size * 3) >= 2048) > + { > /* Record the fact that we have decided that > the function does use far jumps. */ > cfun->machine->far_jump_used = 1; > > > Check for 80 character line length in the comments above - I can never tell if it is my mail client or yours. Otherwise ok if no regressions.. regards Ramana >
> -----Original Message----- > From: Ramana Radhakrishnan > Sent: Thursday, April 11, 2013 4:40 PM > To: Joey Ye > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH][ARM][thumb1] Reduce lr save for leaf function with > non-far jump > > On 12/20/12 09:53, Joey Ye wrote: > > Current GCC thumb1 has an annoying problem that always assuming far > branch. > > So it forces to save lr, even when unnecessarily. The most extreme > > case complained by partner is: > > > > // compiled with "-mthumb -mcpu=cortex-m0 -Os". > > void foo() { for (;;); } > > => > > foo: > > push {lr} // Crazy!!! > > .L2: > > b .L2 > > > > The reason is that thumb1 far jump is only resolved in the very late > > pass "shorten_branch". Prologue/epilogue pass doesn't actually know a > > branch is far or not from its attribute. It has to conservatively > > save/restore lr whenever there is a branch. > > > > This patch tries to fix it with a simple heuristic, i.e., using > > function size to decide if a far jump will likely be used. Function > > size information is meaningful in prologue/epilogue pass. The > > heuristic uses following check to decide if lr should be saved for far jump: > > > > function_size * 3 >= 2048 // yes: save lr for possible far jump. No: > > don't save lr for far jump > > > > The scheme has an issue: if some corner case does break above > > condition, there is no chance to fix-up but to ICE. But the heuristic > > condition is very conservative. It is base on the worse normal > > condition that each instruction is associated with a 4 byte literal ( (2+4)/2=3, > blooming size by 3 times ). > > I can't think of a real case to trigger the ICE. So I think it should work. > > > > Other approaches than the heuristic scheme are too expensive to > > implement for this small size/performance issue. I did explored some > > but none of them persuaded myself. > > > > Tests passed: > > * build libgcc, libstdc++, newlib, libm > > * make check-gcc with cpu=cortex-m0 > > * Small and extreme test cases > > > > ChangeLog: > > > > 2012-12-20 Joey Ye <joey.ye@arm.com> > > > > * config/arm/arm.c(thumb1_final_prescan_insn): > > Assert lr save for real far jump. > > (thumb_far_jump_used_p): Count instruction size and set > > far_jump_used. > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index > > 327ef22..ad79451 100644 > > --- a/gcc/config/arm/arm.c > > +++ b/gcc/config/arm/arm.c > > @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) > > else if (conds != CONDS_NOCOND) > > cfun->machine->thumb1_cc_insn = NULL_RTX; > > } > > + > > + /* Check if unexpected far jump is used. */ > > + if (cfun->machine->lr_save_eliminated > > + && get_attr_far_jump (insn) == FAR_JUMP_YES) > > + internal_error("Unexpected thumb1 far jump"); > > } > > > > int > > @@ -21815,6 +21887,8 @@ static int > > thumb_far_jump_used_p (void) > > { > > rtx insn; > > + bool far_jump = false; > > + unsigned int func_size = 0; > > > > /* This test is only important for leaf functions. */ > > /* assert (!leaf_function_p ()); */ @@ -21870,6 +21944,26 @@ > > thumb_far_jump_used_p (void) > > && get_attr_far_jump (insn) == FAR_JUMP_YES > > ) > > { > > + far_jump = true; > > + } > > + func_size += get_attr_length (insn); > > + } > > + > > + /* Attribute far_jump will always be true for thumb1 before > > shorten_branch > > + pass. So checking far_jump attribute before shorten_branch isn't much > > + useful. > > + > > + Following heuristic tries to estimate more accurately if a far > > + jump > > may > > + finally be used. The heuristic is very conservative as there is > > + no > > chance > > + to roll-back the decision of not to use far jump. > > + > > + Thumb1 long branch offset is -2048 to 2046. The worst case is > > + each > > 2-byte > > + insn is associated with a 4 byte constant pool. Using function size > > + 2048/3 as the threshold is conservative enough. */ if > > + (far_jump) > > + { > > + if ((func_size * 3) >= 2048) > > + { > > /* Record the fact that we have decided that > > the function does use far jumps. */ > > cfun->machine->far_jump_used = 1; > > > > > > > > Check for 80 character line length in the comments above - I can never tell if > it is my mail client or yours. Further shorten the lines. > > Otherwise ok if no regressions.. Make check targeting Cortex-M0/M3 with qemu. No regression. Committed as 197956 - Joey
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 327ef22..ad79451 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -21790,6 +21857,11 @@ thumb1_final_prescan_insn (rtx insn) else if (conds != CONDS_NOCOND) cfun->machine->thumb1_cc_insn = NULL_RTX; } + + /* Check if unexpected far jump is used. */ + if (cfun->machine->lr_save_eliminated + && get_attr_far_jump (insn) == FAR_JUMP_YES) + internal_error("Unexpected thumb1 far jump"); } int @@ -21815,6 +21887,8 @@ static int thumb_far_jump_used_p (void) { rtx insn; + bool far_jump = false; + unsigned int func_size = 0; /* This test is only important for leaf functions. */ /* assert (!leaf_function_p ()); */ @@ -21870,6 +21944,26 @@ thumb_far_jump_used_p (void) && get_attr_far_jump (insn) == FAR_JUMP_YES ) { + far_jump = true; + } + func_size += get_attr_length (insn); + } + + /* Attribute far_jump will always be true for thumb1 before shorten_branch + pass. So checking far_jump attribute before shorten_branch isn't much + useful. + + Following heuristic tries to estimate more accruately if a far jump may + finally be used. The heuristic is very conservative as there is no chance + to roll-back the decision of not to use far jump. + + Thumb1 long branch offset is -2048 to 2046. The worst case is each 2-byte + insn is assiociated with a 4 byte constant pool. Using function size + 2048/3 as the threshold is conservative enough. */ + if (far_jump) + { + if ((func_size * 3) >= 2048)