Message ID | 56949BBA.60006@linaro.org |
---|---|
State | New |
Headers | show |
Hi Kugan, On 12/01/16 06:22, kugan wrote: > > When promote_function_mode and promote_ssa_mode changes the sign differently, following is the cause for the problem in PR67714. > > _8 = fn1D.5055 (); > f_13 = _8; > > function returns -15 and in _8 it is sign extended. In the second statement, we say that the value is SUBREG_PROMOTED and promoted sign in unsigned which is wrong. When the value in _8 had come other ways than function call it would be > correct (as it would be zero extended). Attached patch checks that and uses the correct promoted sign in this case. > > The problem with the approach is, when you the following piece of code, we can still fail. But, I dont think I will ever happen. Any thoughts? > > > _8 = fn1D.5055 (); > _9 = _8 > f_13 = _9; > > This is similar to PR65932 where sign change in PROMOTE_MODE causes problem for parameter. But need a different fix there. > Regression tested on arm-none-linux-gnu with no new regression. I also bootstrapped regression tested (on an earlier version of trunk) for x86_64-none-linux-gnu with no new regressions. If this OK, I will do a full testing again. Comments? > > Thanks, > Kugan > > > gcc/ChangeLog: > > 2016-01-12 Kugan Vivekanandarajah <kuganv@linaro.org> > > * expr.c (expand_expr_real_1): Fix promoted sign in SUBREG_PRMOTED > for SSA_NAME when rhs has a value returned from function call. > Thanks for working on this. I'll leave to other to comment on this part as I'm not overly familiar with that area but... > gcc/testsuite/ChangeLog: > > 2016-01-12 Kugan Vivekanandarajah <kuganv@linaro.org> > > * gcc.target/arm/pr67714.c: New test. This test doesn't contain any arm-specific code so can you please put it in gcc.c-torture/execute/ Thanks, Kyrill
On Tue, Jan 12, 2016 at 12:04:22PM +0000, Kyrill Tkachov wrote: > >2016-01-12 Kugan Vivekanandarajah <kuganv@linaro.org> > > > > * expr.c (expand_expr_real_1): Fix promoted sign in SUBREG_PRMOTED I'd like to just point at the ChangeLog typo - PRMOTED instead of PROMOTED. Jakub
On 12/01/16 12:08, Jakub Jelinek wrote: > On Tue, Jan 12, 2016 at 12:04:22PM +0000, Kyrill Tkachov wrote: >>> 2016-01-12 Kugan Vivekanandarajah <kuganv@linaro.org> >>> >>> * expr.c (expand_expr_real_1): Fix promoted sign in SUBREG_PRMOTED > I'd like to just point at the ChangeLog typo - PRMOTED instead of PROMOTED. Since we're on the subject of the ChangeLog... It should also refer to the PR: PR middle-end/67714 Kyrill
On Mon, Jan 11, 2016 at 10:22 PM, kugan <kugan.vivekanandarajah@linaro.org> wrote: > When promote_function_mode and promote_ssa_mode changes the sign > differently, following is the cause for the problem in PR67714. > This is similar to PR65932 where sign change in PROMOTE_MODE causes problem > for parameter. But need a different fix there. One of the proposed fixes for PR65932 was to make PROMOTE_MODE work the same as promote_function_mode. That should fix both bugs, and avoid some of the weirdness necessary to work around the problem where they disagree. However, that fix is stalled, because it causes potential performance regressions for some older ARM versions. I've been meaning to look at that again. It is probably a better fix than the one you are proposing here if we can make it work. Jim
On 13/01/16 10:19, Jim Wilson wrote: > On Mon, Jan 11, 2016 at 10:22 PM, kugan > <kugan.vivekanandarajah@linaro.org> wrote: >> When promote_function_mode and promote_ssa_mode changes the sign >> differently, following is the cause for the problem in PR67714. > >> This is similar to PR65932 where sign change in PROMOTE_MODE causes problem >> for parameter. But need a different fix there. > > One of the proposed fixes for PR65932 was to make PROMOTE_MODE work > the same as promote_function_mode. That should fix both bugs, and > avoid some of the weirdness necessary to work around the problem where > they disagree. However, that fix is stalled, because it causes > potential performance regressions for some older ARM versions. I've > been meaning to look at that again. It is probably a better fix than > the one you are proposing here if we can make it work. Yes, making PROMOTE_MODE to work the same way as in promote_function_mode in arm will fix this. Can you please point me to the test cases that are regressing so that I can also start looking at them. Thanks, Kugan
On Tue, Jan 12, 2016 at 5:10 PM, Kugan <kugan.vivekanandarajah@linaro.org> wrote: > Yes, making PROMOTE_MODE to work the same way as in > promote_function_mode in arm will fix this. Can you please point me to > the test cases that are regressing so that I can also start looking at them. The info is in here https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932 See the comments on gcc.target/arm/wmul-[123].c which no longer generate smulbb etc instructions, which are 16x16=32 expanding multiplies which are faster on some older parts that have them. They are present in armv5e and higher architecture versions. Kyrylo looked at this in November, but the situation looks even worse now, as some of the redundant sign extends are gone even before the first rtl pass. That may make it harder to get the smulbb instructions back. Jim
On Tue, Jan 12, 2016 at 5:40 PM, Jim Wilson <jim.wilson@linaro.org> wrote: > The info is in here > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932 > See the comments on gcc.target/arm/wmul-[123].c which no longer > generate smulbb etc instructions, which are 16x16=32 expanding > multiplies which are faster on some older parts that have them. They > are present in armv5e and higher architecture versions. I forgot about the ldrub/ldrsb problem. ldrub is preferred, particularly for older targets, e.g. thumb1, as it accepts more addressing modes than ldrsb. We can't get ldrub if PROMOTE_MODE doesn't do unsigned extension. So we have a number of bad choices here 1) We can remove sign-changing promotions from PROMOTE_MODE, and accept slower code for pre-thumb2 architectures. 2) We can add sign-changing promotions to function_promote_mode, and accept a minor ABI change. 3) We can add strange and probably fragile extensions to the middle end to work around the ARM back end problem. 4) We can just leave the ARM port broken and let it occasionally generate incorrect code. Option 4 is the one that we've been using for the last 8 months or so. I think we should do either 1 or 2, though that depends on what the ARM maintainers are willing to accept. Jim
Hi all, On 13/01/16 01:40, Jim Wilson wrote: > On Tue, Jan 12, 2016 at 5:10 PM, Kugan > <kugan.vivekanandarajah@linaro.org> wrote: >> Yes, making PROMOTE_MODE to work the same way as in >> promote_function_mode in arm will fix this. Can you please point me to >> the test cases that are regressing so that I can also start looking at them. > The info is in here > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932 > See the comments on gcc.target/arm/wmul-[123].c which no longer > generate smulbb etc instructions, which are 16x16=32 expanding > multiplies which are faster on some older parts that have them. They > are present in armv5e and higher architecture versions. > > Kyrylo looked at this in November, but the situation looks even worse > now, as some of the redundant sign extends are gone even before the > first rtl pass. That may make it harder to get the smulbb > instructions back. I've done some more investigation on this yesterday. The situation is actually not so bad. The sign-extends are being removed from the arms of the multiply (turning it into an mla rather than smlabb) in the RTL cse pass. Something's going off in the costing logic in CSE because it generates a simpler RTX (a multiply-add versus a multiply-add-extend) but one which is more expensive. I think I've found the bug in there and I hope to post a separate thread on that soon. With the fix to CSE and a couple of arm backend rtx costing issues I can get wmul-1.c and wmul-2.c to pass even with the change to promote_mode. For wmul-3.c I get the sequence: ldrsh r1, [ip, #2]! ldrsh r4, [r0, #2]! cmp r5, ip mls r2, r1, r1, r2 mls lr, r1, r4, lr instead of the previous: ldrh ip, [lr, #2]! ldrh r1, [r0, #2]! cmp r6, lr smulbb r5, ip, ip smulbb r1, ip, r1 sub r2, r2, r5 sub r4, r4, r1 That is, two instructions shorter (no subs) but using the more expensive mls instruction rather than smulbb. Is the new sequence preferable? I hope to post the backend cost bug fixes soon and an RFC for the cse issue. Thanks, Kyrill > Jim >
On 13/01/16 06:59, Jim Wilson wrote: > On Tue, Jan 12, 2016 at 5:40 PM, Jim Wilson <jim.wilson@linaro.org> wrote: >> The info is in here >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932 >> See the comments on gcc.target/arm/wmul-[123].c which no longer >> generate smulbb etc instructions, which are 16x16=32 expanding >> multiplies which are faster on some older parts that have them. They >> are present in armv5e and higher architecture versions. > I forgot about the ldrub/ldrsb problem. ldrub is preferred, > particularly for older targets, e.g. thumb1, as it accepts more > addressing modes than ldrsb. Well, Thumb1 is used in armv6-m i.e. for currently used microcontrollers where codesize is important, so we should gather data on how much this actually hurts us there. Building some popular embedded benchmarks for Thumb1 with various optimisation levels (including -Os!) with and without the change in option 1 is something we should do. > We can't get ldrub if PROMOTE_MODE > doesn't do unsigned extension. > > So we have a number of bad choices here > 1) We can remove sign-changing promotions from PROMOTE_MODE, and > accept slower code for pre-thumb2 architectures. I personally think this is the most promising way forward, as long as we fully understand the effect on codegen and work towards minimising any negative effects. Thanks, Kyrill > 2) We can add sign-changing promotions to function_promote_mode, and > accept a minor ABI change. > 3) We can add strange and probably fragile extensions to the middle > end to work around the ARM back end problem. > 4) We can just leave the ARM port broken and let it occasionally > generate incorrect code. > > Option 4 is the one that we've been using for the last 8 months or so. > I think we should do either 1 or 2, though that depends on what the > ARM maintainers are willing to accept. > > Jim >
On Wed, Jan 13, 2016 at 11:06 AM, Kyrill Tkachov <kyrylo.tkachov@foss.arm.com> wrote: > > On 13/01/16 06:59, Jim Wilson wrote: >> >> On Tue, Jan 12, 2016 at 5:40 PM, Jim Wilson <jim.wilson@linaro.org> wrote: >>> >>> The info is in here >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932 >>> See the comments on gcc.target/arm/wmul-[123].c which no longer >>> generate smulbb etc instructions, which are 16x16=32 expanding >>> multiplies which are faster on some older parts that have them. They >>> are present in armv5e and higher architecture versions. >> >> I forgot about the ldrub/ldrsb problem. ldrub is preferred, >> particularly for older targets, e.g. thumb1, as it accepts more >> addressing modes than ldrsb. > > > Well, Thumb1 is used in armv6-m i.e. for currently used > microcontrollers where codesize is important, so we should > gather data on how much this actually hurts us there. > > Building some popular embedded benchmarks for Thumb1 with various > optimisation levels (including -Os!) with and without the > change in option 1 is something we should do. > >> We can't get ldrub if PROMOTE_MODE >> doesn't do unsigned extension. >> >> So we have a number of bad choices here >> 1) We can remove sign-changing promotions from PROMOTE_MODE, and >> accept slower code for pre-thumb2 architectures. > > > I personally think this is the most promising way forward, > as long as we fully understand the effect on codegen and work > towards minimising any negative effects. I think the only way forward is to make PROMOTE_MODE and promote_function_mode agree. I hope nobody else is going to ack hacks here and there (esp. with "I think it cannot happen" comments...). Oh, and document that they have to agree. Richard. > Thanks, > Kyrill > > >> 2) We can add sign-changing promotions to function_promote_mode, and >> accept a minor ABI change. >> 3) We can add strange and probably fragile extensions to the middle >> end to work around the ARM back end problem. >> 4) We can just leave the ARM port broken and let it occasionally >> generate incorrect code. >> >> Option 4 is the one that we've been using for the last 8 months or so. >> I think we should do either 1 or 2, though that depends on what the >> ARM maintainers are willing to accept. >> >> Jim >> >
On 01/13/2016 03:06 AM, Kyrill Tkachov wrote: > > On 13/01/16 06:59, Jim Wilson wrote: >> On Tue, Jan 12, 2016 at 5:40 PM, Jim Wilson <jim.wilson@linaro.org> >> wrote: >>> The info is in here >>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65932 >>> See the comments on gcc.target/arm/wmul-[123].c which no longer >>> generate smulbb etc instructions, which are 16x16=32 expanding >>> multiplies which are faster on some older parts that have them. They >>> are present in armv5e and higher architecture versions. >> I forgot about the ldrub/ldrsb problem. ldrub is preferred, >> particularly for older targets, e.g. thumb1, as it accepts more >> addressing modes than ldrsb. > > Well, Thumb1 is used in armv6-m i.e. for currently used > microcontrollers where codesize is important, so we should > gather data on how much this actually hurts us there. > > Building some popular embedded benchmarks for Thumb1 with various > optimisation levels (including -Os!) with and without the > change in option 1 is something we should do. > >> We can't get ldrub if PROMOTE_MODE >> doesn't do unsigned extension. >> >> So we have a number of bad choices here >> 1) We can remove sign-changing promotions from PROMOTE_MODE, and >> accept slower code for pre-thumb2 architectures. > > I personally think this is the most promising way forward, > as long as we fully understand the effect on codegen and work > towards minimising any negative effects. I'd lean that way as well, but will defer to those more knowledgeable about the ARM world than myself if they want to go a different direction. jeff
diff --git a/gcc/expr.c b/gcc/expr.c index bd43dc4..6a2b3c0 100644 --- a/gcc/expr.c +++ b/gcc/expr.c @@ -9710,7 +9710,25 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode, gimple_call_fntype (g), 2); else - pmode = promote_ssa_mode (ssa_name, &unsignedp); + { + tree rhs; + gimple *stmt; + /* When this is a SSA copy from a value returned from a + function call, use the corect promoted sign for SUBREG_PROMOTED_P + (PR67714). */ + if (code == SSA_NAME + && is_gimple_assign (g) + && (rhs = gimple_assign_rhs1 (g)) + && TREE_CODE (rhs) == SSA_NAME + && (stmt = SSA_NAME_DEF_STMT (rhs)) + && gimple_code (stmt) == GIMPLE_CALL + && !gimple_call_internal_p (stmt)) + pmode = promote_function_mode (type, mode, &unsignedp, + gimple_call_fntype (stmt), + 2); + else + pmode = promote_ssa_mode (ssa_name, &unsignedp); + } gcc_assert (GET_MODE (decl_rtl) == pmode); temp = gen_lowpart_SUBREG (mode, decl_rtl); diff --git a/gcc/testsuite/gcc.target/arm/pr67714.c b/gcc/testsuite/gcc.target/arm/pr67714.c index e69de29..355b559 100644 --- a/gcc/testsuite/gcc.target/arm/pr67714.c +++ b/gcc/testsuite/gcc.target/arm/pr67714.c @@ -0,0 +1,26 @@ + +/* PR target/67714 */ +/* { dg-do-run } */ +/* { dg-options "-O1" } */ + +unsigned int b; +int c; + +signed char fn1 () +{ + signed char d; + for (int i = 0; i < 1; i++) + d = -15; + return d; +} + +int main() +{ + for (c = 0; c < 1; c++) + b = 0; + char e = fn1(); + signed char f = e ^ b; + __builtin_printf("checksum = %x\n", (int)f); + if ((int)f != 0xfffffff1) + __builtin_abort (); +}