Message ID | 5649E333.4090904@arm.com |
---|---|
State | New |
Headers | show |
On 11/16/2015 03:07 PM, Kyrill Tkachov wrote: > I've explained in the comments in the patch what's going on but the > short version is trying to change the destination of a defining insn > that feeds into an extend insn is not valid if the defining insn > doesn't feed directly into the extend insn. In the ree pass the only > way this can happen is if there is an intermediate conditional move > that the pass tries to handle in a special way. An equivalent fix > would have been to check on that path (when copy_needed in > combine_reaching_defs is true) that the state->copies_list vector > (that contains the conditional move insns feeding into the extend > insn) is empty. I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source: /* Considering transformation of (set (reg1) (expression)) ... (set (reg2) (any_extend (reg1))) into (set (reg2) (any_extend (expression))) (set (reg1) (reg2)) ... */ I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion? The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix, > + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) Shouldn't this really be !rtx_equal_p? Bernd
Hi Bernd, On 16/11/15 18:40, Bernd Schmidt wrote: > On 11/16/2015 03:07 PM, Kyrill Tkachov wrote: > >> I've explained in the comments in the patch what's going on but the >> short version is trying to change the destination of a defining insn >> that feeds into an extend insn is not valid if the defining insn >> doesn't feed directly into the extend insn. In the ree pass the only >> way this can happen is if there is an intermediate conditional move >> that the pass tries to handle in a special way. An equivalent fix >> would have been to check on that path (when copy_needed in >> combine_reaching_defs is true) that the state->copies_list vector >> (that contains the conditional move insns feeding into the extend >> insn) is empty. > > I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source: > > /* Considering transformation of > (set (reg1) (expression)) > ... > (set (reg2) (any_extend (reg1))) > > into > > (set (reg2) (any_extend (expression))) > (set (reg1) (reg2)) > ... */ > > I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion? Yes, I had considered that as well. It should be equivalent. I didn't use !reg_used_between_p because I thought it'd be more expensive than checking reg_overlap_mentioned_p since we must iterate over a number of instructions and call reg_overlap_mentioned_p on each one. But I suppose this case is rare enough that it wouldn't make any measurable difference. Would you prefer to use !reg_used_between_p here? > > The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix, > >> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) > > Shouldn't this really be !rtx_equal_p? > Maybe, will it behave the right way if the two regs have different modes or when subregs are involved? (will we even hit such a case in this path?) Thanks, Kyrill > > Bernd >
On 17/11/15 09:08, Kyrill Tkachov wrote: > Hi Bernd, > > On 16/11/15 18:40, Bernd Schmidt wrote: >> On 11/16/2015 03:07 PM, Kyrill Tkachov wrote: >> >>> I've explained in the comments in the patch what's going on but the >>> short version is trying to change the destination of a defining insn >>> that feeds into an extend insn is not valid if the defining insn >>> doesn't feed directly into the extend insn. In the ree pass the only >>> way this can happen is if there is an intermediate conditional move >>> that the pass tries to handle in a special way. An equivalent fix >>> would have been to check on that path (when copy_needed in >>> combine_reaching_defs is true) that the state->copies_list vector >>> (that contains the conditional move insns feeding into the extend >>> insn) is empty. >> >> I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source: >> >> /* Considering transformation of >> (set (reg1) (expression)) >> ... >> (set (reg2) (any_extend (reg1))) >> >> into >> >> (set (reg2) (any_extend (expression))) >> (set (reg1) (reg2)) >> ... */ >> >> I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion? > > Yes, I had considered that as well. It should be equivalent. I didn't use !reg_used_between_p because I thought > it'd be more expensive than checking reg_overlap_mentioned_p since we must iterate over a number of instructions > and call reg_overlap_mentioned_p on each one. But I suppose this case is rare enough that it wouldn't make any > measurable difference. Actually, I tried it out. And while a check reg_used_between_p fixed the testcase, it caused code quality regressions on aarch64. Seems it's too aggressive in restricting ree. I'll have a closer look. Kyrill > > Would you prefer to use !reg_used_between_p here? > >> >> The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix, >> >>> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) >> >> Shouldn't this really be !rtx_equal_p? >> > > Maybe, will it behave the right way if the two regs have different modes or when subregs are involved? > (will we even hit such a case in this path?) > > Thanks, > Kyrill >> >> Bernd >> >
On 17/11/15 09:49, Kyrill Tkachov wrote: > > On 17/11/15 09:08, Kyrill Tkachov wrote: >> Hi Bernd, >> >> On 16/11/15 18:40, Bernd Schmidt wrote: >>> On 11/16/2015 03:07 PM, Kyrill Tkachov wrote: >>> >>>> I've explained in the comments in the patch what's going on but the >>>> short version is trying to change the destination of a defining insn >>>> that feeds into an extend insn is not valid if the defining insn >>>> doesn't feed directly into the extend insn. In the ree pass the only >>>> way this can happen is if there is an intermediate conditional move >>>> that the pass tries to handle in a special way. An equivalent fix >>>> would have been to check on that path (when copy_needed in >>>> combine_reaching_defs is true) that the state->copies_list vector >>>> (that contains the conditional move insns feeding into the extend >>>> insn) is empty. >>> >>> I ran this through gdb, and I think I see what's going on. For reference, here's a comment from the source: >>> >>> /* Considering transformation of >>> (set (reg1) (expression)) >>> ... >>> (set (reg2) (any_extend (reg1))) >>> >>> into >>> >>> (set (reg2) (any_extend (expression))) >>> (set (reg1) (reg2)) >>> ... */ >>> >>> I was thinking that another possible fix would be to also check !reg_used_between_p for reg1 to ensure it's not used. I'm thinking this might be a little clearer - what is your opinion? >> >> Yes, I had considered that as well. It should be equivalent. I didn't use !reg_used_between_p because I thought >> it'd be more expensive than checking reg_overlap_mentioned_p since we must iterate over a number of instructions >> and call reg_overlap_mentioned_p on each one. But I suppose this case is rare enough that it wouldn't make any >> measurable difference. > > Actually, I tried it out. And while a check reg_used_between_p fixed the testcase, it caused code quality regressions > on aarch64. Seems it's too aggressive in restricting ree. > > I'll have a closer look. Ok, so the testcases that regress code-quality-wise on aarch64 look like this before ree: (insn 48 57 49 7 (set (reg:SI 7 x7) (zero_extend:SI (mem:QI (plus:DI (reg/f:DI 1 x1) (const_int 2 [0x2])))))) (insn 49 48 52 7 (set (reg/v:SI 2 x2) (reg:SI 7 x7))) (insn 52 49 53 7 (set (reg:DI 8 x8) (zero_extend:DI (reg:SI 7 x7)))) ree wants to transform this into: (insn 48 57 296 7 (set (reg:DI 8 x8) (zero_extend:DI (mem:QI (plus:DI (reg/f:DI 1 x1) (const_int 2 [0x2])))))) (insn 296 48 49 7 (set (reg:DI 7 x7) (reg:DI 8 x8))) (insn 49 296 53 7 (set (reg/v:SI 2 x2) (reg:SI 7 x7))) which is valid, but we reject that with the reg_used_between_p check because x7 is used in the intermediate insn 49. Note that no conditional move is present here. So, I think that the crucial element here is that the destination of the def_insn should feed directly into the extend, and that is what my original patch was testing for. So, I'd like to keep my original proposed patch as is, although I think I'll add a couple of testcases from the duplicate PRs to it for the testsuite. What do you think? Thanks, Kyrill > > Kyrill > >> >> Would you prefer to use !reg_used_between_p here? >> >>> >>> The added comment could lead to some confusion since it's placed in front of an existing if statement that also tests a different condition. Also, if we go with your fix, >>> >>>> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) >>> >>> Shouldn't this really be !rtx_equal_p? >>> >> >> Maybe, will it behave the right way if the two regs have different modes or when subregs are involved? >> (will we even hit such a case in this path?) >> >> Thanks, >> Kyrill >>> >>> Bernd >>> >> >
On 11/17/2015 10:08 AM, Kyrill Tkachov wrote: > Yes, I had considered that as well. It should be equivalent. I didn't > use !reg_used_between_p because I thought > it'd be more expensive than checking reg_overlap_mentioned_p since we > must iterate over a number of instructions > and call reg_overlap_mentioned_p on each one. But I suppose this case is > rare enough that it wouldn't make any > measurable difference. > > Would you prefer to use !reg_used_between_p here? I would but apparently it doesn't work, so that's kind of neither here nor there. >> The added comment could lead to some confusion since it's placed in >> front of an existing if statement that also tests a different >> condition. Also, if we go with your fix, >> >>> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN >>> (cand->insn)))) >> >> Shouldn't this really be !rtx_equal_p? >> > > Maybe, will it behave the right way if the two regs have different modes > or when subregs are involved? It would return false, in which case we'll conservatively fail here. I think that's desirable? Bernd
On 17/11/15 12:10, Bernd Schmidt wrote: > On 11/17/2015 10:08 AM, Kyrill Tkachov wrote: >> Yes, I had considered that as well. It should be equivalent. I didn't >> use !reg_used_between_p because I thought >> it'd be more expensive than checking reg_overlap_mentioned_p since we >> must iterate over a number of instructions >> and call reg_overlap_mentioned_p on each one. But I suppose this case is >> rare enough that it wouldn't make any >> measurable difference. >> >> Would you prefer to use !reg_used_between_p here? > > I would but apparently it doesn't work, so that's kind of neither here nor there. > >>> The added comment could lead to some confusion since it's placed in >>> front of an existing if statement that also tests a different >>> condition. Also, if we go with your fix, >>> >>>> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN >>>> (cand->insn)))) >>> >>> Shouldn't this really be !rtx_equal_p? >>> >> >> Maybe, will it behave the right way if the two regs have different modes >> or when subregs are involved? > > It would return false, in which case we'll conservatively fail here. I think that's desirable? > Well, I think the statement we want to make is "return false from this function if the two expressions contain the same register number". if (!rtx_equal_p (..., ...)) return false; will only return false if the two expressions are the same REG with the same mode. if (!reg_overlap_mentioned_p (..., ...)) return false; should return false even if the modes are different or one is a subreg, which is what we want. I did not see any codegen regressions using reg_overlap_mentioned_p on aarch64, so I don't think it will restrict any legitimate cases. Thanks, Kyrill > > Bernd >
On 11/17/2015 02:03 PM, Kyrill Tkachov wrote: > + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) > return false; > Well, I think the statement we want to make is > "return false from this function if the two expressions contain the same > register number". I looked at it again and I think I'll need more time to really figure out what's going on in this pass. However, about the above... either I'm very confused, or the actual statement here is "return false _unless_ the two expressions contain the same register number". In the testcase, the regs in question are ax and bx, which is then rejected if the patch has been applied. (gdb) p tmp_reg (reg:SI 0 ax) (gdb) p cand->insn (insn 59 117 60 7 (set (reg:SI 4 si [orig:107 D.1813 ] [107]) (sign_extend:SI (reg:HI 3 bx [orig:99 D.1813 ] [99]))) And I think it would be better to strengthen that to "return false unless registers are identical". (From the above it's clear however that a plain rtx_equal_p wouldn't work since there's an extension in the operand). Also, I had another look at the testcase. It uses __builtin_printf and dg-output, which is at least unusual. Try to use the normal "if (cond) abort ()". Bernd
On 17/11/15 23:11, Bernd Schmidt wrote: > On 11/17/2015 02:03 PM, Kyrill Tkachov wrote: >> + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) >> return false; > >> Well, I think the statement we want to make is >> "return false from this function if the two expressions contain the same >> register number". > > I looked at it again and I think I'll need more time to really figure out what's going on in this pass. > > However, about the above... either I'm very confused, or the actual statement here is "return false _unless_ the two expressions contain the same register number". In the testcase, the regs in question are ax and bx, which is then > rejected if the patch has been applied. I'm sorry, it is my mistake in explaining. I meant to say: "return false from this function unless the two expressions contain the same register number" > > (gdb) p tmp_reg > (reg:SI 0 ax) > (gdb) p cand->insn > (insn 59 117 60 7 (set (reg:SI 4 si [orig:107 D.1813 ] [107]) > (sign_extend:SI (reg:HI 3 bx [orig:99 D.1813 ] [99]))) > > And I think it would be better to strengthen that to "return false unless registers are identical". (From the above it's clear however that a plain rtx_equal_p wouldn't work since there's an extension in the operand). So the three relevant instructions are: I1: (set (reg:HI 0 ax) (mem:HI (symbol_ref:DI ("f")))) I2: (set (reg:SI 3 bx) (if_then_else:SI (eq (reg:CCZ 17 flags) (const_int 0)) (reg:SI 0 ax) (reg:SI 3 bx))) I3: (set (reg:SI 4 si) (sign_extend:SI (reg:HI 3 bx))) I1 is def_insn, I3 is cand->insn. tmp_reg is 'ax'. What we want to do is reject this transformation because the destination of def_insn (aka I1), that is 'ax', is not the operand of the extend operation in cand->insn (aka I3). As you said, rtx_equal won't work on just SET_SRC (PATTERN (cand->insn)) because it's an extend operation. So reg_overlap_mentioned should be appropriate. Hope this helps. > > Also, I had another look at the testcase. It uses __builtin_printf and dg-output, which is at least unusual. Try to use the normal "if (cond) abort ()". > I did not try to remove the __builtin_printf because the use of 'f' there is needed to trigger this. There are at least two other dups of this PR: 68328 and 68185 the testcases for which have an "if (cond) abort ();" form. I'll use them instead. > > Bernd >
commit 33131f774705b936afc1a26c145e1214b388771f Author: Kyrylo Tkachov <kyrylo.tkachov@arm.com> Date: Fri Nov 13 15:01:47 2015 +0000 [RTL-ree] PR rtl-optimization/68194: Restrict copy instruction in presence of conditional moves diff --git a/gcc/ree.c b/gcc/ree.c index b8436f2..e91d164 100644 --- a/gcc/ree.c +++ b/gcc/ree.c @@ -814,7 +814,30 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state) rtx tmp_reg = gen_rtx_REG (GET_MODE (SET_DEST (PATTERN (cand->insn))), REGNO (SET_DEST (*dest_sub_rtx))); - if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn)))) + + /* When transforming: + (set (reg1) (expression)) + ... + (set (reg2) (any_extend (reg1))) + + into + + (set (reg2) (any_extend (expression))) + (set (reg1) (reg2)) + make sure that reg1 from the first set feeds directly into the extend. + This may not hold in a situation with an intermediate + conditional copy i.e. + I1: (set (reg3) (expression)) + I2: (set (reg1) (cond ? reg3 : reg1)) + I3: (set (reg2) (any_extend (reg1))) + + where I3 is cand, I1 is def_insn and I2 is a conditional copy. + We want to avoid transforming that into: + (set (reg2) (any_extend (expression))) + (set (reg1) (reg2)) + (set (reg1) (cond ? reg3 : reg1)). */ + if (reg_overlap_mentioned_p (tmp_reg, SET_DEST (PATTERN (cand->insn))) + || !reg_overlap_mentioned_p (tmp_reg, SET_SRC (PATTERN (cand->insn)))) return false; /* The destination register of the extension insn must not be diff --git a/gcc/testsuite/gcc.dg/pr68194.c b/gcc/testsuite/gcc.dg/pr68194.c new file mode 100644 index 0000000..b4855ea --- /dev/null +++ b/gcc/testsuite/gcc.dg/pr68194.c @@ -0,0 +1,35 @@ +/* { dg-do run } */ +/* { dg-options "-O2" } */ + +int a, c, d, e, g, h; +short f; + +short +fn1 (void) +{ + int j[2]; + for (; e; e++) + if (j[0]) + for (;;) + ; + if (!g) + return f; +} + +int +main (void) +{ + for (; a < 1; a++) + { + for (c = 0; c < 2; c++) + { + d && (f = 0); + h = fn1 (); + } + __builtin_printf ("%d\n", (char) f); + } + + return 0; +} + +/* { dg-output "0" } */