Message ID | CAEwic4bt2VW9kaTg+ZegPq-9TLO2iQ5GAsgqD=Wts+TVNTpNEA@mail.gmail.com |
---|---|
State | New |
Headers | show |
Hi, On Mon, 10 Oct 2011, Kai Tietz wrote: > To ensure that we use simple_operand_p in all cases, beside for > branching AND/OR chains, in same way as before, I added to this function > an additional argument, by which the looking into comparisons can be > activated. Better make it a separate function the first tests your new conditions, and then calls simple_operand_p. > +fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, > + tree lhs, tree rhs) > { > /* If this is the "or" of two comparisons, we can do something if > the comparisons are NE_EXPR. If this is the "and", we can do something > @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_ > build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg), > ll_arg, rl_arg), > build_int_cst (TREE_TYPE (ll_arg), 0)); > - > - if (LOGICAL_OP_NON_SHORT_CIRCUIT) > - { > - if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs) > - return build2_loc (loc, code, truth_type, lhs, rhs); > - return NULL_TREE; > - } Why do you remove this hunk? Shouldn't you instead move the hunk you added to fold_truth_andor() here. I realize this needs some TLC to fold_truth_andor_1, because right now it early-outs for non-comparisons, but it seems the better place. I.e. somehow move the below code into the above branch, with the associated diddling on fold_truth_andor_1 that it gets called. > + if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) > + && (BRANCH_COST (optimize_function_for_speed_p (cfun), > + false) >= 2) > + && !TREE_SIDE_EFFECTS (arg1) > + && LOGICAL_OP_NON_SHORT_CIRCUIT > + && simple_operand_p (arg1, true)) > + { > + enum tree_code ncode = (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR > + : TRUTH_OR_EXPR); > + > + /* We don't want to pack more then two leafs to an non-IF Missing continuation of the sentence? > + If tree-code of left-hand operand isn't an AND/OR-IF code and not > + equal to CODE, then we don't want to add right-hand operand. > + If the inner right-hand side of left-hand operand has side-effects, > + or isn't simple, then we can't add to it, as otherwise we might > + destroy if-sequence. */ > + if (TREE_CODE (arg0) == code > + /* Needed for sequence points to handle trappings, and > + side-effects. */ > + && !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1)) > + && simple_operand_p (TREE_OPERAND (arg0, 1), true)) > + { > + tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1), > + arg1); > + return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), > + tem); > + } > + /* Needed for sequence points to handle trappings, and side-effects. */ > + else if (!TREE_SIDE_EFFECTS (arg0) > + && simple_operand_p (arg0, true)) > + return fold_build2_loc (loc, ncode, type, arg0, arg1); > + } > + Ciao, Michael.
2011/10/11 Michael Matz <matz@suse.de>: > Hi, > > On Mon, 10 Oct 2011, Kai Tietz wrote: > >> To ensure that we use simple_operand_p in all cases, beside for >> branching AND/OR chains, in same way as before, I added to this function >> an additional argument, by which the looking into comparisons can be >> activated. > > Better make it a separate function the first tests your new conditions, > and then calls simple_operand_p. Well, either I make it a new function and call it instead of simple_operand_p, or I need to modify old simple_operand_p. The logic of new and old version is incompatible and has not to be called on same tree, as otherwise new check is vain. >> +fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, >> + tree lhs, tree rhs) >> { >> /* If this is the "or" of two comparisons, we can do something if >> the comparisons are NE_EXPR. If this is the "and", we can do something >> @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_ >> build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg), >> ll_arg, rl_arg), >> build_int_cst (TREE_TYPE (ll_arg), 0)); >> - >> - if (LOGICAL_OP_NON_SHORT_CIRCUIT) >> - { >> - if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs) >> - return build2_loc (loc, code, truth_type, lhs, rhs); >> - return NULL_TREE; >> - } > > Why do you remove this hunk? Shouldn't you instead move the hunk you > added to fold_truth_andor() here. I realize this needs some TLC to > fold_truth_andor_1, because right now it early-outs for non-comparisons, > but it seems the better place. I.e. somehow move the below code into the > above branch, with the associated diddling on fold_truth_andor_1 that it > gets called. This hunk is removed, as it is vain to do here. Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is better done at a single place in fold_truth_andor only. >> + if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) >> + && (BRANCH_COST (optimize_function_for_speed_p (cfun), >> + false) >= 2) >> + && !TREE_SIDE_EFFECTS (arg1) >> + && LOGICAL_OP_NON_SHORT_CIRCUIT >> + && simple_operand_p (arg1, true)) >> + { >> + enum tree_code ncode = (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR >> + : TRUTH_OR_EXPR); >> + >> + /* We don't want to pack more then two leafs to an non-IF > > Missing continuation of the sentence? Well, here is a colon missing. >> + If tree-code of left-hand operand isn't an AND/OR-IF code and not >> + equal to CODE, then we don't want to add right-hand operand. >> + If the inner right-hand side of left-hand operand has side-effects, >> + or isn't simple, then we can't add to it, as otherwise we might >> + destroy if-sequence. */ > > >> + if (TREE_CODE (arg0) == code >> + /* Needed for sequence points to handle trappings, and >> + side-effects. */ >> + && !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1)) >> + && simple_operand_p (TREE_OPERAND (arg0, 1), true)) >> + { >> + tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1), >> + arg1); >> + return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), >> + tem); >> + } >> + /* Needed for sequence points to handle trappings, and side-effects. */ >> + else if (!TREE_SIDE_EFFECTS (arg0) >> + && simple_operand_p (arg0, true)) >> + return fold_build2_loc (loc, ncode, type, arg0, arg1); >> + } >> + > > > Ciao, > Michael. > Kai
Hi, On Tue, 11 Oct 2011, Kai Tietz wrote: > > Better make it a separate function the first tests your new > > conditions, and then calls simple_operand_p. > > Well, either I make it a new function and call it instead of > simple_operand_p, That's what I meant, yes. > >> @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_ > >> build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg), > >> ll_arg, rl_arg), > >> build_int_cst (TREE_TYPE (ll_arg), 0)); > >> - > >> - if (LOGICAL_OP_NON_SHORT_CIRCUIT) > >> - { > >> - if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs) > >> - return build2_loc (loc, code, truth_type, lhs, rhs); > >> - return NULL_TREE; > >> - } > > > > Why do you remove this hunk? Shouldn't you instead move the hunk you > > added to fold_truth_andor() here. I realize this needs some TLC to > > fold_truth_andor_1, because right now it early-outs for non-comparisons, > > but it seems the better place. I.e. somehow move the below code into the > > above branch, with the associated diddling on fold_truth_andor_1 that it > > gets called. > > This hunk is removed, as it is vain to do here. There is a fallthrough now, that wasn't there before. I don't know if it's harmless, I just wanted to mention it. > Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is > better done at a single place in fold_truth_andor only. As fold_truthop is called twice by fold_truth_andor, the latter might indeed be the better place. Ciao, Michael.
2011/10/11 Michael Matz <matz@suse.de>: > Hi, > > On Tue, 11 Oct 2011, Kai Tietz wrote: > >> > Better make it a separate function the first tests your new >> > conditions, and then calls simple_operand_p. >> >> Well, either I make it a new function and call it instead of >> simple_operand_p, > > That's what I meant, yes. > >> >> @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_ >> >> build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg), >> >> ll_arg, rl_arg), >> >> build_int_cst (TREE_TYPE (ll_arg), 0)); >> >> - >> >> - if (LOGICAL_OP_NON_SHORT_CIRCUIT) >> >> - { >> >> - if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs) >> >> - return build2_loc (loc, code, truth_type, lhs, rhs); >> >> - return NULL_TREE; >> >> - } >> > >> > Why do you remove this hunk? Shouldn't you instead move the hunk you >> > added to fold_truth_andor() here. I realize this needs some TLC to >> > fold_truth_andor_1, because right now it early-outs for non-comparisons, >> > but it seems the better place. I.e. somehow move the below code into the >> > above branch, with the associated diddling on fold_truth_andor_1 that it >> > gets called. >> >> This hunk is removed, as it is vain to do here. > > There is a fallthrough now, that wasn't there before. I don't know if > it's harmless, I just wanted to mention it. It is. Before we changed expression here and recurse here with the non-IF AND/OR expression later. So there is no need to do this recursion. >> Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is >> better done at a single place in fold_truth_andor only. > > As fold_truthop is called twice by fold_truth_andor, the latter might > indeed be the better place. > > > Ciao, > Michael. Kai
> -----Original Message----- > From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- > owner@gcc.gnu.org] On Behalf Of Michael Matz > Sent: Tuesday, October 11, 2011 10:45 PM > To: Kai Tietz > Cc: Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; Richard > Henderson > Subject: Re: [patch tree-optimization]: Improve handling of > conditional-branches on targets with high branch costs > > Hi, > > On Tue, 11 Oct 2011, Kai Tietz wrote: > > > > Better make it a separate function the first tests your new > > > conditions, and then calls simple_operand_p. > > > > Well, either I make it a new function and call it instead of > > simple_operand_p, > > That's what I meant, yes. > > > >> @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_ > > >> build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg), > > >> ll_arg, rl_arg), > > >> build_int_cst (TREE_TYPE (ll_arg), 0)); > > >> - > > >> - if (LOGICAL_OP_NON_SHORT_CIRCUIT) > > >> - { > > >> - if (code != orig_code || lhs != orig_lhs || rhs != > orig_rhs) > > >> - return build2_loc (loc, code, truth_type, lhs, rhs); > > >> - return NULL_TREE; > > >> - } > > > > > > Why do you remove this hunk? Shouldn't you instead move the hunk > you > > > added to fold_truth_andor() here. I realize this needs some TLC to > > > fold_truth_andor_1, because right now it early-outs for non- > comparisons, > > > but it seems the better place. I.e. somehow move the below code > into the > > > above branch, with the associated diddling on fold_truth_andor_1 > that it > > > gets called. > > > > This hunk is removed, as it is vain to do here. > > There is a fallthrough now, that wasn't there before. I don't know if > it's harmless, I just wanted to mention it. > Yes, this part introduced different behavior for this small case, int f(char *i, int j) { if (*i && j!=2) return *i; else return j; } Before the fix, we have D.4710 = *i; D.4711 = D.4710 != 0; D.4712 = j != 2; D.4713 = D.4711 & D.4712; if (D.4713 != 0) goto <D.4714>; else goto <D.4715>; <D.4714>: D.4710 = *i; D.4716 = (int) D.4710; return D.4716; <D.4715>: D.4716 = j; return D.4716; After the fix, we have D.4711 = *i; if (D.4711 != 0) goto <D.4712>; else goto <D.4710>; <D.4712>: if (j != 2) goto <D.4713>; else goto <D.4710>; <D.4713>: D.4711 = *i; D.4714 = (int) D.4711; return D.4714; <D.4710>: D.4714 = j; return D.4714; Does this meet the original expectation? I find the code below in function fold_truth_andor makes difference, /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B) into (A OR B). For sequence point consistancy, we need to check for trapping, and side-effects. */ else if (code == icode && simple_operand_p_2 (arg0) && simple_operand_p_2 (arg1)) return fold_build2_loc (loc, ncode, type, arg0, arg1); for "*i != 0" simple_operand_p(*i) returns false. Originally this is not checked by the code. I don't see the patch originally wanted to cover this. Can this be explained reasonably? I'm not arguing this patch did worse thing, but only want to understand the rationale behind this and justify this patch doesn't hurt anything else. Actually on the contrary, I measured and this change accidently made some benchmarks significantly improved due to some other reasons. Thanks, -Jiangning > > Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is > > better done at a single place in fold_truth_andor only. > > As fold_truthop is called twice by fold_truth_andor, the latter might > indeed be the better place. > > > Ciao, > Michael.
2011/10/26 Jiangning Liu <jiangning.liu@arm.com>: > > >> -----Original Message----- >> From: gcc-patches-owner@gcc.gnu.org [mailto:gcc-patches- >> owner@gcc.gnu.org] On Behalf Of Michael Matz >> Sent: Tuesday, October 11, 2011 10:45 PM >> To: Kai Tietz >> Cc: Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; Richard >> Henderson >> Subject: Re: [patch tree-optimization]: Improve handling of >> conditional-branches on targets with high branch costs >> >> Hi, >> >> On Tue, 11 Oct 2011, Kai Tietz wrote: >> >> > > Better make it a separate function the first tests your new >> > > conditions, and then calls simple_operand_p. >> > >> > Well, either I make it a new function and call it instead of >> > simple_operand_p, >> >> That's what I meant, yes. >> >> > >> @@ -5149,13 +5176,6 @@ fold_truthop (location_t loc, enum tree_ >> > >> build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg), >> > >> ll_arg, rl_arg), >> > >> build_int_cst (TREE_TYPE (ll_arg), 0)); >> > >> - >> > >> - if (LOGICAL_OP_NON_SHORT_CIRCUIT) >> > >> - { >> > >> - if (code != orig_code || lhs != orig_lhs || rhs != >> orig_rhs) >> > >> - return build2_loc (loc, code, truth_type, lhs, rhs); >> > >> - return NULL_TREE; >> > >> - } >> > > >> > > Why do you remove this hunk? Shouldn't you instead move the hunk >> you >> > > added to fold_truth_andor() here. I realize this needs some TLC to >> > > fold_truth_andor_1, because right now it early-outs for non- >> comparisons, >> > > but it seems the better place. I.e. somehow move the below code >> into the >> > > above branch, with the associated diddling on fold_truth_andor_1 >> that it >> > > gets called. >> > >> > This hunk is removed, as it is vain to do here. >> >> There is a fallthrough now, that wasn't there before. I don't know if >> it's harmless, I just wanted to mention it. >> > > Yes, this part introduced different behavior for this small case, > > int f(char *i, int j) > { > if (*i && j!=2) > return *i; > else > return j; > } > > Before the fix, we have > > D.4710 = *i; > D.4711 = D.4710 != 0; > D.4712 = j != 2; > D.4713 = D.4711 & D.4712; > if (D.4713 != 0) goto <D.4714>; else goto <D.4715>; > <D.4714>: > D.4710 = *i; > D.4716 = (int) D.4710; > return D.4716; > <D.4715>: > D.4716 = j; > return D.4716; > > After the fix, we have > > D.4711 = *i; > if (D.4711 != 0) goto <D.4712>; else goto <D.4710>; > <D.4712>: > if (j != 2) goto <D.4713>; else goto <D.4710>; > <D.4713>: > D.4711 = *i; > D.4714 = (int) D.4711; > return D.4714; > <D.4710>: > D.4714 = j; > return D.4714; > > Does this meet the original expectation? > > I find the code below in function fold_truth_andor makes difference, > > /* Transform (A AND-IF B) into (A AND B), or (A OR-IF B) into (A OR B). > For sequence point consistancy, we need to check for trapping, > and side-effects. */ > else if (code == icode && simple_operand_p_2 (arg0) > && simple_operand_p_2 (arg1)) > return fold_build2_loc (loc, ncode, type, arg0, arg1); > > for "*i != 0" simple_operand_p(*i) returns false. Originally this is not checked by the code. I don't see the patch originally wanted to cover this. Can this be explained reasonably? > > I'm not arguing this patch did worse thing, but only want to understand the rationale behind this and justify this patch doesn't hurt anything else. Actually on the contrary, I measured and this change accidently made some benchmarks significantly improved due to some other reasons. > > Thanks, > -Jiangning > >> > Btw richi asked for it, and I agree that new TRUTH-AND/OR packing is >> > better done at a single place in fold_truth_andor only. >> >> As fold_truthop is called twice by fold_truth_andor, the latter might >> indeed be the better place. >> >> >> Ciao, >> Michael. Well, as far as I understand C specification and sequence points, it makes indeed a difference to do branch-cost optimization on instructions might cause a signal traps. As signal could be handled in specifc handlers. You need to consider here that short-circuit optimization assumes associative law on operands. So right-hand side might be evaluaded before the left-hand-side. And this indeed breaks IMHO the sequences as mentioned in C specification about conditions. So a memory de-referencing (same as a floating-point trap) can never be treated as simple, as far as I understood this. So this patch - beside improving logic for branch-cost merging - fixes this latent issue. Regards, Kai
Hi, On Wed, 26 Oct 2011, Jiangning Liu wrote: > > > >> - > > > >> - if (LOGICAL_OP_NON_SHORT_CIRCUIT) > > > >> - { > > > >> - if (code != orig_code || lhs != orig_lhs || rhs != > > orig_rhs) > > > >> - return build2_loc (loc, code, truth_type, lhs, rhs); > > > >> - return NULL_TREE; > > > >> - } > > > > > > > > Why do you remove this hunk? Shouldn't you instead move the hunk > > you > > > > added to fold_truth_andor() here. I realize this needs some TLC to > > > > fold_truth_andor_1, because right now it early-outs for non- > > comparisons, > > > > but it seems the better place. I.e. somehow move the below code > > into the > > > > above branch, with the associated diddling on fold_truth_andor_1 > > that it > > > > gets called. > > > > > > This hunk is removed, as it is vain to do here. > > > > There is a fallthrough now, that wasn't there before. I don't know if > > it's harmless, I just wanted to mention it. > > > > Yes, this part introduced different behavior for this small case, > > D.4710 = *i; > D.4711 = D.4710 != 0; > D.4712 = j != 2; > D.4713 = D.4711 & D.4712; > if (D.4713 != 0) goto <D.4714>; else goto <D.4715>; > > After the fix, we have > > D.4711 = *i; > if (D.4711 != 0) goto <D.4712>; else goto <D.4710>; > <D.4712>: > if (j != 2) goto <D.4713>; else goto <D.4710>; So, we have one more jump than originally, when the point of the patch was to emit less on targets with high branch costs. So, as speculated, the hunk was not useless. (It's nice that it caused a benchmark to improve significantly, but that should be done via a proper analysis and patch, not as a side effect of a supposed non-change). Ciao, Michael.
Hi, On Wed, 26 Oct 2011, Kai Tietz wrote: > > Yes, this part introduced different behavior for this small case, > > > > int f(char *i, int j) > > { > > if (*i && j!=2) > > return *i; > > else > > return j; > > } > > Well, as far as I understand C specification and sequence points, it > makes indeed a difference to do branch-cost optimization on instructions > might cause a signal traps. As signal could be handled in specifc > handlers. You need to consider here that short-circuit optimization > assumes associative law on operands. So right-hand side might be > evaluaded before the left-hand-side. And this indeed breaks IMHO the > sequences as mentioned in C specification about conditions. I'm not sure in this specific case. If it had been: if (*a && *b) ... the you'd be right. The side-effect of dereferencing a must happen before *b, and hence transforming this into (*a!=0)&(*b!=0) would be wrong. But in the case at hand we only have one potentially problematic (aka detectable) side-effect (*i), so I don't think you could construct a program that detects the difference. It would entail detecting that "j!=2" was already evaluated, when (say) the segfault happens, but you can't as the variable is non-volatile. It also can't happen that the side-effect *i does not occur (which also would be a problem). So, I think there wasn't an actual problem before, and it had fewer jumps. Ciao, Michael.
2011/10/26 Michael Matz <matz@suse.de>: > Hi, > > On Wed, 26 Oct 2011, Kai Tietz wrote: > >> > Yes, this part introduced different behavior for this small case, >> > >> > int f(char *i, int j) >> > { >> > if (*i && j!=2) >> > return *i; >> > else >> > return j; >> > } >> >> Well, as far as I understand C specification and sequence points, it >> makes indeed a difference to do branch-cost optimization on instructions >> might cause a signal traps. As signal could be handled in specifc >> handlers. You need to consider here that short-circuit optimization >> assumes associative law on operands. So right-hand side might be >> evaluaded before the left-hand-side. And this indeed breaks IMHO the >> sequences as mentioned in C specification about conditions. > > I'm not sure in this specific case. If it had been: > > if (*a && *b) ... > > the you'd be right. The side-effect of dereferencing a must happen before > *b, and hence transforming this into (*a!=0)&(*b!=0) would be wrong. But > in the case at hand we only have one potentially problematic (aka > detectable) side-effect (*i), so I don't think you could construct a > program that detects the difference. It would entail detecting that > "j!=2" was already evaluated, when (say) the segfault happens, but you > can't as the variable is non-volatile. It also can't happen that the > side-effect *i does not occur (which also would be a problem). So, I > think there wasn't an actual problem before, and it had fewer jumps. > > > Ciao, > Michael. the case can be produced quite easily. Eg: extern int global = 0; .... if (*a && global) ... ... the issue is that by C-specification see here 5.1.2.2.3 about program-termination.The important point is here:: Evaluation of an expression may produce side effects. At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete *** and no side effects of subsequent evaluations shall have taken place *** Annex C describes sequence-points as 1 The following are the sequence points described in 5.1.2.3: — The call to a function, after the arguments have been evaluated (6.5.2.2). — The end of the first operand of the following operators: logical AND && (6.5.13); logical OR || (6.5.14); conditional ? (6.5.15); comma , (6.5.17). ... Regards, Kai
I describe the sample more closely here extern int global = 0; extern int *a = NULL; void catchSigSegV( int sig ) { a = &global; } int foo (int j) { signal (SIGSEGV, catchSigSegV); if (*a && global) return 2; return 0; } I admit that in most cases such a scenario is not common. This sample seems to be a valid C program. So the conditions in IF shall be evaluted strict in order of sequence-points, as first argument might trap. It doesn't matter if second argument have side-effects or none. The point is the first and so it has to be separated from other conditions. Regards, Kai
Hi, On Wed, 26 Oct 2011, Kai Tietz wrote: > >> > int f(char *i, int j) > >> > { > >> > if (*i && j!=2) > >> > return *i; > >> > else > >> > return j; > >> > } > >> > > the case can be produced quite easily. > > extern int global = 0; > > .... > if (*a && global) ... See? You had to change the program to prove the transformation to be invalid. But my point was that the function we discuss about was exactly as above. It didn't have globals, or two loads, or a volatile, or anything else you can come up with where the transformation would be detectable (and hence invalid). I claim that you can't construct a program that can distinguish between this function: int f(char *i, int j) { if (*i && j!=2) return *i; else return j; } and this one: int f(char *i, int j) { if (*i & j!=2) return *i; else return j; } And if you can't construct such a program, then the initial transformation before the fold-const.c change _for this specific situation_ was correct. Ciao, Michael.
2011/10/26 Michael Matz <matz@suse.de>: > Hi, > > On Wed, 26 Oct 2011, Kai Tietz wrote: > >> >> > int f(char *i, int j) >> >> > { >> >> > if (*i && j!=2) >> >> > return *i; >> >> > else >> >> > return j; >> >> > } >> >> >> >> the case can be produced quite easily. >> >> extern int global = 0; >> >> .... >> if (*a && global) ... > > See? You had to change the program to prove the transformation to be > invalid. But my point was that the function we discuss about was exactly > as above. It didn't have globals, or two loads, or a volatile, or > anything else you can come up with where the transformation would be > detectable (and hence invalid). I claim that you can't construct a > program that can distinguish between this function: > > int f(char *i, int j) > { > if (*i && j!=2) > return *i; > else > return j; > } > > and this one: > > int f(char *i, int j) > { > if (*i & j!=2) > return *i; > else > return j; > } > > And if you can't construct such a program, then the initial transformation > before the fold-const.c change _for this specific situation_ was correct. > > > Ciao, > Michael. well, if such a function is used as inline and we know for it that j has value != 2, then we have here a big difference. For your first example, we still have to do the memory access to *i, even if we are not interested in result. See here point 4 of 5.1.2.3 of C-spec. For your second sample we don't need to do that, as the & itself is no sequence-point and so we can eliminate the *i access without breaking anything. Regards, Kai
Hi, On Wed, 26 Oct 2011, Kai Tietz wrote: > well, if such a function is used as inline and we know for it that j has > value != 2, then we have here a big difference. For your first example, > we still have to do the memory access to *i, even if we are not > interested in result. Actually we don't have to preserve memory accesses. The interesting case is if the pointer has an invalid value. The behaviour of the access then is undefined, and it's okay to not do it at all. In case the pointer does point to an object the access (if it's value isn't needed) also isn't necessary. IOW: in "void f(int *p) { int i = *p; }" we can always remove the pointer read. So, I still maintain that the transformation on the original example was okay. Ciao, Michael.
2011/10/26 Michael Matz <matz@suse.de>: > Hi, > > On Wed, 26 Oct 2011, Kai Tietz wrote: > >> well, if such a function is used as inline and we know for it that j has >> value != 2, then we have here a big difference. For your first example, >> we still have to do the memory access to *i, even if we are not >> interested in result. > > Actually we don't have to preserve memory accesses. The interesting case > is if the pointer has an invalid value. The behaviour of the access then > is undefined, and it's okay to not do it at all. In case the pointer does > point to an object the access (if it's value isn't needed) also isn't > necessary. IOW: in "void f(int *p) { int i = *p; }" we can always remove > the pointer read. So, I still maintain that the transformation on the > original example was okay. > > > Ciao, > Michael. So you would mean that memory dereferencing shouldn't be considered as side-effect at all? So we would happily cause by code 'if (i && *i != 0) an crash, as memory-dereference has for you no side-effect? In you special case it might be valid that, if first (and C-fold-const doesn't know if the side-effect condition is really the first, as it might be a sub-sequence of a condition) condition might trap or not, to combine it. But branching has to cover the general cases. If you find a way to determine that left-hand operand in fold_const's branching code is really the left-most condition in chain, then we can add such a special case, but I don't see here an easy way to determine it. Regards, Kai Hmm, not sure
Hi, On Wed, 26 Oct 2011, Kai Tietz wrote: > So you would mean that memory dereferencing shouldn't be considered as > side-effect at all? No. I haven't said this at all. Of course it's a side-effect, but we're allowed to remove existing ones (under some circumstances). We're not allowed to introduce new ones, which means that this ... > So we would happily cause by code 'if (i && *i != 0) an crash, as > memory-dereference has for you no side-effect? ... is not allowed. But in the original example the memread was on the left side, hence occured always, therefore we can move it to the right side, even though it might occur less often. > In you special case it might be valid that, if first (and C-fold-const > doesn't know if the side-effect condition is really the first, as it > might be a sub-sequence of a condition) condition might trap or not, to > combine it. But branching has to cover the general cases. If you find > a way to determine that left-hand operand in fold_const's branching code > is really the left-most condition in chain, then we can add such a > special case, but I don't see here an easy way to determine it. Hmm? I don't see why it's necessary to check if it's the left-most condition in a chain. If the left hand of '&&' is a memread it can always be moved to the right side (or the operator transformed into '&' which can have the same effect), of course only if the original rhs is free of side effects, but then independed if the && was part of a larger expression. The memread will possibly be done fewer times than originally, but as said, that's okay. Ciao, Michael.
> -----Original Message----- > From: Michael Matz [mailto:matz@suse.de] > Sent: Wednesday, October 26, 2011 11:47 PM > To: Kai Tietz > Cc: Jiangning Liu; Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; > Richard Henderson > Subject: Re: [patch tree-optimization]: Improve handling of > conditional-branches on targets with high branch costs > > Hi, > > On Wed, 26 Oct 2011, Kai Tietz wrote: > > > So you would mean that memory dereferencing shouldn't be considered > as > > side-effect at all? > > No. I haven't said this at all. Of course it's a side-effect, but > we're > allowed to remove existing ones (under some circumstances). We're not > allowed to introduce new ones, which means that this ... > > > So we would happily cause by code 'if (i && *i != 0) an crash, as > > memory-dereference has for you no side-effect? > > ... is not allowed. But in the original example the memread was on the > left side, hence occured always, therefore we can move it to the right > side, even though it might occur less often. > > > In you special case it might be valid that, if first (and C-fold- > const > > doesn't know if the side-effect condition is really the first, as it > > might be a sub-sequence of a condition) condition might trap or not, > to > > combine it. But branching has to cover the general cases. If you > find > > a way to determine that left-hand operand in fold_const's branching > code > > is really the left-most condition in chain, then we can add such a > > special case, but I don't see here an easy way to determine it. > > Hmm? I don't see why it's necessary to check if it's the left-most > condition in a chain. If the left hand of '&&' is a memread it can > always > be moved to the right side (or the operator transformed into '&' which > can > have the same effect), of course only if the original rhs is free of > side > effects, but then independed if the && was part of a larger expression. > The memread will possibly be done fewer times than originally, but as > said, that's okay. > Agree. The point is for the small case I gave RHS doesn't have side effect at all, so the optimization of changing it to AND doesn't violate C specification. We need to recover something for this case, although it did improve a lot for some particular benchmarks. Thanks, -Jiangning > > Ciao, > Michael.
2011/10/27 Jiangning Liu <jiangning.liu@arm.com>: > > >> -----Original Message----- >> From: Michael Matz [mailto:matz@suse.de] >> Sent: Wednesday, October 26, 2011 11:47 PM >> To: Kai Tietz >> Cc: Jiangning Liu; Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; >> Richard Henderson >> Subject: Re: [patch tree-optimization]: Improve handling of >> conditional-branches on targets with high branch costs >> >> Hi, >> >> On Wed, 26 Oct 2011, Kai Tietz wrote: >> >> > So you would mean that memory dereferencing shouldn't be considered >> as >> > side-effect at all? >> >> No. I haven't said this at all. Of course it's a side-effect, but >> we're >> allowed to remove existing ones (under some circumstances). We're not >> allowed to introduce new ones, which means that this ... >> >> > So we would happily cause by code 'if (i && *i != 0) an crash, as >> > memory-dereference has for you no side-effect? >> >> ... is not allowed. But in the original example the memread was on the >> left side, hence occured always, therefore we can move it to the right >> side, even though it might occur less often. >> >> > In you special case it might be valid that, if first (and C-fold- >> const >> > doesn't know if the side-effect condition is really the first, as it >> > might be a sub-sequence of a condition) condition might trap or not, >> to >> > combine it. But branching has to cover the general cases. If you >> find >> > a way to determine that left-hand operand in fold_const's branching >> code >> > is really the left-most condition in chain, then we can add such a >> > special case, but I don't see here an easy way to determine it. >> >> Hmm? I don't see why it's necessary to check if it's the left-most >> condition in a chain. If the left hand of '&&' is a memread it can >> always >> be moved to the right side (or the operator transformed into '&' which >> can >> have the same effect), of course only if the original rhs is free of >> side >> effects, but then independed if the && was part of a larger expression. >> The memread will possibly be done fewer times than originally, but as >> said, that's okay. >> > > Agree. The point is for the small case I gave RHS doesn't have side effect > at all, so the optimization of changing it to AND doesn't violate C > specification. We need to recover something for this case, although it did > improve a lot for some particular benchmarks. > > Thanks, > -Jiangning > >> >> Ciao, >> Michael. Hmm, so we can allow merging to AND, if the left-hand-side might trap but has no-side-effects and rhs has neither trapping nor side-effects. As for the case that left-hand side has side-effects but right-hand not, we aren't allowed to do this AND/OR merge. For example 'if ((f = foo ()) != 0 && f < 24)' we aren't allowed to make this transformation. This shouldn't be that hard. We need to provide to simple_operand_p_2 an additional argument for checking trapping or not. Regards, Kai
> -----Original Message----- > From: Kai Tietz [mailto:ktietz70@googlemail.com] > Sent: Thursday, October 27, 2011 5:36 PM > To: Jiangning Liu > Cc: Michael Matz; Richard Guenther; Kai Tietz; gcc-patches@gcc.gnu.org; > Richard Henderson > Subject: Re: [patch tree-optimization]: Improve handling of > conditional-branches on targets with high branch costs > > 2011/10/27 Jiangning Liu <jiangning.liu@arm.com>: > > > > > >> -----Original Message----- > >> From: Michael Matz [mailto:matz@suse.de] > >> Sent: Wednesday, October 26, 2011 11:47 PM > >> To: Kai Tietz > >> Cc: Jiangning Liu; Richard Guenther; Kai Tietz; gcc- > patches@gcc.gnu.org; > >> Richard Henderson > >> Subject: Re: [patch tree-optimization]: Improve handling of > >> conditional-branches on targets with high branch costs > >> > >> Hi, > >> > >> On Wed, 26 Oct 2011, Kai Tietz wrote: > >> > >> > So you would mean that memory dereferencing shouldn't be > considered > >> as > >> > side-effect at all? > >> > >> No. I haven't said this at all. Of course it's a side-effect, but > >> we're > >> allowed to remove existing ones (under some circumstances). We're > not > >> allowed to introduce new ones, which means that this ... > >> > >> > So we would happily cause by code 'if (i && *i != 0) an crash, as > >> > memory-dereference has for you no side-effect? > >> > >> ... is not allowed. But in the original example the memread was on > the > >> left side, hence occured always, therefore we can move it to the > right > >> side, even though it might occur less often. > >> > >> > In you special case it might be valid that, if first (and C-fold- > >> const > >> > doesn't know if the side-effect condition is really the first, as > it > >> > might be a sub-sequence of a condition) condition might trap or > not, > >> to > >> > combine it. But branching has to cover the general cases. If you > >> find > >> > a way to determine that left-hand operand in fold_const's > branching > >> code > >> > is really the left-most condition in chain, then we can add such a > >> > special case, but I don't see here an easy way to determine it. > >> > >> Hmm? I don't see why it's necessary to check if it's the left-most > >> condition in a chain. If the left hand of '&&' is a memread it can > >> always > >> be moved to the right side (or the operator transformed into '&' > which > >> can > >> have the same effect), of course only if the original rhs is free of > >> side > >> effects, but then independed if the && was part of a larger > expression. > >> The memread will possibly be done fewer times than originally, but > as > >> said, that's okay. > >> > > > > Agree. The point is for the small case I gave RHS doesn't have side > effect > > at all, so the optimization of changing it to AND doesn't violate C > > specification. We need to recover something for this case, although > it did > > improve a lot for some particular benchmarks. > > > > Thanks, > > -Jiangning > > > >> > >> Ciao, > >> Michael. > > Hmm, so we can allow merging to AND, if the left-hand-side might trap > but has no-side-effects and rhs has neither trapping nor side-effects. > As for the case that left-hand side has side-effects but right-hand > not, we aren't allowed to do this AND/OR merge. For example 'if ((f = > foo ()) != 0 && f < 24)' we aren't allowed to make this > transformation. > > This shouldn't be that hard. We need to provide to simple_operand_p_2 > an additional argument for checking trapping or not. > Would it be OK if I file a tracker in bugzilla against this? > Regards, > Kai
Index: gcc/gcc/fold-const.c =================================================================== --- gcc.orig/gcc/fold-const.c +++ gcc/gcc/fold-const.c @@ -111,14 +111,13 @@ static tree decode_field_reference (loca tree *, tree *); static int all_ones_mask_p (const_tree, int); static tree sign_bit_p (tree, const_tree); -static int simple_operand_p (const_tree); +static bool simple_operand_p (tree, bool); static tree range_binop (enum tree_code, tree, tree, int, tree, int); static tree range_predecessor (tree); static tree range_successor (tree); static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); static tree fold_cond_expr_with_comparison (location_t, tree, tree, tree, tree); static tree unextend (tree, int, int, tree); -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree); static tree optimize_minmax_comparison (location_t, enum tree_code, tree, tree, tree);