Message ID | CAEwic4b04+W5H75cQFLGCO2FJuR73bPiH-F05HsZzamm2+RRzA@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, Oct 10, 2011 at 4:06 PM, Kai Tietz <ktietz70@googlemail.com> wrote: > 2011/10/10 Richard Guenther <richard.guenther@gmail.com>: >> On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz <ktietz70@googlemail.com> wrote: >>> Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF. It >>> has to be checked that the LHS code is same as outer CODE, as >>> otherwise we wouldn't apply different TRUTH-IF only on inner RHS of >>> LHS, which is of course wrong. >>> >>> Index: gcc/gcc/fold-const.c >>> =================================================================== >>> --- gcc.orig/gcc/fold-const.c >>> +++ gcc/gcc/fold-const.c >>> @@ -111,14 +111,13 @@ static tree decode_field_reference (loca >>> tree *, tree *); >>> static int all_ones_mask_p (const_tree, int); >>> static tree sign_bit_p (tree, const_tree); >>> -static int simple_operand_p (const_tree); >>> +static int simple_operand_p (tree); >>> static tree range_binop (enum tree_code, tree, tree, int, tree, int); >>> static tree range_predecessor (tree); >>> static tree range_successor (tree); >>> static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); >>> static tree fold_cond_expr_with_comparison (location_t, tree, tree, >>> tree, tree); >>> static tree unextend (tree, int, int, tree); >>> -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree); >>> static tree optimize_minmax_comparison (location_t, enum tree_code, >>> tree, tree, tree); >>> static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); >>> @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l >>> return lhs; >>> } >>> >>> -/* Subroutine for fold_truthop: decode a field reference. >>> +/* Subroutine for fold_truth_andor_1: decode a field reference. >>> >>> If EXP is a comparison reference, we return the innermost reference. >>> >>> @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val) >>> return NULL_TREE; >>> } >>> >>> -/* Subroutine for fold_truthop: determine if an operand is simple enough >>> +/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough >>> to be evaluated unconditionally. */ >>> >>> static int >>> -simple_operand_p (const_tree exp) >>> +simple_operand_p (tree exp) >>> { >>> + enum tree_code code; >>> /* Strip any conversions that don't change the machine mode. */ >>> STRIP_NOPS (exp); >>> >>> + code = TREE_CODE (exp); >>> + >>> + /* Handle some trivials */ >>> + if (TREE_CODE_CLASS (code) == tcc_comparison) >>> + return (tree_could_trap_p (exp) >>> + && simple_operand_p (TREE_OPERAND (exp, 0)) >>> + && simple_operand_p (TREE_OPERAND (exp, 1))); >> >> And that's still wrong. >> >> Stopped reading here. >> >> Richard. > > Oh, there is a not missing. I didn't spot that, sorry. > > To the point why we need to handle comparisons within simple_operand_p. > > If we reject comparisons and logical not here, we won't have any > branching optimization anymore, as this the patch moves into > fold_truthandor. > > The result with rejecting in simple_operand_p compares and logic-not > provides for the following example: But you change what simple_operand_p accepts and thus change what fold_truthop accepts as operands to its simplifications. Richard. > extern int doo (void); > extern int doo2 (void); > > int foo (int a, int b, int c, int d, int e, int f) > { > if (a && b && c && d && e && f) > return doo (); > return doo2 (); > } > > we get the following gimple dump (-fdump-tree-gimple): > > foo (int a, int b, int c, int d, int e, int f) > { > int D.2752; > > if (a != 0) goto <D.2740>; else goto <D.2741>; > <D.2740>: > if (b != 0) goto <D.2742>; else goto <D.2743>; > <D.2742>: > if (c != 0) goto <D.2744>; else goto <D.2745>; > <D.2744>: > if (d != 0) goto <D.2746>; else goto <D.2747>; > <D.2746>: > if (e != 0) goto <D.2748>; else goto <D.2749>; > <D.2748>: > if (f != 0) goto <D.2750>; else goto <D.2751>; > <D.2750>: > D.2752 = doo (); > return D.2752; > <D.2751>: > <D.2749>: > <D.2747>: > <D.2745>: > <D.2743>: > <D.2741>: > D.2752 = doo2 (); > return D.2752; > } > > So this result is caused by the fact that a logical-and/or is always a > comparison. As we are rejecting comparisons/logical-not, we end up > with a sequence of TRUTH_(AND|OR)_IFs. So the hole loop in > fold_truth_andor for trying to pack simple and side-effect-less > operands in tupels of two is useless. > > For the new patch (attached with proper ! fix for compares) we get for > the same sample gimple output: > > foo (int a, int b, int c, int d, int e, int f) > { > _Bool D.2740; > _Bool D.2741; > _Bool D.2742; > _Bool D.2745; > _Bool D.2746; > _Bool D.2747; > _Bool D.2750; > _Bool D.2751; > _Bool D.2752; > int D.2755; > > D.2740 = a != 0; > D.2741 = b != 0; > D.2742 = D.2740 & D.2741; > if (D.2742 != 0) goto <D.2743>; else goto <D.2744>; > <D.2743>: > D.2745 = c != 0; > D.2746 = d != 0; > D.2747 = D.2745 & D.2746; > if (D.2747 != 0) goto <D.2748>; else goto <D.2749>; > <D.2748>: > D.2750 = e != 0; > D.2751 = f != 0; > D.2752 = D.2750 & D.2751; > if (D.2752 != 0) goto <D.2753>; else goto <D.2754>; > <D.2753>: > D.2755 = doo (); > return D.2755; > <D.2754>: > <D.2749>: > <D.2744>: > D.2755 = doo2 (); > return D.2755; > } > > which is the proper output for high branching cost, as it tries to > avoid branches and not to tend them. > > -- > Kai > > Index: gcc/gcc/fold-const.c > =================================================================== > --- gcc.orig/gcc/fold-const.c > +++ gcc/gcc/fold-const.c > @@ -111,14 +111,13 @@ static tree decode_field_reference (loca > tree *, tree *); > static int all_ones_mask_p (const_tree, int); > static tree sign_bit_p (tree, const_tree); > -static int simple_operand_p (const_tree); > +static int simple_operand_p (tree); > static tree range_binop (enum tree_code, tree, tree, int, tree, int); > static tree range_predecessor (tree); > static tree range_successor (tree); > static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); > static tree fold_cond_expr_with_comparison (location_t, tree, tree, > tree, tree); > static tree unextend (tree, int, int, tree); > -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree); > static tree optimize_minmax_comparison (location_t, enum tree_code, > tree, tree, tree); > static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); > @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l > return lhs; > } > > -/* Subroutine for fold_truthop: decode a field reference. > +/* Subroutine for fold_truth_andor_1: decode a field reference. > > If EXP is a comparison reference, we return the innermost reference. > > @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val) > return NULL_TREE; > } > > -/* Subroutine for fold_truthop: determine if an operand is simple enough > +/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough > to be evaluated unconditionally. */ > > static int > -simple_operand_p (const_tree exp) > +simple_operand_p (tree exp) > { > + enum tree_code code; > /* Strip any conversions that don't change the machine mode. */ > STRIP_NOPS (exp); > > + code = TREE_CODE (exp); > + > + /* Handle some trivials */ > + if (TREE_CODE_CLASS (code) == tcc_comparison) > + return (!tree_could_trap_p (exp) > + && simple_operand_p (TREE_OPERAND (exp, 0)) > + && simple_operand_p (TREE_OPERAND (exp, 1))); > + > + if (FLOAT_TYPE_P (TREE_TYPE (exp)) > + && tree_could_trap_p (exp)) > + return false; > + > + switch (code) > + { > + case SSA_NAME: > + return true; > + case TRUTH_NOT_EXPR: > + return simple_operand_p (TREE_OPERAND (exp, 0)); > + case BIT_NOT_EXPR: > + if (TREE_CODE (TREE_TYPE (exp)) != BOOLEAN_TYPE) > + return false; > + return simple_operand_p (TREE_OPERAND (exp, 0)); > + default: > + break; > + } > + > return (CONSTANT_CLASS_P (exp) > - || TREE_CODE (exp) == SSA_NAME > || (DECL_P (exp) > && ! TREE_ADDRESSABLE (exp) > && ! TREE_THIS_VOLATILE (exp) > @@ -4888,7 +4913,7 @@ fold_range_test (location_t loc, enum tr > return 0; > } > > -/* Subroutine for fold_truthop: C is an INTEGER_CST interpreted as a P > +/* Subroutine for fold_truth_andor_1: C is an INTEGER_CST interpreted as a P > bit value. Arrange things so the extra bits will be set to zero if and > only if C is signed-extended to its full width. If MASK is nonzero, > it is an INTEGER_CST that should be AND'ed with the extra bits. */ > @@ -5025,8 +5050,8 @@ merge_truthop_with_opposite_arm (locatio > We return the simplified tree or 0 if no optimization is possible. */ > > static tree > -fold_truthop (location_t loc, enum tree_code code, tree truth_type, > - tree lhs, tree rhs) > +fold_truth_andor_1 (location_t loc, enum tree_code code, tree truth_type, > + tree lhs, tree rhs) > { > /* If this is the "or" of two comparisons, we can do something if > the comparisons are NE_EXPR. If this is the "and", we can do something > @@ -5054,8 +5079,6 @@ fold_truthop (location_t loc, enum tree_ > tree lntype, rntype, result; > HOST_WIDE_INT first_bit, end_bit; > int volatilep; > - tree orig_lhs = lhs, orig_rhs = rhs; > - enum tree_code orig_code = code; > > /* Start by getting the comparison codes. Fail if anything is volatile. > If one operand is a BIT_AND_EXPR with the constant one, treat it as if > @@ -5119,8 +5142,7 @@ fold_truthop (location_t loc, enum tree_ > /* If the RHS can be evaluated unconditionally and its operands are > simple, it wins to evaluate the RHS unconditionally on machines > with expensive branches. In this case, this isn't a comparison > - that can be merged. Avoid doing this if the RHS is a floating-point > - comparison since those can trap. */ > + that can be merged. */ > > if (BRANCH_COST (optimize_function_for_speed_p (cfun), > false) >= 2 > @@ -5149,13 +5171,6 @@ fold_truthop (location_t loc, enum tree_ > build2 (BIT_IOR_EXPR, TREE_TYPE (ll_arg), > ll_arg, rl_arg), > build_int_cst (TREE_TYPE (ll_arg), 0)); > - > - if (LOGICAL_OP_NON_SHORT_CIRCUIT) > - { > - if (code != orig_code || lhs != orig_lhs || rhs != orig_rhs) > - return build2_loc (loc, code, truth_type, lhs, rhs); > - return NULL_TREE; > - } > } > > /* See if the comparisons can be merged. Then get all the parameters for > @@ -8380,13 +8395,46 @@ fold_truth_andor (location_t loc, enum t > lhs is another similar operation, try to merge its rhs with our > rhs. Then try to merge our lhs and rhs. */ > if (TREE_CODE (arg0) == code > - && 0 != (tem = fold_truthop (loc, code, type, > - TREE_OPERAND (arg0, 1), arg1))) > + && 0 != (tem = fold_truth_andor_1 (loc, code, type, > + TREE_OPERAND (arg0, 1), arg1))) > return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), tem); > > - if ((tem = fold_truthop (loc, code, type, arg0, arg1)) != 0) > + if ((tem = fold_truth_andor_1 (loc, code, type, arg0, arg1)) != 0) > return tem; > > + if ((code == TRUTH_ANDIF_EXPR || code == TRUTH_ORIF_EXPR) > + && (BRANCH_COST (optimize_function_for_speed_p (cfun), > + false) >= 2) > + && !TREE_SIDE_EFFECTS (arg1) > + && LOGICAL_OP_NON_SHORT_CIRCUIT > + && simple_operand_p (arg1)) > + { > + enum tree_code ncode = (code == TRUTH_ANDIF_EXPR ? TRUTH_AND_EXPR > + : TRUTH_OR_EXPR); > + > + /* We don't want to pack more then two leafs to an non-IF > + If tree-code of left-hand operand isn't an AND/OR-IF code and not > + equal to CODE, then we don't want to add right-hand operand. > + If the inner right-hand side of left-hand operand has side-effects, > + or isn't simple, then we can't add to it, as otherwise we might > + destroy if-sequence. */ > + if (TREE_CODE (arg0) == code > + /* Needed for sequence points to handle trappings, and > + side-effects. */ > + && !TREE_SIDE_EFFECTS (TREE_OPERAND (arg0, 1)) > + && simple_operand_p (TREE_OPERAND (arg0, 1))) > + { > + tem = fold_build2_loc (loc, ncode, type, TREE_OPERAND (arg0, 1), > + arg1); > + return fold_build2_loc (loc, code, type, TREE_OPERAND (arg0, 0), > + tem); > + } > + /* Needed for sequence points to handle trappings, and side-effects. */ > + else if (!TREE_SIDE_EFFECTS (arg0) > + && simple_operand_p (arg0)) > + return fold_build2_loc (loc, ncode, type, arg0, arg1); > + } > + > return NULL_TREE; > } >
2011/10/10 Richard Guenther <richard.guenther@gmail.com>: > On Mon, Oct 10, 2011 at 4:06 PM, Kai Tietz <ktietz70@googlemail.com> wrote: >> 2011/10/10 Richard Guenther <richard.guenther@gmail.com>: >>> On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz <ktietz70@googlemail.com> wrote: >>>> Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF. It >>>> has to be checked that the LHS code is same as outer CODE, as >>>> otherwise we wouldn't apply different TRUTH-IF only on inner RHS of >>>> LHS, which is of course wrong. >>>> >>>> Index: gcc/gcc/fold-const.c >>>> =================================================================== >>>> --- gcc.orig/gcc/fold-const.c >>>> +++ gcc/gcc/fold-const.c >>>> @@ -111,14 +111,13 @@ static tree decode_field_reference (loca >>>> tree *, tree *); >>>> static int all_ones_mask_p (const_tree, int); >>>> static tree sign_bit_p (tree, const_tree); >>>> -static int simple_operand_p (const_tree); >>>> +static int simple_operand_p (tree); >>>> static tree range_binop (enum tree_code, tree, tree, int, tree, int); >>>> static tree range_predecessor (tree); >>>> static tree range_successor (tree); >>>> static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); >>>> static tree fold_cond_expr_with_comparison (location_t, tree, tree, >>>> tree, tree); >>>> static tree unextend (tree, int, int, tree); >>>> -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree); >>>> static tree optimize_minmax_comparison (location_t, enum tree_code, >>>> tree, tree, tree); >>>> static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); >>>> @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l >>>> return lhs; >>>> } >>>> >>>> -/* Subroutine for fold_truthop: decode a field reference. >>>> +/* Subroutine for fold_truth_andor_1: decode a field reference. >>>> >>>> If EXP is a comparison reference, we return the innermost reference. >>>> >>>> @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val) >>>> return NULL_TREE; >>>> } >>>> >>>> -/* Subroutine for fold_truthop: determine if an operand is simple enough >>>> +/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough >>>> to be evaluated unconditionally. */ >>>> >>>> static int >>>> -simple_operand_p (const_tree exp) >>>> +simple_operand_p (tree exp) >>>> { >>>> + enum tree_code code; >>>> /* Strip any conversions that don't change the machine mode. */ >>>> STRIP_NOPS (exp); >>>> >>>> + code = TREE_CODE (exp); >>>> + >>>> + /* Handle some trivials */ >>>> + if (TREE_CODE_CLASS (code) == tcc_comparison) >>>> + return (tree_could_trap_p (exp) >>>> + && simple_operand_p (TREE_OPERAND (exp, 0)) >>>> + && simple_operand_p (TREE_OPERAND (exp, 1))); >>> >>> And that's still wrong. >>> >>> Stopped reading here. >>> >>> Richard. >> >> Oh, there is a not missing. I didn't spot that, sorry. >> >> To the point why we need to handle comparisons within simple_operand_p. >> >> If we reject comparisons and logical not here, we won't have any >> branching optimization anymore, as this the patch moves into >> fold_truthandor. >> >> The result with rejecting in simple_operand_p compares and logic-not >> provides for the following example: > > But you change what simple_operand_p accepts and thus change what > fold_truthop accepts as operands to its simplifications. > > Richard. Well, not really. I assume you mean fold_truth_andor_1 (aka fold_truthop). It checks for ... if (TREE_CODE_CLASS (lcode) != tcc_comparison || TREE_CODE_CLASS (rcode) != tcc_comparison) return 0; ... before checking for simple_operand_p. So there is actual no change. It might be of some interest here to add in a different patch support for logic-not, but well, this is would be material for a different patch. So, it won't operate on anything else then comparisons as before. Kai
On Mon, Oct 10, 2011 at 5:07 PM, Kai Tietz <ktietz70@googlemail.com> wrote: > 2011/10/10 Richard Guenther <richard.guenther@gmail.com>: >> On Mon, Oct 10, 2011 at 4:06 PM, Kai Tietz <ktietz70@googlemail.com> wrote: >>> 2011/10/10 Richard Guenther <richard.guenther@gmail.com>: >>>> On Mon, Oct 10, 2011 at 2:29 PM, Kai Tietz <ktietz70@googlemail.com> wrote: >>>>> Recent patch had a thinko on rhs of inner lhs check for TRUTH-IF. It >>>>> has to be checked that the LHS code is same as outer CODE, as >>>>> otherwise we wouldn't apply different TRUTH-IF only on inner RHS of >>>>> LHS, which is of course wrong. >>>>> >>>>> Index: gcc/gcc/fold-const.c >>>>> =================================================================== >>>>> --- gcc.orig/gcc/fold-const.c >>>>> +++ gcc/gcc/fold-const.c >>>>> @@ -111,14 +111,13 @@ static tree decode_field_reference (loca >>>>> tree *, tree *); >>>>> static int all_ones_mask_p (const_tree, int); >>>>> static tree sign_bit_p (tree, const_tree); >>>>> -static int simple_operand_p (const_tree); >>>>> +static int simple_operand_p (tree); >>>>> static tree range_binop (enum tree_code, tree, tree, int, tree, int); >>>>> static tree range_predecessor (tree); >>>>> static tree range_successor (tree); >>>>> static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); >>>>> static tree fold_cond_expr_with_comparison (location_t, tree, tree, >>>>> tree, tree); >>>>> static tree unextend (tree, int, int, tree); >>>>> -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree); >>>>> static tree optimize_minmax_comparison (location_t, enum tree_code, >>>>> tree, tree, tree); >>>>> static tree extract_muldiv (tree, tree, enum tree_code, tree, bool *); >>>>> @@ -3500,7 +3499,7 @@ optimize_bit_field_compare (location_t l >>>>> return lhs; >>>>> } >>>>> >>>>> -/* Subroutine for fold_truthop: decode a field reference. >>>>> +/* Subroutine for fold_truth_andor_1: decode a field reference. >>>>> >>>>> If EXP is a comparison reference, we return the innermost reference. >>>>> >>>>> @@ -3668,17 +3667,43 @@ sign_bit_p (tree exp, const_tree val) >>>>> return NULL_TREE; >>>>> } >>>>> >>>>> -/* Subroutine for fold_truthop: determine if an operand is simple enough >>>>> +/* Subroutine for fold_truth_andor_1: determine if an operand is simple enough >>>>> to be evaluated unconditionally. */ >>>>> >>>>> static int >>>>> -simple_operand_p (const_tree exp) >>>>> +simple_operand_p (tree exp) >>>>> { >>>>> + enum tree_code code; >>>>> /* Strip any conversions that don't change the machine mode. */ >>>>> STRIP_NOPS (exp); >>>>> >>>>> + code = TREE_CODE (exp); >>>>> + >>>>> + /* Handle some trivials */ >>>>> + if (TREE_CODE_CLASS (code) == tcc_comparison) >>>>> + return (tree_could_trap_p (exp) >>>>> + && simple_operand_p (TREE_OPERAND (exp, 0)) >>>>> + && simple_operand_p (TREE_OPERAND (exp, 1))); >>>> >>>> And that's still wrong. >>>> >>>> Stopped reading here. >>>> >>>> Richard. >>> >>> Oh, there is a not missing. I didn't spot that, sorry. >>> >>> To the point why we need to handle comparisons within simple_operand_p. >>> >>> If we reject comparisons and logical not here, we won't have any >>> branching optimization anymore, as this the patch moves into >>> fold_truthandor. >>> >>> The result with rejecting in simple_operand_p compares and logic-not >>> provides for the following example: >> >> But you change what simple_operand_p accepts and thus change what >> fold_truthop accepts as operands to its simplifications. >> >> Richard. > > Well, not really. I assume you mean fold_truth_andor_1 (aka fold_truthop). > > It checks for > ... > if (TREE_CODE_CLASS (lcode) != tcc_comparison > || TREE_CODE_CLASS (rcode) != tcc_comparison) > return 0; > ... > before checking for simple_operand_p. So there is actual no change. > It might be of some interest here to add in a different patch support > for logic-not, but well, this is would be material for a different > patch. > So, it won't operate on anything else then comparisons as before. Sure, because simple_operand_p is checked on the comparison operands, not the comparison itself. Richard. > Kai >
Index: gcc/gcc/fold-const.c =================================================================== --- gcc.orig/gcc/fold-const.c +++ gcc/gcc/fold-const.c @@ -111,14 +111,13 @@ static tree decode_field_reference (loca tree *, tree *); static int all_ones_mask_p (const_tree, int); static tree sign_bit_p (tree, const_tree); -static int simple_operand_p (const_tree); +static int simple_operand_p (tree); static tree range_binop (enum tree_code, tree, tree, int, tree, int); static tree range_predecessor (tree); static tree range_successor (tree); static tree fold_range_test (location_t, enum tree_code, tree, tree, tree); static tree fold_cond_expr_with_comparison (location_t, tree, tree, tree, tree); static tree unextend (tree, int, int, tree); -static tree fold_truthop (location_t, enum tree_code, tree, tree, tree); static tree optimize_minmax_comparison (location_t, enum tree_code, tree, tree, tree);