Message ID | 56D5F4AC.9030504@redhat.com |
---|---|
State | New |
Headers | show |
On Tue, 1 Mar 2016, Richard Henderson wrote: > This is similar to Mark Gilsse's patch in the OP, except that it ensures that > the expression will fold back to a single condition. I did include Richi's > patch from #c6 to make it more likely to trigger asap. > > I'd appreciate feedback on the match.pd changes; it's my first time looking > into this new(ish) mini-language. The recalculate_side_effects call in the gimplify.c hunk is indented oddly (spaces, no tabs). Note that nothing guarantees we've canoncalized the cond-exprs to have all-ones in the true path and zero in the false path the pattern will miss three of four cases ... also I don't see why the pattern would need to be restricted to those - shouldn't it simply work for + (bcode (vec_cond COMPARISON_CLASS_P@0 @2 @3) + (vec_cond COMPARISON_CLASS_P@1 @2 @3)) ? That would catch two of four cases ;) + (if (COMPARISON_CLASS_P (folded)) + { build3 (VEC_COND_EXPR, TREE_TYPE (@2), folded, @2, @3); }))) this is wrong - you may not build GENERIC trees here. Instead you want (if (COMPARISON_CLASS_P (folded)) (vec_cond { folded; } @2 @3)))) though even that might be bogus if the operands of the comparison in 'folded' are not is_gimple_val. Now, building GENERIC via combine_comparisons is somewhat gross in match.pd, simple helpers that can combine two comparison codes would be nicer though the match.pd language doesn't allow easy use of that. For example handling lt/eq and lt/le combinations requires sth like (for cnd (cond vec_cond) (for cmp1 (lt eq) cmp2 (lt lt eq eq) cmp (lt le le eq) (simplify (bit_ior (cnd (cmp1 @0 @1) @2 @3) (cnd (cmp2 @0 @1) @2 @3)) (cnd (cmp @0 @1) @2 @3))) (for cmp1 (lt le) cmp2 (lt lt le le) cmp (lt lt lt le) (simplify (bit_and (cnd (cmp1 @0 @1) @2 @3) (cnd (cmp2 @0 @1) @2 @3)) (cnd (cmp @0 @1) @2 @3)))) and it will also code-generate the comparison outside of the [VEC_]COND_EXPR stmt, thus _1 = @0 cmp @1; _2 = _1 ? @2 : @3; that's something we can eventually fix though. At least it handles both cases in pattern recognition (embedded GENERIC or SSA name). Note the above still misses the case where @2 and @3 are swapped so you'd have to duplicate things once more. You also have to be careful to omit comparison codes that will not yield in a combination that can be expressed in a single comparison. I also noticed the comment I added: /* ??? This matches embedded conditions open-coded because genmatch would generate matching code for conditions in separate stmts only. The following is still important to merge then and else arm cases from if-conversion. */ which is techincally wrong - the matching code is just unfortunate (eh... I'll see to fix that): switch (gimple_assign_rhs_code (def)) { case LT_EXPR: { ... SSA name condition handling ... break; } case LT_EXPR: <--- oops ;) { ... GENERIC conditon handling ... } break; Stupid GIMPLE IL ... As a general remark I think handling of this simplification is better done in the reassoc pass (see Jakubs comment #4) given || and && associate. So I'd rather go down that route if possible. Thanks, Richard.
On Wed, 2 Mar 2016, Richard Biener wrote: > On Tue, 1 Mar 2016, Richard Henderson wrote: > >> This is similar to Mark Gilsse's patch in the OP, except that it ensures that >> the expression will fold back to a single condition. I did include Richi's >> patch from #c6 to make it more likely to trigger asap. >> >> I'd appreciate feedback on the match.pd changes; it's my first time looking >> into this new(ish) mini-language. > > The recalculate_side_effects call in the gimplify.c hunk is indented > oddly (spaces, no tabs). > > Note that nothing guarantees we've canoncalized the cond-exprs > to have all-ones in the true path and zero in the false path > the pattern will miss three of four cases ... also I don't see > why the pattern would need to be restricted to those - shouldn't it > simply work for > > + (bcode (vec_cond COMPARISON_CLASS_P@0 @2 @3) > + (vec_cond COMPARISON_CLASS_P@1 @2 @3)) > > ? That would catch two of four cases ;) I don't think that's always valid, you would need that @2 | @3 is the same as @2 for the simplification to work for bit_ior (we could use operand_equal_p and fold_binary to ask for that, not sure if there is a cleaner way). > + (if (COMPARISON_CLASS_P (folded)) > + { build3 (VEC_COND_EXPR, TREE_TYPE (@2), folded, @2, @3); > }))) > > this is wrong - you may not build GENERIC trees here. Instead you > want > > (if (COMPARISON_CLASS_P (folded)) > (vec_cond { folded; } @2 @3)))) > > though even that might be bogus if the operands of the comparison > in 'folded' are not is_gimple_val. > > Now, building GENERIC via combine_comparisons is somewhat gross > in match.pd, simple helpers that can combine two comparison > codes would be nicer though the match.pd language doesn't allow > easy use of that. For example handling lt/eq and lt/le combinations > requires sth like > > (for cnd (cond vec_cond) > (for cmp1 (lt eq) > cmp2 (lt lt eq eq) > cmp (lt le le eq) > (simplify > (bit_ior (cnd (cmp1 @0 @1) @2 @3) > (cnd (cmp2 @0 @1) @2 @3)) > (cnd (cmp @0 @1) @2 @3))) > (for cmp1 (lt le) > cmp2 (lt lt le le) > cmp (lt lt lt le) > (simplify > (bit_and (cnd (cmp1 @0 @1) @2 @3) > (cnd (cmp2 @0 @1) @2 @3)) > (cnd (cmp @0 @1) @2 @3)))) > > and it will also code-generate the comparison outside of the > [VEC_]COND_EXPR stmt, thus > > _1 = @0 cmp @1; > _2 = _1 ? @2 : @3; > > that's something we can eventually fix though. At least it handles > both cases in pattern recognition (embedded GENERIC or SSA name). > > Note the above still misses the case where @2 and @3 are swapped > so you'd have to duplicate things once more. You also have to > be careful to omit comparison codes that will not yield in a > combination that can be expressed in a single comparison. > > I also noticed the comment I added: > > /* ??? This matches embedded conditions open-coded because genmatch > would generate matching code for conditions in separate stmts only. > The following is still important to merge then and else arm cases > from if-conversion. */ > > which is techincally wrong - the matching code is just unfortunate > (eh... I'll see to fix that): > > switch (gimple_assign_rhs_code (def)) > { > case LT_EXPR: > { > ... SSA name condition handling ... > break; > } > case LT_EXPR: <--- oops ;) > { > ... GENERIC conditon handling ... > } > break; > > Stupid GIMPLE IL ... > > As a general remark I think handling of this simplification is > better done in the reassoc pass (see Jakubs comment #4) given > || and && associate. So I'd rather go down that route if possible. Still, if there are some easy cases that can be handled early in match.pd, that can't hurt... (if there aren't, that's fine) Just like x+y+x -> 2*x+y is for reassoc, but x+x -> 2*x can be done in match.pd.
On Wed, 2 Mar 2016, Marc Glisse wrote: > On Wed, 2 Mar 2016, Richard Biener wrote: > > > On Tue, 1 Mar 2016, Richard Henderson wrote: > > > > > This is similar to Mark Gilsse's patch in the OP, except that it ensures > > > that > > > the expression will fold back to a single condition. I did include > > > Richi's > > > patch from #c6 to make it more likely to trigger asap. > > > > > > I'd appreciate feedback on the match.pd changes; it's my first time > > > looking > > > into this new(ish) mini-language. > > > > The recalculate_side_effects call in the gimplify.c hunk is indented > > oddly (spaces, no tabs). > > > > Note that nothing guarantees we've canoncalized the cond-exprs > > to have all-ones in the true path and zero in the false path > > the pattern will miss three of four cases ... also I don't see > > why the pattern would need to be restricted to those - shouldn't it > > simply work for > > > > + (bcode (vec_cond COMPARISON_CLASS_P@0 @2 @3) > > + (vec_cond COMPARISON_CLASS_P@1 @2 @3)) > > > > ? That would catch two of four cases ;) > > > I don't think that's always valid, you would need that @2 | @3 is the same as > @2 for the simplification to work for bit_ior (we could use operand_equal_p > and fold_binary to ask for that, not sure if there is a cleaner way). Ah, indeed ... :/ > > > + (if (COMPARISON_CLASS_P (folded)) > > + { build3 (VEC_COND_EXPR, TREE_TYPE (@2), folded, @2, @3); > > }))) > > > > this is wrong - you may not build GENERIC trees here. Instead you > > want > > > > (if (COMPARISON_CLASS_P (folded)) > > (vec_cond { folded; } @2 @3)))) > > > > though even that might be bogus if the operands of the comparison > > in 'folded' are not is_gimple_val. > > > > Now, building GENERIC via combine_comparisons is somewhat gross > > in match.pd, simple helpers that can combine two comparison > > codes would be nicer though the match.pd language doesn't allow > > easy use of that. For example handling lt/eq and lt/le combinations > > requires sth like > > > > (for cnd (cond vec_cond) > > (for cmp1 (lt eq) > > cmp2 (lt lt eq eq) > > cmp (lt le le eq) > > (simplify > > (bit_ior (cnd (cmp1 @0 @1) @2 @3) > > (cnd (cmp2 @0 @1) @2 @3)) > > (cnd (cmp @0 @1) @2 @3))) > > (for cmp1 (lt le) > > cmp2 (lt lt le le) > > cmp (lt lt lt le) > > (simplify > > (bit_and (cnd (cmp1 @0 @1) @2 @3) > > (cnd (cmp2 @0 @1) @2 @3)) > > (cnd (cmp @0 @1) @2 @3)))) > > > > and it will also code-generate the comparison outside of the > > [VEC_]COND_EXPR stmt, thus > > > > _1 = @0 cmp @1; > > _2 = _1 ? @2 : @3; > > > > that's something we can eventually fix though. At least it handles > > both cases in pattern recognition (embedded GENERIC or SSA name). > > > > Note the above still misses the case where @2 and @3 are swapped > > so you'd have to duplicate things once more. You also have to > > be careful to omit comparison codes that will not yield in a > > combination that can be expressed in a single comparison. > > > > I also noticed the comment I added: > > > > /* ??? This matches embedded conditions open-coded because genmatch > > would generate matching code for conditions in separate stmts only. > > The following is still important to merge then and else arm cases > > from if-conversion. */ > > > > which is techincally wrong - the matching code is just unfortunate > > (eh... I'll see to fix that): > > > > switch (gimple_assign_rhs_code (def)) > > { > > case LT_EXPR: > > { > > ... SSA name condition handling ... > > break; > > } > > case LT_EXPR: <--- oops ;) > > { > > ... GENERIC conditon handling ... > > } > > break; > > > > Stupid GIMPLE IL ... > > > > As a general remark I think handling of this simplification is > > better done in the reassoc pass (see Jakubs comment #4) given > > || and && associate. So I'd rather go down that route if possible. > > Still, if there are some easy cases that can be handled early in match.pd, > that can't hurt... (if there aren't, that's fine) > Just like x+y+x -> 2*x+y is for reassoc, but x+x -> 2*x can be done in > match.pd. True, I was merely looking at the ugliness and incompleteness of the pattern and wondered if it's really worth doing here as I belive we need the reassoc code anyway to fix the regression fully. Richard.
diff --git a/gcc/gimplify.c b/gcc/gimplify.c index 7be6bd7..c434e55 100644 --- a/gcc/gimplify.c +++ b/gcc/gimplify.c @@ -10773,8 +10773,21 @@ gimplify_expr (tree *expr_p, gimple_seq *pre_p, gimple_seq *post_p, goto expr_2; } - case FMA_EXPR: case VEC_COND_EXPR: + { + gimplify_status r0, r1, r2; + r0 = gimplify_expr (&TREE_OPERAND (*expr_p, 0), pre_p, + post_p, is_gimple_condexpr, fb_rvalue); + r1 = gimplify_expr (&TREE_OPERAND (*expr_p, 1), pre_p, + post_p, is_gimple_val, fb_rvalue); + r2 = gimplify_expr (&TREE_OPERAND (*expr_p, 2), pre_p, + post_p, is_gimple_val, fb_rvalue); + ret = MIN (MIN (r0, r1), r2); + recalculate_side_effects (*expr_p); + break; + } + + case FMA_EXPR: case VEC_PERM_EXPR: /* Classified as tcc_expression. */ goto expr_3; diff --git a/gcc/match.pd b/gcc/match.pd index 5903782..4192d29 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -30,6 +30,7 @@ along with GCC; see the file COPYING3. If not see real_zerop real_onep real_minus_onep zerop CONSTANT_CLASS_P + COMPARISON_CLASS_P tree_expr_nonnegative_p integer_valued_real_p integer_pow2p @@ -2452,6 +2453,35 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) replace if (x == 0) with tem = ~x; if (tem != 0) which is clearly less optimal and which we'll transform again in forwprop. */ +/* Simplification of vec_cond. */ + +(for bcode (bit_and bit_ior) + (simplify + (bcode (vec_cond COMPARISON_CLASS_P@0 integer_all_onesp@2 integer_zerop@3) + (vec_cond COMPARISON_CLASS_P@1 @2 @3)) + (with + { + tree al = TREE_OPERAND (@0, 0); + tree ar = TREE_OPERAND (@0, 1); + tree bl = TREE_OPERAND (@1, 0); + tree br = TREE_OPERAND (@1, 1); + } + (if (operand_equal_p (al, bl, 0) && operand_equal_p (ar, br, 0)) + (with + { + tree_code tcode + = bcode == BIT_AND_EXPR ? TRUTH_AND_EXPR : TRUTH_OR_EXPR; + tree folded + = combine_comparisons (UNKNOWN_LOCATION, tcode, TREE_CODE (@0), + TREE_CODE (@1), TREE_TYPE (@0), al, ar); + } + (if (folded) + (switch + (if (integer_truep (folded)) @2) + (if (integer_zerop (folded)) @3) + (if (COMPARISON_CLASS_P (folded)) + { build3 (VEC_COND_EXPR, TREE_TYPE (@2), folded, @2, @3); }))) + ))))) /* Simplification of math builtins. These rules must all be optimizations as well as IL simplifications. If there is a possibility that the new