Message ID | 20240229083505.9ACA41329E@imap2.dmz-prg2.suse.org |
---|---|
State | New |
Headers | show |
Series | middle-end/114070 - VEC_COND_EXPR folding | expand |
On 2/29/24 01:35, Richard Biener wrote: > The following amends the PR114070 fix to optimistically allow > the folding when we cannot expand the current vec_cond using > vcond_mask and we're still before vector lowering. This leaves > a small window between vectorization and lowering where we could > break vec_conds that can be expanded via vcond{,u,eq}, most > susceptible is the loop unrolling pass which applies VN and thus > possibly folding to the unrolled body of a vectorized loop. > > This gets back the folding for targets that cannot do vectorization. > It doesn't get back the folding for x86 with AVX512 for example > since that can handle the original IL but not the folded since > it misses some vcond_mask expanders. > > Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. > > As said for stage1 I want to move vector lowering before vectorization. > While I'm not entirely happy with this patch it forces us into the > correct direction, getting vcond_mask and vcmp{,u,eq} patterns > implemented. We could use canonicalize_math_p () to close the > vectorizer -> vector lowering gap but this only works when that > pass is run (not with -Og or when disabled). We could add a new > PROP_vectorizer_il and disable the folding if the vectorizer ran. > > Or we could simply live with the regression. > > Any preferences? Not really. As I think I said, I consider the regression insignificant an I could certainly live with it. jeff
diff --git a/gcc/match.pd b/gcc/match.pd index f3fffd8dec2..4edba7c84fb 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -5153,7 +5153,13 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (op (vec_cond:s @0 @1 @2) (vec_cond:s @0 @3 @4)) (if (TREE_CODE_CLASS (op) != tcc_comparison || types_match (type, TREE_TYPE (@1)) - || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)) + || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK) + || (optimize_vectors_before_lowering_p () + /* The following is optimistic on the side of non-support, we are + missing the legacy vcond{,u,eq} cases. Do this only when + lowering will be able to fixup.. */ + && !expand_vec_cond_expr_p (TREE_TYPE (@1), + TREE_TYPE (@0), ERROR_MARK))) (vec_cond @0 (op! @1 @3) (op! @2 @4)))) /* (c ? a : b) op d --> c ? (a op d) : (b op d) */ @@ -5161,13 +5167,19 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (op (vec_cond:s @0 @1 @2) @3) (if (TREE_CODE_CLASS (op) != tcc_comparison || types_match (type, TREE_TYPE (@1)) - || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)) + || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK) + || (optimize_vectors_before_lowering_p () + && !expand_vec_cond_expr_p (TREE_TYPE (@1), + TREE_TYPE (@0), ERROR_MARK))) (vec_cond @0 (op! @1 @3) (op! @2 @3)))) (simplify (op @3 (vec_cond:s @0 @1 @2)) (if (TREE_CODE_CLASS (op) != tcc_comparison || types_match (type, TREE_TYPE (@1)) - || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK)) + || expand_vec_cond_expr_p (type, TREE_TYPE (@0), ERROR_MARK) + || (optimize_vectors_before_lowering_p () + && !expand_vec_cond_expr_p (TREE_TYPE (@1), + TREE_TYPE (@0), ERROR_MARK))) (vec_cond @0 (op! @3 @1) (op! @3 @2))))) #if GIMPLE