Message ID | 4b77e155-0936-67d6-ab2d-ae7ef49bfde0@gmail.com |
---|---|
State | New |
Headers | show |
Series | gimple-match: Do not try UNCOND optimization with COND_LEN. | expand |
Hi, as Juzhe noticed in gcc.dg/pr92301.c there was still something missing in the last patch. The attached v2 makes sure we always have a COND_LEN operation before returning true and initializes len and bias even if they are unused. Bootstrapped and regtested on aarch64 and x86. Regards Robin Subject: [PATCH v2] gimple-match: Do not try UNCOND optimization with COND_LEN. On riscv we mis-optimize conditional (length) operations into unconditional operations e.g. in slp-reduc-7.c and gcc.dg/pr92301.c. This patch prevents optimizing e.g. COND_LEN_ADD ({-1, ... }, a, 0, c, len, bias) unconditionally into just "a". Currently, we assume that COND_LEN operations can be optimized similarly to COND operations. As the length is part of the mask (and usually not compile-time constant), we must not perform any optimization that relies on just the mask being "true". This patch ensures that we still have a COND_LEN pattern after optimization. gcc/ChangeLog: PR target/111311 * gimple-match-exports.cc (maybe_resimplify_conditional_op): Check for length masking. (try_conditional_simplification): Check that the result is still length masked. --- gcc/gimple-match-exports.cc | 38 ++++++++++++++++++++++++++++++------- gcc/gimple-match.h | 3 ++- 2 files changed, 33 insertions(+), 8 deletions(-) diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc index b36027b0bad..d41de98a3d3 100644 --- a/gcc/gimple-match-exports.cc +++ b/gcc/gimple-match-exports.cc @@ -262,7 +262,8 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, if (!res_op->cond.cond) return false; - if (!res_op->cond.else_value + if (!res_op->cond.len + && !res_op->cond.else_value && res_op->code.is_tree_code ()) { /* The "else" value doesn't matter. If the "then" value is a @@ -301,9 +302,12 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, /* If the "then" value is a gimple value and the "else" value matters, create a VEC_COND_EXPR between them, then see if it can be further - simplified. */ + simplified. + Don't do this if we have a COND_LEN_ as that would make us lose the + length masking. */ gimple_match_op new_op; - if (res_op->cond.else_value + if (!res_op->cond.len + && res_op->cond.else_value && VECTOR_TYPE_P (res_op->type) && gimple_simplified_result_is_gimple_val (res_op)) { @@ -314,7 +318,7 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, return gimple_resimplify3 (seq, res_op, valueize); } - /* Otherwise try rewriting the operation as an IFN_COND_* call. + /* Otherwise try rewriting the operation as an IFN_COND_(LEN_)* call. Again, this isn't a simplification in itself, since it's what RES_OP already described. */ if (convert_conditional_op (res_op, &new_op)) @@ -386,9 +390,29 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op, default: gcc_unreachable (); } - *res_op = cond_op; - maybe_resimplify_conditional_op (seq, res_op, valueize); - return true; + + if (len) + { + /* If we had a COND_LEN before we need to ensure that it stays that + way. */ + gimple_match_op old_op = *res_op; + *res_op = cond_op; + maybe_resimplify_conditional_op (seq, res_op, valueize); + + auto cfn = combined_fn (res_op->code); + if (internal_fn_p (cfn) + && internal_fn_len_index (as_internal_fn (cfn)) != -1) + return true; + + *res_op = old_op; + return false; + } + else + { + *res_op = cond_op; + maybe_resimplify_conditional_op (seq, res_op, valueize); + return true; + } } /* Helper for the autogenerated code, valueize OP. */ diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h index bec3ff42e3e..d192b7dae3e 100644 --- a/gcc/gimple-match.h +++ b/gcc/gimple-match.h @@ -56,7 +56,8 @@ public: inline gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) - : cond (cond_in), else_value (else_value_in) + : cond (cond_in), else_value (else_value_in), len (NULL_TREE), + bias (NULL_TREE) { }
Ping. Regards Robin
Ping^2. I realize it's not very elegant as of now. If there's a better/shorter way to solve this feel free to suggest :) Regards Robin
Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes: > Hi, > > as Juzhe noticed in gcc.dg/pr92301.c there was still something missing in > the last patch. The attached v2 makes sure we always have a COND_LEN operation > before returning true and initializes len and bias even if they are unused. > > Bootstrapped and regtested on aarch64 and x86. Sorry for the slow review. I was hoping Richi would take it, but I see he was hoping the same from me. > Regards > Robin > > Subject: [PATCH v2] gimple-match: Do not try UNCOND optimization with > COND_LEN. > > On riscv we mis-optimize conditional (length) operations into > unconditional operations e.g. in slp-reduc-7.c and > gcc.dg/pr92301.c. > > This patch prevents optimizing e.g. > COND_LEN_ADD ({-1, ... }, a, 0, c, len, bias) > unconditionally into just "a". > > Currently, we assume that COND_LEN operations can be optimized similarly > to COND operations. As the length is part of the mask (and usually not > compile-time constant), we must not perform any optimization that relies > on just the mask being "true". This patch ensures that we still have a > COND_LEN pattern after optimization. > > gcc/ChangeLog: > > PR target/111311 > * gimple-match-exports.cc (maybe_resimplify_conditional_op): > Check for length masking. > (try_conditional_simplification): Check that the result is still > length masked. > --- > gcc/gimple-match-exports.cc | 38 ++++++++++++++++++++++++++++++------- > gcc/gimple-match.h | 3 ++- > 2 files changed, 33 insertions(+), 8 deletions(-) > > diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc > index b36027b0bad..d41de98a3d3 100644 > --- a/gcc/gimple-match-exports.cc > +++ b/gcc/gimple-match-exports.cc > @@ -262,7 +262,8 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, > if (!res_op->cond.cond) > return false; > > - if (!res_op->cond.else_value > + if (!res_op->cond.len > + && !res_op->cond.else_value > && res_op->code.is_tree_code ()) > { > /* The "else" value doesn't matter. If the "then" value is a Why are the contents of this if statement wrong for COND_LEN? If the "else" value doesn't matter, then the masked form can use the "then" value for all elements. I would have expected the same thing to be true of COND_LEN. > @@ -301,9 +302,12 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, > > /* If the "then" value is a gimple value and the "else" value matters, > create a VEC_COND_EXPR between them, then see if it can be further > - simplified. */ > + simplified. > + Don't do this if we have a COND_LEN_ as that would make us lose the > + length masking. */ > gimple_match_op new_op; > - if (res_op->cond.else_value > + if (!res_op->cond.len > + && res_op->cond.else_value > && VECTOR_TYPE_P (res_op->type) > && gimple_simplified_result_is_gimple_val (res_op)) > { The change LGTM, but it would be nice to phrase the comment to avoid the "Do A. Don't do A if B" pattern. Maybe: /* If the condition represents MASK ? THEN : ELSE, where THEN is a gimple value and ELSE matters, create a VEC_COND_EXPR between them, then see if it can be further simplified. */ > @@ -314,7 +318,7 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, > return gimple_resimplify3 (seq, res_op, valueize); > } > > - /* Otherwise try rewriting the operation as an IFN_COND_* call. > + /* Otherwise try rewriting the operation as an IFN_COND_(LEN_)* call. > Again, this isn't a simplification in itself, since it's what > RES_OP already described. */ > if (convert_conditional_op (res_op, &new_op)) > @@ -386,9 +390,29 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op, > default: > gcc_unreachable (); > } > - *res_op = cond_op; > - maybe_resimplify_conditional_op (seq, res_op, valueize); > - return true; > + > + if (len) > + { > + /* If we had a COND_LEN before we need to ensure that it stays that > + way. */ > + gimple_match_op old_op = *res_op; > + *res_op = cond_op; > + maybe_resimplify_conditional_op (seq, res_op, valueize); > + > + auto cfn = combined_fn (res_op->code); > + if (internal_fn_p (cfn) > + && internal_fn_len_index (as_internal_fn (cfn)) != -1) > + return true; Why isn't it enough to check the result of maybe_resimplify_conditional_op? Thanks, Richard > + > + *res_op = old_op; > + return false; > + } > + else > + { > + *res_op = cond_op; > + maybe_resimplify_conditional_op (seq, res_op, valueize); > + return true; > + } > } > > /* Helper for the autogenerated code, valueize OP. */ > diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h > index bec3ff42e3e..d192b7dae3e 100644 > --- a/gcc/gimple-match.h > +++ b/gcc/gimple-match.h > @@ -56,7 +56,8 @@ public: > > inline > gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) > - : cond (cond_in), else_value (else_value_in) > + : cond (cond_in), else_value (else_value_in), len (NULL_TREE), > + bias (NULL_TREE) > { > }
Richard Sandiford <richard.sandiford@arm.com> writes: > Robin Dapp via Gcc-patches <gcc-patches@gcc.gnu.org> writes: >> [...] >> @@ -386,9 +390,29 @@ try_conditional_simplification (internal_fn ifn, gimple_match_op *res_op, >> default: >> gcc_unreachable (); >> } >> - *res_op = cond_op; >> - maybe_resimplify_conditional_op (seq, res_op, valueize); >> - return true; >> + >> + if (len) >> + { >> + /* If we had a COND_LEN before we need to ensure that it stays that >> + way. */ >> + gimple_match_op old_op = *res_op; >> + *res_op = cond_op; >> + maybe_resimplify_conditional_op (seq, res_op, valueize); >> + >> + auto cfn = combined_fn (res_op->code); >> + if (internal_fn_p (cfn) >> + && internal_fn_len_index (as_internal_fn (cfn)) != -1) >> + return true; > > Why isn't it enough to check the result of maybe_resimplify_conditional_op? Sorry, ignore that part. I get it now. But isn't the test whether res_op->code itself is an internal_function? In other words, shouldn't it just be: if (internal_fn_p (res_op->code) && internal_fn_len_index (as_internal_fn (res_op->code)) != -1) return true; maybe_resimplify_conditional_op should already have converted to an internal function where possible, and if combined_fn (res_op->code) does any extra conversion on the fly, that conversion won't be reflected in res_op. Thanks, Richard
> Why are the contents of this if statement wrong for COND_LEN? > If the "else" value doesn't matter, then the masked form can use > the "then" value for all elements. I would have expected the same > thing to be true of COND_LEN. Right, that one was overly pessimistic. Removed. > But isn't the test whether res_op->code itself is an internal_function? > In other words, shouldn't it just be: > > if (internal_fn_p (res_op->code) > && internal_fn_len_index (as_internal_fn (res_op->code)) != -1) > return true; > > maybe_resimplify_conditional_op should already have converted to an > internal function where possible, and if combined_fn (res_op->code) > does any extra conversion on the fly, that conversion won't be reflected > in res_op. I went through some of our test cases and believe most of the problems are due to situations like the following: In vect-cond-arith-2.c we have (on riscv) vect_neg_xi_14.4_23 = -vect_xi_13.3_22; vect_res_2.5_24 = .COND_LEN_ADD ({ -1, ... }, vect_res_1.0_17, vect_neg_xi_14.4_23, vect_res_1.0_17, _29, 0); On aarch64 this is a situation that matches the VEC_COND_EXPR simplification that I disabled with this patch. We valueized to _26 = vect_res_1.0_17 - vect_xi_13.3_22 and then create vect_res_2.5_24 = VEC_COND_EXPR <loop_mask_22, _26, vect_res_1.0_19>; This is later re-assembled into a COND_SUB. As we have two masks or COND_LEN we cannot use a VEC_COND_EXPR to achieve the same thing. Would it be possible to create a COND_OP directly instead, though? I tried the following (not very polished obviously): - new_op.set_op (VEC_COND_EXPR, res_op->type, - res_op->cond.cond, res_op->ops[0], - res_op->cond.else_value); - *res_op = new_op; - return gimple_resimplify3 (seq, res_op, valueize); + if (!res_op->cond.len) + { + new_op.set_op (VEC_COND_EXPR, res_op->type, + res_op->cond.cond, res_op->ops[0], + res_op->cond.else_value); + *res_op = new_op; + return gimple_resimplify3 (seq, res_op, valueize); + } + else if (seq && *seq && is_gimple_assign (*seq)) + { + new_op.code = gimple_assign_rhs_code (*seq); + new_op.type = res_op->type; + new_op.num_ops = gimple_num_ops (*seq) - 1; + new_op.ops[0] = gimple_assign_rhs1 (*seq); + if (new_op.num_ops > 1) + new_op.ops[1] = gimple_assign_rhs2 (*seq); + if (new_op.num_ops > 2) + new_op.ops[2] = gimple_assign_rhs2 (*seq); + + new_op.cond = res_op->cond; + + gimple_match_op bla2; + if (convert_conditional_op (&new_op, &bla2)) + { + *res_op = bla2; + // SEQ should now be dead. + return true; + } + } This would make the other hunk (check whether it was a LEN and try to recreate it) redundant I hope. I don't know enough about valueization, whether it's always safe to do that and other implications. On riscv this seems to work, though and the other backends never go through the LEN path. If, however, this is a feasible direction it could also be done for the non-LEN targets? Regards Robin
Robin Dapp <rdapp.gcc@gmail.com> writes: >> Why are the contents of this if statement wrong for COND_LEN? >> If the "else" value doesn't matter, then the masked form can use >> the "then" value for all elements. I would have expected the same >> thing to be true of COND_LEN. > > Right, that one was overly pessimistic. Removed. > >> But isn't the test whether res_op->code itself is an internal_function? >> In other words, shouldn't it just be: >> >> if (internal_fn_p (res_op->code) >> && internal_fn_len_index (as_internal_fn (res_op->code)) != -1) >> return true; >> >> maybe_resimplify_conditional_op should already have converted to an >> internal function where possible, and if combined_fn (res_op->code) >> does any extra conversion on the fly, that conversion won't be reflected >> in res_op. > > I went through some of our test cases and believe most of the problems > are due to situations like the following: > > In vect-cond-arith-2.c we have (on riscv) > vect_neg_xi_14.4_23 = -vect_xi_13.3_22; > vect_res_2.5_24 = .COND_LEN_ADD ({ -1, ... }, vect_res_1.0_17, vect_neg_xi_14.4_23, vect_res_1.0_17, _29, 0); > > On aarch64 this is a situation that matches the VEC_COND_EXPR > simplification that I disabled with this patch. We valueized > to _26 = vect_res_1.0_17 - vect_xi_13.3_22 and then create > vect_res_2.5_24 = VEC_COND_EXPR <loop_mask_22, _26, vect_res_1.0_19>; > This is later re-assembled into a COND_SUB. > > As we have two masks or COND_LEN we cannot use a VEC_COND_EXPR to > achieve the same thing. Would it be possible to create a COND_OP > directly instead, though? I tried the following (not very polished > obviously): > > - new_op.set_op (VEC_COND_EXPR, res_op->type, > - res_op->cond.cond, res_op->ops[0], > - res_op->cond.else_value); > - *res_op = new_op; > - return gimple_resimplify3 (seq, res_op, valueize); > + if (!res_op->cond.len) > + { > + new_op.set_op (VEC_COND_EXPR, res_op->type, > + res_op->cond.cond, res_op->ops[0], > + res_op->cond.else_value); > + *res_op = new_op; > + return gimple_resimplify3 (seq, res_op, valueize); > + } > + else if (seq && *seq && is_gimple_assign (*seq)) > + { > + new_op.code = gimple_assign_rhs_code (*seq); > + new_op.type = res_op->type; > + new_op.num_ops = gimple_num_ops (*seq) - 1; > + new_op.ops[0] = gimple_assign_rhs1 (*seq); > + if (new_op.num_ops > 1) > + new_op.ops[1] = gimple_assign_rhs2 (*seq); > + if (new_op.num_ops > 2) > + new_op.ops[2] = gimple_assign_rhs2 (*seq); > + > + new_op.cond = res_op->cond; > + > + gimple_match_op bla2; > + if (convert_conditional_op (&new_op, &bla2)) > + { > + *res_op = bla2; > + // SEQ should now be dead. > + return true; > + } > + } > > This would make the other hunk (check whether it was a LEN > and try to recreate it) redundant I hope. > > I don't know enough about valueization, whether it's always > safe to do that and other implications. On riscv this seems > to work, though and the other backends never go through the LEN > path. If, however, this is a feasible direction it could also > be done for the non-LEN targets? I don't know much about valueisation either :) But it does feel like we're working around the lack of a LEN form of COND_EXPR. In other words, it seems odd that we can do: IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) but we can't do: IFN_COND_LEN (mask, a, b, len, bias) There seems to be no way of applying a length without also finding an operation to perform. Does IFN_COND_LEN make conceptual sense on RVV? If so, would defining it solve some of these problems? I suppose in the worst case, IFN_COND_LEN is equivalent to IFN_COND_LEN_IOR with a zero input (and extended to floats). So if the target can do IFN_COND_LEN_IOR, it could implement IFN_COND_LEN using the same instruction. Thanks, Richard
On Mon, Oct 16, 2023 at 11:59 PM Richard Sandiford <richard.sandiford@arm.com> wrote: > > Robin Dapp <rdapp.gcc@gmail.com> writes: > >> Why are the contents of this if statement wrong for COND_LEN? > >> If the "else" value doesn't matter, then the masked form can use > >> the "then" value for all elements. I would have expected the same > >> thing to be true of COND_LEN. > > > > Right, that one was overly pessimistic. Removed. > > > >> But isn't the test whether res_op->code itself is an internal_function? > >> In other words, shouldn't it just be: > >> > >> if (internal_fn_p (res_op->code) > >> && internal_fn_len_index (as_internal_fn (res_op->code)) != -1) > >> return true; > >> > >> maybe_resimplify_conditional_op should already have converted to an > >> internal function where possible, and if combined_fn (res_op->code) > >> does any extra conversion on the fly, that conversion won't be reflected > >> in res_op. > > > > I went through some of our test cases and believe most of the problems > > are due to situations like the following: > > > > In vect-cond-arith-2.c we have (on riscv) > > vect_neg_xi_14.4_23 = -vect_xi_13.3_22; > > vect_res_2.5_24 = .COND_LEN_ADD ({ -1, ... }, vect_res_1.0_17, vect_neg_xi_14.4_23, vect_res_1.0_17, _29, 0); > > > > On aarch64 this is a situation that matches the VEC_COND_EXPR > > simplification that I disabled with this patch. We valueized > > to _26 = vect_res_1.0_17 - vect_xi_13.3_22 and then create > > vect_res_2.5_24 = VEC_COND_EXPR <loop_mask_22, _26, vect_res_1.0_19>; > > This is later re-assembled into a COND_SUB. > > > > As we have two masks or COND_LEN we cannot use a VEC_COND_EXPR to > > achieve the same thing. Would it be possible to create a COND_OP > > directly instead, though? I tried the following (not very polished > > obviously): > > > > - new_op.set_op (VEC_COND_EXPR, res_op->type, > > - res_op->cond.cond, res_op->ops[0], > > - res_op->cond.else_value); > > - *res_op = new_op; > > - return gimple_resimplify3 (seq, res_op, valueize); > > + if (!res_op->cond.len) > > + { > > + new_op.set_op (VEC_COND_EXPR, res_op->type, > > + res_op->cond.cond, res_op->ops[0], > > + res_op->cond.else_value); > > + *res_op = new_op; > > + return gimple_resimplify3 (seq, res_op, valueize); > > + } > > + else if (seq && *seq && is_gimple_assign (*seq)) > > + { > > + new_op.code = gimple_assign_rhs_code (*seq); > > + new_op.type = res_op->type; > > + new_op.num_ops = gimple_num_ops (*seq) - 1; > > + new_op.ops[0] = gimple_assign_rhs1 (*seq); > > + if (new_op.num_ops > 1) > > + new_op.ops[1] = gimple_assign_rhs2 (*seq); > > + if (new_op.num_ops > 2) > > + new_op.ops[2] = gimple_assign_rhs2 (*seq); > > + > > + new_op.cond = res_op->cond; > > + > > + gimple_match_op bla2; > > + if (convert_conditional_op (&new_op, &bla2)) > > + { > > + *res_op = bla2; > > + // SEQ should now be dead. > > + return true; > > + } > > + } > > > > This would make the other hunk (check whether it was a LEN > > and try to recreate it) redundant I hope. > > > > I don't know enough about valueization, whether it's always > > safe to do that and other implications. On riscv this seems > > to work, though and the other backends never go through the LEN > > path. If, however, this is a feasible direction it could also > > be done for the non-LEN targets? > > I don't know much about valueisation either :) But it does feel > like we're working around the lack of a LEN form of COND_EXPR. > In other words, it seems odd that we can do: > > IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) > > but we can't do: > > IFN_COND_LEN (mask, a, b, len, bias) > > There seems to be no way of applying a length without also finding an > operation to perform. Indeed .. maybe - _maybe_ we want to scrap VEC_COND_EXPR for IFN_COND{,_LEN} to be more consistent here? > Does IFN_COND_LEN make conceptual sense on RVV? If so, would defining > it solve some of these problems? > > I suppose in the worst case, IFN_COND_LEN is equivalent to IFN_COND_LEN_IOR > with a zero input (and extended to floats). So if the target can do > IFN_COND_LEN_IOR, it could implement IFN_COND_LEN using the same instruction. In principle one can construct a mask from the length via {0, 1, ... } < len and then AND that to the mask in a VEC_COND_EXPR but that's of course super ugly and likely inefficient (or hard to match back on RTL land). Richard. > Thanks, > Richard >
>> I don't know much about valueisation either :) But it does feel >> like we're working around the lack of a LEN form of COND_EXPR. >> In other words, it seems odd that we can do: >> >> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >> >> but we can't do: >> >> IFN_COND_LEN (mask, a, b, len, bias) >> >> There seems to be no way of applying a length without also finding an >> operation to perform. > > Indeed .. maybe - _maybe_ we want to scrap VEC_COND_EXPR for > IFN_COND{,_LEN} to be more consistent here? So, yes we could define IFN_COND_LEN (or VCOND_MASK_LEN) but I'd assume that there would be a whole lot of follow-up things to consider. I'm wondering if we really gain something from the the round-trip via VEC_COND_EXPR when we eventually create a COND_(LEN_)_OP anyway? Sure, if the target doesn't have the particular operation we would want a VEC_COND_EXPR. Same if SEQ is somehow more complicated. So the IFN_COND(_LEN) =? VCOND_MASK(_LEN) discussion notwithstanding, couldn't what I naively proposed be helpful as well? Or do we potentially lose optimizations during the time where e.g. a _foo = a BINOP b VEC_COND_EXPR (cond, foo, else) has not yet been converted into a COND_OP? We already create COND_OPs for the other paths (via convert_conditional_op) so why not for this one? Or am I missing some interdependence with SEQ? FWIW I did a full bootstrap and testsuite run on the usual architectures showing no changes with the attached patch. Regards Robin Subject: [PATCH] gimple-match: Create COND_OP directly if possible. This patch converts simplified sequences into conditional operations instead of VEC_COND_EXPRs if the target supports them. This helps for len-masked targets which cannot directly use a VEC_COND_EXPR in the presence of length masking. gcc/ChangeLog: * gimple-match-exports.cc (directly_supported_p): Define. (maybe_resimplify_conditional_op): Create COND_OP directly. * gimple-match.h (gimple_match_cond::gimple_match_cond): Initialize length and bias. --- gcc/gimple-match-exports.cc | 40 ++++++++++++++++++++++++++++--------- gcc/gimple-match.h | 7 +++++-- 2 files changed, 36 insertions(+), 11 deletions(-) diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc index b36027b0bad..ba3bd1450db 100644 --- a/gcc/gimple-match-exports.cc +++ b/gcc/gimple-match-exports.cc @@ -98,6 +98,8 @@ static bool gimple_resimplify5 (gimple_seq *, gimple_match_op *, tree (*)(tree)) static bool gimple_resimplify6 (gimple_seq *, gimple_match_op *, tree (*)(tree)); static bool gimple_resimplify7 (gimple_seq *, gimple_match_op *, tree (*)(tree)); +bool directly_supported_p (code_helper, tree, optab_subtype); + /* Match and simplify the toplevel valueized operation THIS. Replaces THIS with a simplified and/or canonicalized result and returns whether any change was made. */ @@ -299,22 +301,42 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, } } - /* If the "then" value is a gimple value and the "else" value matters, - create a VEC_COND_EXPR between them, then see if it can be further - simplified. */ + /* If the condition represents MASK ? THEN : ELSE, where THEN is a gimple + value and ELSE matters, create a VEC_COND_EXPR between them, then see + if it can be further simplified. + For COND_LEN masking, try to create a COND_LEN_OP directly in case + SEQ contains a supportable operation. */ gimple_match_op new_op; if (res_op->cond.else_value && VECTOR_TYPE_P (res_op->type) && gimple_simplified_result_is_gimple_val (res_op)) { - new_op.set_op (VEC_COND_EXPR, res_op->type, - res_op->cond.cond, res_op->ops[0], - res_op->cond.else_value); - *res_op = new_op; - return gimple_resimplify3 (seq, res_op, valueize); + /* If a previous simplification was pushed to SEQ + and we can convert it to a COND_OP directly, do so + in order to save a round-trip via VEC_COND_EXPR -> COND_OP. */ + if (seq && *seq && is_gimple_assign (*seq) + && directly_supported_p (gimple_assign_rhs_code (*seq), res_op->type, + optab_scalar)) + { + res_op->code = gimple_assign_rhs_code (*seq); + res_op->num_ops = gimple_num_ops (*seq) - 1; + res_op->ops[0] = gimple_assign_rhs1 (*seq); + if (res_op->num_ops > 1) + res_op->ops[1] = gimple_assign_rhs2 (*seq); + if (res_op->num_ops > 2) + res_op->ops[2] = gimple_assign_rhs2 (*seq); + } + else if (!res_op->cond.len) + { + new_op.set_op (VEC_COND_EXPR, res_op->type, + res_op->cond.cond, res_op->ops[0], + res_op->cond.else_value); + *res_op = new_op; + return gimple_resimplify3 (seq, res_op, valueize); + } } - /* Otherwise try rewriting the operation as an IFN_COND_* call. + /* Otherwise try rewriting the operation as an IFN_COND_(LEN_)* call. Again, this isn't a simplification in itself, since it's what RES_OP already described. */ if (convert_conditional_op (res_op, &new_op)) diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h index bec3ff42e3e..55c771d560f 100644 --- a/gcc/gimple-match.h +++ b/gcc/gimple-match.h @@ -32,7 +32,9 @@ public: enum uncond { UNCOND }; /* Build an unconditional op. */ - gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE) {} + gimple_match_cond (uncond) + : cond (NULL_TREE), else_value (NULL_TREE), len (NULL_TREE), + bias (NULL_TREE) {} gimple_match_cond (tree, tree); gimple_match_cond (tree, tree, tree, tree); @@ -56,7 +58,8 @@ public: inline gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) - : cond (cond_in), else_value (else_value_in) + : cond (cond_in), else_value (else_value_in), len (NULL_TREE), + bias (NULL_TREE) { }
Robin Dapp <rdapp.gcc@gmail.com> writes: >>> I don't know much about valueisation either :) But it does feel >>> like we're working around the lack of a LEN form of COND_EXPR. >>> In other words, it seems odd that we can do: >>> >>> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >>> >>> but we can't do: >>> >>> IFN_COND_LEN (mask, a, b, len, bias) >>> >>> There seems to be no way of applying a length without also finding an >>> operation to perform. >> >> Indeed .. maybe - _maybe_ we want to scrap VEC_COND_EXPR for >> IFN_COND{,_LEN} to be more consistent here? > > So, yes we could define IFN_COND_LEN (or VCOND_MASK_LEN) but I'd > assume that there would be a whole lot of follow-up things to > consider. > > I'm wondering if we really gain something from the the round-trip > via VEC_COND_EXPR when we eventually create a COND_(LEN_)_OP anyway? The main purpose of the VEC_COND_EXPR isn't as an intermediate step, but as an end in its own right. E.g. it allows: IFN_COND_ADD (mask, cst1, cst2, else) to be folded to: VEC_COND_EXPR <mask, cst1 + cst, else> This is especially useful when vectorisation has the effect of completely unrolling a loop. The VEC_COND_EXPR is only used if the equivalent unconditional rule folds to a gimple value. > Sure, if the target doesn't have the particular operation we would > want a VEC_COND_EXPR. Same if SEQ is somehow more complicated. > > So the IFN_COND(_LEN) =? VCOND_MASK(_LEN) discussion notwithstanding, > couldn't what I naively proposed be helpful as well? I don't think it's independently useful, since the fold that it's attempting is one that match.pd should be able to do. match.pd can also do it in a more general way, since it isn't restricted to looking at the currenct sequence. > Or do we > potentially lose optimizations during the time where e.g. a > _foo = a BINOP b > VEC_COND_EXPR (cond, foo, else) > has not yet been converted into a > COND_OP? Yeah, it would miss out on that too. > We already create COND_OPs for the other paths > (via convert_conditional_op) so why not for this one? Or am I missing > some interdependence with SEQ? The purpose of this code is to see what happens if we apply the usual folds for unconditional ops to the corresponding conditional forms. E.g. for IFN_COND_ADD (mask, a, b, c) it sees what a + b would fold to, then tries to reapply the VEC_DOND_EXPR (mask, ..., c) to the result. If a + b folds to a gimple value, we can fold to a VEC_COND_EXPR involving that gimple value, as discussed above. This could happen if a + b folds to a constant, or for things like a + 0 -> a. If instead a + b folds to a new operation (say a + b' or a - b'), we need to construct the equivalent conditional form of that operation, with the same mask and else values. This is a correctness issue rather than an optimisation. As the comment in: /* Otherwise try rewriting the operation as an IFN_COND_* call. Again, this isn't a simplification in itself, since it's what RES_OP already described. */ if (convert_conditional_op (res_op, &new_op)) *res_op = new_op; says, it's just reconstituting what RES_OP describes in gimple form. If that isn't possible then the simplification must fail. In some cases we could, as a follow-on, try to make a a' op b' fold result fall back to an unconditional a' op b' followed by a VEC_COND_EXPR. But we don't do that currently. It isn't safe in all cases, since IFN_COND_ADD only adds active elements, whereas an unconditional a' op b' would operate on all elements. I also don't know of any specific example where this would be useful on SVE. Thanks, Richard > > FWIW I did a full bootstrap and testsuite run on the usual architectures > showing no changes with the attached patch. > > Regards > Robin > > Subject: [PATCH] gimple-match: Create COND_OP directly if possible. > > This patch converts simplified sequences into conditional operations > instead of VEC_COND_EXPRs if the target supports them. > This helps for len-masked targets which cannot directly use a > VEC_COND_EXPR in the presence of length masking. > > gcc/ChangeLog: > > * gimple-match-exports.cc (directly_supported_p): Define. > (maybe_resimplify_conditional_op): Create COND_OP directly. > * gimple-match.h (gimple_match_cond::gimple_match_cond): > Initialize length and bias. > --- > gcc/gimple-match-exports.cc | 40 ++++++++++++++++++++++++++++--------- > gcc/gimple-match.h | 7 +++++-- > 2 files changed, 36 insertions(+), 11 deletions(-) > > diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc > index b36027b0bad..ba3bd1450db 100644 > --- a/gcc/gimple-match-exports.cc > +++ b/gcc/gimple-match-exports.cc > @@ -98,6 +98,8 @@ static bool gimple_resimplify5 (gimple_seq *, gimple_match_op *, tree (*)(tree)) > static bool gimple_resimplify6 (gimple_seq *, gimple_match_op *, tree (*)(tree)); > static bool gimple_resimplify7 (gimple_seq *, gimple_match_op *, tree (*)(tree)); > > +bool directly_supported_p (code_helper, tree, optab_subtype); > + > /* Match and simplify the toplevel valueized operation THIS. > Replaces THIS with a simplified and/or canonicalized result and > returns whether any change was made. */ > @@ -299,22 +301,42 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, > } > } > > - /* If the "then" value is a gimple value and the "else" value matters, > - create a VEC_COND_EXPR between them, then see if it can be further > - simplified. */ > + /* If the condition represents MASK ? THEN : ELSE, where THEN is a gimple > + value and ELSE matters, create a VEC_COND_EXPR between them, then see > + if it can be further simplified. > + For COND_LEN masking, try to create a COND_LEN_OP directly in case > + SEQ contains a supportable operation. */ > gimple_match_op new_op; > if (res_op->cond.else_value > && VECTOR_TYPE_P (res_op->type) > && gimple_simplified_result_is_gimple_val (res_op)) > { > - new_op.set_op (VEC_COND_EXPR, res_op->type, > - res_op->cond.cond, res_op->ops[0], > - res_op->cond.else_value); > - *res_op = new_op; > - return gimple_resimplify3 (seq, res_op, valueize); > + /* If a previous simplification was pushed to SEQ > + and we can convert it to a COND_OP directly, do so > + in order to save a round-trip via VEC_COND_EXPR -> COND_OP. */ > + if (seq && *seq && is_gimple_assign (*seq) > + && directly_supported_p (gimple_assign_rhs_code (*seq), res_op->type, > + optab_scalar)) > + { > + res_op->code = gimple_assign_rhs_code (*seq); > + res_op->num_ops = gimple_num_ops (*seq) - 1; > + res_op->ops[0] = gimple_assign_rhs1 (*seq); > + if (res_op->num_ops > 1) > + res_op->ops[1] = gimple_assign_rhs2 (*seq); > + if (res_op->num_ops > 2) > + res_op->ops[2] = gimple_assign_rhs2 (*seq); > + } > + else if (!res_op->cond.len) > + { > + new_op.set_op (VEC_COND_EXPR, res_op->type, > + res_op->cond.cond, res_op->ops[0], > + res_op->cond.else_value); > + *res_op = new_op; > + return gimple_resimplify3 (seq, res_op, valueize); > + } > } > > - /* Otherwise try rewriting the operation as an IFN_COND_* call. > + /* Otherwise try rewriting the operation as an IFN_COND_(LEN_)* call. > Again, this isn't a simplification in itself, since it's what > RES_OP already described. */ > if (convert_conditional_op (res_op, &new_op)) > diff --git a/gcc/gimple-match.h b/gcc/gimple-match.h > index bec3ff42e3e..55c771d560f 100644 > --- a/gcc/gimple-match.h > +++ b/gcc/gimple-match.h > @@ -32,7 +32,9 @@ public: > enum uncond { UNCOND }; > > /* Build an unconditional op. */ > - gimple_match_cond (uncond) : cond (NULL_TREE), else_value (NULL_TREE) {} > + gimple_match_cond (uncond) > + : cond (NULL_TREE), else_value (NULL_TREE), len (NULL_TREE), > + bias (NULL_TREE) {} > gimple_match_cond (tree, tree); > gimple_match_cond (tree, tree, tree, tree); > > @@ -56,7 +58,8 @@ public: > > inline > gimple_match_cond::gimple_match_cond (tree cond_in, tree else_value_in) > - : cond (cond_in), else_value (else_value_in) > + : cond (cond_in), else_value (else_value_in), len (NULL_TREE), > + bias (NULL_TREE) > { > }
Thank you for the explanation. So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along with the respective helper and expand functions, what would be the way forward? Generate an IFN_VCOND_MASK(_LEN) here instead of a VEC_COND_EXPR? How would I make sure all of match.pd's vec_cond optimizations applied to it as well? Right now AFAIK IFN_VCOND_MASK only gets created in isel and everything is just a VEC_COND before. But that does not provide length masking so is not the way to go? Thanks. Regards Robin
Richard Biener <richard.guenther@gmail.com> writes: > On Mon, Oct 16, 2023 at 11:59 PM Richard Sandiford > <richard.sandiford@arm.com> wrote: >> >> Robin Dapp <rdapp.gcc@gmail.com> writes: >> >> Why are the contents of this if statement wrong for COND_LEN? >> >> If the "else" value doesn't matter, then the masked form can use >> >> the "then" value for all elements. I would have expected the same >> >> thing to be true of COND_LEN. >> > >> > Right, that one was overly pessimistic. Removed. >> > >> >> But isn't the test whether res_op->code itself is an internal_function? >> >> In other words, shouldn't it just be: >> >> >> >> if (internal_fn_p (res_op->code) >> >> && internal_fn_len_index (as_internal_fn (res_op->code)) != -1) >> >> return true; >> >> >> >> maybe_resimplify_conditional_op should already have converted to an >> >> internal function where possible, and if combined_fn (res_op->code) >> >> does any extra conversion on the fly, that conversion won't be reflected >> >> in res_op. >> > >> > I went through some of our test cases and believe most of the problems >> > are due to situations like the following: >> > >> > In vect-cond-arith-2.c we have (on riscv) >> > vect_neg_xi_14.4_23 = -vect_xi_13.3_22; >> > vect_res_2.5_24 = .COND_LEN_ADD ({ -1, ... }, vect_res_1.0_17, vect_neg_xi_14.4_23, vect_res_1.0_17, _29, 0); >> > >> > On aarch64 this is a situation that matches the VEC_COND_EXPR >> > simplification that I disabled with this patch. We valueized >> > to _26 = vect_res_1.0_17 - vect_xi_13.3_22 and then create >> > vect_res_2.5_24 = VEC_COND_EXPR <loop_mask_22, _26, vect_res_1.0_19>; >> > This is later re-assembled into a COND_SUB. >> > >> > As we have two masks or COND_LEN we cannot use a VEC_COND_EXPR to >> > achieve the same thing. Would it be possible to create a COND_OP >> > directly instead, though? I tried the following (not very polished >> > obviously): >> > >> > - new_op.set_op (VEC_COND_EXPR, res_op->type, >> > - res_op->cond.cond, res_op->ops[0], >> > - res_op->cond.else_value); >> > - *res_op = new_op; >> > - return gimple_resimplify3 (seq, res_op, valueize); >> > + if (!res_op->cond.len) >> > + { >> > + new_op.set_op (VEC_COND_EXPR, res_op->type, >> > + res_op->cond.cond, res_op->ops[0], >> > + res_op->cond.else_value); >> > + *res_op = new_op; >> > + return gimple_resimplify3 (seq, res_op, valueize); >> > + } >> > + else if (seq && *seq && is_gimple_assign (*seq)) >> > + { >> > + new_op.code = gimple_assign_rhs_code (*seq); >> > + new_op.type = res_op->type; >> > + new_op.num_ops = gimple_num_ops (*seq) - 1; >> > + new_op.ops[0] = gimple_assign_rhs1 (*seq); >> > + if (new_op.num_ops > 1) >> > + new_op.ops[1] = gimple_assign_rhs2 (*seq); >> > + if (new_op.num_ops > 2) >> > + new_op.ops[2] = gimple_assign_rhs2 (*seq); >> > + >> > + new_op.cond = res_op->cond; >> > + >> > + gimple_match_op bla2; >> > + if (convert_conditional_op (&new_op, &bla2)) >> > + { >> > + *res_op = bla2; >> > + // SEQ should now be dead. >> > + return true; >> > + } >> > + } >> > >> > This would make the other hunk (check whether it was a LEN >> > and try to recreate it) redundant I hope. >> > >> > I don't know enough about valueization, whether it's always >> > safe to do that and other implications. On riscv this seems >> > to work, though and the other backends never go through the LEN >> > path. If, however, this is a feasible direction it could also >> > be done for the non-LEN targets? >> >> I don't know much about valueisation either :) But it does feel >> like we're working around the lack of a LEN form of COND_EXPR. >> In other words, it seems odd that we can do: >> >> IFN_COND_LEN_ADD (mask, a, 0, b, len, bias) >> >> but we can't do: >> >> IFN_COND_LEN (mask, a, b, len, bias) >> >> There seems to be no way of applying a length without also finding an >> operation to perform. > > Indeed .. maybe - _maybe_ we want to scrap VEC_COND_EXPR for > IFN_COND{,_LEN} to be more consistent here? Yeah, sounds like it could be worthwhile. But I suppose we still need VEC_COND_EXPR itself because it's a generic front-end operation that needs to be lowered. So it might be worth starting with an ifn for the LEN form and seeing whether the non-LEN form should switch over. Thanks, Richard
Robin Dapp <rdapp.gcc@gmail.com> writes: > Thank you for the explanation. > > So, assuming I added an IFN_VCOND_MASK and IFN_VCOND_MASK_LEN along > with the respective helper and expand functions, what would be the > way forward? IMO it'd be worth starting with the _LEN form only. > Generate an IFN_VCOND_MASK(_LEN) here instead of a VEC_COND_EXPR? > How would I make sure all of match.pd's vec_cond optimizations > applied to it as well? I think the most important ones are: /* Simplify: a = a1 op a2 r = c ? a : b; to: r = c ? a1 op a2 : b; if the target can do it in one go. This makes the operation conditional on c, so could drop potentially-trapping arithmetic, but that's a valid simplification if the result of the operation isn't needed. Avoid speculatively generating a stand-alone vector comparison on targets that might not support them. Any target implementing conditional internal functions must support the same comparisons inside and outside a VEC_COND_EXPR. */ It would be nice if there was some match.pd syntax that automatically extended these rules to IFN_VCOND_MASK_LEN, but I don't know how easy that would be, due to the extra two parameters. Perhaps that itself could be done in gimple-match-exports.cc, in a similar way to the current conditional stuff. That is: - for IFN_VCOND_MASK_LEN, try folding as a VEC_COND_EXPR and then "adding the length back" - for IFN_COND_LEN_FOO, try folding as an IFN_COND_FOO and then "add the length back" Not sure how important the second one is. Thanks, Richard > Right now AFAIK IFN_VCOND_MASK only gets created in isel and > everything is just a VEC_COND before. But that does not provide > length masking so is not the way to go? > > Thanks. > > Regards > Robin
diff --git a/gcc/gimple-match-exports.cc b/gcc/gimple-match-exports.cc index b36027b0bad..73be9f4f4c3 100644 --- a/gcc/gimple-match-exports.cc +++ b/gcc/gimple-match-exports.cc @@ -262,7 +262,8 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, if (!res_op->cond.cond) return false; - if (!res_op->cond.else_value + if (!res_op->cond.len + && !res_op->cond.else_value && res_op->code.is_tree_code ()) { /* The "else" value doesn't matter. If the "then" value is a @@ -301,9 +302,12 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, /* If the "then" value is a gimple value and the "else" value matters, create a VEC_COND_EXPR between them, then see if it can be further - simplified. */ + simplified. + Don't do this if we have a COND_LEN_ as that would make us lose the + length masking. */ gimple_match_op new_op; - if (res_op->cond.else_value + if (!res_op->cond.len + && res_op->cond.else_value && VECTOR_TYPE_P (res_op->type) && gimple_simplified_result_is_gimple_val (res_op)) { @@ -314,7 +318,7 @@ maybe_resimplify_conditional_op (gimple_seq *seq, gimple_match_op *res_op, return gimple_resimplify3 (seq, res_op, valueize); } - /* Otherwise try rewriting the operation as an IFN_COND_* call. + /* Otherwise try rewriting the operation as an IFN_COND_(LEN_)* call. Again, this isn't a simplification in itself, since it's what RES_OP already described. */ if (convert_conditional_op (res_op, &new_op))