Message ID | 000f01d938d8$00cdf7d0$0269e770$@nextmovesoftware.com |
---|---|
State | New |
Headers | show |
Series | [DOC] Document the VEC_PERM_EXPR tree code (and minor clean-ups). | expand |
On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com> wrote: > > > This patch (primarily) documents the VEC_PERM_EXPR tree code in > generic.texi. For ease of review, it is provided below as a pair > of diffs. The first contains just the new text added to describe > VEC_PERM_EXPR, the second tidies up this part of the documentation > by sorting the tree codes into alphabetical order, and providing > consistent section naming/capitalization, so changing this section > from "Vectors" to "Vector Expressions" (matching the nearby > "Unary and Binary Expressions"). > > Tested with make pdf and make html on x86_64-pc-linux-gnu. > The reviewer(s) can decide whether to approve just the new content, > or the content+clean-up. Ok for mainline? +@item VEC_PERM_EXPR +This node represents a vector permute/blend operation. The three operands +must be vectors of the same number of elements. The first and second +operands must be vectors of the same type as the entire expression, this was recently relaxed for the case of constant permutes in which case the first and second operands only have to have the same element type as the result. See tree-cfg.cc:verify_gimple_assign_ternary. The following description will become a bit more awkward here and for rhs1/rhs2 with different number of elements the modulo interpretation doesn't hold - I believe we require in-bounds elements for constant permutes. Richard can probably clarify things here. Thanks, Richard. > > > 2023-02-04 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * doc/generic.texi <Expression Trees>: Standardize capitalization > of section titles from "Expression trees". > <Language-dependent Trees>: Likewise standardize capitalization > from "Language-dependent trees". > <Constant expressions>: Capitalized from "Constant Expressions". > <Vector Expressions>: Standardized section name from "Vectors". > Document VEC_PERM_EXPR tree code. Sort tree codes alphabetically. > > > Thanks in advance, > Roger > -- >
Richard Biener <richard.guenther@gmail.com> writes: > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com> wrote: >> >> >> This patch (primarily) documents the VEC_PERM_EXPR tree code in >> generic.texi. For ease of review, it is provided below as a pair >> of diffs. The first contains just the new text added to describe >> VEC_PERM_EXPR, the second tidies up this part of the documentation >> by sorting the tree codes into alphabetical order, and providing >> consistent section naming/capitalization, so changing this section >> from "Vectors" to "Vector Expressions" (matching the nearby >> "Unary and Binary Expressions"). >> >> Tested with make pdf and make html on x86_64-pc-linux-gnu. >> The reviewer(s) can decide whether to approve just the new content, >> or the content+clean-up. Ok for mainline? > > +@item VEC_PERM_EXPR > +This node represents a vector permute/blend operation. The three operands > +must be vectors of the same number of elements. The first and second > +operands must be vectors of the same type as the entire expression, > > this was recently relaxed for the case of constant permutes in which case > the first and second operands only have to have the same element type > as the result. See tree-cfg.cc:verify_gimple_assign_ternary. > > The following description will become a bit more awkward here and > for rhs1/rhs2 with different number of elements the modulo interpretation > doesn't hold - I believe we require in-bounds elements for constant > permutes. Richard can probably clarify things here. I thought that the modulo behaviour still applies when the node has a constant selector, it's just that the in-range form is the canonical one. With variable-length vectors, I think it's in principle possible to have a stepped constant selector whose start elements are in-range but whose final elements aren't (and instead wrap around when applied). E.g. the selector could zip the last quarter of the inputs followed by the first quarter. Thanks, Richard
Perhaps I'm missing something (I'm not too familiar with SVE semantics), but is there a reason that the solution for PR96473 uses a VEC_PERM_EXPR and not just a VEC_DUPLICATE_EXPR? The folding of sv1d1rq (svptrue_..., ...) doesn't seem to require either the blending or the permutation functionality of a VEC_PERM_EXPR. Instead, it seems to be misusing (the modified) VEC_PERM_EXPR as a form of VIEW_CONVERT_EXPR that allows us to convert/mismatch the type of the operands to the type of the result. Conceptually, (as in Richard's original motivation for the PR), svint32_t foo (int32x4_t x) { return svld1rq (svptrue_b8 (), &x[0]); } can be optimized to (something like) svint32_t foo (int32x4_t x) { return svdup_32 (x[0]); } // or dup z0.q, z0.q[0] equivalent hence it makes sense for fold to transform the gimple form of the first, into the gimple form of the second(?) Just curious. Roger -- > -----Original Message----- > From: Richard Sandiford <richard.sandiford@arm.com> > Sent: 06 February 2023 12:22 > To: Richard Biener <richard.guenther@gmail.com> > Cc: Roger Sayle <roger@nextmovesoftware.com>; GCC Patches <gcc- > patches@gcc.gnu.org> > Subject: Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor > clean-ups). > > Richard Biener <richard.guenther@gmail.com> writes: > > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com> > wrote: > >> > >> > >> This patch (primarily) documents the VEC_PERM_EXPR tree code in > >> generic.texi. For ease of review, it is provided below as a pair of > >> diffs. The first contains just the new text added to describe > >> VEC_PERM_EXPR, the second tidies up this part of the documentation by > >> sorting the tree codes into alphabetical order, and providing > >> consistent section naming/capitalization, so changing this section > >> from "Vectors" to "Vector Expressions" (matching the nearby "Unary > >> and Binary Expressions"). > >> > >> Tested with make pdf and make html on x86_64-pc-linux-gnu. > >> The reviewer(s) can decide whether to approve just the new content, > >> or the content+clean-up. Ok for mainline? > > > > +@item VEC_PERM_EXPR > > +This node represents a vector permute/blend operation. The three > > +operands must be vectors of the same number of elements. The first > > +and second operands must be vectors of the same type as the entire > > +expression, > > > > this was recently relaxed for the case of constant permutes in which > > case the first and second operands only have to have the same element > > type as the result. See tree-cfg.cc:verify_gimple_assign_ternary. > > > > The following description will become a bit more awkward here and for > > rhs1/rhs2 with different number of elements the modulo interpretation > > doesn't hold - I believe we require in-bounds elements for constant > > permutes. Richard can probably clarify things here. > > I thought that the modulo behaviour still applies when the node has a constant > selector, it's just that the in-range form is the canonical one. > > With variable-length vectors, I think it's in principle possible to have a stepped > constant selector whose start elements are in-range but whose final elements > aren't (and instead wrap around when applied). > E.g. the selector could zip the last quarter of the inputs followed by the first > quarter. > > Thanks, > Richard
On Mon, 6 Feb 2023 at 20:14, Roger Sayle <roger@nextmovesoftware.com> wrote: > > > Perhaps I'm missing something (I'm not too familiar with SVE semantics), but > is there > a reason that the solution for PR96473 uses a VEC_PERM_EXPR and not just a > VEC_DUPLICATE_EXPR? The folding of sv1d1rq (svptrue_..., ...) doesn't seem > to > require either the blending or the permutation functionality of a > VEC_PERM_EXPR. > Instead, it seems to be misusing (the modified) VEC_PERM_EXPR as a form of > VIEW_CONVERT_EXPR that allows us to convert/mismatch the type of the > operands > to the type of the result. Hi, I am not sure if we could use VEC_DUPLICATE_EXPR for PR96463 case as-is. Perhaps we could extend VEC_DUPLICATE_EXPR to take N operands, so the resulting vector has npatterns = N, nelts_per_pattern = 1 ? AFAIU, extending VEC_PERM_EXPR to handle vectors with different lengths, would allow for more optimization opportunities besides PR96463. > > Conceptually, (as in Richard's original motivation for the PR), > svint32_t foo (int32x4_t x) { return svld1rq (svptrue_b8 (), &x[0]); } > can be optimized to (something like) > svint32_t foo (int32x4_t x) { return svdup_32 (x[0]); } // or dup z0.q, > z0.q[0] equivalent I guess that should be equivalent to svdupq_s32 (x[0], x[1], x[2], x[3]) ? Thanks, Prathamesh > hence it makes sense for fold to transform the gimple form of the first, > into the > gimple form of the second(?) > > Just curious. > Roger > -- > > > -----Original Message----- > > From: Richard Sandiford <richard.sandiford@arm.com> > > Sent: 06 February 2023 12:22 > > To: Richard Biener <richard.guenther@gmail.com> > > Cc: Roger Sayle <roger@nextmovesoftware.com>; GCC Patches <gcc- > > patches@gcc.gnu.org> > > Subject: Re: [DOC PATCH] Document the VEC_PERM_EXPR tree code (and minor > > clean-ups). > > > > Richard Biener <richard.guenther@gmail.com> writes: > > > On Sat, Feb 4, 2023 at 9:35 PM Roger Sayle <roger@nextmovesoftware.com> > > wrote: > > >> > > >> > > >> This patch (primarily) documents the VEC_PERM_EXPR tree code in > > >> generic.texi. For ease of review, it is provided below as a pair of > > >> diffs. The first contains just the new text added to describe > > >> VEC_PERM_EXPR, the second tidies up this part of the documentation by > > >> sorting the tree codes into alphabetical order, and providing > > >> consistent section naming/capitalization, so changing this section > > >> from "Vectors" to "Vector Expressions" (matching the nearby "Unary > > >> and Binary Expressions"). > > >> > > >> Tested with make pdf and make html on x86_64-pc-linux-gnu. > > >> The reviewer(s) can decide whether to approve just the new content, > > >> or the content+clean-up. Ok for mainline? > > > > > > +@item VEC_PERM_EXPR > > > +This node represents a vector permute/blend operation. The three > > > +operands must be vectors of the same number of elements. The first > > > +and second operands must be vectors of the same type as the entire > > > +expression, > > > > > > this was recently relaxed for the case of constant permutes in which > > > case the first and second operands only have to have the same element > > > type as the result. See tree-cfg.cc:verify_gimple_assign_ternary. > > > > > > The following description will become a bit more awkward here and for > > > rhs1/rhs2 with different number of elements the modulo interpretation > > > doesn't hold - I believe we require in-bounds elements for constant > > > permutes. Richard can probably clarify things here. > > > > I thought that the modulo behaviour still applies when the node has a > constant > > selector, it's just that the in-range form is the canonical one. > > > > With variable-length vectors, I think it's in principle possible to have a > stepped > > constant selector whose start elements are in-range but whose final > elements > > aren't (and instead wrap around when applied). > > E.g. the selector could zip the last quarter of the inputs followed by the > first > > quarter. > > > > Thanks, > > Richard >
On 2/4/23 13:33, Roger Sayle wrote: > > This patch (primarily) documents the VEC_PERM_EXPR tree code in > generic.texi. For ease of review, it is provided below as a pair > of diffs. The first contains just the new text added to describe > VEC_PERM_EXPR, the second tidies up this part of the documentation > by sorting the tree codes into alphabetical order, and providing > consistent section naming/capitalization, so changing this section > from "Vectors" to "Vector Expressions" (matching the nearby > "Unary and Binary Expressions"). > > Tested with make pdf and make html on x86_64-pc-linux-gnu. > The reviewer(s) can decide whether to approve just the new content, > or the content+clean-up. Ok for mainline? > > > 2023-02-04 Roger Sayle <roger@nextmovesoftware.com> > > gcc/ChangeLog > * doc/generic.texi <Expression Trees>: Standardize capitalization > of section titles from "Expression trees". > <Language-dependent Trees>: Likewise standardize capitalization > from "Language-dependent trees". > <Constant expressions>: Capitalized from "Constant Expressions". > <Vector Expressions>: Standardized section name from "Vectors". > Document VEC_PERM_EXPR tree code. Sort tree codes alphabetically. Trying to catch up on old mail here.... IIUC the proposed VEC_PERM_EXPR wording was rejected on technical grounds. I confess I know nothing about this, so I can't usefully suggest alternate wording myself. :-( The other changes look OK except that the correct capitalization would be "Language-Dependent Trees". -Sandra
diff --git a/gcc/doc/generic.texi b/gcc/doc/generic.texi index 3f52d30..4e8f131 100644 --- a/gcc/doc/generic.texi +++ b/gcc/doc/generic.texi @@ -1826,6 +1826,7 @@ a value from @code{enum annot_expr_kind}, the third is an @code{INTEGER_CST}. @tindex VEC_PACK_FIX_TRUNC_EXPR @tindex VEC_PACK_FLOAT_EXPR @tindex VEC_COND_EXPR +@tindex VEC_PERM_EXPR @tindex SAD_EXPR @table @code @@ -1967,6 +1968,27 @@ any other value currently, but optimizations should not rely on that property. In contrast with a @code{COND_EXPR}, all operands are always evaluated. +@item VEC_PERM_EXPR +This node represents a vector permute/blend operation. The three operands +must be vectors of the same number of elements. The first and second +operands must be vectors of the same type as the entire expression, and +the third operand, @dfn{selector}, must be an integral vector type. + +The input elements are numbered from 0 in operand 1 through +@math{2*@var{N}-1} in operand 2. The elements of the selector are +interpreted modulo @math{2*@var{N}}. + +The expression +@code{@var{out} = VEC_PERM_EXPR<@var{v0}, @var{v1}, @var{selector}>}, +where @var{v0}, @var{v1} and @var{selector} have @var{N} elements, means +@smallexample + for (int i = 0; i < N; i++) + @{ + int j = selector[i] % (2*N); + out[i] = j < N ? v0[j] : v1[j-N]; + @} +@end smallexample + @item SAD_EXPR This node represents the Sum of Absolute Differences operation. The three operands must be vectors of integral types. The first and second operand