Message ID | 6797c56b-bc5a-2410-4a90-424cda38abaf@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | rs6000: Add MMA __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins | expand |
On 6/30/21 3:56 PM, Peter Bergner wrote: > The following patch is bootstrapping and regtesting on powerpc64le-linux. > Ok for trunk if there are no regressions? Ok to backport to GCC 11 and > GCC 10 after baking on trunk for a while? Bootstrap and regression testing came back clean. Peter
Hi! On Wed, Jun 30, 2021 at 04:56:04PM -0500, Peter Bergner wrote: > LLVM added the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins. > The following patch adds support for them to GCC so that we stay in sync > with LLVM. This should be documented somewhere. > + else if (fncode == VSX_BUILTIN_LXVP) > + { > + push_gimplify_context (true); > + tree offset = gimple_call_arg (stmt, 0); > + tree ptr = gimple_call_arg (stmt, 1); > + tree lhs = gimple_call_lhs (stmt); > + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset)); Line (much) too long. > + else if (fncode == VSX_BUILTIN_STXVP) > + { > + push_gimplify_context (true); > + tree src = gimple_call_arg (stmt, 0); > + tree offset = gimple_call_arg (stmt, 1); > + tree ptr = gimple_call_arg (stmt, 2); > + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset)); Same. Can BU_MMA_LD be used only for lxvp? Name it BU_MMA_PAIR_LD then? Same for _ST ofc. The patch is okay for trunk. For the backports it is okay if Bill has looked at this patch as well. Thanks! Segher
On 7/1/21 1:01 PM, Segher Boessenkool wrote: > On Wed, Jun 30, 2021 at 04:56:04PM -0500, Peter Bergner wrote: >> LLVM added the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins. >> The following patch adds support for them to GCC so that we stay in sync >> with LLVM. > > This should be documented somewhere. Hmmm, I had that change in one of my trees, but must have dropped it when moving to another system/tree and didn't notice! Doh! Let me add that back and make the other changes you suggested and then I'll repost and test here. Thanks for catching that! Peter
On 7/1/21 1:01 PM, Segher Boessenkool wrote: > On Wed, Jun 30, 2021 at 04:56:04PM -0500, Peter Bergner wrote: >> LLVM added the __builtin_vsx_lxvp and __builtin_vsx_stxvp built-ins. >> The following patch adds support for them to GCC so that we stay in sync >> with LLVM. > > This should be documented somewhere. Done. >> + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset)); > > Line (much) too long. Fixed. >> + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset)); > > Same. Fixed. > Can BU_MMA_LD be used only for lxvp? Name it BU_MMA_PAIR_LD then? Same > for _ST ofc. Yes, it's used only for lxvp and stxvp and uses our MMA specific __vector_pair types too. Macro name changed. > The patch is okay for trunk. Below is the updated patch which is bootstrapping now. I'll commit it if it shows no regressions. > For the backports it is okay if Bill has looked at this patch as well. Bill has not seen the patch. I'm not sure when/if he'll get a chance to either. Peter gcc/ * config/rs6000/rs6000-builtin.def (BU_MMA_PAIR_LD, BU_MMA_PAIR_ST): New macros. (__builtin_vsx_lxvp, __builtin_vsx_stxvp): New built-ins. * config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Expand lxvp and stxvp built-ins. (mma_init_builtins): Handle lxvp and stxvp built-ins. (builtin_function_type): Likewise. * doc/extend.texi (__builtin_vsx_lxvp, __builtin_mma_stxvp): Document. gcc/testsuite/ * gcc.target/powerpc/mma-builtin-7.c: New test. * gcc.target/powerpc/mma-builtin-8.c: New test. diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index d7ce4de421e..6270444ef70 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -484,6 +484,25 @@ | RS6000_BTC_SENARY), \ CODE_FOR_ ## ICODE) /* ICODE */ +#define BU_MMA_PAIR_LD(ENUM, NAME, ATTR) \ + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_vsx_" NAME, /* NAME */ \ + RS6000_BTM_MMA, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_BINARY \ + | RS6000_BTC_GIMPLE), \ + CODE_FOR_nothing) /* ICODE */ + +#define BU_MMA_PAIR_ST(ENUM, NAME, ATTR) \ + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_vsx_" NAME, /* NAME */ \ + RS6000_BTM_MMA, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_TERNARY \ + | RS6000_BTC_VOID \ + | RS6000_BTC_GIMPLE), \ + CODE_FOR_nothing) /* ICODE */ + /* ISA 2.05 (power6) convenience macros. */ /* For functions that depend on the CMPB instruction */ #define BU_P6_2(ENUM, NAME, ATTR, ICODE) \ @@ -3253,6 +3272,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, BU_P10V_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) BU_P10V_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) +BU_MMA_PAIR_LD (LXVP, "lxvp", MISC) +BU_MMA_PAIR_ST (STXVP, "stxvp", PAIR) + BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, mma_xxmfacc) BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) BU_MMA_1 (XXSETACCZ, "xxsetaccz", MISC, mma_xxsetaccz) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index b67789845a5..6115e3b34d9 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -11913,6 +11913,32 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi) gsi_replace_with_seq (gsi, new_seq, true); return true; } + else if (fncode == VSX_BUILTIN_LXVP) + { + push_gimplify_context (true); + tree offset = gimple_call_arg (stmt, 0); + tree ptr = gimple_call_arg (stmt, 1); + tree lhs = gimple_call_lhs (stmt); + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, + TREE_TYPE (ptr), ptr, offset)); + gimplify_assign (lhs, mem, &new_seq); + pop_gimplify_context (NULL); + gsi_replace_with_seq (gsi, new_seq, true); + return true; + } + else if (fncode == VSX_BUILTIN_STXVP) + { + push_gimplify_context (true); + tree src = gimple_call_arg (stmt, 0); + tree offset = gimple_call_arg (stmt, 1); + tree ptr = gimple_call_arg (stmt, 2); + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, + TREE_TYPE (ptr), ptr, offset)); + gimplify_assign (mem, src, &new_seq); + pop_gimplify_context (NULL); + gsi_replace_with_seq (gsi, new_seq, true); + return true; + } /* Convert this built-in into an internal version that uses pass-by-value arguments. The internal built-in follows immediately after this one. */ @@ -14264,11 +14290,14 @@ mma_init_builtins (void) if (gimple_func) { gcc_assert (icode == CODE_FOR_nothing); - op[nopnds++] = void_type_node; /* Some MMA built-ins that are expanded into gimple are converted into internal MMA built-ins that are expanded into rtl. The internal built-in follows immediately after this built-in. */ - icode = d[1].icode; + if (d[1].icode != CODE_FOR_nothing) + { + op[nopnds++] = void_type_node; + icode = d[1].icode; + } } else { @@ -14291,6 +14320,19 @@ mma_init_builtins (void) else op[nopnds++] = build_pointer_type (vector_pair_type_node); } + else if (d->code == VSX_BUILTIN_LXVP) + { + op[nopnds++] = vector_pair_type_node; + op[nopnds++] = sizetype; + op[nopnds++] = build_pointer_type (vector_pair_type_node); + } + else if (d->code == VSX_BUILTIN_STXVP) + { + op[nopnds++] = void_type_node; + op[nopnds++] = vector_pair_type_node; + op[nopnds++] = sizetype; + op[nopnds++] = build_pointer_type (vector_pair_type_node); + } else { /* This is a normal MMA built-in function. */ @@ -14781,6 +14823,16 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, h.uns_p[2] = 1; break; + case VSX_BUILTIN_LXVP: + h.uns_p[0] = 1; + h.uns_p[2] = 1; + break; + + case VSX_BUILTIN_STXVP: + h.uns_p[1] = 1; + h.uns_p[3] = 1; + break; + default: break; } diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index 8fc66d626d8..b83cd4919bb 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -20731,6 +20731,9 @@ void __builtin_vsx_disassemble_pair (void *, __vector_pair *); vec_t __builtin_vsx_xvcvspbf16 (vec_t); vec_t __builtin_vsx_xvcvbf16spn (vec_t); + +__vector_pair __builtin_vsx_lxvp (size_t, __vector_pair *); +void __builtin_vsx_stxvp (__vector_pair, size_t, __vector_pair *); @end smallexample @node PRU Built-in Functions diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c new file mode 100644 index 00000000000..c661a4b84bc --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +void +foo (__vector_pair *dst, __vector_pair *src, long idx) +{ + dst[0] = __builtin_vsx_lxvp (0, src); + dst[2] = __builtin_vsx_lxvp (32, src); + dst[4] = __builtin_vsx_lxvp (64, src); + /* Non-constant offset should generate a lxvpx. */ + dst[6] = __builtin_vsx_lxvp (idx, src); + /* Non-aligned offset should generate a plxvp. */ + dst[8] = __builtin_vsx_lxvp (257, src); +} + +#if !__has_builtin (__builtin_vsx_lxvp) +# error "__has_builtin (__builtin_vsx_lxvp) failed" +#endif + +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ +/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mlxvpx\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mplxvp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 5 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c new file mode 100644 index 00000000000..af29e479f83 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +void +foo (__vector_pair *dst, __vector_pair *src, long idx) +{ + __vector_pair pair = *src; + __builtin_vsx_stxvp (pair, 0, dst); + __builtin_vsx_stxvp (pair, 32, dst); + __builtin_vsx_stxvp (pair, 64, dst); + /* Non-constant offset should generate a stxvpx. */ + __builtin_vsx_stxvp (pair, idx, dst); + /* Non-aligned offset should generate a pstxvp. */ + __builtin_vsx_stxvp (pair, 257, dst); +} + +#if !__has_builtin (__builtin_vsx_stxvp) +# error "__has_builtin (__builtin_vsx_stxvp) failed" +#endif + +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ +/* { dg-final { scan-assembler-times {\mlxvp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mstxvpx\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mpstxvp\M} 1 } } */
On 7/1/21 2:48 PM, Peter Bergner wrote: > On 7/1/21 1:01 PM, Segher Boessenkool wrote: >> The patch is okay for trunk. > > Below is the updated patch which is bootstrapping now. I'll commit it > if it shows no regressions. Testing was clean so I pushed it to trunk. >> For the backports it is okay if Bill has looked at this patch as well. > > Bill has not seen the patch. I'm not sure when/if he'll get a chance > to either. Since Bill won't have a chance to look at this, ok for the backport after a couple of days on trunk? Bill's builtin rewrite is only targeted for trunk, so it doesn't affect the backport at all. Peter
On Fri, Jul 02, 2021 at 01:32:45PM -0500, Peter Bergner wrote: > On 7/1/21 2:48 PM, Peter Bergner wrote: > > On 7/1/21 1:01 PM, Segher Boessenkool wrote: > >> The patch is okay for trunk. > > > > Below is the updated patch which is bootstrapping now. I'll commit it > > if it shows no regressions. > > Testing was clean so I pushed it to trunk. > > > > >> For the backports it is okay if Bill has looked at this patch as well. > > > > Bill has not seen the patch. I'm not sure when/if he'll get a chance > > to either. > > Since Bill won't have a chance to look at this, ok for the backport > after a couple of days on trunk? Bill's builtin rewrite is only targeted > for trunk, so it doesn't affect the backport at all. Okay. Thanks! Segher
On 7/3/21 10:56 AM, Segher Boessenkool wrote: > On Fri, Jul 02, 2021 at 01:32:45PM -0500, Peter Bergner wrote: >> On 7/1/21 2:48 PM, Peter Bergner wrote: >>> On 7/1/21 1:01 PM, Segher Boessenkool wrote: >>>> The patch is okay for trunk. >>> Below is the updated patch which is bootstrapping now. I'll commit it >>> if it shows no regressions. >> Testing was clean so I pushed it to trunk. >> >> >> >>>> For the backports it is okay if Bill has looked at this patch as well. >>> Bill has not seen the patch. I'm not sure when/if he'll get a chance >>> to either. >> Since Bill won't have a chance to look at this, ok for the backport >> after a couple of days on trunk? Bill's builtin rewrite is only targeted >> for trunk, so it doesn't affect the backport at all. > Okay. Thanks! I will take a look hopefully today. Mostly looks good but I want to check a couple of things before signing off. Bill > > > Segher
Hi Peter, On 7/1/21 2:48 PM, Peter Bergner via Gcc-patches wrote: > gcc/ > * config/rs6000/rs6000-builtin.def (BU_MMA_PAIR_LD, BU_MMA_PAIR_ST): > New macros. > (__builtin_vsx_lxvp, __builtin_vsx_stxvp): New built-ins. > * config/rs6000/rs6000-call.c (rs6000_gimple_fold_mma_builtin): Expand > lxvp and stxvp built-ins. > (mma_init_builtins): Handle lxvp and stxvp built-ins. > (builtin_function_type): Likewise. > * doc/extend.texi (__builtin_vsx_lxvp, __builtin_mma_stxvp): Document. > > gcc/testsuite/ > * gcc.target/powerpc/mma-builtin-7.c: New test. > * gcc.target/powerpc/mma-builtin-8.c: New test. > > > diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def > index d7ce4de421e..6270444ef70 100644 > --- a/gcc/config/rs6000/rs6000-builtin.def > +++ b/gcc/config/rs6000/rs6000-builtin.def > @@ -484,6 +484,25 @@ > | RS6000_BTC_SENARY), \ > CODE_FOR_ ## ICODE) /* ICODE */ > > +#define BU_MMA_PAIR_LD(ENUM, NAME, ATTR) \ > + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ > + "__builtin_vsx_" NAME, /* NAME */ \ > + RS6000_BTM_MMA, /* MASK */ \ > + (RS6000_BTC_ ## ATTR /* ATTR */ \ > + | RS6000_BTC_BINARY \ > + | RS6000_BTC_GIMPLE), \ > + CODE_FOR_nothing) /* ICODE */ > + > +#define BU_MMA_PAIR_ST(ENUM, NAME, ATTR) \ > + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ > + "__builtin_vsx_" NAME, /* NAME */ \ > + RS6000_BTM_MMA, /* MASK */ \ > + (RS6000_BTC_ ## ATTR /* ATTR */ \ > + | RS6000_BTC_TERNARY \ > + | RS6000_BTC_VOID \ > + | RS6000_BTC_GIMPLE), \ > + CODE_FOR_nothing) /* ICODE */ > + > /* ISA 2.05 (power6) convenience macros. */ > /* For functions that depend on the CMPB instruction */ > #define BU_P6_2(ENUM, NAME, ATTR, ICODE) \ > @@ -3253,6 +3272,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, > BU_P10V_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) > BU_P10V_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) > > +BU_MMA_PAIR_LD (LXVP, "lxvp", MISC) > +BU_MMA_PAIR_ST (STXVP, "stxvp", PAIR) > + > BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, mma_xxmfacc) > BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) > BU_MMA_1 (XXSETACCZ, "xxsetaccz", MISC, mma_xxsetaccz) > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index b67789845a5..6115e3b34d9 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -11913,6 +11913,32 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi) > gsi_replace_with_seq (gsi, new_seq, true); > return true; > } > + else if (fncode == VSX_BUILTIN_LXVP) > + { > + push_gimplify_context (true); > + tree offset = gimple_call_arg (stmt, 0); > + tree ptr = gimple_call_arg (stmt, 1); > + tree lhs = gimple_call_lhs (stmt); > + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, > + TREE_TYPE (ptr), ptr, offset)); > + gimplify_assign (lhs, mem, &new_seq); > + pop_gimplify_context (NULL); > + gsi_replace_with_seq (gsi, new_seq, true); > + return true; > + } > + else if (fncode == VSX_BUILTIN_STXVP) > + { > + push_gimplify_context (true); > + tree src = gimple_call_arg (stmt, 0); > + tree offset = gimple_call_arg (stmt, 1); > + tree ptr = gimple_call_arg (stmt, 2); > + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, > + TREE_TYPE (ptr), ptr, offset)); > + gimplify_assign (mem, src, &new_seq); > + pop_gimplify_context (NULL); > + gsi_replace_with_seq (gsi, new_seq, true); > + return true; > + } > > /* Convert this built-in into an internal version that uses pass-by-value > arguments. The internal built-in follows immediately after this one. */ > @@ -14264,11 +14290,14 @@ mma_init_builtins (void) > if (gimple_func) > { > gcc_assert (icode == CODE_FOR_nothing); > - op[nopnds++] = void_type_node; > /* Some MMA built-ins that are expanded into gimple are converted > into internal MMA built-ins that are expanded into rtl. > The internal built-in follows immediately after this built-in. */ > - icode = d[1].icode; > + if (d[1].icode != CODE_FOR_nothing) > + { > + op[nopnds++] = void_type_node; > + icode = d[1].icode; > + } This hunk bothers me. The new macros BU_MMA_PAIR_LD and BU_MMA_PAIR_ST define only one builtin, not two. They are both flagged as RS6000_BTC_GIMPLE. The use of d[1] in this case is suspect and fragile. It appears you're relying on the built-in following each of __builtin_vsx_lxvp and __builtin_vsx_stxvp to exist, as otherwise d[1] will be out of bounds. It's otherwise rather meaningless because you later handle VSX_BUILTIN_LXVP and VSX_BUILTIN_STXVP separately. Can you please replace this with something less fragile? Perhaps leave the gimple_func leg alone, but first test for these two builtins and do the right thing for them instead. Otherwise this LGTM. Thanks, Bill > } > else > { > @@ -14291,6 +14320,19 @@ mma_init_builtins (void) > else > op[nopnds++] = build_pointer_type (vector_pair_type_node); > } > + else if (d->code == VSX_BUILTIN_LXVP) > + { > + op[nopnds++] = vector_pair_type_node; > + op[nopnds++] = sizetype; > + op[nopnds++] = build_pointer_type (vector_pair_type_node); > + } > + else if (d->code == VSX_BUILTIN_STXVP) > + { > + op[nopnds++] = void_type_node; > + op[nopnds++] = vector_pair_type_node; > + op[nopnds++] = sizetype; > + op[nopnds++] = build_pointer_type (vector_pair_type_node); > + } > else > { > /* This is a normal MMA built-in function. */ > @@ -14781,6 +14823,16 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, > h.uns_p[2] = 1; > break; > > + case VSX_BUILTIN_LXVP: > + h.uns_p[0] = 1; > + h.uns_p[2] = 1; > + break; > + > + case VSX_BUILTIN_STXVP: > + h.uns_p[1] = 1; > + h.uns_p[3] = 1; > + break; > + > default: > break; > } > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index 8fc66d626d8..b83cd4919bb 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -20731,6 +20731,9 @@ void __builtin_vsx_disassemble_pair (void *, __vector_pair *); > > vec_t __builtin_vsx_xvcvspbf16 (vec_t); > vec_t __builtin_vsx_xvcvbf16spn (vec_t); > + > +__vector_pair __builtin_vsx_lxvp (size_t, __vector_pair *); > +void __builtin_vsx_stxvp (__vector_pair, size_t, __vector_pair *); > @end smallexample > > @node PRU Built-in Functions > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c > new file mode 100644 > index 00000000000..c661a4b84bc > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c > @@ -0,0 +1,26 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > + > +void > +foo (__vector_pair *dst, __vector_pair *src, long idx) > +{ > + dst[0] = __builtin_vsx_lxvp (0, src); > + dst[2] = __builtin_vsx_lxvp (32, src); > + dst[4] = __builtin_vsx_lxvp (64, src); > + /* Non-constant offset should generate a lxvpx. */ > + dst[6] = __builtin_vsx_lxvp (idx, src); > + /* Non-aligned offset should generate a plxvp. */ > + dst[8] = __builtin_vsx_lxvp (257, src); > +} > + > +#if !__has_builtin (__builtin_vsx_lxvp) > +# error "__has_builtin (__builtin_vsx_lxvp) failed" > +#endif > + > +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ > +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ > +/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mlxvpx\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mplxvp\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mstxvp\M} 5 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c > new file mode 100644 > index 00000000000..af29e479f83 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c > @@ -0,0 +1,27 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target power10_ok } */ > +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ > + > +void > +foo (__vector_pair *dst, __vector_pair *src, long idx) > +{ > + __vector_pair pair = *src; > + __builtin_vsx_stxvp (pair, 0, dst); > + __builtin_vsx_stxvp (pair, 32, dst); > + __builtin_vsx_stxvp (pair, 64, dst); > + /* Non-constant offset should generate a stxvpx. */ > + __builtin_vsx_stxvp (pair, idx, dst); > + /* Non-aligned offset should generate a pstxvp. */ > + __builtin_vsx_stxvp (pair, 257, dst); > +} > + > +#if !__has_builtin (__builtin_vsx_stxvp) > +# error "__has_builtin (__builtin_vsx_stxvp) failed" > +#endif > + > +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ > +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ > +/* { dg-final { scan-assembler-times {\mlxvp\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mstxvpx\M} 1 } } */ > +/* { dg-final { scan-assembler-times {\mpstxvp\M} 1 } } */
On 7/4/21 11:32 AM, Bill Schmidt wrote > On 7/1/21 2:48 PM, Peter Bergner via Gcc-patches wrote: >> + if (d[1].icode != CODE_FOR_nothing) >> + { >> + op[nopnds++] = void_type_node; >> + icode = d[1].icode; >> + } > This hunk bothers me. The new macros BU_MMA_PAIR_LD and BU_MMA_PAIR_ST > define only one builtin, not two. Correct, unlike the other MMA builtins, they do not expand from gimple into internal builtins that are expanded at rtl time. They instead are expanded directly into gimple memory references. > They are both flagged as RS6000_BTC_GIMPLE. Yes, because they are supposed to be expanded at gimple time. Note, other non-MMA builtins that expand at gimple time should probably also set this flag. That would allow an early exit to be added to rs6000_gimple_fold_builtin(). I can add that as a TODO. > The use of d[1] in this case is suspect and fragile. It appears you're relying > on the built-in following each of __builtin_vsx_lxvp and __builtin_vsx_stxvp > to exist, as otherwise d[1] will be out of bounds. Ok, if they had been added as the last builtins in the array (they aren't), that would be a problem. > Can you please replace this with something less fragile? > Perhaps leave the gimple_func leg alone, but first test for these two > builtins and do the right thing for them instead. These are RS6000_BTC_GIMPLE, so I think they should be handled within the "if (gimple_func) ..." leg. That said, how about the following change to resolve the issue you have? I'll kick off a bootstrap and regtest for this change. Peter * config/rs6000/rs6000-call.c (mma_init_builtins): Use VSX_BUILTIN_LXVP and VSX_BUILTIN_STXVP. diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index 6115e3b34d9..904e104c058 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -14293,7 +14293,8 @@ mma_init_builtins (void) /* Some MMA built-ins that are expanded into gimple are converted into internal MMA built-ins that are expanded into rtl. The internal built-in follows immediately after this built-in. */ - if (d[1].icode != CODE_FOR_nothing) + if (d->code != VSX_BUILTIN_LXVP + && d->code != VSX_BUILTIN_STXVP) { op[nopnds++] = void_type_node; icode = d[1].icode;
On 7/6/21 2:29 PM, Peter Bergner wrote: > On 7/4/21 11:32 AM, Bill Schmidt wrote >> On 7/1/21 2:48 PM, Peter Bergner via Gcc-patches wrote: >>> + if (d[1].icode != CODE_FOR_nothing) >>> + { >>> + op[nopnds++] = void_type_node; >>> + icode = d[1].icode; >>> + } > >> This hunk bothers me. The new macros BU_MMA_PAIR_LD and BU_MMA_PAIR_ST >> define only one builtin, not two. > Correct, unlike the other MMA builtins, they do not expand from gimple > into internal builtins that are expanded at rtl time. They instead are > expanded directly into gimple memory references. > > >> They are both flagged as RS6000_BTC_GIMPLE. > Yes, because they are supposed to be expanded at gimple time. > > Note, other non-MMA builtins that expand at gimple time should > probably also set this flag. That would allow an early exit > to be added to rs6000_gimple_fold_builtin(). I can add that > as a TODO. > > > > >> The use of d[1] in this case is suspect and fragile. It appears you're relying >> on the built-in following each of __builtin_vsx_lxvp and __builtin_vsx_stxvp >> to exist, as otherwise d[1] will be out of bounds. > Ok, if they had been added as the last builtins in the array (they aren't), > that would be a problem. > > >> Can you please replace this with something less fragile? >> Perhaps leave the gimple_func leg alone, but first test for these two >> builtins and do the right thing for them instead. > These are RS6000_BTC_GIMPLE, so I think they should be handled within > the "if (gimple_func) ..." leg. That said, how about the following > change to resolve the issue you have? I'll kick off a bootstrap and > regtest for this change. Thanks, yes, that works for me! Bill > > Peter > > > * config/rs6000/rs6000-call.c (mma_init_builtins): Use VSX_BUILTIN_LXVP > and VSX_BUILTIN_STXVP. > > diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c > index 6115e3b34d9..904e104c058 100644 > --- a/gcc/config/rs6000/rs6000-call.c > +++ b/gcc/config/rs6000/rs6000-call.c > @@ -14293,7 +14293,8 @@ mma_init_builtins (void) > /* Some MMA built-ins that are expanded into gimple are converted > into internal MMA built-ins that are expanded into rtl. > The internal built-in follows immediately after this built-in. */ > - if (d[1].icode != CODE_FOR_nothing) > + if (d->code != VSX_BUILTIN_LXVP > + && d->code != VSX_BUILTIN_STXVP) > { > op[nopnds++] = void_type_node; > icode = d[1].icode; > > >
On 7/6/21 3:08 PM, Bill Schmidt wrote: > On 7/6/21 2:29 PM, Peter Bergner wrote: >> These are RS6000_BTC_GIMPLE, so I think they should be handled within >> the "if (gimple_func) ..." leg. That said, how about the following >> change to resolve the issue you have? I'll kick off a bootstrap and >> regtest for this change. > > > Thanks, yes, that works for me! Great, thanks! Segher, the patch was clean on testing. Ok with you too? Peter
On Tue, Jul 06, 2021 at 04:13:06PM -0500, Peter Bergner wrote: > On 7/6/21 3:08 PM, Bill Schmidt wrote: > > On 7/6/21 2:29 PM, Peter Bergner wrote: > >> These are RS6000_BTC_GIMPLE, so I think they should be handled within > >> the "if (gimple_func) ..." leg. That said, how about the following > >> change to resolve the issue you have? I'll kick off a bootstrap and > >> regtest for this change. > > > > > > Thanks, yes, that works for me! > > Great, thanks! > > > Segher, the patch was clean on testing. Ok with you too? Of course. Okay for trunk and backports (you might want to hurry it for 11.2). Thanks to you both! Segher
On 7/6/21 5:05 PM, Segher Boessenkool wrote: > On Tue, Jul 06, 2021 at 04:13:06PM -0500, Peter Bergner wrote: >> On 7/6/21 3:08 PM, Bill Schmidt wrote: >>> On 7/6/21 2:29 PM, Peter Bergner wrote: >>>> These are RS6000_BTC_GIMPLE, so I think they should be handled within >>>> the "if (gimple_func) ..." leg. That said, how about the following >>>> change to resolve the issue you have? I'll kick off a bootstrap and >>>> regtest for this change. >>> >>> >>> Thanks, yes, that works for me! >> >> Great, thanks! >> >> >> Segher, the patch was clean on testing. Ok with you too? > > Of course. Okay for trunk and backports (you might want to hurry it > for 11.2). Thanks to you both! Ok, committed to trunk. I'm testing the GCC 11 backport now and will commit if clean. Thanks. Peter
On 7/7/21 11:55 AM, Peter Bergner wrote: > On 7/6/21 5:05 PM, Segher Boessenkool wrote: >> On Tue, Jul 06, 2021 at 04:13:06PM -0500, Peter Bergner wrote: >>> On 7/6/21 3:08 PM, Bill Schmidt wrote: >>>> On 7/6/21 2:29 PM, Peter Bergner wrote: >>>>> These are RS6000_BTC_GIMPLE, so I think they should be handled within >>>>> the "if (gimple_func) ..." leg. That said, how about the following >>>>> change to resolve the issue you have? I'll kick off a bootstrap and >>>>> regtest for this change. >>>> >>>> >>>> Thanks, yes, that works for me! >>> >>> Great, thanks! >>> >>> >>> Segher, the patch was clean on testing. Ok with you too? >> >> Of course. Okay for trunk and backports (you might want to hurry it >> for 11.2). Thanks to you both! > > Ok, committed to trunk. I'm testing the GCC 11 backport now and will > commit if clean. Thanks. Backport testing was clean, so pushed to the GCC 11 branch. I'll work on the GCC 10 backport in a day or two. Peter
diff --git a/gcc/config/rs6000/rs6000-builtin.def b/gcc/config/rs6000/rs6000-builtin.def index d7ce4de421e..a42c3cbd55b 100644 --- a/gcc/config/rs6000/rs6000-builtin.def +++ b/gcc/config/rs6000/rs6000-builtin.def @@ -484,6 +484,25 @@ | RS6000_BTC_SENARY), \ CODE_FOR_ ## ICODE) /* ICODE */ +#define BU_MMA_LD(ENUM, NAME, ATTR) \ + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_vsx_" NAME, /* NAME */ \ + RS6000_BTM_MMA, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_BINARY \ + | RS6000_BTC_GIMPLE), \ + CODE_FOR_nothing) /* ICODE */ + +#define BU_MMA_ST(ENUM, NAME, ATTR) \ + RS6000_BUILTIN_M (VSX_BUILTIN_ ## ENUM, /* ENUM */ \ + "__builtin_vsx_" NAME, /* NAME */ \ + RS6000_BTM_MMA, /* MASK */ \ + (RS6000_BTC_ ## ATTR /* ATTR */ \ + | RS6000_BTC_TERNARY \ + | RS6000_BTC_VOID \ + | RS6000_BTC_GIMPLE), \ + CODE_FOR_nothing) /* ICODE */ + /* ISA 2.05 (power6) convenience macros. */ /* For functions that depend on the CMPB instruction */ #define BU_P6_2(ENUM, NAME, ATTR, ICODE) \ @@ -3253,6 +3272,9 @@ BU_SPECIAL_X (RS6000_BUILTIN_CFSTRING, "__builtin_cfstring", RS6000_BTM_ALWAYS, BU_P10V_VSX_1 (XVCVBF16SPN, "xvcvbf16spn", MISC, vsx_xvcvbf16spn) BU_P10V_VSX_1 (XVCVSPBF16, "xvcvspbf16", MISC, vsx_xvcvspbf16) +BU_MMA_LD (LXVP, "lxvp", MISC) +BU_MMA_ST (STXVP, "stxvp", PAIR) + BU_MMA_1 (XXMFACC, "xxmfacc", QUAD, mma_xxmfacc) BU_MMA_1 (XXMTACC, "xxmtacc", QUAD, mma_xxmtacc) BU_MMA_1 (XXSETACCZ, "xxsetaccz", MISC, mma_xxsetaccz) diff --git a/gcc/config/rs6000/rs6000-call.c b/gcc/config/rs6000/rs6000-call.c index b67789845a5..ebc03da9b24 100644 --- a/gcc/config/rs6000/rs6000-call.c +++ b/gcc/config/rs6000/rs6000-call.c @@ -11913,6 +11913,30 @@ rs6000_gimple_fold_mma_builtin (gimple_stmt_iterator *gsi) gsi_replace_with_seq (gsi, new_seq, true); return true; } + else if (fncode == VSX_BUILTIN_LXVP) + { + push_gimplify_context (true); + tree offset = gimple_call_arg (stmt, 0); + tree ptr = gimple_call_arg (stmt, 1); + tree lhs = gimple_call_lhs (stmt); + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset)); + gimplify_assign (lhs, mem, &new_seq); + pop_gimplify_context (NULL); + gsi_replace_with_seq (gsi, new_seq, true); + return true; + } + else if (fncode == VSX_BUILTIN_STXVP) + { + push_gimplify_context (true); + tree src = gimple_call_arg (stmt, 0); + tree offset = gimple_call_arg (stmt, 1); + tree ptr = gimple_call_arg (stmt, 2); + tree mem = build_simple_mem_ref (build2 (POINTER_PLUS_EXPR, TREE_TYPE (ptr), ptr, offset)); + gimplify_assign (mem, src, &new_seq); + pop_gimplify_context (NULL); + gsi_replace_with_seq (gsi, new_seq, true); + return true; + } /* Convert this built-in into an internal version that uses pass-by-value arguments. The internal built-in follows immediately after this one. */ @@ -14264,11 +14288,14 @@ mma_init_builtins (void) if (gimple_func) { gcc_assert (icode == CODE_FOR_nothing); - op[nopnds++] = void_type_node; /* Some MMA built-ins that are expanded into gimple are converted into internal MMA built-ins that are expanded into rtl. The internal built-in follows immediately after this built-in. */ - icode = d[1].icode; + if (d[1].icode != CODE_FOR_nothing) + { + op[nopnds++] = void_type_node; + icode = d[1].icode; + } } else { @@ -14291,6 +14318,19 @@ mma_init_builtins (void) else op[nopnds++] = build_pointer_type (vector_pair_type_node); } + else if (d->code == VSX_BUILTIN_LXVP) + { + op[nopnds++] = vector_pair_type_node; + op[nopnds++] = sizetype; + op[nopnds++] = build_pointer_type (vector_pair_type_node); + } + else if (d->code == VSX_BUILTIN_STXVP) + { + op[nopnds++] = void_type_node; + op[nopnds++] = vector_pair_type_node; + op[nopnds++] = sizetype; + op[nopnds++] = build_pointer_type (vector_pair_type_node); + } else { /* This is a normal MMA built-in function. */ @@ -14781,6 +14821,16 @@ builtin_function_type (machine_mode mode_ret, machine_mode mode_arg0, h.uns_p[2] = 1; break; + case VSX_BUILTIN_LXVP: + h.uns_p[0] = 1; + h.uns_p[2] = 1; + break; + + case VSX_BUILTIN_STXVP: + h.uns_p[1] = 1; + h.uns_p[3] = 1; + break; + default: break; } diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c new file mode 100644 index 00000000000..c661a4b84bc --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-7.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +void +foo (__vector_pair *dst, __vector_pair *src, long idx) +{ + dst[0] = __builtin_vsx_lxvp (0, src); + dst[2] = __builtin_vsx_lxvp (32, src); + dst[4] = __builtin_vsx_lxvp (64, src); + /* Non-constant offset should generate a lxvpx. */ + dst[6] = __builtin_vsx_lxvp (idx, src); + /* Non-aligned offset should generate a plxvp. */ + dst[8] = __builtin_vsx_lxvp (257, src); +} + +#if !__has_builtin (__builtin_vsx_lxvp) +# error "__has_builtin (__builtin_vsx_lxvp) failed" +#endif + +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ +/* { dg-final { scan-assembler-times {\mlxvp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mlxvpx\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mplxvp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 5 } } */ diff --git a/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c new file mode 100644 index 00000000000..af29e479f83 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/mma-builtin-8.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target power10_ok } */ +/* { dg-options "-mdejagnu-cpu=power10 -O2" } */ + +void +foo (__vector_pair *dst, __vector_pair *src, long idx) +{ + __vector_pair pair = *src; + __builtin_vsx_stxvp (pair, 0, dst); + __builtin_vsx_stxvp (pair, 32, dst); + __builtin_vsx_stxvp (pair, 64, dst); + /* Non-constant offset should generate a stxvpx. */ + __builtin_vsx_stxvp (pair, idx, dst); + /* Non-aligned offset should generate a pstxvp. */ + __builtin_vsx_stxvp (pair, 257, dst); +} + +#if !__has_builtin (__builtin_vsx_stxvp) +# error "__has_builtin (__builtin_vsx_stxvp) failed" +#endif + +/* { dg-final { scan-assembler-not {\mlxv\M} } } */ +/* { dg-final { scan-assembler-not {\mstxv\M} } } */ +/* { dg-final { scan-assembler-times {\mlxvp\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mstxvp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mstxvpx\M} 1 } } */ +/* { dg-final { scan-assembler-times {\mpstxvp\M} 1 } } */