Message ID | 20210308105710.5o5yl52ld6d2b7rq@arm.com |
---|---|
State | New |
Headers | show |
Series | aarch64: Fix SVE ACLE builtins with LTO [PR99216] | expand |
On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches <gcc-patches@gcc.gnu.org> wrote: > > Hi all, > > As discussed in the PR, we currently have two different numbering > schemes for SVE builtins: one for C, and one for C++. This is > problematic for LTO, where we end up getting confused about which > intrinsic we're talking about. This patch inserts placeholders into the > registered_functions vector to ensure that there is a consistent > numbering scheme for both C and C++. > > Initially I tried using a null decl for these placeholders, but this > trips up some validation when the builtin trees are streamed into lto1, Care to elaborate? Note we stream builtins by their function code and thus expect to be able to materialize them later - materializing to NULL obviously fails but then the issue is the proper decl isn't there in that case. Materializing a "dummy" decl just will delay the inevitable breakage. > so I ended up building dummy FUNCTION_DECLs as placeholders. > > To get better test coverage here, I'm wondering if we could make some of > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch? > For now I've just added the specific testcase from the PR. > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk? > > Thanks, > Alex > > --- > > gcc/ChangeLog: > > PR target/99216 > * config/aarch64/aarch64-sve-builtins.cc > (function_builder::add_function): Add placeholder_p argument, build > placeholder decls if this is set. > (function_builder::add_unique_function): Instead of conditionally adding > direct overloads, unconditionally add either a direct overload or a > placeholder. > (function_builder::add_overloaded_function): Set placeholder_p if we're > using C++ overloads. > (function_builder::add_overloaded_functions): Don't return early for > m_direct_overloads: we need to add placeholders. > * config/aarch64/aarch64-sve-builtins.h > (function_builder::add_function): Add placeholder_p argument. > > gcc/testsuite/ChangeLog: > > PR target/99216 > * g++.target/aarch64/sve/pr99216.C: New test.
Alex Coplan <alex.coplan@arm.com> writes: > Hi all, > > As discussed in the PR, we currently have two different numbering > schemes for SVE builtins: one for C, and one for C++. This is > problematic for LTO, where we end up getting confused about which > intrinsic we're talking about. This patch inserts placeholders into the > registered_functions vector to ensure that there is a consistent > numbering scheme for both C and C++. > > Initially I tried using a null decl for these placeholders, but this > trips up some validation when the builtin trees are streamed into lto1, > so I ended up building dummy FUNCTION_DECLs as placeholders. > > To get better test coverage here, I'm wondering if we could make some of > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch? > For now I've just added the specific testcase from the PR. > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk? OK, thanks. I think we should backport this to GCC 10, but as Richi says, that would need a bump in lto-streamer.h:LTO_minor_version. Thanks, Richard > > Thanks, > Alex > > --- > > gcc/ChangeLog: > > PR target/99216 > * config/aarch64/aarch64-sve-builtins.cc > (function_builder::add_function): Add placeholder_p argument, build > placeholder decls if this is set. > (function_builder::add_unique_function): Instead of conditionally adding > direct overloads, unconditionally add either a direct overload or a > placeholder. > (function_builder::add_overloaded_function): Set placeholder_p if we're > using C++ overloads. > (function_builder::add_overloaded_functions): Don't return early for > m_direct_overloads: we need to add placeholders. > * config/aarch64/aarch64-sve-builtins.h > (function_builder::add_function): Add placeholder_p argument. > > gcc/testsuite/ChangeLog: > > PR target/99216 > * g++.target/aarch64/sve/pr99216.C: New test. > > diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc > index 25612d2ea52..c0ba352599c 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins.cc > +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc > @@ -999,12 +999,25 @@ registered_function & > function_builder::add_function (const function_instance &instance, > const char *name, tree fntype, tree attrs, > uint64_t required_extensions, > - bool overloaded_p) > + bool overloaded_p, > + bool placeholder_p) > { > unsigned int code = vec_safe_length (registered_functions); > code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_SVE; > - tree decl = simulate_builtin_function_decl (input_location, name, fntype, > - code, NULL, attrs); > + > + /* We need to be able to generate placeholders to enusre that we have a > + consistent numbering scheme for function codes between the C and C++ > + frontends, so that everything ties up in LTO. > + > + The placeholder needs to be non-NULL and some node other than > + error_mark_node to pass validation when the builtin is streamed in to lto1. > + > + It's also convenient if the placeholder can store the name, so build a > + FUNCTION_DECL, but don't register it with the frontend. */ > + tree decl = placeholder_p > + ? build_decl (input_location, FUNCTION_DECL, get_identifier (name), fntype) > + : simulate_builtin_function_decl (input_location, name, fntype, > + code, NULL, attrs); > > registered_function &rfn = *ggc_alloc <registered_function> (); > rfn.instance = instance; > @@ -1036,7 +1049,7 @@ function_builder::add_unique_function (const function_instance &instance, > argument_types.address ()); > tree attrs = get_attributes (instance); > registered_function &rfn = add_function (instance, name, fntype, attrs, > - required_extensions, false); > + required_extensions, false, false); > > /* Enter the function into the hash table. */ > hashval_t hash = instance.hash (); > @@ -1047,16 +1060,14 @@ function_builder::add_unique_function (const function_instance &instance, > > /* Also add the function under its overloaded alias, if we want > a separate decl for each instance of an overloaded function. */ > - if (m_direct_overloads || force_direct_overloads) > + char *overload_name = get_name (instance, true); > + if (strcmp (name, overload_name) != 0) > { > - char *overload_name = get_name (instance, true); > - if (strcmp (name, overload_name) != 0) > - { > - /* Attribute lists shouldn't be shared. */ > - tree attrs = get_attributes (instance); > - add_function (instance, overload_name, fntype, attrs, > - required_extensions, false); > - } > + /* Attribute lists shouldn't be shared. */ > + tree attrs = get_attributes (instance); > + bool placeholder_p = !(m_direct_overloads || force_direct_overloads); > + add_function (instance, overload_name, fntype, attrs, > + required_extensions, false, placeholder_p); > } > > obstack_free (&m_string_obstack, name); > @@ -1084,7 +1095,7 @@ function_builder::add_overloaded_function (const function_instance &instance, > { > registered_function &rfn > = add_function (instance, name, m_overload_type, NULL_TREE, > - required_extensions, true); > + required_extensions, true, m_direct_overloads); > const char *permanent_name = IDENTIFIER_POINTER (DECL_NAME (rfn.decl)); > m_overload_names.put (permanent_name, &rfn); > } > @@ -1098,9 +1109,6 @@ void > function_builder::add_overloaded_functions (const function_group_info &group, > mode_suffix_index mode) > { > - if (m_direct_overloads) > - return; > - > unsigned int explicit_type0 = (*group.shape)->explicit_type_suffix_p (0); > unsigned int explicit_type1 = (*group.shape)->explicit_type_suffix_p (1); > for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) > diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h > index 620e18841b2..b701f90ac1a 100644 > --- a/gcc/config/aarch64/aarch64-sve-builtins.h > +++ b/gcc/config/aarch64/aarch64-sve-builtins.h > @@ -337,7 +337,8 @@ private: > tree get_attributes (const function_instance &); > > registered_function &add_function (const function_instance &, > - const char *, tree, tree, uint64_t, bool); > + const char *, tree, tree, > + uint64_t, bool, bool); > > /* The function type to use for functions that are resolved by > function_resolver. */ > diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr99216.C b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C > new file mode 100644 > index 00000000000..61a58a7fcf4 > --- /dev/null > +++ b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C > @@ -0,0 +1,5 @@ > +/* { dg-do link { target aarch64_asm_sve_ok } } */ > +/* { dg-additional-options "-flto" } */ > +#include <arm_sve.h> > +bool a; > +int main() { a = svaddv(svptrue_b8(), svdup_s8(0)); }
On 08/03/2021 14:57, Richard Biener wrote: > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches > <gcc-patches@gcc.gnu.org> wrote: > > > > Hi all, > > > > As discussed in the PR, we currently have two different numbering > > schemes for SVE builtins: one for C, and one for C++. This is > > problematic for LTO, where we end up getting confused about which > > intrinsic we're talking about. This patch inserts placeholders into the > > registered_functions vector to ensure that there is a consistent > > numbering scheme for both C and C++. > > > > Initially I tried using a null decl for these placeholders, but this > > trips up some validation when the builtin trees are streamed into lto1, > > Care to elaborate? Note we stream builtins by their function code > and thus expect to be able to materialize them later - materializing > to NULL obviously fails but then the issue is the proper decl isn't there > in that case. Materializing a "dummy" decl just will delay the inevitable > breakage. So the validation I was referring to is this part of tree-streamer-in.c:unpack_ts_function_decl_value_fields: else if (cl == BUILT_IN_MD) { tree result = targetm.builtin_decl (fcode, true); if (!result || result == error_mark_node) fatal_error (input_location, "target specific builtin not available"); } Because we stream the original decl and reconstruct it, I was assuming that we would be using this (real) node everywhere and we would never need to map from function code to tree node. Indeed, grepping for "targetm.builtin_decl", the only results are this use, plus a few uses in the avr backend, so it seems that right now this doesn't matter for AArch64 (unless I'm missing some other path). However, it seems that returning a subtly incorrect result for this taget hook is asking for trouble, and we should return the real node here. Richard, do you have a feeling for how we should do this? Perhaps instantiate real nodes (instead of placeholders) if in_lto_p? Or unconditionally? Thanks, Alex > > > so I ended up building dummy FUNCTION_DECLs as placeholders. > > > > To get better test coverage here, I'm wondering if we could make some of > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch? > > For now I've just added the specific testcase from the PR. > > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk? > > > > Thanks, > > Alex > > > > --- > > > > gcc/ChangeLog: > > > > PR target/99216 > > * config/aarch64/aarch64-sve-builtins.cc > > (function_builder::add_function): Add placeholder_p argument, build > > placeholder decls if this is set. > > (function_builder::add_unique_function): Instead of conditionally adding > > direct overloads, unconditionally add either a direct overload or a > > placeholder. > > (function_builder::add_overloaded_function): Set placeholder_p if we're > > using C++ overloads. > > (function_builder::add_overloaded_functions): Don't return early for > > m_direct_overloads: we need to add placeholders. > > * config/aarch64/aarch64-sve-builtins.h > > (function_builder::add_function): Add placeholder_p argument. > > > > gcc/testsuite/ChangeLog: > > > > PR target/99216 > > * g++.target/aarch64/sve/pr99216.C: New test.
On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan <alex.coplan@arm.com> wrote: > > On 08/03/2021 14:57, Richard Biener wrote: > > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > Hi all, > > > > > > As discussed in the PR, we currently have two different numbering > > > schemes for SVE builtins: one for C, and one for C++. This is > > > problematic for LTO, where we end up getting confused about which > > > intrinsic we're talking about. This patch inserts placeholders into the > > > registered_functions vector to ensure that there is a consistent > > > numbering scheme for both C and C++. > > > > > > Initially I tried using a null decl for these placeholders, but this > > > trips up some validation when the builtin trees are streamed into lto1, > > > > Care to elaborate? Note we stream builtins by their function code > > and thus expect to be able to materialize them later - materializing > > to NULL obviously fails but then the issue is the proper decl isn't there > > in that case. Materializing a "dummy" decl just will delay the inevitable > > breakage. > > So the validation I was referring to is this part of > tree-streamer-in.c:unpack_ts_function_decl_value_fields: > > else if (cl == BUILT_IN_MD) > { > tree result = targetm.builtin_decl (fcode, true); > if (!result || result == error_mark_node) > fatal_error (input_location, "target specific builtin not available"); > } > > Because we stream the original decl and reconstruct it, I was assuming > that we would be using this (real) node everywhere and we would never > need to map from function code to tree node. Indeed, grepping for > "targetm.builtin_decl", the only results are this use, plus a few uses > in the avr backend, so it seems that right now this doesn't matter for > AArch64 (unless I'm missing some other path). Hmm, I thought we were streaming those by reference - indeed this probably got changed when we stopped doing that for the middle-end builtins which we did because those are often "misrecognized". > However, it seems that returning a subtly incorrect result for this > taget hook is asking for trouble, and we should return the real node > here. Yeah, the intent of this target hook is exactly that. There's now of course the option to simply remove it if it just exists for the checking done above (need to check the avr backend, of course). I suppose the cost in the LTO IR stream isn't so big thus we can safely ditch the optimization forever? > Richard, do you have a feeling for how we should do this? Perhaps > instantiate real nodes (instead of placeholders) if in_lto_p? Or > unconditionally? > > Thanks, > Alex > > > > > > so I ended up building dummy FUNCTION_DECLs as placeholders. > > > > > > To get better test coverage here, I'm wondering if we could make some of > > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch? > > > For now I've just added the specific testcase from the PR. > > > > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk? > > > > > > Thanks, > > > Alex > > > > > > --- > > > > > > gcc/ChangeLog: > > > > > > PR target/99216 > > > * config/aarch64/aarch64-sve-builtins.cc > > > (function_builder::add_function): Add placeholder_p argument, build > > > placeholder decls if this is set. > > > (function_builder::add_unique_function): Instead of conditionally adding > > > direct overloads, unconditionally add either a direct overload or a > > > placeholder. > > > (function_builder::add_overloaded_function): Set placeholder_p if we're > > > using C++ overloads. > > > (function_builder::add_overloaded_functions): Don't return early for > > > m_direct_overloads: we need to add placeholders. > > > * config/aarch64/aarch64-sve-builtins.h > > > (function_builder::add_function): Add placeholder_p argument. > > > > > > gcc/testsuite/ChangeLog: > > > > > > PR target/99216 > > > * g++.target/aarch64/sve/pr99216.C: New test.
On 08/03/2021 16:21, Richard Biener wrote: > On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan <alex.coplan@arm.com> wrote: > > > > On 08/03/2021 14:57, Richard Biener wrote: > > > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > Hi all, > > > > > > > > As discussed in the PR, we currently have two different numbering > > > > schemes for SVE builtins: one for C, and one for C++. This is > > > > problematic for LTO, where we end up getting confused about which > > > > intrinsic we're talking about. This patch inserts placeholders into the > > > > registered_functions vector to ensure that there is a consistent > > > > numbering scheme for both C and C++. > > > > > > > > Initially I tried using a null decl for these placeholders, but this > > > > trips up some validation when the builtin trees are streamed into lto1, > > > > > > Care to elaborate? Note we stream builtins by their function code > > > and thus expect to be able to materialize them later - materializing > > > to NULL obviously fails but then the issue is the proper decl isn't there > > > in that case. Materializing a "dummy" decl just will delay the inevitable > > > breakage. > > > > So the validation I was referring to is this part of > > tree-streamer-in.c:unpack_ts_function_decl_value_fields: > > > > else if (cl == BUILT_IN_MD) > > { > > tree result = targetm.builtin_decl (fcode, true); > > if (!result || result == error_mark_node) > > fatal_error (input_location, "target specific builtin not available"); > > } > > > > Because we stream the original decl and reconstruct it, I was assuming > > that we would be using this (real) node everywhere and we would never > > need to map from function code to tree node. Indeed, grepping for > > "targetm.builtin_decl", the only results are this use, plus a few uses > > in the avr backend, so it seems that right now this doesn't matter for > > AArch64 (unless I'm missing some other path). > > Hmm, I thought we were streaming those by reference - indeed > this probably got changed when we stopped doing that for > the middle-end builtins which we did because those are often > "misrecognized". > > > However, it seems that returning a subtly incorrect result for this > > taget hook is asking for trouble, and we should return the real node > > here. > > Yeah, the intent of this target hook is exactly that. There's now of course > the option to simply remove it if it just exists for the checking done above > (need to check the avr backend, of course). I suppose the cost in the > LTO IR stream isn't so big thus we can safely ditch the optimization > forever? The option of just removing the target hook seems appealing, if that would be acceptable. It looks like avr could easily be changed to just call its internal implementation (avr_builtin_decl) instead of the target hook. > > > Richard, do you have a feeling for how we should do this? Perhaps > > instantiate real nodes (instead of placeholders) if in_lto_p? Or > > unconditionally? > > > > Thanks, > > Alex > > > > > > > > > so I ended up building dummy FUNCTION_DECLs as placeholders. > > > > > > > > To get better test coverage here, I'm wondering if we could make some of > > > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch? > > > > For now I've just added the specific testcase from the PR. > > > > > > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk? > > > > > > > > Thanks, > > > > Alex > > > > > > > > --- > > > > > > > > gcc/ChangeLog: > > > > > > > > PR target/99216 > > > > * config/aarch64/aarch64-sve-builtins.cc > > > > (function_builder::add_function): Add placeholder_p argument, build > > > > placeholder decls if this is set. > > > > (function_builder::add_unique_function): Instead of conditionally adding > > > > direct overloads, unconditionally add either a direct overload or a > > > > placeholder. > > > > (function_builder::add_overloaded_function): Set placeholder_p if we're > > > > using C++ overloads. > > > > (function_builder::add_overloaded_functions): Don't return early for > > > > m_direct_overloads: we need to add placeholders. > > > > * config/aarch64/aarch64-sve-builtins.h > > > > (function_builder::add_function): Add placeholder_p argument. > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > PR target/99216 > > > > * g++.target/aarch64/sve/pr99216.C: New test.
On Mon, Mar 8, 2021 at 4:46 PM Alex Coplan <alex.coplan@arm.com> wrote: > > On 08/03/2021 16:21, Richard Biener wrote: > > On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan <alex.coplan@arm.com> wrote: > > > > > > On 08/03/2021 14:57, Richard Biener wrote: > > > > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches > > > > <gcc-patches@gcc.gnu.org> wrote: > > > > > > > > > > Hi all, > > > > > > > > > > As discussed in the PR, we currently have two different numbering > > > > > schemes for SVE builtins: one for C, and one for C++. This is > > > > > problematic for LTO, where we end up getting confused about which > > > > > intrinsic we're talking about. This patch inserts placeholders into the > > > > > registered_functions vector to ensure that there is a consistent > > > > > numbering scheme for both C and C++. > > > > > > > > > > Initially I tried using a null decl for these placeholders, but this > > > > > trips up some validation when the builtin trees are streamed into lto1, > > > > > > > > Care to elaborate? Note we stream builtins by their function code > > > > and thus expect to be able to materialize them later - materializing > > > > to NULL obviously fails but then the issue is the proper decl isn't there > > > > in that case. Materializing a "dummy" decl just will delay the inevitable > > > > breakage. > > > > > > So the validation I was referring to is this part of > > > tree-streamer-in.c:unpack_ts_function_decl_value_fields: > > > > > > else if (cl == BUILT_IN_MD) > > > { > > > tree result = targetm.builtin_decl (fcode, true); > > > if (!result || result == error_mark_node) > > > fatal_error (input_location, "target specific builtin not available"); > > > } > > > > > > Because we stream the original decl and reconstruct it, I was assuming > > > that we would be using this (real) node everywhere and we would never > > > need to map from function code to tree node. Indeed, grepping for > > > "targetm.builtin_decl", the only results are this use, plus a few uses > > > in the avr backend, so it seems that right now this doesn't matter for > > > AArch64 (unless I'm missing some other path). > > > > Hmm, I thought we were streaming those by reference - indeed > > this probably got changed when we stopped doing that for > > the middle-end builtins which we did because those are often > > "misrecognized". > > > > > However, it seems that returning a subtly incorrect result for this > > > taget hook is asking for trouble, and we should return the real node > > > here. > > > > Yeah, the intent of this target hook is exactly that. There's now of course > > the option to simply remove it if it just exists for the checking done above > > (need to check the avr backend, of course). I suppose the cost in the > > LTO IR stream isn't so big thus we can safely ditch the optimization > > forever? > > The option of just removing the target hook seems appealing, if that > would be acceptable. It looks like avr could easily be changed to just > call its internal implementation (avr_builtin_decl) instead of the > target hook. OK, let's not rush things but do this for next stage1 if desired. Given you know how the target hook works you can also return integer_zero_node to make the code happy ... (just building fake decls feels somewhat odd) > > > > > Richard, do you have a feeling for how we should do this? Perhaps > > > instantiate real nodes (instead of placeholders) if in_lto_p? Or > > > unconditionally? > > > > > > Thanks, > > > Alex > > > > > > > > > > > > so I ended up building dummy FUNCTION_DECLs as placeholders. > > > > > > > > > > To get better test coverage here, I'm wondering if we could make some of > > > > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch? > > > > > For now I've just added the specific testcase from the PR. > > > > > > > > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk? > > > > > > > > > > Thanks, > > > > > Alex > > > > > > > > > > --- > > > > > > > > > > gcc/ChangeLog: > > > > > > > > > > PR target/99216 > > > > > * config/aarch64/aarch64-sve-builtins.cc > > > > > (function_builder::add_function): Add placeholder_p argument, build > > > > > placeholder decls if this is set. > > > > > (function_builder::add_unique_function): Instead of conditionally adding > > > > > direct overloads, unconditionally add either a direct overload or a > > > > > placeholder. > > > > > (function_builder::add_overloaded_function): Set placeholder_p if we're > > > > > using C++ overloads. > > > > > (function_builder::add_overloaded_functions): Don't return early for > > > > > m_direct_overloads: we need to add placeholders. > > > > > * config/aarch64/aarch64-sve-builtins.h > > > > > (function_builder::add_function): Add placeholder_p argument. > > > > > > > > > > gcc/testsuite/ChangeLog: > > > > > > > > > > PR target/99216 > > > > > * g++.target/aarch64/sve/pr99216.C: New test.
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc index 25612d2ea52..c0ba352599c 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.cc +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc @@ -999,12 +999,25 @@ registered_function & function_builder::add_function (const function_instance &instance, const char *name, tree fntype, tree attrs, uint64_t required_extensions, - bool overloaded_p) + bool overloaded_p, + bool placeholder_p) { unsigned int code = vec_safe_length (registered_functions); code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_SVE; - tree decl = simulate_builtin_function_decl (input_location, name, fntype, - code, NULL, attrs); + + /* We need to be able to generate placeholders to enusre that we have a + consistent numbering scheme for function codes between the C and C++ + frontends, so that everything ties up in LTO. + + The placeholder needs to be non-NULL and some node other than + error_mark_node to pass validation when the builtin is streamed in to lto1. + + It's also convenient if the placeholder can store the name, so build a + FUNCTION_DECL, but don't register it with the frontend. */ + tree decl = placeholder_p + ? build_decl (input_location, FUNCTION_DECL, get_identifier (name), fntype) + : simulate_builtin_function_decl (input_location, name, fntype, + code, NULL, attrs); registered_function &rfn = *ggc_alloc <registered_function> (); rfn.instance = instance; @@ -1036,7 +1049,7 @@ function_builder::add_unique_function (const function_instance &instance, argument_types.address ()); tree attrs = get_attributes (instance); registered_function &rfn = add_function (instance, name, fntype, attrs, - required_extensions, false); + required_extensions, false, false); /* Enter the function into the hash table. */ hashval_t hash = instance.hash (); @@ -1047,16 +1060,14 @@ function_builder::add_unique_function (const function_instance &instance, /* Also add the function under its overloaded alias, if we want a separate decl for each instance of an overloaded function. */ - if (m_direct_overloads || force_direct_overloads) + char *overload_name = get_name (instance, true); + if (strcmp (name, overload_name) != 0) { - char *overload_name = get_name (instance, true); - if (strcmp (name, overload_name) != 0) - { - /* Attribute lists shouldn't be shared. */ - tree attrs = get_attributes (instance); - add_function (instance, overload_name, fntype, attrs, - required_extensions, false); - } + /* Attribute lists shouldn't be shared. */ + tree attrs = get_attributes (instance); + bool placeholder_p = !(m_direct_overloads || force_direct_overloads); + add_function (instance, overload_name, fntype, attrs, + required_extensions, false, placeholder_p); } obstack_free (&m_string_obstack, name); @@ -1084,7 +1095,7 @@ function_builder::add_overloaded_function (const function_instance &instance, { registered_function &rfn = add_function (instance, name, m_overload_type, NULL_TREE, - required_extensions, true); + required_extensions, true, m_direct_overloads); const char *permanent_name = IDENTIFIER_POINTER (DECL_NAME (rfn.decl)); m_overload_names.put (permanent_name, &rfn); } @@ -1098,9 +1109,6 @@ void function_builder::add_overloaded_functions (const function_group_info &group, mode_suffix_index mode) { - if (m_direct_overloads) - return; - unsigned int explicit_type0 = (*group.shape)->explicit_type_suffix_p (0); unsigned int explicit_type1 = (*group.shape)->explicit_type_suffix_p (1); for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi) diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h index 620e18841b2..b701f90ac1a 100644 --- a/gcc/config/aarch64/aarch64-sve-builtins.h +++ b/gcc/config/aarch64/aarch64-sve-builtins.h @@ -337,7 +337,8 @@ private: tree get_attributes (const function_instance &); registered_function &add_function (const function_instance &, - const char *, tree, tree, uint64_t, bool); + const char *, tree, tree, + uint64_t, bool, bool); /* The function type to use for functions that are resolved by function_resolver. */ diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr99216.C b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C new file mode 100644 index 00000000000..61a58a7fcf4 --- /dev/null +++ b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C @@ -0,0 +1,5 @@ +/* { dg-do link { target aarch64_asm_sve_ok } } */ +/* { dg-additional-options "-flto" } */ +#include <arm_sve.h> +bool a; +int main() { a = svaddv(svptrue_b8(), svdup_s8(0)); }