diff mbox series

aarch64: Fix SVE ACLE builtins with LTO [PR99216]

Message ID 20210308105710.5o5yl52ld6d2b7rq@arm.com
State New
Headers show
Series aarch64: Fix SVE ACLE builtins with LTO [PR99216] | expand

Commit Message

Alex Coplan March 8, 2021, 10:57 a.m. UTC
Hi all,

As discussed in the PR, we currently have two different numbering
schemes for SVE builtins: one for C, and one for C++. This is
problematic for LTO, where we end up getting confused about which
intrinsic we're talking about. This patch inserts placeholders into the
registered_functions vector to ensure that there is a consistent
numbering scheme for both C and C++.

Initially I tried using a null decl for these placeholders, but this
trips up some validation when the builtin trees are streamed into lto1,
so I ended up building dummy FUNCTION_DECLs as placeholders.

To get better test coverage here, I'm wondering if we could make some of
the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
For now I've just added the specific testcase from the PR.

Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?

Thanks,
Alex

---

gcc/ChangeLog:

	PR target/99216
	* config/aarch64/aarch64-sve-builtins.cc
	(function_builder::add_function): Add placeholder_p argument, build
	placeholder decls if this is set.
	(function_builder::add_unique_function): Instead of conditionally adding
	direct overloads, unconditionally add either a direct overload or a
	placeholder.
	(function_builder::add_overloaded_function): Set placeholder_p if we're
	using C++ overloads.
	(function_builder::add_overloaded_functions): Don't return early for
	m_direct_overloads: we need to add placeholders.
	* config/aarch64/aarch64-sve-builtins.h
	(function_builder::add_function): Add placeholder_p argument.

gcc/testsuite/ChangeLog:

	PR target/99216
	* g++.target/aarch64/sve/pr99216.C: New test.

Comments

Richard Biener March 8, 2021, 1:57 p.m. UTC | #1
On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> Hi all,
>
> As discussed in the PR, we currently have two different numbering
> schemes for SVE builtins: one for C, and one for C++. This is
> problematic for LTO, where we end up getting confused about which
> intrinsic we're talking about. This patch inserts placeholders into the
> registered_functions vector to ensure that there is a consistent
> numbering scheme for both C and C++.
>
> Initially I tried using a null decl for these placeholders, but this
> trips up some validation when the builtin trees are streamed into lto1,

Care to elaborate?  Note we stream builtins by their function code
and thus expect to be able to materialize them later - materializing
to NULL obviously fails but then the issue is the proper decl isn't there
in that case.  Materializing a "dummy" decl just will delay the inevitable
breakage.

> so I ended up building dummy FUNCTION_DECLs as placeholders.
>
> To get better test coverage here, I'm wondering if we could make some of
> the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> For now I've just added the specific testcase from the PR.
>
> Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
>
> Thanks,
> Alex
>
> ---
>
> gcc/ChangeLog:
>
>         PR target/99216
>         * config/aarch64/aarch64-sve-builtins.cc
>         (function_builder::add_function): Add placeholder_p argument, build
>         placeholder decls if this is set.
>         (function_builder::add_unique_function): Instead of conditionally adding
>         direct overloads, unconditionally add either a direct overload or a
>         placeholder.
>         (function_builder::add_overloaded_function): Set placeholder_p if we're
>         using C++ overloads.
>         (function_builder::add_overloaded_functions): Don't return early for
>         m_direct_overloads: we need to add placeholders.
>         * config/aarch64/aarch64-sve-builtins.h
>         (function_builder::add_function): Add placeholder_p argument.
>
> gcc/testsuite/ChangeLog:
>
>         PR target/99216
>         * g++.target/aarch64/sve/pr99216.C: New test.
Richard Sandiford March 8, 2021, 2:10 p.m. UTC | #2
Alex Coplan <alex.coplan@arm.com> writes:
> Hi all,
>
> As discussed in the PR, we currently have two different numbering
> schemes for SVE builtins: one for C, and one for C++. This is
> problematic for LTO, where we end up getting confused about which
> intrinsic we're talking about. This patch inserts placeholders into the
> registered_functions vector to ensure that there is a consistent
> numbering scheme for both C and C++.
>
> Initially I tried using a null decl for these placeholders, but this
> trips up some validation when the builtin trees are streamed into lto1,
> so I ended up building dummy FUNCTION_DECLs as placeholders.
>
> To get better test coverage here, I'm wondering if we could make some of
> the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> For now I've just added the specific testcase from the PR.
>
> Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?

OK, thanks.  I think we should backport this to GCC 10, but as Richi says,
that would need a bump in lto-streamer.h:LTO_minor_version.

Thanks,
Richard

>
> Thanks,
> Alex
>
> ---
>
> gcc/ChangeLog:
>
> 	PR target/99216
> 	* config/aarch64/aarch64-sve-builtins.cc
> 	(function_builder::add_function): Add placeholder_p argument, build
> 	placeholder decls if this is set.
> 	(function_builder::add_unique_function): Instead of conditionally adding
> 	direct overloads, unconditionally add either a direct overload or a
> 	placeholder.
> 	(function_builder::add_overloaded_function): Set placeholder_p if we're
> 	using C++ overloads.
> 	(function_builder::add_overloaded_functions): Don't return early for
> 	m_direct_overloads: we need to add placeholders.
> 	* config/aarch64/aarch64-sve-builtins.h
> 	(function_builder::add_function): Add placeholder_p argument.
>
> gcc/testsuite/ChangeLog:
>
> 	PR target/99216
> 	* g++.target/aarch64/sve/pr99216.C: New test.
>
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc
> index 25612d2ea52..c0ba352599c 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
> @@ -999,12 +999,25 @@ registered_function &
>  function_builder::add_function (const function_instance &instance,
>  				const char *name, tree fntype, tree attrs,
>  				uint64_t required_extensions,
> -				bool overloaded_p)
> +				bool overloaded_p,
> +				bool placeholder_p)
>  {
>    unsigned int code = vec_safe_length (registered_functions);
>    code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_SVE;
> -  tree decl = simulate_builtin_function_decl (input_location, name, fntype,
> -					      code, NULL, attrs);
> +
> +  /* We need to be able to generate placeholders to enusre that we have a
> +     consistent numbering scheme for function codes between the C and C++
> +     frontends, so that everything ties up in LTO.
> +
> +     The placeholder needs to be non-NULL and some node other than
> +     error_mark_node to pass validation when the builtin is streamed in to lto1.
> +
> +     It's also convenient if the placeholder can store the name, so build a
> +     FUNCTION_DECL, but don't register it with the frontend.  */
> +  tree decl = placeholder_p
> +    ? build_decl (input_location, FUNCTION_DECL, get_identifier (name), fntype)
> +    : simulate_builtin_function_decl (input_location, name, fntype,
> +				      code, NULL, attrs);
>  
>    registered_function &rfn = *ggc_alloc <registered_function> ();
>    rfn.instance = instance;
> @@ -1036,7 +1049,7 @@ function_builder::add_unique_function (const function_instance &instance,
>  					   argument_types.address ());
>    tree attrs = get_attributes (instance);
>    registered_function &rfn = add_function (instance, name, fntype, attrs,
> -					   required_extensions, false);
> +					   required_extensions, false, false);
>  
>    /* Enter the function into the hash table.  */
>    hashval_t hash = instance.hash ();
> @@ -1047,16 +1060,14 @@ function_builder::add_unique_function (const function_instance &instance,
>  
>    /* Also add the function under its overloaded alias, if we want
>       a separate decl for each instance of an overloaded function.  */
> -  if (m_direct_overloads || force_direct_overloads)
> +  char *overload_name = get_name (instance, true);
> +  if (strcmp (name, overload_name) != 0)
>      {
> -      char *overload_name = get_name (instance, true);
> -      if (strcmp (name, overload_name) != 0)
> -	{
> -	  /* Attribute lists shouldn't be shared.  */
> -	  tree attrs = get_attributes (instance);
> -	  add_function (instance, overload_name, fntype, attrs,
> -			required_extensions, false);
> -	}
> +      /* Attribute lists shouldn't be shared.  */
> +      tree attrs = get_attributes (instance);
> +      bool placeholder_p = !(m_direct_overloads || force_direct_overloads);
> +      add_function (instance, overload_name, fntype, attrs,
> +		    required_extensions, false, placeholder_p);
>      }
>  
>    obstack_free (&m_string_obstack, name);
> @@ -1084,7 +1095,7 @@ function_builder::add_overloaded_function (const function_instance &instance,
>      {
>        registered_function &rfn
>  	= add_function (instance, name, m_overload_type, NULL_TREE,
> -			required_extensions, true);
> +			required_extensions, true, m_direct_overloads);
>        const char *permanent_name = IDENTIFIER_POINTER (DECL_NAME (rfn.decl));
>        m_overload_names.put (permanent_name, &rfn);
>      }
> @@ -1098,9 +1109,6 @@ void
>  function_builder::add_overloaded_functions (const function_group_info &group,
>  					    mode_suffix_index mode)
>  {
> -  if (m_direct_overloads)
> -    return;
> -
>    unsigned int explicit_type0 = (*group.shape)->explicit_type_suffix_p (0);
>    unsigned int explicit_type1 = (*group.shape)->explicit_type_suffix_p (1);
>    for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi)
> diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h
> index 620e18841b2..b701f90ac1a 100644
> --- a/gcc/config/aarch64/aarch64-sve-builtins.h
> +++ b/gcc/config/aarch64/aarch64-sve-builtins.h
> @@ -337,7 +337,8 @@ private:
>    tree get_attributes (const function_instance &);
>  
>    registered_function &add_function (const function_instance &,
> -				     const char *, tree, tree, uint64_t, bool);
> +				     const char *, tree, tree,
> +				     uint64_t, bool, bool);
>  
>    /* The function type to use for functions that are resolved by
>       function_resolver.  */
> diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr99216.C b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C
> new file mode 100644
> index 00000000000..61a58a7fcf4
> --- /dev/null
> +++ b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C
> @@ -0,0 +1,5 @@
> +/* { dg-do link { target aarch64_asm_sve_ok } } */
> +/* { dg-additional-options "-flto" } */
> +#include <arm_sve.h>
> +bool a;
> +int main() { a = svaddv(svptrue_b8(), svdup_s8(0)); }
Alex Coplan March 8, 2021, 3:14 p.m. UTC | #3
On 08/03/2021 14:57, Richard Biener wrote:
> On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > Hi all,
> >
> > As discussed in the PR, we currently have two different numbering
> > schemes for SVE builtins: one for C, and one for C++. This is
> > problematic for LTO, where we end up getting confused about which
> > intrinsic we're talking about. This patch inserts placeholders into the
> > registered_functions vector to ensure that there is a consistent
> > numbering scheme for both C and C++.
> >
> > Initially I tried using a null decl for these placeholders, but this
> > trips up some validation when the builtin trees are streamed into lto1,
> 
> Care to elaborate?  Note we stream builtins by their function code
> and thus expect to be able to materialize them later - materializing
> to NULL obviously fails but then the issue is the proper decl isn't there
> in that case.  Materializing a "dummy" decl just will delay the inevitable
> breakage.

So the validation I was referring to is this part of
tree-streamer-in.c:unpack_ts_function_decl_value_fields:

    else if (cl == BUILT_IN_MD)
    {
      tree result = targetm.builtin_decl (fcode, true);
      if (!result || result == error_mark_node)
        fatal_error (input_location, "target specific builtin not available");
    }

Because we stream the original decl and reconstruct it, I was assuming
that we would be using this (real) node everywhere and we would never
need to map from function code to tree node. Indeed, grepping for
"targetm.builtin_decl", the only results are this use, plus a few uses
in the avr backend, so it seems that right now this doesn't matter for
AArch64 (unless I'm missing some other path).

However, it seems that returning a subtly incorrect result for this
taget hook is asking for trouble, and we should return the real node
here.

Richard, do you have a feeling for how we should do this? Perhaps
instantiate real nodes (instead of placeholders) if in_lto_p? Or
unconditionally?

Thanks,
Alex

> 
> > so I ended up building dummy FUNCTION_DECLs as placeholders.
> >
> > To get better test coverage here, I'm wondering if we could make some of
> > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> > For now I've just added the specific testcase from the PR.
> >
> > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
> >
> > Thanks,
> > Alex
> >
> > ---
> >
> > gcc/ChangeLog:
> >
> >         PR target/99216
> >         * config/aarch64/aarch64-sve-builtins.cc
> >         (function_builder::add_function): Add placeholder_p argument, build
> >         placeholder decls if this is set.
> >         (function_builder::add_unique_function): Instead of conditionally adding
> >         direct overloads, unconditionally add either a direct overload or a
> >         placeholder.
> >         (function_builder::add_overloaded_function): Set placeholder_p if we're
> >         using C++ overloads.
> >         (function_builder::add_overloaded_functions): Don't return early for
> >         m_direct_overloads: we need to add placeholders.
> >         * config/aarch64/aarch64-sve-builtins.h
> >         (function_builder::add_function): Add placeholder_p argument.
> >
> > gcc/testsuite/ChangeLog:
> >
> >         PR target/99216
> >         * g++.target/aarch64/sve/pr99216.C: New test.
Richard Biener March 8, 2021, 3:21 p.m. UTC | #4
On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan <alex.coplan@arm.com> wrote:
>
> On 08/03/2021 14:57, Richard Biener wrote:
> > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> > >
> > > Hi all,
> > >
> > > As discussed in the PR, we currently have two different numbering
> > > schemes for SVE builtins: one for C, and one for C++. This is
> > > problematic for LTO, where we end up getting confused about which
> > > intrinsic we're talking about. This patch inserts placeholders into the
> > > registered_functions vector to ensure that there is a consistent
> > > numbering scheme for both C and C++.
> > >
> > > Initially I tried using a null decl for these placeholders, but this
> > > trips up some validation when the builtin trees are streamed into lto1,
> >
> > Care to elaborate?  Note we stream builtins by their function code
> > and thus expect to be able to materialize them later - materializing
> > to NULL obviously fails but then the issue is the proper decl isn't there
> > in that case.  Materializing a "dummy" decl just will delay the inevitable
> > breakage.
>
> So the validation I was referring to is this part of
> tree-streamer-in.c:unpack_ts_function_decl_value_fields:
>
>     else if (cl == BUILT_IN_MD)
>     {
>       tree result = targetm.builtin_decl (fcode, true);
>       if (!result || result == error_mark_node)
>         fatal_error (input_location, "target specific builtin not available");
>     }
>
> Because we stream the original decl and reconstruct it, I was assuming
> that we would be using this (real) node everywhere and we would never
> need to map from function code to tree node. Indeed, grepping for
> "targetm.builtin_decl", the only results are this use, plus a few uses
> in the avr backend, so it seems that right now this doesn't matter for
> AArch64 (unless I'm missing some other path).

Hmm, I thought we were streaming those by reference - indeed
this probably got changed when we stopped doing that for
the middle-end builtins which we did because those are often
"misrecognized".

> However, it seems that returning a subtly incorrect result for this
> taget hook is asking for trouble, and we should return the real node
> here.

Yeah, the intent of this target hook is exactly that.  There's now of course
the option to simply remove it if it just exists for the checking done above
(need to check the avr backend, of course).  I suppose the cost in the
LTO IR stream isn't so big thus we can safely ditch the optimization
forever?

> Richard, do you have a feeling for how we should do this? Perhaps
> instantiate real nodes (instead of placeholders) if in_lto_p? Or
> unconditionally?
>
> Thanks,
> Alex
>
> >
> > > so I ended up building dummy FUNCTION_DECLs as placeholders.
> > >
> > > To get better test coverage here, I'm wondering if we could make some of
> > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> > > For now I've just added the specific testcase from the PR.
> > >
> > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
> > >
> > > Thanks,
> > > Alex
> > >
> > > ---
> > >
> > > gcc/ChangeLog:
> > >
> > >         PR target/99216
> > >         * config/aarch64/aarch64-sve-builtins.cc
> > >         (function_builder::add_function): Add placeholder_p argument, build
> > >         placeholder decls if this is set.
> > >         (function_builder::add_unique_function): Instead of conditionally adding
> > >         direct overloads, unconditionally add either a direct overload or a
> > >         placeholder.
> > >         (function_builder::add_overloaded_function): Set placeholder_p if we're
> > >         using C++ overloads.
> > >         (function_builder::add_overloaded_functions): Don't return early for
> > >         m_direct_overloads: we need to add placeholders.
> > >         * config/aarch64/aarch64-sve-builtins.h
> > >         (function_builder::add_function): Add placeholder_p argument.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >         PR target/99216
> > >         * g++.target/aarch64/sve/pr99216.C: New test.
Alex Coplan March 8, 2021, 3:45 p.m. UTC | #5
On 08/03/2021 16:21, Richard Biener wrote:
> On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan <alex.coplan@arm.com> wrote:
> >
> > On 08/03/2021 14:57, Richard Biener wrote:
> > > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
> > > <gcc-patches@gcc.gnu.org> wrote:
> > > >
> > > > Hi all,
> > > >
> > > > As discussed in the PR, we currently have two different numbering
> > > > schemes for SVE builtins: one for C, and one for C++. This is
> > > > problematic for LTO, where we end up getting confused about which
> > > > intrinsic we're talking about. This patch inserts placeholders into the
> > > > registered_functions vector to ensure that there is a consistent
> > > > numbering scheme for both C and C++.
> > > >
> > > > Initially I tried using a null decl for these placeholders, but this
> > > > trips up some validation when the builtin trees are streamed into lto1,
> > >
> > > Care to elaborate?  Note we stream builtins by their function code
> > > and thus expect to be able to materialize them later - materializing
> > > to NULL obviously fails but then the issue is the proper decl isn't there
> > > in that case.  Materializing a "dummy" decl just will delay the inevitable
> > > breakage.
> >
> > So the validation I was referring to is this part of
> > tree-streamer-in.c:unpack_ts_function_decl_value_fields:
> >
> >     else if (cl == BUILT_IN_MD)
> >     {
> >       tree result = targetm.builtin_decl (fcode, true);
> >       if (!result || result == error_mark_node)
> >         fatal_error (input_location, "target specific builtin not available");
> >     }
> >
> > Because we stream the original decl and reconstruct it, I was assuming
> > that we would be using this (real) node everywhere and we would never
> > need to map from function code to tree node. Indeed, grepping for
> > "targetm.builtin_decl", the only results are this use, plus a few uses
> > in the avr backend, so it seems that right now this doesn't matter for
> > AArch64 (unless I'm missing some other path).
> 
> Hmm, I thought we were streaming those by reference - indeed
> this probably got changed when we stopped doing that for
> the middle-end builtins which we did because those are often
> "misrecognized".
> 
> > However, it seems that returning a subtly incorrect result for this
> > taget hook is asking for trouble, and we should return the real node
> > here.
> 
> Yeah, the intent of this target hook is exactly that.  There's now of course
> the option to simply remove it if it just exists for the checking done above
> (need to check the avr backend, of course).  I suppose the cost in the
> LTO IR stream isn't so big thus we can safely ditch the optimization
> forever?

The option of just removing the target hook seems appealing, if that
would be acceptable. It looks like avr could easily be changed to just
call its internal implementation (avr_builtin_decl) instead of the
target hook.

> 
> > Richard, do you have a feeling for how we should do this? Perhaps
> > instantiate real nodes (instead of placeholders) if in_lto_p? Or
> > unconditionally?
> >
> > Thanks,
> > Alex
> >
> > >
> > > > so I ended up building dummy FUNCTION_DECLs as placeholders.
> > > >
> > > > To get better test coverage here, I'm wondering if we could make some of
> > > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> > > > For now I've just added the specific testcase from the PR.
> > > >
> > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
> > > >
> > > > Thanks,
> > > > Alex
> > > >
> > > > ---
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > >         PR target/99216
> > > >         * config/aarch64/aarch64-sve-builtins.cc
> > > >         (function_builder::add_function): Add placeholder_p argument, build
> > > >         placeholder decls if this is set.
> > > >         (function_builder::add_unique_function): Instead of conditionally adding
> > > >         direct overloads, unconditionally add either a direct overload or a
> > > >         placeholder.
> > > >         (function_builder::add_overloaded_function): Set placeholder_p if we're
> > > >         using C++ overloads.
> > > >         (function_builder::add_overloaded_functions): Don't return early for
> > > >         m_direct_overloads: we need to add placeholders.
> > > >         * config/aarch64/aarch64-sve-builtins.h
> > > >         (function_builder::add_function): Add placeholder_p argument.
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > >         PR target/99216
> > > >         * g++.target/aarch64/sve/pr99216.C: New test.
Richard Biener March 9, 2021, 8:43 a.m. UTC | #6
On Mon, Mar 8, 2021 at 4:46 PM Alex Coplan <alex.coplan@arm.com> wrote:
>
> On 08/03/2021 16:21, Richard Biener wrote:
> > On Mon, Mar 8, 2021 at 4:14 PM Alex Coplan <alex.coplan@arm.com> wrote:
> > >
> > > On 08/03/2021 14:57, Richard Biener wrote:
> > > > On Mon, Mar 8, 2021 at 12:44 PM Alex Coplan via Gcc-patches
> > > > <gcc-patches@gcc.gnu.org> wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > As discussed in the PR, we currently have two different numbering
> > > > > schemes for SVE builtins: one for C, and one for C++. This is
> > > > > problematic for LTO, where we end up getting confused about which
> > > > > intrinsic we're talking about. This patch inserts placeholders into the
> > > > > registered_functions vector to ensure that there is a consistent
> > > > > numbering scheme for both C and C++.
> > > > >
> > > > > Initially I tried using a null decl for these placeholders, but this
> > > > > trips up some validation when the builtin trees are streamed into lto1,
> > > >
> > > > Care to elaborate?  Note we stream builtins by their function code
> > > > and thus expect to be able to materialize them later - materializing
> > > > to NULL obviously fails but then the issue is the proper decl isn't there
> > > > in that case.  Materializing a "dummy" decl just will delay the inevitable
> > > > breakage.
> > >
> > > So the validation I was referring to is this part of
> > > tree-streamer-in.c:unpack_ts_function_decl_value_fields:
> > >
> > >     else if (cl == BUILT_IN_MD)
> > >     {
> > >       tree result = targetm.builtin_decl (fcode, true);
> > >       if (!result || result == error_mark_node)
> > >         fatal_error (input_location, "target specific builtin not available");
> > >     }
> > >
> > > Because we stream the original decl and reconstruct it, I was assuming
> > > that we would be using this (real) node everywhere and we would never
> > > need to map from function code to tree node. Indeed, grepping for
> > > "targetm.builtin_decl", the only results are this use, plus a few uses
> > > in the avr backend, so it seems that right now this doesn't matter for
> > > AArch64 (unless I'm missing some other path).
> >
> > Hmm, I thought we were streaming those by reference - indeed
> > this probably got changed when we stopped doing that for
> > the middle-end builtins which we did because those are often
> > "misrecognized".
> >
> > > However, it seems that returning a subtly incorrect result for this
> > > taget hook is asking for trouble, and we should return the real node
> > > here.
> >
> > Yeah, the intent of this target hook is exactly that.  There's now of course
> > the option to simply remove it if it just exists for the checking done above
> > (need to check the avr backend, of course).  I suppose the cost in the
> > LTO IR stream isn't so big thus we can safely ditch the optimization
> > forever?
>
> The option of just removing the target hook seems appealing, if that
> would be acceptable. It looks like avr could easily be changed to just
> call its internal implementation (avr_builtin_decl) instead of the
> target hook.

OK, let's not rush things but do this for next stage1 if desired.

Given you know how the target hook works you can also return
integer_zero_node to make the code happy ... (just building fake
decls feels somewhat odd)

> >
> > > Richard, do you have a feeling for how we should do this? Perhaps
> > > instantiate real nodes (instead of placeholders) if in_lto_p? Or
> > > unconditionally?
> > >
> > > Thanks,
> > > Alex
> > >
> > > >
> > > > > so I ended up building dummy FUNCTION_DECLs as placeholders.
> > > > >
> > > > > To get better test coverage here, I'm wondering if we could make some of
> > > > > the existing ACLE tests also run with LTO? Perhaps as a follow-on patch?
> > > > > For now I've just added the specific testcase from the PR.
> > > > >
> > > > > Bootstrapped and regtested on aarch64-linux-gnu, OK for trunk?
> > > > >
> > > > > Thanks,
> > > > > Alex
> > > > >
> > > > > ---
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > >         PR target/99216
> > > > >         * config/aarch64/aarch64-sve-builtins.cc
> > > > >         (function_builder::add_function): Add placeholder_p argument, build
> > > > >         placeholder decls if this is set.
> > > > >         (function_builder::add_unique_function): Instead of conditionally adding
> > > > >         direct overloads, unconditionally add either a direct overload or a
> > > > >         placeholder.
> > > > >         (function_builder::add_overloaded_function): Set placeholder_p if we're
> > > > >         using C++ overloads.
> > > > >         (function_builder::add_overloaded_functions): Don't return early for
> > > > >         m_direct_overloads: we need to add placeholders.
> > > > >         * config/aarch64/aarch64-sve-builtins.h
> > > > >         (function_builder::add_function): Add placeholder_p argument.
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > >         PR target/99216
> > > > >         * g++.target/aarch64/sve/pr99216.C: New test.
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 25612d2ea52..c0ba352599c 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -999,12 +999,25 @@  registered_function &
 function_builder::add_function (const function_instance &instance,
 				const char *name, tree fntype, tree attrs,
 				uint64_t required_extensions,
-				bool overloaded_p)
+				bool overloaded_p,
+				bool placeholder_p)
 {
   unsigned int code = vec_safe_length (registered_functions);
   code = (code << AARCH64_BUILTIN_SHIFT) | AARCH64_BUILTIN_SVE;
-  tree decl = simulate_builtin_function_decl (input_location, name, fntype,
-					      code, NULL, attrs);
+
+  /* We need to be able to generate placeholders to enusre that we have a
+     consistent numbering scheme for function codes between the C and C++
+     frontends, so that everything ties up in LTO.
+
+     The placeholder needs to be non-NULL and some node other than
+     error_mark_node to pass validation when the builtin is streamed in to lto1.
+
+     It's also convenient if the placeholder can store the name, so build a
+     FUNCTION_DECL, but don't register it with the frontend.  */
+  tree decl = placeholder_p
+    ? build_decl (input_location, FUNCTION_DECL, get_identifier (name), fntype)
+    : simulate_builtin_function_decl (input_location, name, fntype,
+				      code, NULL, attrs);
 
   registered_function &rfn = *ggc_alloc <registered_function> ();
   rfn.instance = instance;
@@ -1036,7 +1049,7 @@  function_builder::add_unique_function (const function_instance &instance,
 					   argument_types.address ());
   tree attrs = get_attributes (instance);
   registered_function &rfn = add_function (instance, name, fntype, attrs,
-					   required_extensions, false);
+					   required_extensions, false, false);
 
   /* Enter the function into the hash table.  */
   hashval_t hash = instance.hash ();
@@ -1047,16 +1060,14 @@  function_builder::add_unique_function (const function_instance &instance,
 
   /* Also add the function under its overloaded alias, if we want
      a separate decl for each instance of an overloaded function.  */
-  if (m_direct_overloads || force_direct_overloads)
+  char *overload_name = get_name (instance, true);
+  if (strcmp (name, overload_name) != 0)
     {
-      char *overload_name = get_name (instance, true);
-      if (strcmp (name, overload_name) != 0)
-	{
-	  /* Attribute lists shouldn't be shared.  */
-	  tree attrs = get_attributes (instance);
-	  add_function (instance, overload_name, fntype, attrs,
-			required_extensions, false);
-	}
+      /* Attribute lists shouldn't be shared.  */
+      tree attrs = get_attributes (instance);
+      bool placeholder_p = !(m_direct_overloads || force_direct_overloads);
+      add_function (instance, overload_name, fntype, attrs,
+		    required_extensions, false, placeholder_p);
     }
 
   obstack_free (&m_string_obstack, name);
@@ -1084,7 +1095,7 @@  function_builder::add_overloaded_function (const function_instance &instance,
     {
       registered_function &rfn
 	= add_function (instance, name, m_overload_type, NULL_TREE,
-			required_extensions, true);
+			required_extensions, true, m_direct_overloads);
       const char *permanent_name = IDENTIFIER_POINTER (DECL_NAME (rfn.decl));
       m_overload_names.put (permanent_name, &rfn);
     }
@@ -1098,9 +1109,6 @@  void
 function_builder::add_overloaded_functions (const function_group_info &group,
 					    mode_suffix_index mode)
 {
-  if (m_direct_overloads)
-    return;
-
   unsigned int explicit_type0 = (*group.shape)->explicit_type_suffix_p (0);
   unsigned int explicit_type1 = (*group.shape)->explicit_type_suffix_p (1);
   for (unsigned int pi = 0; group.preds[pi] != NUM_PREDS; ++pi)
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h b/gcc/config/aarch64/aarch64-sve-builtins.h
index 620e18841b2..b701f90ac1a 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.h
+++ b/gcc/config/aarch64/aarch64-sve-builtins.h
@@ -337,7 +337,8 @@  private:
   tree get_attributes (const function_instance &);
 
   registered_function &add_function (const function_instance &,
-				     const char *, tree, tree, uint64_t, bool);
+				     const char *, tree, tree,
+				     uint64_t, bool, bool);
 
   /* The function type to use for functions that are resolved by
      function_resolver.  */
diff --git a/gcc/testsuite/g++.target/aarch64/sve/pr99216.C b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C
new file mode 100644
index 00000000000..61a58a7fcf4
--- /dev/null
+++ b/gcc/testsuite/g++.target/aarch64/sve/pr99216.C
@@ -0,0 +1,5 @@ 
+/* { dg-do link { target aarch64_asm_sve_ok } } */
+/* { dg-additional-options "-flto" } */
+#include <arm_sve.h>
+bool a;
+int main() { a = svaddv(svptrue_b8(), svdup_s8(0)); }