diff mbox series

[nvptx] Fix code-gen for alias attribute

Message ID IA1PR12MB903193CDAE8C74B8C44B4621CE8B2@IA1PR12MB9031.namprd12.prod.outlook.com
State New
Headers show
Series [nvptx] Fix code-gen for alias attribute | expand

Commit Message

Prathamesh Kulkarni Aug. 26, 2024, 10:50 a.m. UTC
Hi,
For the following test (adapted from pr96390.c):

__attribute__((noipa)) int foo () { return 42; }
int bar () __attribute__((alias ("foo")));
int baz () __attribute__((alias ("bar")));

int main ()
{
  int n;
  #pragma omp target map(from:n)
    n = baz ();
  return n;
}

Compiling with -fopenmp -foffload=nvptx-none -foffload=-malias -foffload=-mptx=6.3 results in:

ptxas fatal   : Internal error: alias to unknown symbol
nvptx-as: ptxas returned 255 exit status
nvptx mkoffload: fatal error: ../../install/bin/aarch64-unknown-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
compilation terminated.
lto-wrapper: fatal error: /home/prathameshk/gnu-toolchain/gcc/grcogcc-38/install/libexec/gcc/aarch64-unknown-linux-gnu/15.0.0//accel/nvptx-none/mkoffload returned 1 exit status
compilation terminated. 

This happens because ptx code-gen shows:

// BEGIN GLOBAL FUNCTION DEF: foo
.visible .func (.param.u32 %value_out) foo
{
        .reg.u32 %value;
                mov.u32 %value, 42;
        st.param.u32    [%value_out], %value;
        ret;
}
.visible .func (.param.u32 %value_out) bar;
.alias bar,foo;
.visible .func (.param.u32 %value_out) baz;
.alias baz,bar;

.alias baz, bar is invalid since PTX requires aliasee to be a defined function:
https://sw-docs-dgx-station.nvidia.com/cuda-latest/parallel-thread-execution/latest-internal/#kernel-and-function-directives-alias

The patch uses cgraph_node::get(name)->ultimate_alias_target () instead of the provided value in nvptx_asm_output_def_from_decls.
For the above case, it now generates the following ptx:

.alias baz,foo; 
instead of:
.alias baz,bar;

which fixes the issue.

Does the patch look in the right direction ?

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>

Thanks,
Prathamesh
[nvptx] Fix code-gen for alias attribute.

For the following test (adapted from pr96390.c):

__attribute__((noipa)) int foo () { return 42; }
int bar () __attribute__((alias ("foo")));
int baz () __attribute__((alias ("bar")));

int main ()
{
  int n;
  #pragma omp target map(from:n)
    n = baz ();
  return n;
}

gcc emits following ptx for baz:
.visible .func (.param.u32 %value_out) bar;
.alias bar,foo;
.visible .func (.param.u32 %value_out) baz;
.alias baz,bar;

which is incorrect since PTX requires aliasee to be a defined function.
The patch instead uses cgraph_node::get(name)->ultimate_alias_target,
which generates the following PTX:

.visible .func (.param.u32 %value_out) baz;
.alias baz,foo;

gcc/ChangeLog:

	* config/nvptx/nvptx.cc (nvptx_asm_output_def_from_decls): Use
	cgraph_node::get(name)->ultimate_alias_target instead of value.

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>

Comments

Prathamesh Kulkarni Sept. 2, 2024, 4:19 a.m. UTC | #1
> -----Original Message-----
> From: Prathamesh Kulkarni <prathameshk@nvidia.com>
> Sent: Monday, August 26, 2024 4:21 PM
> To: Thomas Schwinge <tschwinge@baylibre.com>; gcc-patches@gcc.gnu.org
> Subject: [nvptx] Fix code-gen for alias attribute
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi,
> For the following test (adapted from pr96390.c):
> 
> __attribute__((noipa)) int foo () { return 42; } int bar ()
> __attribute__((alias ("foo"))); int baz () __attribute__((alias
> ("bar")));
> 
> int main ()
> {
>   int n;
>   #pragma omp target map(from:n)
>     n = baz ();
>   return n;
> }
> 
> Compiling with -fopenmp -foffload=nvptx-none -foffload=-malias -
> foffload=-mptx=6.3 results in:
> 
> ptxas fatal   : Internal error: alias to unknown symbol
> nvptx-as: ptxas returned 255 exit status nvptx mkoffload: fatal error:
> ../../install/bin/aarch64-unknown-linux-gnu-accel-nvptx-none-gcc
> returned 1 exit status compilation terminated.
> lto-wrapper: fatal error: /home/prathameshk/gnu-toolchain/gcc/grcogcc-
> 38/install/libexec/gcc/aarch64-unknown-linux-gnu/15.0.0//accel/nvptx-
> none/mkoffload returned 1 exit status compilation terminated.
> 
> This happens because ptx code-gen shows:
> 
> // BEGIN GLOBAL FUNCTION DEF: foo
> .visible .func (.param.u32 %value_out) foo {
>         .reg.u32 %value;
>                 mov.u32 %value, 42;
>         st.param.u32    [%value_out], %value;
>         ret;
> }
> .visible .func (.param.u32 %value_out) bar; .alias bar,foo; .visible
> .func (.param.u32 %value_out) baz; .alias baz,bar;
> 
> .alias baz, bar is invalid since PTX requires aliasee to be a defined
> function:
> https://sw-docs-dgx-station.nvidia.com/cuda-latest/parallel-thread-
> execution/latest-internal/#kernel-and-function-directives-alias
> 
> The patch uses cgraph_node::get(name)->ultimate_alias_target ()
> instead of the provided value in nvptx_asm_output_def_from_decls.
> For the above case, it now generates the following ptx:
> 
> .alias baz,foo;
> instead of:
> .alias baz,bar;
> 
> which fixes the issue.
> 
> Does the patch look in the right direction ?
> 
> Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
Hi,
ping: https://gcc.gnu.org/pipermail/gcc-patches/2024-August/661457.html

Thanks,
Prathamesh
> 
> Thanks,
> Prathamesh
Thomas Schwinge Sept. 4, 2024, 9:45 a.m. UTC | #2
Hi!

Honza (or others, of course), there's a question about
'ultimate_alias_target'.

On 2024-08-26T10:50:36+0000, Prathamesh Kulkarni <prathameshk@nvidia.com> wrote:
> For the following test (adapted from pr96390.c):
>
> __attribute__((noipa)) int foo () { return 42; }
> int bar () __attribute__((alias ("foo")));
> int baz () __attribute__((alias ("bar")));

> Compiling [for nvptx] results in:
>
> ptxas fatal   : Internal error: alias to unknown symbol
> nvptx-as: ptxas returned 255 exit status

Prathamesh: thanks for looking into this, and ACK: one of the many
limitations of PTX '.alias'.  :-|

> This happens because ptx code-gen shows:
>
> // BEGIN GLOBAL FUNCTION DEF: foo
> .visible .func (.param.u32 %value_out) foo
> {
>   [...]
> }
> .visible .func (.param.u32 %value_out) bar;
> .alias bar,foo;
> .visible .func (.param.u32 %value_out) baz;
> .alias baz,bar;

> .alias baz, bar is invalid since PTX requires aliasee to be a defined function:
> https://sw-docs-dgx-station.nvidia.com/cuda-latest/parallel-thread-execution/latest-internal/#kernel-and-function-directives-alias

(Us ordinary mortals need to look at
<https://docs.nvidia.com/cuda/parallel-thread-execution/#kernel-and-function-directives-alias>;
please update the Git commit log.)

> The patch uses cgraph_node::get(name)->ultimate_alias_target () instead of the provided value in nvptx_asm_output_def_from_decls.

I confirm that resolving to 'ultimate_alias_target' does work for this
case:

> For the above case, it now generates the following ptx:
>
> .alias baz,foo; 
> instead of:
> .alias baz,bar;
>
> which fixes the issue.

..., but I'm not sure if that's conceptually correct; I'm not familiar
with 'ultimate_alias_target' semantics.  (Honza?)

Also, I wonder whether 'gcc/varasm.cc:do_assemble_alias' is prepared for
'ASM_OUTPUT_DEF_FROM_DECLS' to disregard the specified 'target'/'value'
and instead do its own thing (here, the proposed resolving to
'ultimate_alias_target')?  (No other GCC back end appears to be doing
such a thing; from a quick look, all appear to faithfully use the
specified 'target'/'value'.)

Now, consider the case that the source code is changed as follows:

     __attribute__((noipa)) int foo () { return 42; }
    -int bar () __attribute__((alias ("foo")));
    +int bar () __attribute__((weak, alias ("foo")));
     int baz () __attribute__((alias ("bar")));

With 'ultimate_alias_target', I've checked, you'd then still emit
'.alias baz,foo;', losing the ability to override the weak alias with a
strong 'bar' definition in another compilation unit?

Now, that said: GCC/nvptx for such code currently diagnoses
"error: weak alias definitions not supported [...]" ;-| -- so we may be
safe, after all?  ..., or is there any other way that the resolving to
'ultimate_alias_target' might cause issues?  If not, then at least your
proposed patch shouldn't be causing any harm (doesn't affect
'--target=nvptx-none' test results at all...), and does address one
user-visible issue ('libgomp.c-c++-common/pr96390.c'), and thus makes
sense to install.

> [nvptx] Fix code-gen for alias attribute.

I'd rather suggest something like:
"[nvptx] (Some) support for aliases to aliases" (or similar).

Also, please add "PR target/104957" to the Git commit log, as your change
directly alters this one aspect of PR104957
"[nvptx] Use .alias directive (available starting ptx isa version 6.3)"'s
commit r12-7766-gf8b15e177155960017ac0c5daef8780d1127f91c
"[nvptx] Use .alias directive for mptx >= 6.3":

| Aliases to aliases are not supported (see libgomp.c-c++-common/pr96390.c).
| This is currently not prohibited by the compiler, but with the driver link we
| run into:  "Internal error: alias to unknown symbol" .

... which we then have (some) support for with the proposed code changes:

> --- a/gcc/config/nvptx/nvptx.cc
> +++ b/gcc/config/nvptx/nvptx.cc
> @@ -7583,7 +7583,8 @@ nvptx_mem_local_p (rtx mem)
>    while (0)
>  
>  void
> -nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
> +nvptx_asm_output_def_from_decls (FILE *stream, tree name,
> +				 tree value ATTRIBUTE_UNUSED)
>  {
>    if (nvptx_alias == 0 || !TARGET_PTX_6_3)
>      {
> @@ -7618,7 +7619,8 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
>        return;
>      }
>  
> -  if (!cgraph_node::get (name)->referred_to_p ())
> +  cgraph_node *cnode = cgraph_node::get (name);
> +  if (!cnode->referred_to_p ())
>      /* Prevent "Internal error: reference to deleted section".  */
>      return;
>  
> @@ -7627,8 +7629,10 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
>    fputs (s.str ().c_str (), stream);
>  
>    tree id = DECL_ASSEMBLER_NAME (name);
> +  symtab_node *alias_target_node = cnode->ultimate_alias_target ();
> +  tree alias_target_id = DECL_ASSEMBLER_NAME (alias_target_node->decl);
>    NVPTX_ASM_OUTPUT_DEF (stream, IDENTIFIER_POINTER (id),
> -			IDENTIFIER_POINTER (value));
> +			IDENTIFIER_POINTER (alias_target_id));
>  }
>  
>  #undef NVPTX_ASM_OUTPUT_DEF

Please put some rationale comment before the 'ultimate_alias_target'.


All that said, I'm currently working on <https://gcc.gnu.org/PR105018>
"[nvptx] Need better alias support", via
<https://github.com/SourceryTools/nvptx-tools/issues/32>
"[LD] Handle alias in nvptx-ld as nvptx's .alias does not handle it fully".


Grüße
 Thomas
Prathamesh Kulkarni Sept. 23, 2024, 8:24 a.m. UTC | #3
> -----Original Message-----
> From: Thomas Schwinge <tschwinge@baylibre.com>
> Sent: Wednesday, September 4, 2024 3:15 PM
> To: Prathamesh Kulkarni <prathameshk@nvidia.com>; Jan Hubicka
> <hubicka@ucw.cz>; gcc-patches@gcc.gnu.org
> Subject: Re: [nvptx] Fix code-gen for alias attribute
> 
> External email: Use caution opening links or attachments
> 
> 
> Hi!
> 
> Honza (or others, of course), there's a question about
> 'ultimate_alias_target'.
> 
> On 2024-08-26T10:50:36+0000, Prathamesh Kulkarni
> <prathameshk@nvidia.com> wrote:
> > For the following test (adapted from pr96390.c):
> >
> > __attribute__((noipa)) int foo () { return 42; } int bar ()
> > __attribute__((alias ("foo"))); int baz () __attribute__((alias
> > ("bar")));
> 
> > Compiling [for nvptx] results in:
> >
> > ptxas fatal   : Internal error: alias to unknown symbol
> > nvptx-as: ptxas returned 255 exit status
> 
> Prathamesh: thanks for looking into this, and ACK: one of the many
> limitations of PTX '.alias'.  :-|
> 
> > This happens because ptx code-gen shows:
> >
> > // BEGIN GLOBAL FUNCTION DEF: foo
> > .visible .func (.param.u32 %value_out) foo {
> >   [...]
> > }
> > .visible .func (.param.u32 %value_out) bar; .alias bar,foo; .visible
> > .func (.param.u32 %value_out) baz; .alias baz,bar;
> 
> > .alias baz, bar is invalid since PTX requires aliasee to be a defined
> function:
> > https://sw-docs-dgx-station.nvidia.com/cuda-latest/parallel-thread-exe
> > cution/latest-internal/#kernel-and-function-directives-alias
> 
> (Us ordinary mortals need to look at
> <https://docs.nvidia.com/cuda/parallel-thread-execution/#kernel-and-
> function-directives-alias>;
> please update the Git commit log.)
> 
> > The patch uses cgraph_node::get(name)->ultimate_alias_target ()
> instead of the provided value in nvptx_asm_output_def_from_decls.
> 
> I confirm that resolving to 'ultimate_alias_target' does work for this
> case:
> 
> > For the above case, it now generates the following ptx:
> >
> > .alias baz,foo;
> > instead of:
> > .alias baz,bar;
> >
> > which fixes the issue.
> 
> ..., but I'm not sure if that's conceptually correct; I'm not familiar
> with 'ultimate_alias_target' semantics.  (Honza?)
> 
> Also, I wonder whether 'gcc/varasm.cc:do_assemble_alias' is prepared for
> 'ASM_OUTPUT_DEF_FROM_DECLS' to disregard the specified 'target'/'value'
> and instead do its own thing (here, the proposed resolving to
> 'ultimate_alias_target')?  (No other GCC back end appears to be doing
> such a thing; from a quick look, all appear to faithfully use the
> specified 'target'/'value'.)
> 
> Now, consider the case that the source code is changed as follows:
> 
>      __attribute__((noipa)) int foo () { return 42; }
>     -int bar () __attribute__((alias ("foo")));
>     +int bar () __attribute__((weak, alias ("foo")));
>      int baz () __attribute__((alias ("bar")));
> 
> With 'ultimate_alias_target', I've checked, you'd then still emit
> '.alias baz,foo;', losing the ability to override the weak alias with a
> strong 'bar' definition in another compilation unit?
> 
> Now, that said: GCC/nvptx for such code currently diagnoses
> "error: weak alias definitions not supported [...]" ;-| -- so we may be
> safe, after all?  ..., or is there any other way that the resolving to
> 'ultimate_alias_target' might cause issues?  If not, then at least your
> proposed patch shouldn't be causing any harm (doesn't affect '--
> target=nvptx-none' test results at all...), and does address one user-
> visible issue ('libgomp.c-c++-common/pr96390.c'), and thus makes sense
> to install.
> 
> > [nvptx] Fix code-gen for alias attribute.
> 
> I'd rather suggest something like:
> "[nvptx] (Some) support for aliases to aliases" (or similar).
> 
> Also, please add "PR target/104957" to the Git commit log, as your
> change directly alters this one aspect of PR104957 "[nvptx] Use .alias
> directive (available starting ptx isa version 6.3)"'s commit r12-7766-
> gf8b15e177155960017ac0c5daef8780d1127f91c
> "[nvptx] Use .alias directive for mptx >= 6.3":
> 
> | Aliases to aliases are not supported (see libgomp.c-c++-
> common/pr96390.c).
> | This is currently not prohibited by the compiler, but with the driver
> | link we run into:  "Internal error: alias to unknown symbol" .
> 
> ... which we then have (some) support for with the proposed code
> changes:
> 
> > --- a/gcc/config/nvptx/nvptx.cc
> > +++ b/gcc/config/nvptx/nvptx.cc
> > @@ -7583,7 +7583,8 @@ nvptx_mem_local_p (rtx mem)
> >    while (0)
> >
> >  void
> > -nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
> > +nvptx_asm_output_def_from_decls (FILE *stream, tree name,
> > +                              tree value ATTRIBUTE_UNUSED)
> >  {
> >    if (nvptx_alias == 0 || !TARGET_PTX_6_3)
> >      {
> > @@ -7618,7 +7619,8 @@ nvptx_asm_output_def_from_decls (FILE *stream,
> tree name, tree value)
> >        return;
> >      }
> >
> > -  if (!cgraph_node::get (name)->referred_to_p ())
> > +  cgraph_node *cnode = cgraph_node::get (name);  if
> > + (!cnode->referred_to_p ())
> >      /* Prevent "Internal error: reference to deleted section".  */
> >      return;
> >
> > @@ -7627,8 +7629,10 @@ nvptx_asm_output_def_from_decls (FILE *stream,
> tree name, tree value)
> >    fputs (s.str ().c_str (), stream);
> >
> >    tree id = DECL_ASSEMBLER_NAME (name);
> > +  symtab_node *alias_target_node = cnode->ultimate_alias_target ();
> > + tree alias_target_id = DECL_ASSEMBLER_NAME
> > + (alias_target_node->decl);
> >    NVPTX_ASM_OUTPUT_DEF (stream, IDENTIFIER_POINTER (id),
> > -                     IDENTIFIER_POINTER (value));
> > +                     IDENTIFIER_POINTER (alias_target_id));
> >  }
> >
> >  #undef NVPTX_ASM_OUTPUT_DEF
> 
> Please put some rationale comment before the 'ultimate_alias_target'.
> 
> 
> All that said, I'm currently working on <https://gcc.gnu.org/PR105018>
> "[nvptx] Need better alias support", via
> <https://github.com/SourceryTools/nvptx-tools/issues/32>
> "[LD] Handle alias in nvptx-ld as nvptx's .alias does not handle it
> fully".
Hi Thomas,
Thanks for the review and sorry for late reply.
The attached patch addresses the above suggestions.
Does it look OK ?
(Also, could you please test it at your end as well?)

Signed-off-by:

Thanks,
Prathamesh
> 
> 
> Grüße
>  Thomas
nvptx: Partial support for aliases to aliases.

For the following test (adapted from pr96390.c):

__attribute__((noipa)) int foo () { return 42; }
int bar () __attribute__((alias ("foo")));
int baz () __attribute__((alias ("bar")));

int main ()
{
  int n;
  #pragma omp target map(from:n)
    n = baz ();
  return n;
}

gcc emits following ptx for baz:
.visible .func (.param.u32 %value_out) bar;
.alias bar,foo;
.visible .func (.param.u32 %value_out) baz;
.alias baz,bar;

which is incorrect since PTX requires aliasee to be a defined function.
The patch instead uses cgraph_node::get(name)->ultimate_alias_target,
which generates the following PTX:

.visible .func (.param.u32 %value_out) baz;
.alias baz,foo;

gcc/ChangeLog:
	PR target/104957
        * config/nvptx/nvptx.cc (nvptx_asm_output_def_from_decls): Use
        cgraph_node::get(name)->ultimate_alias_target instead of value.

gcc/testsuite/ChangeLog:
	PR target/104957
	* gcc.target/nvptx/alias-to-alias-1.c: Adjust.

Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 4a7c64f05eb..96a1134220e 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -7582,7 +7582,8 @@ nvptx_mem_local_p (rtx mem)
   while (0)
 
 void
-nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
+nvptx_asm_output_def_from_decls (FILE *stream, tree name,
+				 tree value ATTRIBUTE_UNUSED)
 {
   if (nvptx_alias == 0 || !TARGET_PTX_6_3)
     {
@@ -7617,7 +7618,8 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
       return;
     }
 
-  if (!cgraph_node::get (name)->referred_to_p ())
+  cgraph_node *cnode = cgraph_node::get (name);
+  if (!cnode->referred_to_p ())
     /* Prevent "Internal error: reference to deleted section".  */
     return;
 
@@ -7626,11 +7628,27 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
   fputs (s.str ().c_str (), stream);
 
   tree id = DECL_ASSEMBLER_NAME (name);
+
+  /* Walk alias chain to get reference callgraph node.
+     The rationale of using ultimate_alias_target here is that
+     PTX's .alias directive only supports 1-level aliasing where
+     aliasee is function defined in same module.
+
+     So for the following case:
+     int foo() { return 42; }
+     int bar () __attribute__((alias ("foo")));
+     int baz () __attribute__((alias ("bar")));
+
+     should resolve baz to foo:
+     .visible .func (.param.u32 %value_out) baz;
+     .alias baz,foo;  */
+  symtab_node *alias_target_node = cnode->ultimate_alias_target ();
+  tree alias_target_id = DECL_ASSEMBLER_NAME (alias_target_node->decl);
   std::stringstream s_def;
   write_fn_marker (s_def, true, TREE_PUBLIC (name), IDENTIFIER_POINTER (id));
   fputs (s_def.str ().c_str (), stream);
   NVPTX_ASM_OUTPUT_DEF (stream, IDENTIFIER_POINTER (id),
-			IDENTIFIER_POINTER (value));
+			IDENTIFIER_POINTER (alias_target_id));
 }
 
 #undef NVPTX_ASM_OUTPUT_DEF
diff --git a/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c b/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c
index 7bce7a358c7..08de9e6d69d 100644
--- a/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c
+++ b/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c
@@ -1,6 +1,8 @@
 /* Alias to alias; 'libgomp.c-c++-common/pr96390.c'.  */
 
-/* { dg-do compile } */
+/* { dg-do link } */
+/* { dg-do run { target nvptx_runtime_alias_ptx } } */
+/* { dg-options -save-temps } */
 /* { dg-add-options nvptx_alias_ptx } */
 
 int v;
@@ -32,7 +34,7 @@ main (void)
 /* { dg-final { scan-assembler-times {(?n)^// BEGIN GLOBAL FUNCTION DECL: baz$} 1 } }
    { dg-final { scan-assembler-times {(?n)^\.visible \.func baz;$} 1 } }
    { dg-final { scan-assembler-times {(?n)^// BEGIN GLOBAL FUNCTION DEF: baz$} 1 } }
-   { dg-final { scan-assembler-times {(?n)^\.alias baz,bar;$} 1 } } */
+   { dg-final { scan-assembler-times {(?n)^\.alias baz,foo;$} 1 } } */
 
 /* { dg-final { scan-assembler-times {(?n)\tcall foo;$} 0 } }
    { dg-final { scan-assembler-times {(?n)\tcall bar;$} 0 } }
Thomas Schwinge Sept. 23, 2024, 2:26 p.m. UTC | #4
Hi Prathamesh!

On 2024-09-23T08:24:36+0000, Prathamesh Kulkarni <prathameshk@nvidia.com> wrote:
> Thanks for the review and sorry for late reply.

No worries.  My replies often are way more delayed...  ;'-|

> The attached patch addresses the above suggestions.
> Does it look OK ?

ACK, thanks!

> (Also, could you please test it at your end as well?)

As expected:

     PASS: gcc.target/nvptx/alias-to-alias-1.c (test for excess errors)
    +PASS: gcc.target/nvptx/alias-to-alias-1.c execution test
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)\\tcall bar;$ 0
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)\\tcall baz;$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)\\tcall foo;$ 0
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^// BEGIN GLOBAL FUNCTION DECL: bar$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^// BEGIN GLOBAL FUNCTION DECL: baz$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^// BEGIN GLOBAL FUNCTION DECL: foo$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^// BEGIN GLOBAL FUNCTION DEF: bar$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^// BEGIN GLOBAL FUNCTION DEF: baz$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^// BEGIN GLOBAL FUNCTION DEF: foo$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^\\.alias bar,foo;$ 1
    -PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^\\.alias baz,bar;$ 1
    +PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^\\.alias baz,foo;$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^\\.visible \\.func bar;$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^\\.visible \\.func baz;$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^\\.visible \\.func foo$ 1
     PASS: gcc.target/nvptx/alias-to-alias-1.c scan-assembler-times (?n)^\\.visible \\.func foo;$ 1


Grüße
 Thomas


> nvptx: Partial support for aliases to aliases.
>
> For the following test (adapted from pr96390.c):
>
> __attribute__((noipa)) int foo () { return 42; }
> int bar () __attribute__((alias ("foo")));
> int baz () __attribute__((alias ("bar")));
>
> int main ()
> {
>   int n;
>   #pragma omp target map(from:n)
>     n = baz ();
>   return n;
> }
>
> gcc emits following ptx for baz:
> .visible .func (.param.u32 %value_out) bar;
> .alias bar,foo;
> .visible .func (.param.u32 %value_out) baz;
> .alias baz,bar;
>
> which is incorrect since PTX requires aliasee to be a defined function.
> The patch instead uses cgraph_node::get(name)->ultimate_alias_target,
> which generates the following PTX:
>
> .visible .func (.param.u32 %value_out) baz;
> .alias baz,foo;
>
> gcc/ChangeLog:
> 	PR target/104957
>         * config/nvptx/nvptx.cc (nvptx_asm_output_def_from_decls): Use
>         cgraph_node::get(name)->ultimate_alias_target instead of value.
>
> gcc/testsuite/ChangeLog:
> 	PR target/104957
> 	* gcc.target/nvptx/alias-to-alias-1.c: Adjust.
>
> Signed-off-by: Prathamesh Kulkarni <prathameshk@nvidia.com>
> Co-authored-by: Thomas Schwinge <tschwinge@baylibre.com>
>
> diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
> index 4a7c64f05eb..96a1134220e 100644
> --- a/gcc/config/nvptx/nvptx.cc
> +++ b/gcc/config/nvptx/nvptx.cc
> @@ -7582,7 +7582,8 @@ nvptx_mem_local_p (rtx mem)
>    while (0)
>  
>  void
> -nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
> +nvptx_asm_output_def_from_decls (FILE *stream, tree name,
> +				 tree value ATTRIBUTE_UNUSED)
>  {
>    if (nvptx_alias == 0 || !TARGET_PTX_6_3)
>      {
> @@ -7617,7 +7618,8 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
>        return;
>      }
>  
> -  if (!cgraph_node::get (name)->referred_to_p ())
> +  cgraph_node *cnode = cgraph_node::get (name);
> +  if (!cnode->referred_to_p ())
>      /* Prevent "Internal error: reference to deleted section".  */
>      return;
>  
> @@ -7626,11 +7628,27 @@ nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
>    fputs (s.str ().c_str (), stream);
>  
>    tree id = DECL_ASSEMBLER_NAME (name);
> +
> +  /* Walk alias chain to get reference callgraph node.
> +     The rationale of using ultimate_alias_target here is that
> +     PTX's .alias directive only supports 1-level aliasing where
> +     aliasee is function defined in same module.
> +
> +     So for the following case:
> +     int foo() { return 42; }
> +     int bar () __attribute__((alias ("foo")));
> +     int baz () __attribute__((alias ("bar")));
> +
> +     should resolve baz to foo:
> +     .visible .func (.param.u32 %value_out) baz;
> +     .alias baz,foo;  */
> +  symtab_node *alias_target_node = cnode->ultimate_alias_target ();
> +  tree alias_target_id = DECL_ASSEMBLER_NAME (alias_target_node->decl);
>    std::stringstream s_def;
>    write_fn_marker (s_def, true, TREE_PUBLIC (name), IDENTIFIER_POINTER (id));
>    fputs (s_def.str ().c_str (), stream);
>    NVPTX_ASM_OUTPUT_DEF (stream, IDENTIFIER_POINTER (id),
> -			IDENTIFIER_POINTER (value));
> +			IDENTIFIER_POINTER (alias_target_id));
>  }
>  
>  #undef NVPTX_ASM_OUTPUT_DEF
> diff --git a/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c b/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c
> index 7bce7a358c7..08de9e6d69d 100644
> --- a/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c
> +++ b/gcc/testsuite/gcc.target/nvptx/alias-to-alias-1.c
> @@ -1,6 +1,8 @@
>  /* Alias to alias; 'libgomp.c-c++-common/pr96390.c'.  */
>  
> -/* { dg-do compile } */
> +/* { dg-do link } */
> +/* { dg-do run { target nvptx_runtime_alias_ptx } } */
> +/* { dg-options -save-temps } */
>  /* { dg-add-options nvptx_alias_ptx } */
>  
>  int v;
> @@ -32,7 +34,7 @@ main (void)
>  /* { dg-final { scan-assembler-times {(?n)^// BEGIN GLOBAL FUNCTION DECL: baz$} 1 } }
>     { dg-final { scan-assembler-times {(?n)^\.visible \.func baz;$} 1 } }
>     { dg-final { scan-assembler-times {(?n)^// BEGIN GLOBAL FUNCTION DEF: baz$} 1 } }
> -   { dg-final { scan-assembler-times {(?n)^\.alias baz,bar;$} 1 } } */
> +   { dg-final { scan-assembler-times {(?n)^\.alias baz,foo;$} 1 } } */
>  
>  /* { dg-final { scan-assembler-times {(?n)\tcall foo;$} 0 } }
>     { dg-final { scan-assembler-times {(?n)\tcall bar;$} 0 } }
diff mbox series

Patch

diff --git a/gcc/config/nvptx/nvptx.cc b/gcc/config/nvptx/nvptx.cc
index 2a8f713c680..9688b0e6f2d 100644
--- a/gcc/config/nvptx/nvptx.cc
+++ b/gcc/config/nvptx/nvptx.cc
@@ -7583,7 +7583,8 @@  nvptx_mem_local_p (rtx mem)
   while (0)
 
 void
-nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
+nvptx_asm_output_def_from_decls (FILE *stream, tree name,
+				 tree value ATTRIBUTE_UNUSED)
 {
   if (nvptx_alias == 0 || !TARGET_PTX_6_3)
     {
@@ -7618,7 +7619,8 @@  nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
       return;
     }
 
-  if (!cgraph_node::get (name)->referred_to_p ())
+  cgraph_node *cnode = cgraph_node::get (name);
+  if (!cnode->referred_to_p ())
     /* Prevent "Internal error: reference to deleted section".  */
     return;
 
@@ -7627,8 +7629,10 @@  nvptx_asm_output_def_from_decls (FILE *stream, tree name, tree value)
   fputs (s.str ().c_str (), stream);
 
   tree id = DECL_ASSEMBLER_NAME (name);
+  symtab_node *alias_target_node = cnode->ultimate_alias_target ();
+  tree alias_target_id = DECL_ASSEMBLER_NAME (alias_target_node->decl);
   NVPTX_ASM_OUTPUT_DEF (stream, IDENTIFIER_POINTER (id),
-			IDENTIFIER_POINTER (value));
+			IDENTIFIER_POINTER (alias_target_id));
 }
 
 #undef NVPTX_ASM_OUTPUT_DEF