diff mbox series

[v5,2/5] C++: Support clang compatible [[musttail]] (PR83324)

Message ID 20240505181458.2903045-2-ak@linux.intel.com
State New
Headers show
Series [v5,1/5] Improve must tail in RTL backend | expand

Commit Message

Andi Kleen May 5, 2024, 6:14 p.m. UTC
This patch implements a clang compatible [[musttail]] attribute for
returns.

musttail is useful as an alternative to computed goto for interpreters.
With computed goto the interpreter function usually ends up very big
which causes problems with register allocation and other per function
optimizations not scaling. With musttail the interpreter can be instead
written as a sequence of smaller functions that call each other. To
avoid unbounded stack growth this requires forcing a sibling call, which
this attribute does. It guarantees an error if the call cannot be tail
called which allows the programmer to fix it instead of risking a stack
overflow. Unlike computed goto it is also type-safe.

It turns out that David Malcolm had already implemented middle/backend
support for a musttail attribute back in 2016, but it wasn't exposed
to any frontend other than a special plugin.

This patch adds a [[gnu::musttail]] attribute for C++ that can be added
to return statements. The return statement must be a direct call
(it does not follow dependencies), which is similar to what clang
implements. It then uses the existing must tail infrastructure.

For compatibility it also detects clang::musttail

One problem is that tree-tailcall usually fails when optimization
is disabled, which implies the attribute only really works with
optimization on. But that seems to be a reasonable limitation.

Passes bootstrap and full test

	PR83324

gcc/cp/ChangeLog:

	* cp-tree.h (finish_return_stmt): Add musttail_p.
	(check_return_expr): Dito.
	* parser.cc (cp_parser_statement): Handle [[musttail]].
	(cp_parser_std_attribute): Dito.
	(cp_parser_init_statement): Dito.
	(cp_parser_jump_statement): Dito.
	* semantics.cc (finish_return_stmt): Dito.
	* typeck.cc (check_return_expr): Handle musttail_p flag.
---
 gcc/cp/cp-tree.h    |  4 ++--
 gcc/cp/parser.cc    | 30 ++++++++++++++++++++++++------
 gcc/cp/semantics.cc |  6 +++---
 gcc/cp/typeck.cc    | 20 ++++++++++++++++++--
 4 files changed, 47 insertions(+), 13 deletions(-)

Comments

Jason Merrill May 7, 2024, 3:02 a.m. UTC | #1
On 5/5/24 14:14, Andi Kleen wrote:
> This patch implements a clang compatible [[musttail]] attribute for
> returns.

Thanks.

> musttail is useful as an alternative to computed goto for interpreters.
> With computed goto the interpreter function usually ends up very big
> which causes problems with register allocation and other per function
> optimizations not scaling. With musttail the interpreter can be instead
> written as a sequence of smaller functions that call each other. To
> avoid unbounded stack growth this requires forcing a sibling call, which
> this attribute does. It guarantees an error if the call cannot be tail
> called which allows the programmer to fix it instead of risking a stack
> overflow. Unlike computed goto it is also type-safe.
> 
> It turns out that David Malcolm had already implemented middle/backend
> support for a musttail attribute back in 2016, but it wasn't exposed
> to any frontend other than a special plugin.
> 
> This patch adds a [[gnu::musttail]] attribute for C++ that can be added
> to return statements. The return statement must be a direct call
> (it does not follow dependencies), which is similar to what clang
> implements. It then uses the existing must tail infrastructure.
> 
> For compatibility it also detects clang::musttail
> 
> One problem is that tree-tailcall usually fails when optimization
> is disabled, which implies the attribute only really works with
> optimization on. But that seems to be a reasonable limitation.
> 
> Passes bootstrap and full test
> 
> 	PR83324
> 
> gcc/cp/ChangeLog:
> 
> 	* cp-tree.h (finish_return_stmt): Add musttail_p.
> 	(check_return_expr): Dito.
> 	* parser.cc (cp_parser_statement): Handle [[musttail]].
> 	(cp_parser_std_attribute): Dito.
> 	(cp_parser_init_statement): Dito.
> 	(cp_parser_jump_statement): Dito.
> 	* semantics.cc (finish_return_stmt): Dito.
> 	* typeck.cc (check_return_expr): Handle musttail_p flag.
> ---
>   gcc/cp/cp-tree.h    |  4 ++--
>   gcc/cp/parser.cc    | 30 ++++++++++++++++++++++++------
>   gcc/cp/semantics.cc |  6 +++---
>   gcc/cp/typeck.cc    | 20 ++++++++++++++++++--
>   4 files changed, 47 insertions(+), 13 deletions(-)
> 
> @@ -12734,9 +12734,27 @@ cp_parser_statement (cp_parser* parser, tree in_statement_expr,
>   						     NULL_TREE, false);
>   	  break;
>   
> +	case RID_RETURN:
> +	  {
> +	    bool musttail_p = false;
> +	    std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
> +	    if (lookup_attribute ("gnu", "musttail", std_attrs))
> +	      {
> +		musttail_p = true;
> +		std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
> +	      }
> +	    // support this for compatibility
> +	    if (lookup_attribute ("clang", "musttail", std_attrs))
> +	      {
> +		musttail_p = true;
> +		std_attrs = remove_attribute ("clang", "musttail", std_attrs);
> +	      }
> +	    statement = cp_parser_jump_statement (parser, musttail_p);

It seems to me that if we were to pass &std_attrs to 
cp_parser_jump_statement, we could handle this entirely in that function 
rather than adding a flag to finish_return_stmt and check_return_stmt.

> @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
>       /* Maybe we don't expect to see any arguments for this attribute.  */
>       const attribute_spec *as
>         = lookup_attribute_spec (TREE_PURPOSE (attribute));
> -    if (as && as->max_length == 0)
> +    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))

I'd prefer to add an attribute to the table, rather than special-case it 
here; apart from consistency, it seems likely that someone will later 
want to apply it to a function.

You need a template testcase; I expect it doesn't work in templates with 
the current patch.  It's probably enough to copy it in tsubst_expr where 
we currently propagate CALL_EXPR_OPERATOR_SYNTAX.

You also need a testcase where the function returns a class; in that 
case the call will often appear as AGGR_INIT_EXPR rather than CALL_EXPR, 
so you'll need to handle that as well.  And see the places that copy 
flags like CALL_EXPR_OPERATOR_SYNTAX between CALL_EXPR and AGGR_INIT_EXPR.

Jason
Andi Kleen May 14, 2024, 5:24 p.m. UTC | #2
Hi Jason,

On Mon, May 06, 2024 at 11:02:20PM -0400, Jason Merrill wrote:
> > @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
> >       /* Maybe we don't expect to see any arguments for this attribute.  */
> >       const attribute_spec *as
> >         = lookup_attribute_spec (TREE_PURPOSE (attribute));
> > -    if (as && as->max_length == 0)
> > +    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))
> 
> I'd prefer to add an attribute to the table, rather than special-case it
> here; apart from consistency, it seems likely that someone will later want
> to apply it to a function.

Just to clarify. I can add it to the table, but it would be a nop there
for now because the table is not used for statement attributes by
the current parser.

> 
> You need a template testcase; I expect it doesn't work in templates with the
> current patch.  It's probably enough to copy it in tsubst_expr where we
> currently propagate CALL_EXPR_OPERATOR_SYNTAX.

I tried it with the appended test case, everything seems to work without
changes.

Does it cover the cases you were concerned about?


> 
> You also need a testcase where the function returns a class; in that case
> the call will often appear as AGGR_INIT_EXPR rather than CALL_EXPR, so
> you'll need to handle that as well.  And see the places that copy flags like
> CALL_EXPR_OPERATOR_SYNTAX between CALL_EXPR and AGGR_INIT_EXPR.

Dito.

-Andi


/* { dg-do compile { target { tail_call } } } */
/* { dg-options "-O2" } */
/* { dg-additional-options "-fdelayed-branch" { target sparc*-*-* } } */

class Foo {
public:
  int a, b;
  Foo(int a, int b) : a(a), b(b) {}
};

Foo __attribute__((noinline,noclone,noipa))
callee (int i)
{
  return Foo(i, i+1);
}

Foo __attribute__((noinline,noclone,noipa))
caller (int i)
{
  [[gnu::musttail]] return callee (i + 1);
}

template<typename T>
T __attribute__((noinline,noclone,noipa)) foo (T i)
{
  return i + 1;
}

int
caller2 (int k)
{
  [[gnu::musttail]] return foo<int>(1);
}

template<typename T>
T caller3 (T v)
{
  [[gnu::musttail]] return foo<T>(v);
}

int call3(int i)
{
  [[gnu::musttail]] return caller3<int>(i + 1);
}

class Bar {
  int a;
public:
  Bar(int a) : a(a) {}
  Bar operator+(Bar o) { return Bar(a + o.a); } 
};

Bar
caller3 (Bar k)
{
  [[gnu::musttail]] return caller3<Bar>(Bar(99));
}
Jason Merrill May 14, 2024, 9:52 p.m. UTC | #3
On 5/14/24 13:24, Andi Kleen wrote:
> Hi Jason,
> 
> On Mon, May 06, 2024 at 11:02:20PM -0400, Jason Merrill wrote:
>>> @@ -30189,7 +30207,7 @@ cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
>>>        /* Maybe we don't expect to see any arguments for this attribute.  */
>>>        const attribute_spec *as
>>>          = lookup_attribute_spec (TREE_PURPOSE (attribute));
>>> -    if (as && as->max_length == 0)
>>> +    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))
>>
>> I'd prefer to add an attribute to the table, rather than special-case it
>> here; apart from consistency, it seems likely that someone will later want
>> to apply it to a function.
> 
> Just to clarify. I can add it to the table, but it would be a nop there
> for now because the table is not used for statement attributes by
> the current parser.

Agreed.

>> You need a template testcase; I expect it doesn't work in templates with the
>> current patch.  It's probably enough to copy it in tsubst_expr where we
>> currently propagate CALL_EXPR_OPERATOR_SYNTAX.
> 
> I tried it with the appended test case, everything seems to work without
> changes.
> 
> Does it cover the cases you were concerned about?

Not fully; this testcase doesn't seem to check for errors if tail-call 
fails, only whether the syntax is accepted.  So it would pass if the 
attribute were simply ignored.

Did you also see this comment?

> It seems to me that if we were to pass &std_attrs to cp_parser_jump_statement, we could handle this entirely in that function rather than adding a flag to finish_return_stmt and check_return_stmt. 

Jason
Andi Kleen May 14, 2024, 11:23 p.m. UTC | #4
> > > You need a template testcase; I expect it doesn't work in templates with the
> > > current patch.  It's probably enough to copy it in tsubst_expr where we
> > > currently propagate CALL_EXPR_OPERATOR_SYNTAX.
> > 
> > I tried it with the appended test case, everything seems to work without
> > changes.
> > 
> > Does it cover the cases you were concerned about?
> 
> Not fully; this testcase doesn't seem to check for errors if tail-call
> fails, only whether the syntax is accepted.  So it would pass if the
> attribute were simply ignored.

Okay I'm not clear how I would do that. Pattern match the assembler 
in a target specific test case? From looking at the assembler output
everything got tail converted.

> 
> Did you also see this comment?
> 
> > It seems to me that if we were to pass &std_attrs to
> > cp_parser_jump_statement, we could handle this entirely in that function
> > rather than adding a flag to finish_return_stmt and check_return_stmt.

Yes. I did that change.

-Andi
Jason Merrill May 21, 2024, 6:02 p.m. UTC | #5
On 5/14/24 19:23, Andi Kleen wrote:
>>>> You need a template testcase; I expect it doesn't work in templates with the
>>>> current patch.  It's probably enough to copy it in tsubst_expr where we
>>>> currently propagate CALL_EXPR_OPERATOR_SYNTAX.
>>>
>>> I tried it with the appended test case, everything seems to work without
>>> changes.
>>>
>>> Does it cover the cases you were concerned about?
>>
>> Not fully; this testcase doesn't seem to check for errors if tail-call
>> fails, only whether the syntax is accepted.  So it would pass if the
>> attribute were simply ignored.
> 
> Okay I'm not clear how I would do that. Pattern match the assembler
> in a target specific test case? From looking at the assembler output
> everything got tail converted.

Write a testcase where the tail-call optimization can't happen, perhaps 
because the caller and callee disagree on return type:

int f();

double h() { [[gnu::musttail]] return f(); } // error

template <class T>
T g() { [[gnu::musttail]] return f(); }

int main()
{
   g<int>();
   g<double>(); // should error, but doesn't with v6 patch set
}

Jason
diff mbox series

Patch

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 52d6841559ca..ef5f0039ece2 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7782,7 +7782,7 @@  extern void finish_while_stmt			(tree);
 extern tree begin_do_stmt			(void);
 extern void finish_do_body			(tree);
 extern void finish_do_stmt		(tree, tree, bool, tree, bool);
-extern tree finish_return_stmt			(tree);
+extern tree finish_return_stmt			(tree, bool = false);
 extern tree begin_for_scope			(tree *);
 extern tree begin_for_stmt			(tree, tree);
 extern void finish_init_stmt			(tree);
@@ -8294,7 +8294,7 @@  extern tree composite_pointer_type		(const op_location_t &,
 						 tsubst_flags_t);
 extern tree merge_types				(tree, tree);
 extern tree strip_array_domain			(tree);
-extern tree check_return_expr			(tree, bool *, bool *);
+extern tree check_return_expr			(tree, bool *, bool *, bool);
 extern tree spaceship_type			(tree, tsubst_flags_t = tf_warning_or_error);
 extern tree genericize_spaceship		(location_t, tree, tree, tree);
 extern tree cp_build_binary_op                  (const op_location_t &,
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 66ce161252c7..e1bf92628ac3 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -2467,7 +2467,7 @@  static tree cp_parser_perform_range_for_lookup
 static tree cp_parser_range_for_member_function
   (tree, tree);
 static tree cp_parser_jump_statement
-  (cp_parser *);
+  (cp_parser *, bool = false);
 static void cp_parser_declaration_statement
   (cp_parser *);
 
@@ -12734,9 +12734,27 @@  cp_parser_statement (cp_parser* parser, tree in_statement_expr,
 						     NULL_TREE, false);
 	  break;
 
+	case RID_RETURN:
+	  {
+	    bool musttail_p = false;
+	    std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
+	    if (lookup_attribute ("gnu", "musttail", std_attrs))
+	      {
+		musttail_p = true;
+		std_attrs = remove_attribute ("gnu", "musttail", std_attrs);
+	      }
+	    // support this for compatibility
+	    if (lookup_attribute ("clang", "musttail", std_attrs))
+	      {
+		musttail_p = true;
+		std_attrs = remove_attribute ("clang", "musttail", std_attrs);
+	      }
+	    statement = cp_parser_jump_statement (parser, musttail_p);
+	  }
+	  break;
+
 	case RID_BREAK:
 	case RID_CONTINUE:
-	case RID_RETURN:
 	case RID_CO_RETURN:
 	case RID_GOTO:
 	  std_attrs = process_stmt_hotness_attribute (std_attrs, attrs_loc);
@@ -14782,7 +14800,7 @@  cp_parser_init_statement (cp_parser *parser, tree *decl)
   return false;
 }
 
-/* Parse a jump-statement.
+/* Parse a jump-statement.  MUSTTAIL_P indicates a musttail attribute.
 
    jump-statement:
      break ;
@@ -14800,7 +14818,7 @@  cp_parser_init_statement (cp_parser *parser, tree *decl)
    Returns the new BREAK_STMT, CONTINUE_STMT, RETURN_EXPR, or GOTO_EXPR.  */
 
 static tree
-cp_parser_jump_statement (cp_parser* parser)
+cp_parser_jump_statement (cp_parser* parser, bool musttail_p)
 {
   tree statement = error_mark_node;
   cp_token *token;
@@ -14884,7 +14902,7 @@  cp_parser_jump_statement (cp_parser* parser)
 	else if (FNDECL_USED_AUTO (current_function_decl) && in_discarded_stmt)
 	  /* Don't deduce from a discarded return statement.  */;
 	else
-	  statement = finish_return_stmt (expr);
+	  statement = finish_return_stmt (expr, musttail_p);
 	/* Look for the final `;'.  */
 	cp_parser_require (parser, CPP_SEMICOLON, RT_SEMICOLON);
       }
@@ -30189,7 +30207,7 @@  cp_parser_std_attribute (cp_parser *parser, tree attr_ns)
     /* Maybe we don't expect to see any arguments for this attribute.  */
     const attribute_spec *as
       = lookup_attribute_spec (TREE_PURPOSE (attribute));
-    if (as && as->max_length == 0)
+    if ((as && as->max_length == 0) || is_attribute_p ("musttail", attr_id))
       {
 	error_at (token->location, "%qE attribute does not take any arguments",
 		  attr_id);
diff --git a/gcc/cp/semantics.cc b/gcc/cp/semantics.cc
index b8c2bf8771fc..cac1a949c7d3 100644
--- a/gcc/cp/semantics.cc
+++ b/gcc/cp/semantics.cc
@@ -1387,16 +1387,16 @@  finish_do_stmt (tree cond, tree do_stmt, bool ivdep, tree unroll,
 }
 
 /* Finish a return-statement.  The EXPRESSION returned, if any, is as
-   indicated.  */
+   indicated.  MUSTTAIL_P indicates a mustcall attribute.  */
 
 tree
-finish_return_stmt (tree expr)
+finish_return_stmt (tree expr, bool musttail_p)
 {
   tree r;
   bool no_warning;
   bool dangling;
 
-  expr = check_return_expr (expr, &no_warning, &dangling);
+  expr = check_return_expr (expr, &no_warning, &dangling, musttail_p);
 
   if (error_operand_p (expr)
       || (flag_openmp && !check_omp_return ()))
diff --git a/gcc/cp/typeck.cc b/gcc/cp/typeck.cc
index a25f8622651d..9a5668796d2e 100644
--- a/gcc/cp/typeck.cc
+++ b/gcc/cp/typeck.cc
@@ -11036,10 +11036,12 @@  maybe_warn_pessimizing_move (tree expr, tree type, bool return_p)
    the DECL_RESULT for the function.  Set *NO_WARNING to true if
    code reaches end of non-void function warning shouldn't be issued
    on this RETURN_EXPR.  Set *DANGLING to true if code returns the
-   address of a local variable.  */
+   address of a local variable.  MUSTTAIL_P indicates a musttail
+   return.  */
 
 tree
-check_return_expr (tree retval, bool *no_warning, bool *dangling)
+check_return_expr (tree retval, bool *no_warning, bool *dangling,
+		   bool musttail_p)
 {
   tree result;
   /* The type actually returned by the function.  */
@@ -11053,6 +11055,20 @@  check_return_expr (tree retval, bool *no_warning, bool *dangling)
   *no_warning = false;
   *dangling = false;
 
+  if (musttail_p)
+    {
+      tree t = retval;
+      if (TREE_CODE (t) == TARGET_EXPR)
+	t = TARGET_EXPR_INITIAL (retval);
+      if (TREE_CODE (t) != CALL_EXPR)
+	{
+	  error_at (loc, "cannot tail-call: return value must be a call");
+	  return error_mark_node;
+	}
+      else
+	CALL_EXPR_MUST_TAIL_CALL (t) = 1;
+    }
+
   /* A `volatile' function is one that isn't supposed to return, ever.
      (This is a G++ extension, used to get better code for functions
      that call the `volatile' function.)  */