Message ID | 20240505181458.2903045-1-ak@linux.intel.com |
---|---|
State | New |
Headers | show |
Series | [v5,1/5] Improve must tail in RTL backend | expand |
On Sun, May 5, 2024 at 8:16 PM Andi Kleen <ak@linux.intel.com> wrote: > > - Give error messages for all causes of non sibling call generation > - Don't override choices of other non sibling call checks with > must tail. This causes ICEs. The must tail attribute now only > overrides flag_optimize_sibling_calls locally. > - Error out when tree-tailcall failed to mark a must-tail call > sibcall. In this case it doesn't know the true reason and only gives > a vague message (this could be improved, but it's already useful without > that) tree-tailcall usually fails without optimization, so must > adjust the existing must-tail plugin test to specify -O2. > > PR83324 > > gcc/ChangeLog: > > * calls.cc (expand_call): Fix mustcall implementation. > > gcc/testsuite/ChangeLog: > > * gcc.dg/plugin/must-tail-call-1.c: Adjust. > --- > gcc/calls.cc | 30 ++++++++++++------- > .../gcc.dg/plugin/must-tail-call-1.c | 1 + > 2 files changed, 21 insertions(+), 10 deletions(-) > > diff --git a/gcc/calls.cc b/gcc/calls.cc > index 21d78f9779fe..a6b8ee44cc29 100644 > --- a/gcc/calls.cc > +++ b/gcc/calls.cc > @@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore) > /* The type of the function being called. */ > tree fntype; > bool try_tail_call = CALL_EXPR_TAILCALL (exp); > - bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp); > + /* tree-tailcall decided not to do tail calls. Error for the musttail case. */ > + if (!try_tail_call) > + maybe_complain_about_tail_call (exp, "other reasons"); > int pass; > > /* Register in which non-BLKmode value will be returned, > @@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore) > pushed these optimizations into -O2. Don't try if we're already > expanding a call, as that means we're an argument. Don't try if > there's cleanups, as we know there's code to follow the call. */ > - if (currently_expanding_call++ != 0 > - || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp)) > - || args_size.var > - || dbg_cnt (tail_call) == false) > + if (currently_expanding_call++ != 0) > + { > + maybe_complain_about_tail_call (exp, "inside another call"); > + try_tail_call = 0; > + } > + if (!flag_optimize_sibling_calls > + && !CALL_FROM_THUNK_P (exp) > + && !CALL_EXPR_MUST_TAIL_CALL (exp)) > + try_tail_call = 0; > + if (args_size.var) If we are both inside another call and run into this we give two errors, but I guess that's OK ... > + { > + /* ??? correct message? */ > + maybe_complain_about_tail_call (exp, "stack space needed"); args_size.var != NULL_TREE means the argument size is not constant. I'm quite sure this is an overly conservative check. > + try_tail_call = 0; > + } > + if (dbg_cnt (tail_call) == false) > try_tail_call = 0; > > /* Workaround buggy C/C++ wrappers around Fortran routines with > @@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore) > if (MEM_P (*iter)) > { > try_tail_call = 0; > + maybe_complain_about_tail_call (exp, "hidden string length argument"); "hidden string length argument passed on stack" from what I read the code. > break; > } > } > > - /* If the user has marked the function as requiring tail-call > - optimization, attempt it. */ > - if (must_tail_call) > - try_tail_call = 1; > - > /* Rest of purposes for tail call optimizations to fail. */ > if (try_tail_call) > try_tail_call = can_implement_as_sibling_call_p (exp, > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > index 3a6d4cceaba7..44af361e2925 100644 > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > @@ -1,4 +1,5 @@ > /* { dg-do compile { target tail_call } } */ > +/* { dg-options "-O2" } */ So I think this is unfortunate - I think when there's a must-tail attribute we should either run the tailcall pass to check the call even at -O0 or trust the user with correctness (hoping no optimization interfered with the ability to tail-call). What were the ICEs you ran into? I would guess it's for example problematic to duplicate must-tail calls? Thanks, Richard. > /* { dg-options "-fdelayed-branch" { target sparc*-*-* } } */ > > extern void abort (void); > -- > 2.44.0 >
> > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > index 3a6d4cceaba7..44af361e2925 100644 > > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > @@ -1,4 +1,5 @@ > > /* { dg-do compile { target tail_call } } */ > > +/* { dg-options "-O2" } */ > > So I think this is unfortunate - I think when there's a must-tail attribute > we should either run the tailcall pass to check the call even at -O0 or > trust the user with correctness (hoping no optimization interfered with > the ability to tail-call). > > What were the ICEs you ran into? I just tried and I don't see ICEs with the current test cases anymore. Unfortunately I didn't save the earlier ones and I'm not fully sure what test case i used then. I'll undo that change and watch it. Maybe together with moving the tree pass that's enough to fix -O0. -Andi
On Tue, May 14, 2024 at 04:15:08PM +0200, Richard Biener wrote: > On Sun, May 5, 2024 at 8:16 PM Andi Kleen <ak@linux.intel.com> wrote: > > > > - Give error messages for all causes of non sibling call generation > > - Don't override choices of other non sibling call checks with > > must tail. This causes ICEs. The must tail attribute now only > > overrides flag_optimize_sibling_calls locally. > > - Error out when tree-tailcall failed to mark a must-tail call > > sibcall. In this case it doesn't know the true reason and only gives > > a vague message (this could be improved, but it's already useful without > > that) tree-tailcall usually fails without optimization, so must > > adjust the existing must-tail plugin test to specify -O2. > > > > PR83324 > > > > gcc/ChangeLog: > > > > * calls.cc (expand_call): Fix mustcall implementation. > > > > gcc/testsuite/ChangeLog: > > > > * gcc.dg/plugin/must-tail-call-1.c: Adjust. > > --- > > gcc/calls.cc | 30 ++++++++++++------- > > .../gcc.dg/plugin/must-tail-call-1.c | 1 + > > 2 files changed, 21 insertions(+), 10 deletions(-) > > > > diff --git a/gcc/calls.cc b/gcc/calls.cc > > index 21d78f9779fe..a6b8ee44cc29 100644 > > --- a/gcc/calls.cc > > +++ b/gcc/calls.cc > > @@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore) > > /* The type of the function being called. */ > > tree fntype; > > bool try_tail_call = CALL_EXPR_TAILCALL (exp); > > - bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp); > > + /* tree-tailcall decided not to do tail calls. Error for the musttail case. */ > > + if (!try_tail_call) > > + maybe_complain_about_tail_call (exp, "other reasons"); > > int pass; > > > > /* Register in which non-BLKmode value will be returned, > > @@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore) > > pushed these optimizations into -O2. Don't try if we're already > > expanding a call, as that means we're an argument. Don't try if > > there's cleanups, as we know there's code to follow the call. */ > > - if (currently_expanding_call++ != 0 > > - || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp)) > > - || args_size.var > > - || dbg_cnt (tail_call) == false) > > + if (currently_expanding_call++ != 0) > > + { > > + maybe_complain_about_tail_call (exp, "inside another call"); > > + try_tail_call = 0; > > + } > > + if (!flag_optimize_sibling_calls > > + && !CALL_FROM_THUNK_P (exp) > > + && !CALL_EXPR_MUST_TAIL_CALL (exp)) > > + try_tail_call = 0; > > + if (args_size.var) > > If we are both inside another call and run into this we give two errors, > but I guess that's OK ... > > > + { > > + /* ??? correct message? */ > > + maybe_complain_about_tail_call (exp, "stack space needed"); > > args_size.var != NULL_TREE means the argument size is not constant. > I'm quite sure this is an overly conservative check. > > > + try_tail_call = 0; > > + } > > + if (dbg_cnt (tail_call) == false) > > try_tail_call = 0; > > > > /* Workaround buggy C/C++ wrappers around Fortran routines with > > @@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore) > > if (MEM_P (*iter)) > > { > > try_tail_call = 0; > > + maybe_complain_about_tail_call (exp, "hidden string length argument"); > > "hidden string length argument passed on stack" > > from what I read the code. > > > break; > > } > > } > > > > - /* If the user has marked the function as requiring tail-call > > - optimization, attempt it. */ > > - if (must_tail_call) > > - try_tail_call = 1; > > - > > /* Rest of purposes for tail call optimizations to fail. */ > > if (try_tail_call) > > try_tail_call = can_implement_as_sibling_call_p (exp, > > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > index 3a6d4cceaba7..44af361e2925 100644 > > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > @@ -1,4 +1,5 @@ > > /* { dg-do compile { target tail_call } } */ > > +/* { dg-options "-O2" } */ > > So I think this is unfortunate - I think when there's a must-tail attribute > we should either run the tailcall pass to check the call even at -O0 or > trust the user with correctness (hoping no optimization interfered with > the ability to tail-call). > > What were the ICEs you ran into? > > I would guess it's for example problematic to duplicate must-tail calls? I experimented more with this, the problem I have currently is that in -O0 when returning a struct I get something like D.2846 = caller3<Bar> (D.2836); [must tail call] <bb 3> : D.2836 ={v} {CLOBBER(eos)}; return D.2846; And this always fails this check in tree-tailcall: /* If the statement references memory or volatile operands, fail. */ if (gimple_references_memory_p (stmt) || gimple_has_volatile_ops (stmt)) return; In a optimized build this passes, but with -O0 it always fails when the pass is placed before pass_optimizations_g. I assume it's some problem with mem ssa form. Any ideas how to fix that? Otherwise I can restrict musttail to non structs. -Andi
On Mon, May 20, 2024 at 6:53 AM Andi Kleen <ak@linux.intel.com> wrote: > > On Tue, May 14, 2024 at 04:15:08PM +0200, Richard Biener wrote: > > On Sun, May 5, 2024 at 8:16 PM Andi Kleen <ak@linux.intel.com> wrote: > > > > > > - Give error messages for all causes of non sibling call generation > > > - Don't override choices of other non sibling call checks with > > > must tail. This causes ICEs. The must tail attribute now only > > > overrides flag_optimize_sibling_calls locally. > > > - Error out when tree-tailcall failed to mark a must-tail call > > > sibcall. In this case it doesn't know the true reason and only gives > > > a vague message (this could be improved, but it's already useful without > > > that) tree-tailcall usually fails without optimization, so must > > > adjust the existing must-tail plugin test to specify -O2. > > > > > > PR83324 > > > > > > gcc/ChangeLog: > > > > > > * calls.cc (expand_call): Fix mustcall implementation. > > > > > > gcc/testsuite/ChangeLog: > > > > > > * gcc.dg/plugin/must-tail-call-1.c: Adjust. > > > --- > > > gcc/calls.cc | 30 ++++++++++++------- > > > .../gcc.dg/plugin/must-tail-call-1.c | 1 + > > > 2 files changed, 21 insertions(+), 10 deletions(-) > > > > > > diff --git a/gcc/calls.cc b/gcc/calls.cc > > > index 21d78f9779fe..a6b8ee44cc29 100644 > > > --- a/gcc/calls.cc > > > +++ b/gcc/calls.cc > > > @@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore) > > > /* The type of the function being called. */ > > > tree fntype; > > > bool try_tail_call = CALL_EXPR_TAILCALL (exp); > > > - bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp); > > > + /* tree-tailcall decided not to do tail calls. Error for the musttail case. */ > > > + if (!try_tail_call) > > > + maybe_complain_about_tail_call (exp, "other reasons"); > > > int pass; > > > > > > /* Register in which non-BLKmode value will be returned, > > > @@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore) > > > pushed these optimizations into -O2. Don't try if we're already > > > expanding a call, as that means we're an argument. Don't try if > > > there's cleanups, as we know there's code to follow the call. */ > > > - if (currently_expanding_call++ != 0 > > > - || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp)) > > > - || args_size.var > > > - || dbg_cnt (tail_call) == false) > > > + if (currently_expanding_call++ != 0) > > > + { > > > + maybe_complain_about_tail_call (exp, "inside another call"); > > > + try_tail_call = 0; > > > + } > > > + if (!flag_optimize_sibling_calls > > > + && !CALL_FROM_THUNK_P (exp) > > > + && !CALL_EXPR_MUST_TAIL_CALL (exp)) > > > + try_tail_call = 0; > > > + if (args_size.var) > > > > If we are both inside another call and run into this we give two errors, > > but I guess that's OK ... > > > > > + { > > > + /* ??? correct message? */ > > > + maybe_complain_about_tail_call (exp, "stack space needed"); > > > > args_size.var != NULL_TREE means the argument size is not constant. > > I'm quite sure this is an overly conservative check. > > > > > + try_tail_call = 0; > > > + } > > > + if (dbg_cnt (tail_call) == false) > > > try_tail_call = 0; > > > > > > /* Workaround buggy C/C++ wrappers around Fortran routines with > > > @@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore) > > > if (MEM_P (*iter)) > > > { > > > try_tail_call = 0; > > > + maybe_complain_about_tail_call (exp, "hidden string length argument"); > > > > "hidden string length argument passed on stack" > > > > from what I read the code. > > > > > break; > > > } > > > } > > > > > > - /* If the user has marked the function as requiring tail-call > > > - optimization, attempt it. */ > > > - if (must_tail_call) > > > - try_tail_call = 1; > > > - > > > /* Rest of purposes for tail call optimizations to fail. */ > > > if (try_tail_call) > > > try_tail_call = can_implement_as_sibling_call_p (exp, > > > diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > > index 3a6d4cceaba7..44af361e2925 100644 > > > --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > > +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c > > > @@ -1,4 +1,5 @@ > > > /* { dg-do compile { target tail_call } } */ > > > +/* { dg-options "-O2" } */ > > > > So I think this is unfortunate - I think when there's a must-tail attribute > > we should either run the tailcall pass to check the call even at -O0 or > > trust the user with correctness (hoping no optimization interfered with > > the ability to tail-call). > > > > What were the ICEs you ran into? > > > > I would guess it's for example problematic to duplicate must-tail calls? > > I experimented more with this, the problem I have currently is that in > -O0 when returning a struct I get something like > > D.2846 = caller3<Bar> (D.2836); [must tail call] > > <bb 3> : > D.2836 ={v} {CLOBBER(eos)}; > return D.2846; > > And this always fails this check in tree-tailcall: > > /* If the statement references memory or volatile operands, fail. */ > if (gimple_references_memory_p (stmt) > || gimple_has_volatile_ops (stmt)) > return; I can't see how this triggers on the IL above, the loop should have ignored both the return and the clobber and when recursing to the predecessor stop before the above check when runnig into the call? > In a optimized build this passes, but with -O0 it always fails > when the pass is placed before pass_optimizations_g. I assume > it's some problem with mem ssa form. > > Any ideas how to fix that? Otherwise I can restrict musttail to non > structs. I wonder how this works when optimizing? > > -Andi
> I can't see how this triggers on the IL above, the loop should have > ignored both the return and the clobber and when recursing to > the predecessor stop before the above check when runnig into the > call? Yes, I tracked that down later. The problem was that there were multiple successors to the BB due to exception handling, which makes the find_tail_calls walker give up. Putting the new pass after ehcleanup fixed that, but there are still cases when ehcleanup cannot get rid of them and then it gives up. musttail checking at expand time still works, but can only give a vague error message. > > > In a optimized build this passes, but with -O0 it always fails > > when the pass is placed before pass_optimizations_g. I assume > > it's some problem with mem ssa form. > > > > Any ideas how to fix that? Otherwise I can restrict musttail to non > > structs. > > I wonder how this works when optimizing? It just doesn't. You need optimization to do tail calls with structs. The only alternative would be to detect the situation and pull in some extra passes. Also even with optimization it only works for structs that fit into registers. This could be maybe fixed, but is out of scope for this patch kit. -Andi
On Tue, May 21, 2024 at 3:35 PM Andi Kleen <ak@linux.intel.com> wrote: > > > I can't see how this triggers on the IL above, the loop should have > > ignored both the return and the clobber and when recursing to > > the predecessor stop before the above check when runnig into the > > call? > > Yes, I tracked that down later. The problem was that there > were multiple successors to the BB due to exception handling, > which makes the find_tail_calls walker give up. > > Putting the new pass after ehcleanup fixed that, but there > are still cases when ehcleanup cannot get rid of them and > then it gives up. musttail checking at expand time still > works, but can only give a vague error message. > > > > > > In a optimized build this passes, but with -O0 it always fails > > > when the pass is placed before pass_optimizations_g. I assume > > > it's some problem with mem ssa form. > > > > > > Any ideas how to fix that? Otherwise I can restrict musttail to non > > > structs. > > > > I wonder how this works when optimizing? > > It just doesn't. You need optimization to do tail calls with > structs. The only alternative would be to detect the situation > and pull in some extra passes. > > Also even with optimization it only works for structs that > fit into registers. This could be maybe fixed, but is out of scope > for this patch kit. I see. I do wonder how we should deal with the inherent dependence on optimization for [[musttail]] to work then? "Solve" the problem with good documentation? Offer a -fignore-musttail option to allow a -O0 build to at least succeed? But then [[musttail]] would rather be [[shouldtail]] and can no longer be for correctness? How does clang solve this? Richard. > > -Andi
> I see. I do wonder how we should deal with the inherent > dependence on optimization for [[musttail]] to work then? "Solve" > the problem with good documentation? For now that's good enough I think. If it's a significant problem it can always be improved later. > Offer a -fignore-musttail > option to allow a -O0 build to at least succeed? But then [[musttail]] > would rather be [[shouldtail]] and can no longer be for correctness? I don't think ignore-musttail is a good idea because an interpreter relying on this would almost certainly die very quickly after overflowing its stack. The feature is only useful when it can be relied on. > How does clang solve this? clang++ 17 seems to handle both cases (large struct in memory or struct fitting into registers) without optimization. I haven't checked the LLVM source on the exact details. -Andi
diff --git a/gcc/calls.cc b/gcc/calls.cc index 21d78f9779fe..a6b8ee44cc29 100644 --- a/gcc/calls.cc +++ b/gcc/calls.cc @@ -2650,7 +2650,9 @@ expand_call (tree exp, rtx target, int ignore) /* The type of the function being called. */ tree fntype; bool try_tail_call = CALL_EXPR_TAILCALL (exp); - bool must_tail_call = CALL_EXPR_MUST_TAIL_CALL (exp); + /* tree-tailcall decided not to do tail calls. Error for the musttail case. */ + if (!try_tail_call) + maybe_complain_about_tail_call (exp, "other reasons"); int pass; /* Register in which non-BLKmode value will be returned, @@ -3022,10 +3024,22 @@ expand_call (tree exp, rtx target, int ignore) pushed these optimizations into -O2. Don't try if we're already expanding a call, as that means we're an argument. Don't try if there's cleanups, as we know there's code to follow the call. */ - if (currently_expanding_call++ != 0 - || (!flag_optimize_sibling_calls && !CALL_FROM_THUNK_P (exp)) - || args_size.var - || dbg_cnt (tail_call) == false) + if (currently_expanding_call++ != 0) + { + maybe_complain_about_tail_call (exp, "inside another call"); + try_tail_call = 0; + } + if (!flag_optimize_sibling_calls + && !CALL_FROM_THUNK_P (exp) + && !CALL_EXPR_MUST_TAIL_CALL (exp)) + try_tail_call = 0; + if (args_size.var) + { + /* ??? correct message? */ + maybe_complain_about_tail_call (exp, "stack space needed"); + try_tail_call = 0; + } + if (dbg_cnt (tail_call) == false) try_tail_call = 0; /* Workaround buggy C/C++ wrappers around Fortran routines with @@ -3046,15 +3060,11 @@ expand_call (tree exp, rtx target, int ignore) if (MEM_P (*iter)) { try_tail_call = 0; + maybe_complain_about_tail_call (exp, "hidden string length argument"); break; } } - /* If the user has marked the function as requiring tail-call - optimization, attempt it. */ - if (must_tail_call) - try_tail_call = 1; - /* Rest of purposes for tail call optimizations to fail. */ if (try_tail_call) try_tail_call = can_implement_as_sibling_call_p (exp, diff --git a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c index 3a6d4cceaba7..44af361e2925 100644 --- a/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c +++ b/gcc/testsuite/gcc.dg/plugin/must-tail-call-1.c @@ -1,4 +1,5 @@ /* { dg-do compile { target tail_call } } */ +/* { dg-options "-O2" } */ /* { dg-options "-fdelayed-branch" { target sparc*-*-* } } */ extern void abort (void);