Message ID | 20240127135234.354114-1-hjl.tools@gmail.com |
---|---|
State | New |
Headers | show |
Series | x86: Save callee-saved registers in noreturn functions for -O0/-Og | expand |
On Sat, Jan 27, 2024 at 05:52:34AM -0800, H.J. Lu wrote: > @@ -3391,7 +3392,9 @@ ix86_set_func_type (tree fndecl) > function is marked as noreturn in the IR output, which leads the > incompatible attribute error in LTO1. */ > bool has_no_callee_saved_registers > - = (((TREE_NOTHROW (fndecl) || !flag_exceptions) > + = ((optimize > + && !optimize_debug Shouldn't that be opt_for_fn (fndecl, optimize) and ditto for optimize_debug? I mean, aren't the options not restored yet when this function is called (i.e. remain in whatever state they were in the previous function or global state)? Also, shouldn't the lookup_attribute ("noreturn" check be the first one? I mean, noreturn functions are quite rare and so checking all the other conditions upon each set_cfun could waste too much compile time. Also, why check "noreturn" attribute rather than TREE_THIS_VOLATILE (fndecl)? Jakub
On Sat, Jan 27, 2024 at 6:09 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Sat, Jan 27, 2024 at 05:52:34AM -0800, H.J. Lu wrote: > > @@ -3391,7 +3392,9 @@ ix86_set_func_type (tree fndecl) > > function is marked as noreturn in the IR output, which leads the > > incompatible attribute error in LTO1. */ > > bool has_no_callee_saved_registers > > - = (((TREE_NOTHROW (fndecl) || !flag_exceptions) > > + = ((optimize > > + && !optimize_debug > > Shouldn't that be opt_for_fn (fndecl, optimize) and ditto for > optimize_debug? > I mean, aren't the options not restored yet when this function is called > (i.e. remain in whatever state they were in the previous function or > global state)? store_parm_decls is called when parsing a function. store_parm_decls calls allocate_struct_function which calls invoke_set_current_function_hook (fndecl); which has /* Change optimization options if needed. */ if (optimization_current_node != opts) { optimization_current_node = opts; cl_optimization_restore (&global_options, &global_options_set, TREE_OPTIMIZATION (opts)); } targetm.set_current_function (fndecl); which calls ix86_set_current_function after global_options has been updated. ix86_set_func_type is called from ix86_set_current_function. I don't see an issue with optimize and optimize_debug here. > Also, shouldn't the lookup_attribute ("noreturn" check be the first one? > I mean, noreturn functions are quite rare and so checking all the other I will fix it and updated one testcase with __attribute__((noreturn, optimize("-Og"))) > conditions upon each set_cfun could waste too much compile time. > > Also, why check "noreturn" attribute rather than > TREE_THIS_VOLATILE (fndecl)? > The comments above this code has NB: Don't use TREE_THIS_VOLATILE to check if this is a noreturn function. The local-pure-const pass turns an interrupt function into a noreturn function by setting TREE_THIS_VOLATILE. Normally the local-pure-const pass is run after ix86_set_func_type is called. When the local-pure-const pass is enabled for LTO, the interrupt function is marked as noreturn in the IR output, which leads the incompatible attribute error in LTO1. Thanks.
On Sat, Jan 27, 2024 at 07:00:03AM -0800, H.J. Lu wrote: > On Sat, Jan 27, 2024 at 6:09 AM Jakub Jelinek <jakub@redhat.com> wrote: > > > > On Sat, Jan 27, 2024 at 05:52:34AM -0800, H.J. Lu wrote: > > > @@ -3391,7 +3392,9 @@ ix86_set_func_type (tree fndecl) > > > function is marked as noreturn in the IR output, which leads the > > > incompatible attribute error in LTO1. */ > > > bool has_no_callee_saved_registers > > > - = (((TREE_NOTHROW (fndecl) || !flag_exceptions) > > > + = ((optimize > > > + && !optimize_debug > > > > Shouldn't that be opt_for_fn (fndecl, optimize) and ditto for > > optimize_debug? > > I mean, aren't the options not restored yet when this function is called > > (i.e. remain in whatever state they were in the previous function or > > global state)? > > store_parm_decls is called when parsing a function. store_parm_decls > calls allocate_struct_function which calls > > invoke_set_current_function_hook (fndecl); > > which has > > /* Change optimization options if needed. */ > if (optimization_current_node != opts) > { > optimization_current_node = opts; > cl_optimization_restore (&global_options, &global_options_set, > TREE_OPTIMIZATION (opts)); > } > > targetm.set_current_function (fndecl); > > which calls ix86_set_current_function after global_options > has been updated. ix86_set_func_type is called from > ix86_set_current_function. Sorry, you're right, I just saw option restore later in ix86_set_current_function and missed that it is target option restore only. > > Also, why check "noreturn" attribute rather than > > TREE_THIS_VOLATILE (fndecl)? > > > > The comments above this code has > > NB: Don't use TREE_THIS_VOLATILE to check if this is a noreturn > function. The local-pure-const pass turns an interrupt function > into a noreturn function by setting TREE_THIS_VOLATILE. Normally > the local-pure-const pass is run after ix86_set_func_type is called. > When the local-pure-const pass is enabled for LTO, the interrupt > function is marked as noreturn in the IR output, which leads the > incompatible attribute error in LTO1. So in that case, I think it would be best to test TREE_THIS_VOLATILE (fndecl) && lookup_attribute ("noreturn", DECL_ATTRIBUTES (fndecl)) && ... because if it doesn't have noreturn attribute, it will not have TREE_THIS_VOLATILE set and TREE_THIS_VOLATILE is much cheaper to test than looking an attribute. Jakub
On Mon, Jan 29, 2024 at 2:11 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Sat, Jan 27, 2024 at 07:00:03AM -0800, H.J. Lu wrote: > > On Sat, Jan 27, 2024 at 6:09 AM Jakub Jelinek <jakub@redhat.com> wrote: > > > > > > On Sat, Jan 27, 2024 at 05:52:34AM -0800, H.J. Lu wrote: > > > > @@ -3391,7 +3392,9 @@ ix86_set_func_type (tree fndecl) > > > > function is marked as noreturn in the IR output, which leads the > > > > incompatible attribute error in LTO1. */ > > > > bool has_no_callee_saved_registers > > > > - = (((TREE_NOTHROW (fndecl) || !flag_exceptions) > > > > + = ((optimize > > > > + && !optimize_debug > > > > > > Shouldn't that be opt_for_fn (fndecl, optimize) and ditto for > > > optimize_debug? > > > I mean, aren't the options not restored yet when this function is called > > > (i.e. remain in whatever state they were in the previous function or > > > global state)? > > > > store_parm_decls is called when parsing a function. store_parm_decls > > calls allocate_struct_function which calls > > > > invoke_set_current_function_hook (fndecl); > > > > which has > > > > /* Change optimization options if needed. */ > > if (optimization_current_node != opts) > > { > > optimization_current_node = opts; > > cl_optimization_restore (&global_options, &global_options_set, > > TREE_OPTIMIZATION (opts)); > > } > > > > targetm.set_current_function (fndecl); > > > > which calls ix86_set_current_function after global_options > > has been updated. ix86_set_func_type is called from > > ix86_set_current_function. > > Sorry, you're right, I just saw option restore later in ix86_set_current_function > and missed that it is target option restore only. > > > > Also, why check "noreturn" attribute rather than > > > TREE_THIS_VOLATILE (fndecl)? > > > > > > > The comments above this code has > > > > NB: Don't use TREE_THIS_VOLATILE to check if this is a noreturn > > function. The local-pure-const pass turns an interrupt function > > into a noreturn function by setting TREE_THIS_VOLATILE. Normally > > the local-pure-const pass is run after ix86_set_func_type is called. > > When the local-pure-const pass is enabled for LTO, the interrupt > > function is marked as noreturn in the IR output, which leads the > > incompatible attribute error in LTO1. > > So in that case, I think it would be best to test > TREE_THIS_VOLATILE (fndecl) > && lookup_attribute ("noreturn", DECL_ATTRIBUTES (fndecl)) > && ... > because if it doesn't have noreturn attribute, it will not have > TREE_THIS_VOLATILE set and TREE_THIS_VOLATILE is much cheaper to test than > looking an attribute. > Fixed in the v3 patch: https://patchwork.sourceware.org/project/gcc/list/?series=30308 Thanks.
diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index 473f5359fc9..5ff5560df7a 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -3381,7 +3381,8 @@ static void ix86_set_func_type (tree fndecl) { /* No need to save and restore callee-saved registers for a noreturn - function with nothrow or compiled with -fno-exceptions. + function with nothrow or compiled with -fno-exceptions unless when + compiling with -O0 or -Og. NB: Don't use TREE_THIS_VOLATILE to check if this is a noreturn function. The local-pure-const pass turns an interrupt function @@ -3391,7 +3392,9 @@ ix86_set_func_type (tree fndecl) function is marked as noreturn in the IR output, which leads the incompatible attribute error in LTO1. */ bool has_no_callee_saved_registers - = (((TREE_NOTHROW (fndecl) || !flag_exceptions) + = ((optimize + && !optimize_debug + && (TREE_NOTHROW (fndecl) || !flag_exceptions) && lookup_attribute ("noreturn", DECL_ATTRIBUTES (fndecl))) || lookup_attribute ("no_callee_saved_registers", TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))); diff --git a/gcc/testsuite/gcc.target/i386/pr38534-5.c b/gcc/testsuite/gcc.target/i386/pr38534-5.c new file mode 100644 index 00000000000..91c0c0f8c59 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr38534-5.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O0 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ + +#define ARRAY_SIZE 256 + +extern int array[ARRAY_SIZE][ARRAY_SIZE][ARRAY_SIZE]; +extern int value (int, int, int) +#ifndef __x86_64__ +__attribute__ ((regparm(3))) +#endif +; + +void +__attribute__((noreturn)) +no_return_to_caller (void) +{ + unsigned i, j, k; + for (i = ARRAY_SIZE; i > 0; --i) + for (j = ARRAY_SIZE; j > 0; --j) + for (k = ARRAY_SIZE; k > 0; --k) + array[i - 1][j - 1][k - 1] = value (i, j, k); + while (1); +} + +/* { dg-final { scan-assembler "push" } } */ +/* { dg-final { scan-assembler-not "pop" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr38534-6.c b/gcc/testsuite/gcc.target/i386/pr38534-6.c new file mode 100644 index 00000000000..756e1ec81f5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr38534-6.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-Og -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ + +#define ARRAY_SIZE 256 + +extern int array[ARRAY_SIZE][ARRAY_SIZE][ARRAY_SIZE]; +extern int value (int, int, int) +#ifndef __x86_64__ +__attribute__ ((regparm(3))) +#endif +; + +void +__attribute__((noreturn)) +no_return_to_caller (void) +{ + unsigned i, j, k; + for (i = ARRAY_SIZE; i > 0; --i) + for (j = ARRAY_SIZE; j > 0; --j) + for (k = ARRAY_SIZE; k > 0; --k) + array[i - 1][j - 1][k - 1] = value (i, j, k); + while (1); +} + +/* { dg-final { scan-assembler "push" } } */ +/* { dg-final { scan-assembler-not "pop" } } */