Message ID | 20241002211443.2035722-1-ak@linux.intel.com |
---|---|
State | New |
Headers | show |
Series | [v1] Add -ftime-report-wall | expand |
On Wed, 2024-10-02 at 14:14 -0700, Andi Kleen wrote: > From: Andi Kleen <ak@gcc.gnu.org> > > Time vars normally use times(2) to get the user/sys/wall time, which > is always a > system call. I don't think the system time is very useful because > most overhead > is in user time. If we only use the wall (or monotonic) time modern > OS have an > optimized path to get it directly from a CPU instruction like RDTSC > without system call, which is much faster. > > Comparing the overhead with tramp3d: > > ./gcc/cc1plus -quiet ../tsrc/tramp3d-v4.i ran > 1.03 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report- > wall ../tsrc/tramp3d-v4.i > 1.18 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report > ../tsrc/tramp3d-v4.i > > -ftime-report costs 18% (excluding the output), while -ftime-report- > wall > only costs 3%, so is nearly free. So it would be feasible for some > build > system to always enable it and break down the build time into passes. > > The drawback is that if there is context switching with other > programs > the time will be overestimated, however for the common case that the > system is not oversubscribed it is more accurate because each > measurement has less overhead. > > Add a -ftime-report-wall option. It actually uses the POSIX monotonic > time, > so strictly it's not wall clock, but it's still a reasonable name. > > Bootstrapped on x86_64-linux with full test suite run. Note that if the user requests SARIF output e.g. with -fdiagnostics-format=sarif-stderr then any timevar data from -ftime-report is written in JSON form as part of the SARIF, rather than in text form to stderr (see 75d623946d4b6ea80a777b789b116d4b4a2298dc). I see that the proposed patch leaves the user and sys stats as zero, and conditionalizes what's printed for text output as part of timer::print. Should it also do something similar in make_json_for_timevar_time_def for the json output, and not add the properties for "user" and "sys" if the data hasn't been gathered? Hope I'm reading the patch correctly. Thanks Dave > > gcc/ChangeLog: > > * common.opt (ftime-report-wall): Add. > * common.opt.urls: Regenerate. > * doc/invoke.texi: (ftime-report-wall): Document > * gcc.cc (try_generate_repro): Check for -ftime-report-wall. > * timevar.cc (get_time): Use clock_gettime if enabled. > (timer::print): Print only wall time for time_report_wall. > * toplev.cc (toplev::start_timevars): Check for > time_report_wall. > > gcc/testsuite/ChangeLog: > > * g++.dg/ext/timevar3.C: New test. > --- > gcc/common.opt | 4 ++++ > gcc/common.opt.urls | 3 +++ > gcc/doc/invoke.texi | 7 +++++++ > gcc/gcc.cc | 3 ++- > gcc/testsuite/g++.dg/ext/timevar3.C | 14 +++++++++++++ > gcc/timevar.cc | 31 +++++++++++++++++++++++---- > -- > gcc/toplev.cc | 3 ++- > 7 files changed, 57 insertions(+), 8 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/ext/timevar3.C > > diff --git a/gcc/common.opt b/gcc/common.opt > index d270e524ff45..e9fb15e28d80 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -3010,6 +3010,10 @@ ftime-report > Common Var(time_report) > Report the time taken by each compiler pass. > > +ftime-report-wall > +Common Var(time_report_wall) > +Report the wall time taken by each compiler. > + > ftime-report-details > Common Var(time_report_details) > Record times taken by sub-phases separately. > diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls > index e31736cd9945..6e79a8f9390b 100644 > --- a/gcc/common.opt.urls > +++ b/gcc/common.opt.urls > @@ -1378,6 +1378,9 @@ UrlSuffix(gcc/Optimize-Options.html#index- > fthread-jumps) > ftime-report > UrlSuffix(gcc/Developer-Options.html#index-ftime-report) > > +ftime-report-wall > +UrlSuffix(gcc/Developer-Options.html#index-ftime-report-wall) > + > ftime-report-details > UrlSuffix(gcc/Developer-Options.html#index-ftime-report-details) > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index e199522f62c7..80cb355f5d79 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -784,6 +784,7 @@ Objective-C and Objective-C++ Dialects}. > -frandom-seed=@var{string} -fsched-verbose=@var{n} > -fsel-sched-verbose -fsel-sched-dump-cfg -fsel-sched-pipelining- > verbose > -fstats -fstack-usage -ftime-report -ftime-report-details > +-ftime-report-wall > -fvar-tracking-assignments-toggle -gtoggle > -print-file-name=@var{library} -print-libgcc-file-name > -print-multi-directory -print-multi-lib -print-multi-os-directory > @@ -21026,6 +21027,12 @@ slightly different place within the > compiler. > @item -ftime-report-details > Record the time consumed by infrastructure parts separately for each > pass. > > +@opindex ftime-report-wall > +@item -ftime-report-wall > +Report statistics about compiler pass time consumpion, but only > using wall > +time. This is faster than @option{-ftime-report}, but can be more > +influenced by background jobs. > + > @opindex fira-verbose > @item -fira-verbose=@var{n} > Control the verbosity of the dump file for the integrated register > allocator. > diff --git a/gcc/gcc.cc b/gcc/gcc.cc > index 16fed46fb35f..8d3046eb7874 100644 > --- a/gcc/gcc.cc > +++ b/gcc/gcc.cc > @@ -7964,7 +7964,8 @@ try_generate_repro (const char **argv) > it might varry between invocations. */ > else if (! strcmp (argv[nargs], "-quiet")) > quiet = 1; > - else if (! strcmp (argv[nargs], "-ftime-report")) > + else if (! strcmp (argv[nargs], "-ftime-report") > + || ! strcmp (argv[nargs], "-ftime-report-wall")) > return; > > if (out_arg == -1 || !quiet) > diff --git a/gcc/testsuite/g++.dg/ext/timevar3.C > b/gcc/testsuite/g++.dg/ext/timevar3.C > new file mode 100644 > index 000000000000..b003f37f9654 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/ext/timevar3.C > @@ -0,0 +1,14 @@ > +// PR c++/52248 > +// { dg-options "-ftime-report-wall" } > +// { dg-allow-blank-lines-in-output 1 } > +// { dg-prune-output "Time variable" } > +// { dg-prune-output "k" } > +// { dg-prune-output " 0 " } > +// { dg-prune-output "checks" } > +// { dg-prune-output "\[0-9\]+%" } > + > +void > +foo () > +{ > + goto lab; // { dg-error "not defined" } > +} > diff --git a/gcc/timevar.cc b/gcc/timevar.cc > index 68bcf44864f9..bf788f735621 100644 > --- a/gcc/timevar.cc > +++ b/gcc/timevar.cc > @@ -243,6 +243,15 @@ get_time (struct timevar_time_def *now) > now->ggc_mem = timevar_ggc_mem_total; > > { > +#if _POSIX_C_SOURCE >= 199309L > + if (time_report_wall) > + { > + struct timespec ts; > + clock_gettime (CLOCK_MONOTONIC, &ts); > + now->wall = ts.tv_sec * 1000000000 + ts.tv_nsec; > + return; > + } > +#endif > #ifdef USE_TIMES > struct tms tms; > now->wall = times (&tms) * ticks_to_nanosec; > @@ -682,14 +691,16 @@ timer::print_row (FILE *fp, > > #ifdef HAVE_USER_TIME > /* Print user-mode time for this process. */ > - fprintf (fp, "%7.2f (%3.0f%%)", > + if (!time_report_wall) > + fprintf (fp, "%7.2f (%3.0f%%)", > nanosec_to_floating_sec (elapsed.user), > percent_of (total->user, elapsed.user)); > #endif /* HAVE_USER_TIME */ > > #ifdef HAVE_SYS_TIME > + if (!time_report_wall) > /* Print system-mode time for this process. */ > - fprintf (fp, "%7.2f (%3.0f%%)", > + fprintf (fp, "%7.2f (%3.0f%%)", > nanosec_to_floating_sec (elapsed.sys), > percent_of (total->sys, elapsed.sys)); > #endif /* HAVE_SYS_TIME */ > @@ -754,7 +765,10 @@ timer::print (FILE *fp) > TIMEVAR. */ > m_start_time = now; > > - fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr", > "sys", > + if (time_report_wall) > + fprintf (fp, "\n%-35s%16s%14s\n", "Time variable", "wall", > "GGC"); > + else > + fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", > "usr", "sys", > "wall", "GGC"); > if (m_jit_client_items) > fputs ("GCC items:\n", fp); > @@ -810,13 +824,18 @@ timer::print (FILE *fp) > /* Print total time. */ > fprintf (fp, " %-35s:", "TOTAL"); > #ifdef HAVE_USER_TIME > - fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total- > >user)); > + if (!time_report_wall) > + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total- > >user)); > #endif > #ifdef HAVE_SYS_TIME > - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys)); > + if (!time_report_wall) > + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total- > >sys)); > #endif > #ifdef HAVE_WALL_TIME > - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total- > >wall)); > + if (!time_report_wall) > + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total- > >wall)); > + else > + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total- > >wall)); > #endif > fprintf (fp, PRsa (7) "\n", SIZE_AMOUNT (total->ggc_mem)); > > diff --git a/gcc/toplev.cc b/gcc/toplev.cc > index 5df59b79c803..87fa3f4c8833 100644 > --- a/gcc/toplev.cc > +++ b/gcc/toplev.cc > @@ -2247,7 +2247,8 @@ toplev::~toplev () > void > toplev::start_timevars () > { > - if (time_report || !quiet_flag || flag_detailed_statistics) > + if (time_report || time_report_wall || !quiet_flag > + || flag_detailed_statistics) > timevar_init (); > > timevar_start (TV_TOTAL);
> Note that if the user requests SARIF output e.g. with > -fdiagnostics-format=sarif-stderr > then any timevar data from -ftime-report is written in JSON form as > part of the SARIF, rather than in text form to stderr (see > 75d623946d4b6ea80a777b789b116d4b4a2298dc). > > I see that the proposed patch leaves the user and sys stats as zero, > and conditionalizes what's printed for text output as part of > timer::print. Should it also do something similar in > make_json_for_timevar_time_def for the json output, and not add the > properties for "user" and "sys" if the data hasn't been gathered? > Hope I'm reading the patch correctly. Yes that's right. I mainly adjusted the human output for cosmetic reasons. For machine readable i guess it is better to have a stable schema and not skip fields to avoid pain for parsers. So I left it alone. -Andi
On Thu, 2024-10-03 at 00:37 -0700, Andi Kleen wrote: > > Note that if the user requests SARIF output e.g. with > > -fdiagnostics-format=sarif-stderr > > then any timevar data from -ftime-report is written in JSON form as > > part of the SARIF, rather than in text form to stderr (see > > 75d623946d4b6ea80a777b789b116d4b4a2298dc). > > > > I see that the proposed patch leaves the user and sys stats as > > zero, > > and conditionalizes what's printed for text output as part of > > timer::print. Should it also do something similar in > > make_json_for_timevar_time_def for the json output, and not add the > > properties for "user" and "sys" if the data hasn't been gathered? > > > Hope I'm reading the patch correctly. > > Yes that's right. Thanks. > > I mainly adjusted the human output for cosmetic reasons. > > For machine readable i guess it is better to have a stable schema > and not skip fields to avoid pain for parsers. So I left it alone. The only consumer I know of for the JSON time report data is in the integration tests I wrote for -fanalyzer, which assumes that all fields are present when printing, and then goes on to use the "user" times for summarizing; see this commit FWIW: https://github.com/davidmalcolm/gcc-analyzer-integration-tests/commit/5420ce968e6eae886e61486555b54fd460e0d35f I'm not planning to use -ftime-report-wall, but given that my summarization code is using the "user" times I think I'd prefer it the property wasn't present rather than contained a bogus value that I might mistakenly use. The existing docs do say: "The precise format of this JSON data is subject to change": https://gcc.gnu.org/onlinedocs/gcc/Developer-Options.html#index-ftime-report and there isn't a formal schema for this written down anywhere. Hope this is constructive Dave
> The only consumer I know of for the JSON time report data is in the > integration tests I wrote for -fanalyzer, which assumes that all fields > are present when printing, and then goes on to use the "user" times for > summarizing; see this commit FWIW: > https://github.com/davidmalcolm/gcc-analyzer-integration-tests/commit/5420ce968e6eae886e61486555b54fd460e0d35f It seems to be broken even without my changes: % ./gcc/cc1plus -ftime-report -fdiagnostics-format=sarif-file ../tsrc/tramp3d-v4.i cc1plus: internal compiler error: Segmentation fault 0x27206ee internal_error(char const*, ...) ../../gcc/gcc/diagnostic-global-context.cc:517 0x133401f crash_signal ../../gcc/gcc/toplev.cc:321 0x27e7934 htab_hash_string ../../gcc/libiberty/hashtab.c:838 0x2715dde string_hash::hash(char const*) ../../gcc/gcc/hash-traits.h:239 0x2715dde simple_hashmap_traits<default_hash_traits<nofree_string_hash>, sarif_artifact*>::hash(char const* const&) ../../gcc/gcc/hash-map-traits.h:50 0x2715dde hash_map<nofree_string_hash, sarif_artifact*, simple_hashmap_traits<default_hash_traits<nofree_string_hash>, sarif_artifact*> >::get(char const* const&) ../../gcc/gcc/hash-map.h:191 0x2715dde ordered_hash_map<nofree_string_hash, sarif_artifact*, simple_hashmap_traits<default_hash_traits<nofree_string_hash>, sarif_artifact*> >::get(char const* const&) ../../gcc/gcc/ordered-hash-map.h:76 0x2715dde sarif_builder::get_or_create_artifact(char const*, diagnostic_artifact_role, bool) ../../gcc/gcc/diagnostic-format-sarif.cc:2892 0x2716403 sarif_output_format::sarif_output_format(diagnostic_context&, line_maps const*, char const*, bool) ../../gcc/gcc/diagnostic-format-sarif.cc:3154 0x2716403 sarif_file_output_format::sarif_file_output_format(diagnostic_context&, line_maps const*, char const*, bool, char const*) ../../gcc/gcc/diagnostic-format-sarif.cc:3193 0x2716403 std::enable_if<!std::is_array<sarif_file_output_format>::value, std::unique_ptr<sarif_file_output_format, std::default_delete<sarif_file_output_format> > >::type make_unique<sarif_file_output_format, diagnostic_context&, line_maps const*&, char const*&, bool&, char const*&>(diagnostic_context&, line_maps const*&, char const*&, bool&, char const*&) ../../gcc/gcc/make-unique.h:41 0x2716403 diagnostic_output_format_init_sarif_file(diagnostic_context&, line_maps const*, char const*, bool, char const*) ../../gcc/gcc/diagnostic-format-sarif.cc:3392 0x26f0522 common_handle_option(gcc_options*, gcc_options*, cl_decoded_option const*, unsigned int, int, unsigned int, cl_option_handlers const*, diagnostic_context*, void (*)()) ../../gcc/gcc/opts.cc:2968 0x26f5728 handle_option ../../gcc/gcc/opts-common.cc:1316 0x26f585e read_cmdline_option(gcc_options*, gcc_options*, cl_decoded_option*, unsigned int, unsigned int, cl_option_handlers const*, diagnostic_context*) ../../gcc/gcc/opts-common.cc:1646 0x120f194 read_cmdline_options ../../gcc/gcc/opts-global.cc:242 0x120f194 decode_options(gcc_options*, gcc_options*, cl_decoded_option*, unsigned int, unsigned int, diagnostic_context*, void (*)()) ../../gcc/gcc/opts-global.cc:329 Please submit a full bug report, with preprocessed source (by using -freport-bug). Please include the complete backtrace with any bug report. See <https://gcc.gnu.org/bugs/> for instructions.
On Thu, 2024-10-03 at 11:15 -0700, Andi Kleen wrote: > > The only consumer I know of for the JSON time report data is in the > > integration tests I wrote for -fanalyzer, which assumes that all > > fields > > are present when printing, and then goes on to use the "user" times > > for > > summarizing; see this commit FWIW: > > https://github.com/davidmalcolm/gcc-analyzer-integration-tests/commit/5420ce968e6eae886e61486555b54fd460e0d35f > > It seems to be broken even without my changes: > > > % ./gcc/cc1plus -ftime-report -fdiagnostics-format=sarif-file > ../tsrc/tramp3d-v4.i > cc1plus: internal compiler error: Segmentation fault Oops, thanks; I'm tracking this as https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116978 and working on a fix. Dave
diff --git a/gcc/common.opt b/gcc/common.opt index d270e524ff45..e9fb15e28d80 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -3010,6 +3010,10 @@ ftime-report Common Var(time_report) Report the time taken by each compiler pass. +ftime-report-wall +Common Var(time_report_wall) +Report the wall time taken by each compiler. + ftime-report-details Common Var(time_report_details) Record times taken by sub-phases separately. diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls index e31736cd9945..6e79a8f9390b 100644 --- a/gcc/common.opt.urls +++ b/gcc/common.opt.urls @@ -1378,6 +1378,9 @@ UrlSuffix(gcc/Optimize-Options.html#index-fthread-jumps) ftime-report UrlSuffix(gcc/Developer-Options.html#index-ftime-report) +ftime-report-wall +UrlSuffix(gcc/Developer-Options.html#index-ftime-report-wall) + ftime-report-details UrlSuffix(gcc/Developer-Options.html#index-ftime-report-details) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index e199522f62c7..80cb355f5d79 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -784,6 +784,7 @@ Objective-C and Objective-C++ Dialects}. -frandom-seed=@var{string} -fsched-verbose=@var{n} -fsel-sched-verbose -fsel-sched-dump-cfg -fsel-sched-pipelining-verbose -fstats -fstack-usage -ftime-report -ftime-report-details +-ftime-report-wall -fvar-tracking-assignments-toggle -gtoggle -print-file-name=@var{library} -print-libgcc-file-name -print-multi-directory -print-multi-lib -print-multi-os-directory @@ -21026,6 +21027,12 @@ slightly different place within the compiler. @item -ftime-report-details Record the time consumed by infrastructure parts separately for each pass. +@opindex ftime-report-wall +@item -ftime-report-wall +Report statistics about compiler pass time consumpion, but only using wall +time. This is faster than @option{-ftime-report}, but can be more +influenced by background jobs. + @opindex fira-verbose @item -fira-verbose=@var{n} Control the verbosity of the dump file for the integrated register allocator. diff --git a/gcc/gcc.cc b/gcc/gcc.cc index 16fed46fb35f..8d3046eb7874 100644 --- a/gcc/gcc.cc +++ b/gcc/gcc.cc @@ -7964,7 +7964,8 @@ try_generate_repro (const char **argv) it might varry between invocations. */ else if (! strcmp (argv[nargs], "-quiet")) quiet = 1; - else if (! strcmp (argv[nargs], "-ftime-report")) + else if (! strcmp (argv[nargs], "-ftime-report") + || ! strcmp (argv[nargs], "-ftime-report-wall")) return; if (out_arg == -1 || !quiet) diff --git a/gcc/testsuite/g++.dg/ext/timevar3.C b/gcc/testsuite/g++.dg/ext/timevar3.C new file mode 100644 index 000000000000..b003f37f9654 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/timevar3.C @@ -0,0 +1,14 @@ +// PR c++/52248 +// { dg-options "-ftime-report-wall" } +// { dg-allow-blank-lines-in-output 1 } +// { dg-prune-output "Time variable" } +// { dg-prune-output "k" } +// { dg-prune-output " 0 " } +// { dg-prune-output "checks" } +// { dg-prune-output "\[0-9\]+%" } + +void +foo () +{ + goto lab; // { dg-error "not defined" } +} diff --git a/gcc/timevar.cc b/gcc/timevar.cc index 68bcf44864f9..bf788f735621 100644 --- a/gcc/timevar.cc +++ b/gcc/timevar.cc @@ -243,6 +243,15 @@ get_time (struct timevar_time_def *now) now->ggc_mem = timevar_ggc_mem_total; { +#if _POSIX_C_SOURCE >= 199309L + if (time_report_wall) + { + struct timespec ts; + clock_gettime (CLOCK_MONOTONIC, &ts); + now->wall = ts.tv_sec * 1000000000 + ts.tv_nsec; + return; + } +#endif #ifdef USE_TIMES struct tms tms; now->wall = times (&tms) * ticks_to_nanosec; @@ -682,14 +691,16 @@ timer::print_row (FILE *fp, #ifdef HAVE_USER_TIME /* Print user-mode time for this process. */ - fprintf (fp, "%7.2f (%3.0f%%)", + if (!time_report_wall) + fprintf (fp, "%7.2f (%3.0f%%)", nanosec_to_floating_sec (elapsed.user), percent_of (total->user, elapsed.user)); #endif /* HAVE_USER_TIME */ #ifdef HAVE_SYS_TIME + if (!time_report_wall) /* Print system-mode time for this process. */ - fprintf (fp, "%7.2f (%3.0f%%)", + fprintf (fp, "%7.2f (%3.0f%%)", nanosec_to_floating_sec (elapsed.sys), percent_of (total->sys, elapsed.sys)); #endif /* HAVE_SYS_TIME */ @@ -754,7 +765,10 @@ timer::print (FILE *fp) TIMEVAR. */ m_start_time = now; - fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr", "sys", + if (time_report_wall) + fprintf (fp, "\n%-35s%16s%14s\n", "Time variable", "wall", "GGC"); + else + fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr", "sys", "wall", "GGC"); if (m_jit_client_items) fputs ("GCC items:\n", fp); @@ -810,13 +824,18 @@ timer::print (FILE *fp) /* Print total time. */ fprintf (fp, " %-35s:", "TOTAL"); #ifdef HAVE_USER_TIME - fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user)); + if (!time_report_wall) + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user)); #endif #ifdef HAVE_SYS_TIME - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys)); + if (!time_report_wall) + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys)); #endif #ifdef HAVE_WALL_TIME - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall)); + if (!time_report_wall) + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall)); + else + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->wall)); #endif fprintf (fp, PRsa (7) "\n", SIZE_AMOUNT (total->ggc_mem)); diff --git a/gcc/toplev.cc b/gcc/toplev.cc index 5df59b79c803..87fa3f4c8833 100644 --- a/gcc/toplev.cc +++ b/gcc/toplev.cc @@ -2247,7 +2247,8 @@ toplev::~toplev () void toplev::start_timevars () { - if (time_report || !quiet_flag || flag_detailed_statistics) + if (time_report || time_report_wall || !quiet_flag + || flag_detailed_statistics) timevar_init (); timevar_start (TV_TOTAL);
From: Andi Kleen <ak@gcc.gnu.org> Time vars normally use times(2) to get the user/sys/wall time, which is always a system call. I don't think the system time is very useful because most overhead is in user time. If we only use the wall (or monotonic) time modern OS have an optimized path to get it directly from a CPU instruction like RDTSC without system call, which is much faster. Comparing the overhead with tramp3d: ./gcc/cc1plus -quiet ../tsrc/tramp3d-v4.i ran 1.03 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report-wall ../tsrc/tramp3d-v4.i 1.18 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report ../tsrc/tramp3d-v4.i -ftime-report costs 18% (excluding the output), while -ftime-report-wall only costs 3%, so is nearly free. So it would be feasible for some build system to always enable it and break down the build time into passes. The drawback is that if there is context switching with other programs the time will be overestimated, however for the common case that the system is not oversubscribed it is more accurate because each measurement has less overhead. Add a -ftime-report-wall option. It actually uses the POSIX monotonic time, so strictly it's not wall clock, but it's still a reasonable name. Bootstrapped on x86_64-linux with full test suite run. gcc/ChangeLog: * common.opt (ftime-report-wall): Add. * common.opt.urls: Regenerate. * doc/invoke.texi: (ftime-report-wall): Document * gcc.cc (try_generate_repro): Check for -ftime-report-wall. * timevar.cc (get_time): Use clock_gettime if enabled. (timer::print): Print only wall time for time_report_wall. * toplev.cc (toplev::start_timevars): Check for time_report_wall. gcc/testsuite/ChangeLog: * g++.dg/ext/timevar3.C: New test. --- gcc/common.opt | 4 ++++ gcc/common.opt.urls | 3 +++ gcc/doc/invoke.texi | 7 +++++++ gcc/gcc.cc | 3 ++- gcc/testsuite/g++.dg/ext/timevar3.C | 14 +++++++++++++ gcc/timevar.cc | 31 +++++++++++++++++++++++------ gcc/toplev.cc | 3 ++- 7 files changed, 57 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/g++.dg/ext/timevar3.C