From patchwork Wed Oct 2 21:14:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andi Kleen X-Patchwork-Id: 1992172 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=bVYZloGd; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4XJndb5SQpz1xt3 for ; Thu, 3 Oct 2024 07:15:14 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 29F55385DDC6 for ; Wed, 2 Oct 2024 21:15:12 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) by sourceware.org (Postfix) with ESMTPS id 25C4D3858D34; Wed, 2 Oct 2024 21:14:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 25C4D3858D34 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=linux.intel.com Authentication-Results: sourceware.org; spf=none smtp.mailfrom=linux.intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 25C4D3858D34 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727903695; cv=none; b=vN1m0VhtaN3/hOCjvv1GY7sAibVSKbsja7Gg8yubnu68TGCQLfxLDItjpdJ4dUICrzewEZkC0OKhgoOH38KRstGB1SYx/lAU+GXv06XKLWmf9xhs6I4/alv+neH7fKKSHl2UFAHjvGK0gW37qw2zSvT8wrNDm2eAXJ7fWbhqsyQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1727903695; c=relaxed/simple; bh=kTfsHCmuUWnJdbnia95L36IJFZAWZI8zrth/iLOMU7A=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=Rni11TZcxPZuxYG8kEfU4v2O6ooPtZSWdDMYyoGjuSxYInygJGVsoOIbD+/Vj5/Q/lK3zx8p3L3SWeJNHC8q3KublKTTvt4LQ4jPZ06P9HIC1JcXxP8TIVTlHhEEgTfGdvRYGcLxJg2A079jpjcDlLdVH022Gm33JgKISbGq9i4= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1727903692; x=1759439692; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=kTfsHCmuUWnJdbnia95L36IJFZAWZI8zrth/iLOMU7A=; b=bVYZloGd98fKRsolMMrOoSdCYXNGoMjTOu15B2yFx9QWGSjBNWBbZNNU w3JbAPpR+JdgjfWkF7i6YsMxWEvX3Gyt6g5XME2DgiYehEOTNmG8tUX9l JiqS7bE0RhvNHEipvYYuiXJHZgGxErefjkMGEGVGDYxCfUEIw3E3MYapc g67LdWOtPbw/jy5anXVV0EwyEqCZom+6D8ojn3doVzbV5I1crTyQ7GoYI 0yIK79qQ1amwlGqG/QcL9KQ7htxrOobh6UwhRMI6XQpvC/jnwmluoSW+S OCj0n/xCC0Godx40j94J6Z3wf30R9ER4qdGd8eFn+xBTODzgSthJ7FhId Q==; X-CSE-ConnectionGUID: tDe2UTb5TEaWeSFH/NkUFw== X-CSE-MsgGUID: eBpCs8ELQJOPMdWENlvzZg== X-IronPort-AV: E=McAfee;i="6700,10204,11213"; a="27175885" X-IronPort-AV: E=Sophos;i="6.11,172,1725346800"; d="scan'208";a="27175885" Received: from orviesa008.jf.intel.com ([10.64.159.148]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Oct 2024 14:14:51 -0700 X-CSE-ConnectionGUID: 0YD+xCFiRb+W3JiUkHlIEA== X-CSE-MsgGUID: EgVE653LQPWhoLL3LLyu/A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,172,1725346800"; d="scan'208";a="74938321" Received: from tassilo.jf.intel.com ([10.54.38.190]) by orviesa008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Oct 2024 14:14:51 -0700 From: Andi Kleen To: gcc-patches@gcc.gnu.org Cc: Andi Kleen Subject: [PATCH v1] Add -ftime-report-wall Date: Wed, 2 Oct 2024 14:14:43 -0700 Message-ID: <20241002211443.2035722-1-ak@linux.intel.com> X-Mailer: git-send-email 2.46.2 MIME-Version: 1.0 X-Spam-Status: No, score=-11.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org From: Andi Kleen Time vars normally use times(2) to get the user/sys/wall time, which is always a system call. I don't think the system time is very useful because most overhead is in user time. If we only use the wall (or monotonic) time modern OS have an optimized path to get it directly from a CPU instruction like RDTSC without system call, which is much faster. Comparing the overhead with tramp3d: ./gcc/cc1plus -quiet ../tsrc/tramp3d-v4.i ran 1.03 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report-wall ../tsrc/tramp3d-v4.i 1.18 ± 0.00 times faster than ./gcc/cc1plus -quiet -ftime-report ../tsrc/tramp3d-v4.i -ftime-report costs 18% (excluding the output), while -ftime-report-wall only costs 3%, so is nearly free. So it would be feasible for some build system to always enable it and break down the build time into passes. The drawback is that if there is context switching with other programs the time will be overestimated, however for the common case that the system is not oversubscribed it is more accurate because each measurement has less overhead. Add a -ftime-report-wall option. It actually uses the POSIX monotonic time, so strictly it's not wall clock, but it's still a reasonable name. Bootstrapped on x86_64-linux with full test suite run. gcc/ChangeLog: * common.opt (ftime-report-wall): Add. * common.opt.urls: Regenerate. * doc/invoke.texi: (ftime-report-wall): Document * gcc.cc (try_generate_repro): Check for -ftime-report-wall. * timevar.cc (get_time): Use clock_gettime if enabled. (timer::print): Print only wall time for time_report_wall. * toplev.cc (toplev::start_timevars): Check for time_report_wall. gcc/testsuite/ChangeLog: * g++.dg/ext/timevar3.C: New test. --- gcc/common.opt | 4 ++++ gcc/common.opt.urls | 3 +++ gcc/doc/invoke.texi | 7 +++++++ gcc/gcc.cc | 3 ++- gcc/testsuite/g++.dg/ext/timevar3.C | 14 +++++++++++++ gcc/timevar.cc | 31 +++++++++++++++++++++++------ gcc/toplev.cc | 3 ++- 7 files changed, 57 insertions(+), 8 deletions(-) create mode 100644 gcc/testsuite/g++.dg/ext/timevar3.C diff --git a/gcc/common.opt b/gcc/common.opt index d270e524ff45..e9fb15e28d80 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -3010,6 +3010,10 @@ ftime-report Common Var(time_report) Report the time taken by each compiler pass. +ftime-report-wall +Common Var(time_report_wall) +Report the wall time taken by each compiler. + ftime-report-details Common Var(time_report_details) Record times taken by sub-phases separately. diff --git a/gcc/common.opt.urls b/gcc/common.opt.urls index e31736cd9945..6e79a8f9390b 100644 --- a/gcc/common.opt.urls +++ b/gcc/common.opt.urls @@ -1378,6 +1378,9 @@ UrlSuffix(gcc/Optimize-Options.html#index-fthread-jumps) ftime-report UrlSuffix(gcc/Developer-Options.html#index-ftime-report) +ftime-report-wall +UrlSuffix(gcc/Developer-Options.html#index-ftime-report-wall) + ftime-report-details UrlSuffix(gcc/Developer-Options.html#index-ftime-report-details) diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index e199522f62c7..80cb355f5d79 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -784,6 +784,7 @@ Objective-C and Objective-C++ Dialects}. -frandom-seed=@var{string} -fsched-verbose=@var{n} -fsel-sched-verbose -fsel-sched-dump-cfg -fsel-sched-pipelining-verbose -fstats -fstack-usage -ftime-report -ftime-report-details +-ftime-report-wall -fvar-tracking-assignments-toggle -gtoggle -print-file-name=@var{library} -print-libgcc-file-name -print-multi-directory -print-multi-lib -print-multi-os-directory @@ -21026,6 +21027,12 @@ slightly different place within the compiler. @item -ftime-report-details Record the time consumed by infrastructure parts separately for each pass. +@opindex ftime-report-wall +@item -ftime-report-wall +Report statistics about compiler pass time consumpion, but only using wall +time. This is faster than @option{-ftime-report}, but can be more +influenced by background jobs. + @opindex fira-verbose @item -fira-verbose=@var{n} Control the verbosity of the dump file for the integrated register allocator. diff --git a/gcc/gcc.cc b/gcc/gcc.cc index 16fed46fb35f..8d3046eb7874 100644 --- a/gcc/gcc.cc +++ b/gcc/gcc.cc @@ -7964,7 +7964,8 @@ try_generate_repro (const char **argv) it might varry between invocations. */ else if (! strcmp (argv[nargs], "-quiet")) quiet = 1; - else if (! strcmp (argv[nargs], "-ftime-report")) + else if (! strcmp (argv[nargs], "-ftime-report") + || ! strcmp (argv[nargs], "-ftime-report-wall")) return; if (out_arg == -1 || !quiet) diff --git a/gcc/testsuite/g++.dg/ext/timevar3.C b/gcc/testsuite/g++.dg/ext/timevar3.C new file mode 100644 index 000000000000..b003f37f9654 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/timevar3.C @@ -0,0 +1,14 @@ +// PR c++/52248 +// { dg-options "-ftime-report-wall" } +// { dg-allow-blank-lines-in-output 1 } +// { dg-prune-output "Time variable" } +// { dg-prune-output "k" } +// { dg-prune-output " 0 " } +// { dg-prune-output "checks" } +// { dg-prune-output "\[0-9\]+%" } + +void +foo () +{ + goto lab; // { dg-error "not defined" } +} diff --git a/gcc/timevar.cc b/gcc/timevar.cc index 68bcf44864f9..bf788f735621 100644 --- a/gcc/timevar.cc +++ b/gcc/timevar.cc @@ -243,6 +243,15 @@ get_time (struct timevar_time_def *now) now->ggc_mem = timevar_ggc_mem_total; { +#if _POSIX_C_SOURCE >= 199309L + if (time_report_wall) + { + struct timespec ts; + clock_gettime (CLOCK_MONOTONIC, &ts); + now->wall = ts.tv_sec * 1000000000 + ts.tv_nsec; + return; + } +#endif #ifdef USE_TIMES struct tms tms; now->wall = times (&tms) * ticks_to_nanosec; @@ -682,14 +691,16 @@ timer::print_row (FILE *fp, #ifdef HAVE_USER_TIME /* Print user-mode time for this process. */ - fprintf (fp, "%7.2f (%3.0f%%)", + if (!time_report_wall) + fprintf (fp, "%7.2f (%3.0f%%)", nanosec_to_floating_sec (elapsed.user), percent_of (total->user, elapsed.user)); #endif /* HAVE_USER_TIME */ #ifdef HAVE_SYS_TIME + if (!time_report_wall) /* Print system-mode time for this process. */ - fprintf (fp, "%7.2f (%3.0f%%)", + fprintf (fp, "%7.2f (%3.0f%%)", nanosec_to_floating_sec (elapsed.sys), percent_of (total->sys, elapsed.sys)); #endif /* HAVE_SYS_TIME */ @@ -754,7 +765,10 @@ timer::print (FILE *fp) TIMEVAR. */ m_start_time = now; - fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr", "sys", + if (time_report_wall) + fprintf (fp, "\n%-35s%16s%14s\n", "Time variable", "wall", "GGC"); + else + fprintf (fp, "\n%-35s%16s%14s%14s%14s\n", "Time variable", "usr", "sys", "wall", "GGC"); if (m_jit_client_items) fputs ("GCC items:\n", fp); @@ -810,13 +824,18 @@ timer::print (FILE *fp) /* Print total time. */ fprintf (fp, " %-35s:", "TOTAL"); #ifdef HAVE_USER_TIME - fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user)); + if (!time_report_wall) + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->user)); #endif #ifdef HAVE_SYS_TIME - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys)); + if (!time_report_wall) + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->sys)); #endif #ifdef HAVE_WALL_TIME - fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall)); + if (!time_report_wall) + fprintf (fp, "%8.2f ", nanosec_to_floating_sec (total->wall)); + else + fprintf (fp, "%7.2f ", nanosec_to_floating_sec (total->wall)); #endif fprintf (fp, PRsa (7) "\n", SIZE_AMOUNT (total->ggc_mem)); diff --git a/gcc/toplev.cc b/gcc/toplev.cc index 5df59b79c803..87fa3f4c8833 100644 --- a/gcc/toplev.cc +++ b/gcc/toplev.cc @@ -2247,7 +2247,8 @@ toplev::~toplev () void toplev::start_timevars () { - if (time_report || !quiet_flag || flag_detailed_statistics) + if (time_report || time_report_wall || !quiet_flag + || flag_detailed_statistics) timevar_init (); timevar_start (TV_TOTAL);