From patchwork Thu Jul 4 08:25:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?q?J=C3=B8rgen_Kvalsvik?= X-Patchwork-Id: 1956721 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; secure) header.d=kolabnow.com header.i=@kolabnow.com header.a=rsa-sha256 header.s=dkim20240523 header.b=AHkP26ex; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4WF8rX20JRz1xqb for ; Thu, 4 Jul 2024 18:26:52 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 678503858C41 for ; Thu, 4 Jul 2024 08:26:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx.kolabnow.com (mx.kolabnow.com [212.103.80.155]) by sourceware.org (Postfix) with ESMTPS id 3B86C3858C41 for ; Thu, 4 Jul 2024 08:26:10 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 3B86C3858C41 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=lambda.is Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=lambda.is ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 3B86C3858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=212.103.80.155 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720081574; cv=none; b=pN2mAiLkSQknrUb6QQX3qNvQ7PuP8pmjtCvAQxWzr8VeH0NokmyBmPB4Oi8xxv1wGngJ8ViOL0ITJzJijzdfrsORfy/IScy6wxZFXlbhWdnL1LipSOSNqD0UH1mejUJAYOXeKq9LXXDbKV5gaS4lT3S2XeM8ngChfnBpbCWL60o= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1720081574; c=relaxed/simple; bh=2yMCdwIr22y2kDYW9AgtKe+XGhOdUBI5fsa6wX4tr8o=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=E2K2i/Zvz3LemR32JuC7WSG25OSaTifr+I/Wx/uaTlvB7IIgvJsj4GVmx2ayS4FkjfmERq7gy2qUim39GL+FXoF0uDLrbF0byhDu1g7FSC7BumcleeZgNZpSbc/YQpYI4ncdkpp0nK3jH4RpZZUO+w11liil43EFDcaCzO+ZVEw= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from localhost (unknown [127.0.0.1]) by mx.kolabnow.com (Postfix) with ESMTP id EE87320D9E8F; Thu, 4 Jul 2024 10:26:08 +0200 (CEST) Authentication-Results: ext-mx-out011.mykolab.com (amavis); dkim=pass (2048-bit key) reason="pass (just generated, assumed good)" header.d=kolabnow.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kolabnow.com; h= content-transfer-encoding:mime-version:message-id:date:date :subject:subject:from:from:received:received:received; s= dkim20240523; t=1720081565; x=1721895966; bh=O0LKHXLETWIq3NWeFoi dupJnRsWBrWLViLjEwwmGocs=; b=AHkP26exhmOloYOJIzh9a1NW7ITTW5WcEfY Ej4rsviazvUSf9/EahJCe8d+XXnsNoXVw46+ta4hmZSPUUJw6rGGvMYT5FVU7mum yjeX1se+VupmiH7HCv+/o5cO/n7gEaVkWChhjXWDoGv1YClaXYhujOP/KXtyoqTa fnMUwgm0nLM2GE3hry9T3twr+4yXqed2QB2W5aILE0ay1kSmCm/ksfk1Om3EOVX9 gCuLN2FhJLMnlX2z0onuI6skcLtMFPTbJbD3ERszW7+gxm9RJVdpRxD1vNh116Zu uBqbxcGDRc7b3Dd+dcU4Y9cjc0EgwsvMlIEvEFB75NFto/q8ggw== X-Virus-Scanned: amavis at mykolab.com X-Spam-Score: -1 X-Spam-Level: X-Spam-Status: No, score=-12.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 Received: from mx.kolabnow.com ([127.0.0.1]) by localhost (ext-mx-out011.mykolab.com [127.0.0.1]) (amavis, port 10024) with ESMTP id K6uKc9-fFAGX; Thu, 4 Jul 2024 10:26:05 +0200 (CEST) Received: from int-mx011.mykolab.com (unknown [10.9.13.11]) by mx.kolabnow.com (Postfix) with ESMTPS id 5C43420D9E8E; Thu, 4 Jul 2024 10:26:04 +0200 (CEST) Received: from ext-subm010.mykolab.com (unknown [10.9.6.10]) by int-mx011.mykolab.com (Postfix) with ESMTPS id 01F55348FE93; Thu, 4 Jul 2024 10:26:03 +0200 (CEST) From: =?utf-8?q?J=C3=B8rgen_Kvalsvik?= To: gcc-patches@gcc.gnu.org Cc: hubicka@ucw.cz, =?utf-8?q?J=C3=B8rgen_Kvalsvik?= Subject: [PATCH] gcov: Cache source files Date: Thu, 4 Jul 2024 10:25:56 +0200 Message-Id: <20240704082556.1079778-1-j@lambda.is> MIME-Version: 1.0 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Cache the source files as they are read, rather than discarding them at the end of output_lines (), and move the reading of the source file to the new function slurp. This patch does not really change anything other than moving the file reading out of output_file, but set gcov up for more interaction with the source file. The motvating example is reporting coverage on functions from different source files, notably C++ headers and ((always_inline)). Here is an example of what gcov does today: hello.h: inline __attribute__((always_inline)) int hello (const char *s) { if (s) printf ("hello, %s!\n", s); else printf ("hello, world!\n"); return 0; } hello.c: int notmain(const char *entity) { return hello (entity); } int main() { const char *empty = 0; if (!empty) hello (empty); else puts ("Goodbye!"); } $ gcov -abc hello function notmain called 0 returned 0% blocks executed 0% #####: 4:int notmain(const char *entity) %%%%%: 4-block 2 branch 0 never executed (fallthrough) branch 1 never executed -: 5:{ #####: 6: return hello (entity); %%%%%: 6-block 7 -: 7:} Clearly there is a branch in notmain, but the branch comes from the inlining of hello. This is not very obvious from looking at the output. Here is hello.h.gcov: -: 3:inline __attribute__((always_inline)) -: 4:int hello (const char *s) -: 5:{ #####: 6: if (s) %%%%%: 6-block 3 branch 0 never executed (fallthrough) branch 1 never executed %%%%%: 6-block 2 branch 2 never executed (fallthrough) branch 3 never executed #####: 7: printf ("hello, %s!\n", s); %%%%%: 7-block 4 call 0 never executed %%%%%: 7-block 3 call 1 never executed -: 8: else #####: 9: printf ("hello, world!\n"); %%%%%: 9-block 5 call 0 never executed %%%%%: 9-block 4 call 1 never executed #####: 10: return 0; %%%%%: 10-block 6 %%%%%: 10-block 5 -: 11:} The blocks from the different call sites have all been interleaved. The reporting could tuned be to list the inlined function, too, like this: 1: 4:int notmain(const char *entity) -: == inlined from hello.h == 1: 6: if (s) branch 0 taken 0 (fallthrough) branch 1 taken 1 #####: 7: printf ("hello, %s!\n", s); %%%%%: 7-block 3 call 0 never executed -: 8: else 1: 9: printf ("hello, world!\n"); 1: 9-block 4 call 0 returned 1 1: 10: return 0; 1: 10-block 5 -: == inlined from hello.h (end) == -: 5:{ 1: 6: return hello (entity); 1: 6-block 7 -: 7:} Implementing something to this effect relies on having the sources for both files (hello.c, hello.h) available, which is what this patch sets up. Note that the previous reading code would leak the source file content, and explicitly storing them is not a huge departure nor performance implication. I verified this with valgrind: With slurp: $ valgrind gcov ./hello == == Memcheck, a memory error detector == == Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. == == Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info == == Command: ./gcc/gcov demo == == File 'hello.c' Lines executed:100.00% of 4 Creating 'hello.c.gcov' File 'hello.h' Lines executed:75.00% of 4 Creating 'hello.h.gcov' == == == == HEAP SUMMARY: == == in use at exit: 84,907 bytes in 54 blocks == == total heap usage: 254 allocs, 200 frees, 137,156 bytes allocated == == == == LEAK SUMMARY: == == definitely lost: 1,237 bytes in 22 blocks == == indirectly lost: 562 bytes in 18 blocks == == possibly lost: 0 bytes in 0 blocks == == still reachable: 83,108 bytes in 14 blocks == == of which reachable via heuristic: == == newarray : 1,544 bytes in 1 blocks == == suppressed: 0 bytes in 0 blocks == == Rerun with --leak-check=full to see details of leaked memory == == == == For lists of detected and suppressed errors, rerun with: -s == == ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) Without slurp: $ valgrind gcov ./demo == == Memcheck, a memory error detector == == Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al. == == Using Valgrind-3.19.0 and LibVEX; rerun with -h for copyright info == == Command: ./gcc/gcov demo == == File 'hello.c' Lines executed:100.00% of 4 Creating 'hello.c.gcov' File 'hello.h' Lines executed:75.00% of 4 Creating 'hello.h.gcov' Lines executed:87.50% of 8 == == == == HEAP SUMMARY: == == in use at exit: 85,316 bytes in 82 blocks == == total heap usage: 250 allocs, 168 frees, 137,084 bytes allocated == == == == LEAK SUMMARY: == == definitely lost: 1,646 bytes in 50 blocks == == indirectly lost: 562 bytes in 18 blocks == == possibly lost: 0 bytes in 0 blocks == == still reachable: 83,108 bytes in 14 blocks == == of which reachable via heuristic: == == newarray : 1,544 bytes in 1 blocks == == suppressed: 0 bytes in 0 blocks == == Rerun with --leak-check=full to see details of leaked memory == == == == For lists of detected and suppressed errors, rerun with: -s == == ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0) gcc/ChangeLog: * gcov.cc (release_structures): Release source_lines. (slurp): New function. (output_lines): Read sources with slurp. --- gcc/gcov.cc | 67 +++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 50 insertions(+), 17 deletions(-) diff --git a/gcc/gcov.cc b/gcc/gcov.cc index 2e4bd9d3c5d..c5323a5753b 100644 --- a/gcc/gcov.cc +++ b/gcc/gcov.cc @@ -549,6 +549,11 @@ static vector names; a file being read multiple times. */ static vector processed_files; +/* The contents of a source file. The nth SOURCE_LINES entry is the + contents of the nth SOURCES, or empty if it has not or could not be + read. */ +static vector*> source_lines; + /* This holds data summary information. */ static unsigned object_runs; @@ -1734,6 +1739,14 @@ release_structures (void) it != functions.end (); it++) delete (*it); + for (vector *lines : source_lines) + { + for (const char *line : *lines) + free (const_cast (line)); + delete (lines); + } + source_lines.resize (0); + sources.resize (0); names.resize (0); functions.resize (0); @@ -3154,6 +3167,41 @@ read_line (FILE *file) return pos ? string : NULL; } +/* Get the vector with the contents SRC, possibly from a cache. If + the reading fails, a message prefixed with LINE_START is written to + GCOV_FILE. */ +static const vector& +slurp (const source_info &src, FILE *gcov_file, + const char *line_start) +{ + if (source_lines.size () <= src.index) + source_lines.resize (src.index + 1); + + /* Store vector pointers so that the returned references remain + stable and won't be broken by successive calls to slurp. */ + if (!source_lines[src.index]) + source_lines[src.index] = new vector (); + + if (!source_lines[src.index]->empty ()) + return *source_lines[src.index]; + + FILE *source_file = fopen (src.name, "r"); + if (!source_file) + fnotice (stderr, "Cannot open source file %s\n", src.name); + else if (src.file_time == 0) + fprintf (gcov_file, "%sSource is newer than graph\n", line_start); + + const char *retval; + vector &lines = *source_lines[src.index]; + if (source_file) + while ((retval = read_line (source_file))) + lines.push_back (xstrdup (retval)); + + if (source_file) + fclose (source_file); + return lines; +} + /* Pad string S with spaces from left to have total width equal to 9. */ static void @@ -3343,9 +3391,6 @@ output_lines (FILE *gcov_file, const source_info *src) #define DEFAULT_LINE_START " -: 0:" #define FN_SEPARATOR "------------------\n" - FILE *source_file; - const char *retval; - /* Print colorization legend. */ if (flag_use_colors) fprintf (gcov_file, "%s", @@ -3372,17 +3417,8 @@ output_lines (FILE *gcov_file, const source_info *src) fprintf (gcov_file, DEFAULT_LINE_START "Runs:%u\n", object_runs); } - source_file = fopen (src->name, "r"); - if (!source_file) - fnotice (stderr, "Cannot open source file %s\n", src->name); - else if (src->file_time == 0) - fprintf (gcov_file, DEFAULT_LINE_START "Source is newer than graph\n"); - - vector source_lines; - if (source_file) - while ((retval = read_line (source_file)) != NULL) - source_lines.push_back (xstrdup (retval)); - + const vector &source_lines = slurp (*src, gcov_file, + DEFAULT_LINE_START); unsigned line_start_group = 0; vector *fns; @@ -3479,7 +3515,4 @@ output_lines (FILE *gcov_file, const source_info *src) line_start_group = 0; } } - - if (source_file) - fclose (source_file); }