From patchwork Fri Jun 21 12:55:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Malcolm X-Patchwork-Id: 1950774 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256 header.s=mimecast20190719 header.b=VolZoNeR; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4W5HRT59Rmz20Wb for ; Fri, 21 Jun 2024 22:56:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DCC8E3898C47 for ; Fri, 21 Jun 2024 12:56:19 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by sourceware.org (Postfix) with ESMTPS id 1B5D2389850B for ; Fri, 21 Jun 2024 12:55:51 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1B5D2389850B Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1B5D2389850B Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718974553; cv=none; b=meRWljeoI84IxeGCJHOnJB9nuZRQR5Gt4vEzXnNziPUmedwldT2dCCnUhhE47IHQpbXnD6r9ZdnB6WW72d5r4E7BYc/h9DdTfKZEBW7pDAMGEKWUsffwDWi35HItGULblCot/sAcGw2MCCJ3NyqXtJxHB4ZZQQtuldH5ZLi1LbA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1718974553; c=relaxed/simple; bh=MdKOHtgeFYNA6iJCAxkgDyKET/orgpo0yDArDg8gNS0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=m+uH2Vn+4wZ2uaXzSDZ6uwrvP2dd3sKghWVbxzHnIVf/6gv67Lpva/7FDSjP+rQDdroN1Pxd1hPYOcs55jJNC6rUY9hXS5OWQ0TvFt7T/aFbSxXImJxjz53UrL6KJKXWreH/9rA9LaB+aNnJ78KaHkZgO1h5fTcB8wh7JvUSxJs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1718974550; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=h1ZmcolZCu4Bdh7yVXtbCwpqBzQFfDx4SxCyUL5aw78=; b=VolZoNeRTixhRfdMlQI5MDvcRTx5KFUSKPvhSFHkNmfXHWIrHOSEvCtfZOEZjnQRQzgwyb UGGobLfGUp1oSaWVB+w3eFnrLyYsbfxxrzawL6IYjf9frHjCtdxvk7cmaOv4LdihE1uzQe H+oNom+jzUk9892woJjQ4Pd0GMwQvpY= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-17-FUuyt7AuNia416iiAP9ELQ-1; Fri, 21 Jun 2024 08:55:48 -0400 X-MC-Unique: FUuyt7AuNia416iiAP9ELQ-1 Received: from mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.12]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id A787519560B1 for ; Fri, 21 Jun 2024 12:55:47 +0000 (UTC) Received: from t14s.localdomain.com (unknown [10.22.10.107]) by mx-prod-int-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id ADD9B19560AE; Fri, 21 Jun 2024 12:55:45 +0000 (UTC) From: David Malcolm To: gcc-patches@gcc.gnu.org Cc: David Malcolm Subject: [pushed 1/2] diagnostics: fixes to SARIF output [PR109360] Date: Fri, 21 Jun 2024 08:55:35 -0400 Message-Id: <20240621125536.1134456-1-dmalcolm@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.0 on 10.30.177.12 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com X-Spam-Status: No, score=-10.4 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, RCVD_IN_SBL_CSS, SPF_HELO_NONE, SPF_NONE, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org When adding validation of .sarif files against the schema (PR testsuite/109360) I discovered various issues where we were generating invalid .sarif files. Specifically, in c-c++-common/diagnostic-format-sarif-file-bad-utf8-pr109098-1.c the relatedLocations for the "note" diagnostics were missing column numbers, leading to validation failure due to non-unique elements, such as multiple: "message": {"text": "invalid UTF-8 character "}}, on line 25 with no column information. Root cause is that for some diagnostics in libcpp we have a location_t representing the line as a whole, setting a column_override on the rich_location (since the line hasn't been fully read yet). We were handling this column override for plain text output, but not for .sarif output. Similarly, in diagnostic-format-sarif-file-pr111700.c there is a warning emitted on "line 0" of the file, whereas SARIF requires line numbers to be positive. We also use column == 0 internally to mean "the line as a whole", whereas SARIF required column numbers to be positive. This patch fixes these various issues. Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu. Successful run of analyzer integration tests on x86_64-pc-linux-gnu. Pushed to trunk as r15-1540-g9f4fdc3acebcf6. gcc/ChangeLog: PR testsuite/109360 * diagnostic-format-sarif.cc (sarif_builder::make_location_object): Pass any column override from rich_loc to maybe_make_physical_location_object. (sarif_builder::maybe_make_physical_location_object): Add "column_override" param and pass it to maybe_make_region_object. (sarif_builder::maybe_make_region_object): Add "column_override" param and use it when the location has 0 for a column. Don't add "startLine", "startColumn", "endLine", or "endColumn" if the values aren't positive. (sarif_builder::maybe_make_region_object_for_context): Don't add "startLine" or "endLine" if the values aren't positive. libcpp/ChangeLog: PR testsuite/109360 * include/rich-location.h (rich_location::get_column_override): New accessor. Signed-off-by: David Malcolm --- gcc/diagnostic-format-sarif.cc | 75 ++++++++++++++++++++++++---------- libcpp/include/rich-location.h | 2 + 2 files changed, 56 insertions(+), 21 deletions(-) diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc index 79116f051bc1..acf2aa875c48 100644 --- a/gcc/diagnostic-format-sarif.cc +++ b/gcc/diagnostic-format-sarif.cc @@ -243,11 +243,13 @@ private: json::array *maybe_make_kinds_array (diagnostic_event::meaning m) const; json::object * maybe_make_physical_location_object (location_t loc, - enum diagnostic_artifact_role role); + enum diagnostic_artifact_role role, + int column_override); json::object *make_artifact_location_object (location_t loc); json::object *make_artifact_location_object (const char *filename); json::object *make_artifact_location_object_for_pwd () const; - json::object *maybe_make_region_object (location_t loc) const; + json::object *maybe_make_region_object (location_t loc, + int column_override) const; json::object *maybe_make_region_object_for_context (location_t loc) const; json::object *make_region_object_for_hint (const fixit_hint &hint) const; json::object *make_multiformat_message_string (const char *msg) const; @@ -924,8 +926,9 @@ sarif_builder::make_location_object (const rich_location &rich_loc, location_t loc = rich_loc.get_loc (); /* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */ - if (json::object *phs_loc_obj = maybe_make_physical_location_object (loc, - role)) + if (json::object *phs_loc_obj + = maybe_make_physical_location_object (loc, role, + rich_loc.get_column_override ())) location_obj->set ("physicalLocation", phs_loc_obj); /* "logicalLocations" property (SARIF v2.1.0 section 3.28.4). */ @@ -946,7 +949,7 @@ sarif_builder::make_location_object (const diagnostic_event &event, /* "physicalLocation" property (SARIF v2.1.0 section 3.28.3). */ location_t loc = event.get_location (); if (json::object *phs_loc_obj - = maybe_make_physical_location_object (loc, role)) + = maybe_make_physical_location_object (loc, role, 0)) location_obj->set ("physicalLocation", phs_loc_obj); /* "logicalLocations" property (SARIF v2.1.0 section 3.28.4). */ @@ -961,7 +964,10 @@ sarif_builder::make_location_object (const diagnostic_event &event, return location_obj; } -/* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for LOC, +/* Make a physicalLocation object (SARIF v2.1.0 section 3.29) for LOC. + + If COLUMN_OVERRIDE is non-zero, then use it as the column number + if LOC has no column information. Ensure that we have an artifact object for the file, adding ROLE to it, and flagging that we will attempt to embed the contents of the artifact @@ -970,7 +976,8 @@ sarif_builder::make_location_object (const diagnostic_event &event, json::object * sarif_builder:: maybe_make_physical_location_object (location_t loc, - enum diagnostic_artifact_role role) + enum diagnostic_artifact_role role, + int column_override) { if (loc <= BUILTINS_LOCATION || LOCATION_FILE (loc) == NULL) return NULL; @@ -983,7 +990,8 @@ maybe_make_physical_location_object (location_t loc, get_or_create_artifact (LOCATION_FILE (loc), role, true); /* "region" property (SARIF v2.1.0 section 3.29.4). */ - if (json::object *region_obj = maybe_make_region_object (loc)) + if (json::object *region_obj = maybe_make_region_object (loc, + column_override)) phys_loc_obj->set ("region", region_obj); /* "contextRegion" property (SARIF v2.1.0 section 3.29.5). */ @@ -1089,10 +1097,14 @@ sarif_builder::get_sarif_column (expanded_location exploc) const } /* Make a region object (SARIF v2.1.0 section 3.30) for LOC, - or return NULL. */ + or return NULL. + + If COLUMN_OVERRIDE is non-zero, then use it as the column number + if LOC has no column information. */ json::object * -sarif_builder::maybe_make_region_object (location_t loc) const +sarif_builder::maybe_make_region_object (location_t loc, + int column_override) const { location_t caret_loc = get_pure_location (loc); @@ -1114,21 +1126,40 @@ sarif_builder::maybe_make_region_object (location_t loc) const json::object *region_obj = new json::object (); /* "startLine" property (SARIF v2.1.0 section 3.30.5) */ - region_obj->set_integer ("startLine", exploc_start.line); + if (exploc_start.line > 0) + region_obj->set_integer ("startLine", exploc_start.line); - /* "startColumn" property (SARIF v2.1.0 section 3.30.6) */ - region_obj->set_integer ("startColumn", get_sarif_column (exploc_start)); + /* "startColumn" property (SARIF v2.1.0 section 3.30.6). + + We use column == 0 to mean the whole line, so omit the column + information for this case, unless COLUMN_OVERRIDE is non-zero, + (for handling certain awkward lexer diagnostics) */ + + if (exploc_start.column == 0 && column_override) + /* Use the provided column number. */ + exploc_start.column = column_override; + + if (exploc_start.column > 0) + { + int start_column = get_sarif_column (exploc_start); + region_obj->set_integer ("startColumn", start_column); + } /* "endLine" property (SARIF v2.1.0 section 3.30.7) */ - if (exploc_finish.line != exploc_start.line) + if (exploc_finish.line != exploc_start.line + && exploc_finish.line > 0) region_obj->set_integer ("endLine", exploc_finish.line); /* "endColumn" property (SARIF v2.1.0 section 3.30.8). - This expresses the column immediately beyond the range. */ - { - int next_column = sarif_builder::get_sarif_column (exploc_finish) + 1; - region_obj->set_integer ("endColumn", next_column); - } + This expresses the column immediately beyond the range. + + We use column == 0 to mean the whole line, so omit the column + information for this case. */ + if (exploc_finish.column > 0) + { + int next_column = get_sarif_column (exploc_finish) + 1; + region_obj->set_integer ("endColumn", next_column); + } return region_obj; } @@ -1164,10 +1195,12 @@ sarif_builder::maybe_make_region_object_for_context (location_t loc) const json::object *region_obj = new json::object (); /* "startLine" property (SARIF v2.1.0 section 3.30.5) */ - region_obj->set_integer ("startLine", exploc_start.line); + if (exploc_start.line > 0) + region_obj->set_integer ("startLine", exploc_start.line); /* "endLine" property (SARIF v2.1.0 section 3.30.7) */ - if (exploc_finish.line != exploc_start.line) + if (exploc_finish.line != exploc_start.line + && exploc_finish.line > 0) region_obj->set_integer ("endLine", exploc_finish.line); /* "snippet" property (SARIF v2.1.0 section 3.30.13). */ diff --git a/libcpp/include/rich-location.h b/libcpp/include/rich-location.h index be424cb4b65f..cc4f888f458b 100644 --- a/libcpp/include/rich-location.h +++ b/libcpp/include/rich-location.h @@ -514,6 +514,8 @@ class rich_location const line_maps *get_line_table () const { return m_line_table; } + int get_column_override () const { return m_column_override; } + private: bool reject_impossible_fixit (location_t where); void stop_supporting_fixits ();