From patchwork Wed Aug 2 18:45:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Feng X-Patchwork-Id: 1816131 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: legolas.ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.a=rsa-sha256 header.s=default header.b=gPkFrsNH; dkim-atps=neutral Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4RGLXw3Vg7z1yYC for ; Thu, 3 Aug 2023 04:46:22 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 212A438582BC for ; Wed, 2 Aug 2023 18:46:19 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 212A438582BC DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1691001979; bh=VSor2Ey4kxiXPKkXfrMh6uG4Q0j36pjwZeq3V5LwHQw=; h=To:Cc:Subject:Date:In-Reply-To:References:List-Id: List-Unsubscribe:List-Archive:List-Post:List-Help:List-Subscribe: From:Reply-To:From; b=gPkFrsNHlbji0XElsb90Z9uY5dEHNcaYfYg8M8y2bafKt7ZIrgwTCr7Sot030wC3E mqdF72wC8lfPrQ2x+YqiKXhUR9MJsZi98/X6XDjogOEMqmH3osIrx3mNeNJSjyNQjb 44eLXiCy9xFsOObMdCEuuSoGB08uISZMpUX7swOY= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mx0b-00364e01.pphosted.com (mx0b-00364e01.pphosted.com [148.163.139.74]) by sourceware.org (Postfix) with ESMTPS id 92DF73858D1E for ; Wed, 2 Aug 2023 18:45:57 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 92DF73858D1E Received: from pps.filterd (m0167077.ppops.net [127.0.0.1]) by mx0b-00364e01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 372Htp7N006183 for ; Wed, 2 Aug 2023 14:45:57 -0400 Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by mx0b-00364e01.pphosted.com (PPS) with ESMTPS id 3s7qpu2p0a-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Wed, 02 Aug 2023 14:45:56 -0400 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-7683cdabcb7so9017085a.3 for ; Wed, 02 Aug 2023 11:45:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691001956; x=1691606756; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=VSor2Ey4kxiXPKkXfrMh6uG4Q0j36pjwZeq3V5LwHQw=; b=SFdZMrLlFPKnzLbu0iqIQOxpdTm1pF3NGJsLlgqwu8L+GLm8ari8KIKB5vDQM9cWpg 6K1Ybp1dqwdpUD/ZzzowlbECTyAezVgAQu4dGDb7QzcqWDLBP7Be+GFwExa9qdreE7NC 4wvQJfmVnhl7g8bupuo2AQHre1n7cN8YKZUPa3t3WrQu8VxeQo4e4lGzjoMRw+AIasDj P4Knt8r8A/1GwHgLbMwHW9SdvaDEmML0ZeRgdXHGEVSc4bMnvRfAvkw5BVIch4/HoEV2 lEEvfIncQ/WwAl6X/bvSbLzCgjFXniqlWu0A827w5NNqLu1mEJGgDNC1dFjhSmP/JDGS 5LBw== X-Gm-Message-State: ABy/qLYsBT4uChus/NaOuLv6q+4FcfFjNGlvZrjC0J6VEv/kPLRon9GT KyyJeC0aGCPekY+ogdiRjk5MCxGHCIGXdChZ4UIoxHtM7C3YkjqJ9sg18yncpjbSCgtemwnM7Md fQei0vjKSAEASwv4= X-Received: by 2002:a05:620a:4309:b0:767:3fa7:2ae9 with SMTP id u9-20020a05620a430900b007673fa72ae9mr19845586qko.12.1691001955783; Wed, 02 Aug 2023 11:45:55 -0700 (PDT) X-Google-Smtp-Source: APBJJlHr3mKulyuPamb9CCtXcVIp4zNnWyl5TNjQMKJbV0LN6nI9yRW7PgUVlC5USFBTzy7+FK6SaA== X-Received: by 2002:a05:620a:4309:b0:767:3fa7:2ae9 with SMTP id u9-20020a05620a430900b007673fa72ae9mr19845566qko.12.1691001955451; Wed, 02 Aug 2023 11:45:55 -0700 (PDT) Received: from localhost.localdomain ([206.71.236.226]) by smtp.gmail.com with ESMTPSA id p13-20020ae9f30d000000b00767b37256ecsm5232130qkg.107.2023.08.02.11.45.54 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Wed, 02 Aug 2023 11:45:55 -0700 (PDT) To: polacek@redhat.com Cc: gcc-patches@gcc.gnu.org, joseph@codesourcery.com, dmalcolm@redhat.com, Eric Feng Subject: [PATCH v3] analyzer: stash values for CPython plugin [PR107646] Date: Wed, 2 Aug 2023 14:45:47 -0400 Message-Id: <20230802184547.26983-1-ef2648@columbia.edu> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: References: MIME-Version: 1.0 X-Proofpoint-GUID: iCz6X7b-nfxUjuWAL37-Q1wTo8F6GQsE X-Proofpoint-ORIG-GUID: iCz6X7b-nfxUjuWAL37-Q1wTo8F6GQsE X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-02_15,2023-08-01_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxscore=0 phishscore=0 adultscore=0 mlxlogscore=999 impostorscore=10 clxscore=1015 priorityscore=1501 spamscore=0 suspectscore=0 bulkscore=10 lowpriorityscore=10 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308020166 X-Spam-Status: No, score=-11.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, RCVD_IN_DNSWL_LOW, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Eric Feng via Gcc-patches From: Eric Feng Reply-To: Eric Feng Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org Sender: "Gcc-patches" Revised: -- Remove superfluous { } -- Reword diagnostic --- This patch adds a hook to the end of ana::on_finish_translation_unit which calls relevant stashing-related callbacks registered during plugin initialization. This feature is used to stash named types and global variables for a CPython analyzer plugin [PR107646]. gcc/analyzer/ChangeLog: PR analyzer/107646 * analyzer-language.cc (run_callbacks): New function. (on_finish_translation_unit): New function. * analyzer-language.h (GCC_ANALYZER_LANGUAGE_H): New include. (class translation_unit): New vfuncs. gcc/c/ChangeLog: PR analyzer/107646 * c-parser.cc: New functions on stashing values for the analyzer. gcc/testsuite/ChangeLog: PR analyzer/107646 * gcc.dg/plugin/plugin.exp: Add new plugin and test. * gcc.dg/plugin/analyzer_cpython_plugin.c: New plugin. * gcc.dg/plugin/cpython-plugin-test-1.c: New test. Signed-off-by: Eric Feng --- gcc/analyzer/analyzer-language.cc | 22 ++ gcc/analyzer/analyzer-language.h | 9 + gcc/c/c-parser.cc | 24 ++ .../gcc.dg/plugin/analyzer_cpython_plugin.c | 230 ++++++++++++++++++ .../gcc.dg/plugin/cpython-plugin-test-1.c | 8 + gcc/testsuite/gcc.dg/plugin/plugin.exp | 2 + 6 files changed, 295 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c diff --git a/gcc/analyzer/analyzer-language.cc b/gcc/analyzer/analyzer-language.cc index 2c8910906ee..85400288a93 100644 --- a/gcc/analyzer/analyzer-language.cc +++ b/gcc/analyzer/analyzer-language.cc @@ -35,6 +35,26 @@ static GTY (()) hash_map *analyzer_stashed_constants; #if ENABLE_ANALYZER namespace ana { +static vec + *finish_translation_unit_callbacks; + +void +register_finish_translation_unit_callback ( + finish_translation_unit_callback callback) +{ + if (!finish_translation_unit_callbacks) + vec_alloc (finish_translation_unit_callbacks, 1); + finish_translation_unit_callbacks->safe_push (callback); +} + +static void +run_callbacks (logger *logger, const translation_unit &tu) +{ + for (auto const &cb : finish_translation_unit_callbacks) + { + cb (logger, tu); + } +} /* Call into TU to try to find a value for NAME. If found, stash its value within analyzer_stashed_constants. */ @@ -102,6 +122,8 @@ on_finish_translation_unit (const translation_unit &tu) the_logger.set_logger (new logger (logfile, 0, 0, *global_dc->printer)); stash_named_constants (the_logger.get_logger (), tu); + + run_callbacks (the_logger.get_logger (), tu); } /* Lookup NAME in the named constants stashed when the frontend TU finished. diff --git a/gcc/analyzer/analyzer-language.h b/gcc/analyzer/analyzer-language.h index 00f85aba041..8deea52d627 100644 --- a/gcc/analyzer/analyzer-language.h +++ b/gcc/analyzer/analyzer-language.h @@ -21,6 +21,8 @@ along with GCC; see the file COPYING3. If not see #ifndef GCC_ANALYZER_LANGUAGE_H #define GCC_ANALYZER_LANGUAGE_H +#include "analyzer/analyzer-logging.h" + #if ENABLE_ANALYZER namespace ana { @@ -35,8 +37,15 @@ class translation_unit have been seen). If it is defined and an integer (e.g. either as a macro or enum), return the INTEGER_CST value, otherwise return NULL. */ virtual tree lookup_constant_by_id (tree id) const = 0; + virtual tree lookup_type_by_id (tree id) const = 0; + virtual tree lookup_global_var_by_id (tree id) const = 0; }; +typedef void (*finish_translation_unit_callback) + (logger *, const translation_unit &); +void register_finish_translation_unit_callback ( + finish_translation_unit_callback callback); + /* Analyzer hook for frontends to call at the end of the TU. */ void on_finish_translation_unit (const translation_unit &tu); diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc index cf82b0306d1..a3f216d90f8 100644 --- a/gcc/c/c-parser.cc +++ b/gcc/c/c-parser.cc @@ -1695,6 +1695,30 @@ public: return NULL_TREE; } + tree + lookup_type_by_id (tree id) const final override + { + if (tree type_decl = lookup_name (id)) + if (TREE_CODE (type_decl) == TYPE_DECL) + { + tree record_type = TREE_TYPE (type_decl); + if (TREE_CODE (record_type) == RECORD_TYPE) + return record_type; + } + + return NULL_TREE; + } + + tree + lookup_global_var_by_id (tree id) const final override + { + if (tree var_decl = lookup_name (id)) + if (TREE_CODE (var_decl) == VAR_DECL) + return var_decl; + + return NULL_TREE; + } + private: /* Attempt to get an INTEGER_CST from MACRO. Only handle the simplest cases: where MACRO's definition is a single diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c new file mode 100644 index 00000000000..9ecc42d4465 --- /dev/null +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c @@ -0,0 +1,230 @@ +/* -fanalyzer plugin for CPython extension modules */ +/* { dg-options "-g" } */ + +#define INCLUDE_MEMORY +#include "gcc-plugin.h" +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "tree.h" +#include "function.h" +#include "basic-block.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "diagnostic-core.h" +#include "graphviz.h" +#include "options.h" +#include "cgraph.h" +#include "tree-dfa.h" +#include "stringpool.h" +#include "convert.h" +#include "target.h" +#include "fold-const.h" +#include "tree-pretty-print.h" +#include "diagnostic-color.h" +#include "diagnostic-metadata.h" +#include "tristate.h" +#include "bitmap.h" +#include "selftest.h" +#include "function.h" +#include "json.h" +#include "analyzer/analyzer.h" +#include "analyzer/analyzer-language.h" +#include "analyzer/analyzer-logging.h" +#include "ordered-hash-map.h" +#include "options.h" +#include "cgraph.h" +#include "cfg.h" +#include "digraph.h" +#include "analyzer/supergraph.h" +#include "sbitmap.h" +#include "analyzer/call-string.h" +#include "analyzer/program-point.h" +#include "analyzer/store.h" +#include "analyzer/region-model.h" +#include "analyzer/call-details.h" +#include "analyzer/call-info.h" +#include "make-unique.h" + +int plugin_is_GPL_compatible; + +#if ENABLE_ANALYZER +static GTY (()) hash_map *analyzer_stashed_types; +static GTY (()) hash_map *analyzer_stashed_globals; + +namespace ana +{ +static tree pyobj_record = NULL_TREE; +static tree varobj_record = NULL_TREE; +static tree pylistobj_record = NULL_TREE; +static tree pylongobj_record = NULL_TREE; +static tree pylongtype_vardecl = NULL_TREE; +static tree pylisttype_vardecl = NULL_TREE; + +static tree +get_field_by_name (tree type, const char *name) +{ + for (tree field = TYPE_FIELDS (type); field; field = TREE_CHAIN (field)) + { + if (TREE_CODE (field) == FIELD_DECL) + { + const char *field_name = IDENTIFIER_POINTER (DECL_NAME (field)); + if (strcmp (field_name, name) == 0) + return field; + } + } + return NULL_TREE; +} + +static void +maybe_stash_named_type (logger *logger, const translation_unit &tu, + const char *name) +{ + LOG_FUNC_1 (logger, "name: %qs", name); + if (!analyzer_stashed_types) + analyzer_stashed_types = hash_map::create_ggc (); + + tree id = get_identifier (name); + if (tree t = tu.lookup_type_by_id (id)) + { + gcc_assert (TREE_CODE (t) == RECORD_TYPE); + analyzer_stashed_types->put (id, t); + if (logger) + logger->log ("found %qs: %qE", name, t); + } + else + { + if (logger) + logger->log ("%qs: not found", name); + } +} + +static void +maybe_stash_global_var (logger *logger, const translation_unit &tu, + const char *name) +{ + LOG_FUNC_1 (logger, "name: %qs", name); + if (!analyzer_stashed_globals) + analyzer_stashed_globals = hash_map::create_ggc (); + + tree id = get_identifier (name); + if (tree t = tu.lookup_global_var_by_id (id)) + { + gcc_assert (TREE_CODE (t) == VAR_DECL); + analyzer_stashed_globals->put (id, t); + if (logger) + logger->log ("found %qs: %qE", name, t); + } + else + { + if (logger) + logger->log ("%qs: not found", name); + } +} + +static void +stash_named_types (logger *logger, const translation_unit &tu) +{ + LOG_SCOPE (logger); + + maybe_stash_named_type (logger, tu, "PyObject"); + maybe_stash_named_type (logger, tu, "PyListObject"); + maybe_stash_named_type (logger, tu, "PyVarObject"); + maybe_stash_named_type (logger, tu, "PyLongObject"); +} + +static void +stash_global_vars (logger *logger, const translation_unit &tu) +{ + LOG_SCOPE (logger); + + maybe_stash_global_var (logger, tu, "PyLong_Type"); + maybe_stash_global_var (logger, tu, "PyList_Type"); +} + +static tree +get_stashed_type_by_name (const char *name) +{ + if (!analyzer_stashed_types) + return NULL_TREE; + tree id = get_identifier (name); + if (tree *slot = analyzer_stashed_types->get (id)) + { + gcc_assert (TREE_CODE (*slot) == RECORD_TYPE); + return *slot; + } + return NULL_TREE; +} + +static tree +get_stashed_global_var_by_name (const char *name) +{ + if (!analyzer_stashed_globals) + return NULL_TREE; + tree id = get_identifier (name); + if (tree *slot = analyzer_stashed_globals->get (id)) + { + gcc_assert (TREE_CODE (*slot) == VAR_DECL); + return *slot; + } + return NULL_TREE; +} + +static void +init_py_structs () +{ + pyobj_record = get_stashed_type_by_name ("PyObject"); + varobj_record = get_stashed_type_by_name ("PyVarObject"); + pylistobj_record = get_stashed_type_by_name ("PyListObject"); + pylongobj_record = get_stashed_type_by_name ("PyLongObject"); + pylongtype_vardecl = get_stashed_global_var_by_name ("PyLong_Type"); + pylisttype_vardecl = get_stashed_global_var_by_name ("PyList_Type"); +} + +void +sorry_no_cpython_plugin () +{ + sorry ("%qs definitions not found." + " Please ensure to %qs.)", + "Python/C API", "#include "); +} + +static void +cpython_analyzer_init_cb (void *gcc_data, void * /*user_data */) +{ + ana::plugin_analyzer_init_iface *iface + = (ana::plugin_analyzer_init_iface *)gcc_data; + LOG_SCOPE (iface->get_logger ()); + if (0) + inform (input_location, "got here: cpython_analyzer_init_cb"); + + init_py_structs (); + + if (pyobj_record == NULL_TREE) + { + sorry_no_cpython_plugin (); + return; + } +} +} // namespace ana + +#endif /* #if ENABLE_ANALYZER */ + +int +plugin_init (struct plugin_name_args *plugin_info, + struct plugin_gcc_version *version) +{ +#if ENABLE_ANALYZER + const char *plugin_name = plugin_info->base_name; + if (0) + inform (input_location, "got here; %qs", plugin_name); + ana::register_finish_translation_unit_callback (&stash_named_types); + ana::register_finish_translation_unit_callback (&stash_global_vars); + register_callback (plugin_info->base_name, PLUGIN_ANALYZER_INIT, + ana::cpython_analyzer_init_cb, + NULL); /* void *user_data */ +#else + sorry_no_analyzer (); +#endif + return 0; +} \ No newline at end of file diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c new file mode 100644 index 00000000000..c105074042a --- /dev/null +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-options "-fanalyzer" } */ +/* { dg-require-effective-target analyzer } */ +/* { dg-message "'Python/C API' definitions not found. Please ensure to '#include '." "" { target *-*-* } 0 } */ + +void test_no_python_plugin () +{ +} \ No newline at end of file diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp index 60723a20eda..09c45394b1f 100644 --- a/gcc/testsuite/gcc.dg/plugin/plugin.exp +++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp @@ -160,6 +160,8 @@ set plugin_test_list [list \ taint-CVE-2011-0521-5-fixed.c \ taint-CVE-2011-0521-6.c \ taint-antipatterns-1.c } \ + { analyzer_cpython_plugin.c \ + cpython-plugin-test-1.c } \ ] foreach plugin_test $plugin_test_list {