diff mbox series

analyzer: implement reference count checking for CPython plugin [PR107646]

Message ID 20230829043155.17651-1-ef2648@columbia.edu
State New
Headers show
Series analyzer: implement reference count checking for CPython plugin [PR107646] | expand

Commit Message

Eric Feng Aug. 29, 2023, 4:31 a.m. UTC
Hi Dave,

Thanks for the feedback. I've addressed the changes you mentioned in
addition to adding more test cases. I've also taken this chance to 
split the test files according to known function subclasses, as you previously 
suggested. Since there were also some changes to the core analyzer, I've done a
bootstrap and regtested the patch as well. Does it look OK for trunk?

Best,
Eric

---

This patch introduces initial support for reference count checking of
PyObjects in relation to the Python/C API for the CPython plugin.
Additionally, the core analyzer underwent several modifications to
accommodate this feature. These include:

- Introducing support for callbacks at the end of
  region_model::pop_frame. This is our current point of validation for
  the reference count of PyObjects.
- An added optional custom stmt_finder parameter to
  region_model_context::warn. This aids in emitting a diagnostic
  concerning the reference count, especially when the stmt_finder is
  NULL, which is currently the case during region_model::pop_frame.

The current diagnostic we emit relating to the reference count
appears as follows:

rc3.c:23:10: warning: expected <variable name belonging to m_base_region> to have reference count: ‘1’ but ob_refcnt field is: ‘2’
   23 |   return list;
      |          ^~~~
  ‘create_py_object’: events 1-4
    |
    |    4 |   PyObject* item = PyLong_FromLong(3);
    |      |                    ^~~~~~~~~~~~~~~~~~
    |      |                    |
    |      |                    (1) when ‘PyLong_FromLong’ succeeds
    |    5 |   PyObject* list = PyList_New(1);
    |      |                    ~~~~~~~~~~~~~
    |      |                    |
    |      |                    (2) when ‘PyList_New’ succeeds
    |......
    |   14 |   PyList_Append(list, item);
    |      |   ~~~~~~~~~~~~~~~~~~~~~~~~~
    |      |   |
    |      |   (3) when ‘PyList_Append’ succeeds, moving buffer
    |......
    |   23 |   return list;
    |      |          ~~~~
    |      |          |
    |      |          (4) here
    |

This is a WIP in several ways:
- Enhancing the diagnostic for better clarity. For instance, users should
  expect to see the variable name 'item' instead of the placeholder in the
  diagnostic above.
- Currently, functions returning PyObject * are assumed to always produce
  a new reference.
- The validation of reference count is only for PyObjects created within a
  function body. Verifying reference counts for PyObjects passed as
  parameters is not supported in this patch.

gcc/analyzer/ChangeLog:
  PR analyzer/107646
	* engine.cc (impl_region_model_context::warn): New optional parameter.
	* exploded-graph.h (class impl_region_model_context): Likewise.
	* region-model.cc (region_model::pop_frame): New callback feature for
  * region_model::pop_frame.
	* region-model.h (struct append_regions_cb_data): Likewise.
	(class region_model): Likewise.
	(class region_model_context): New optional parameter.
	(class region_model_context_decorator): Likewise.

gcc/testsuite/ChangeLog:
  PR analyzer/107646
	* gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
  * checking for PyObjects.
	* gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
	* gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
  * added more tests).
	* gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
	* gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
  * more tests).
	* gcc.dg/plugin/plugin.exp: New tests.
	* gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
	* gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
	* gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.

Signed-off-by: Eric Feng <ef2648@columbia.edu>

---
 gcc/analyzer/engine.cc                        |   8 +-
 gcc/analyzer/exploded-graph.h                 |   4 +-
 gcc/analyzer/region-model.cc                  |   3 +
 gcc/analyzer/region-model.h                   |  48 ++-
 .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376 +++++++++++++++++-
 ....c => cpython-plugin-test-PyList_Append.c} |  56 +--
 .../plugin/cpython-plugin-test-PyList_New.c   |  38 ++
 .../cpython-plugin-test-PyLong_FromLong.c     |  38 ++
 ...st-1.c => cpython-plugin-test-no-plugin.c} |   0
 .../cpython-plugin-test-refcnt-checking.c     |  78 ++++
 gcc/testsuite/gcc.dg/plugin/plugin.exp        |   5 +-
 11 files changed, 612 insertions(+), 42 deletions(-)
 rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-2.c => cpython-plugin-test-PyList_Append.c} (64%)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
 rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-1.c => cpython-plugin-test-no-plugin.c} (100%)
 create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c

Comments

Eric Feng Aug. 29, 2023, 4:35 a.m. UTC | #1
On Tue, Aug 29, 2023 at 12:32 AM Eric Feng <ef2648@columbia.edu> wrote:
>
> Hi Dave,
>
> Thanks for the feedback. I've addressed the changes you mentioned in
> addition to adding more test cases. I've also taken this chance to
> split the test files according to known function subclasses, as you previously
> suggested. Since there were also some changes to the core analyzer, I've done a
> bootstrap and regtested the patch as well. Does it look OK for trunk?
Apologies — I forgot to mention that bootstrap and regtest was done on
aarch64-unknown-linux-gnu.
>
> Best,
> Eric
>
> ---
>
> This patch introduces initial support for reference count checking of
> PyObjects in relation to the Python/C API for the CPython plugin.
> Additionally, the core analyzer underwent several modifications to
> accommodate this feature. These include:
>
> - Introducing support for callbacks at the end of
>   region_model::pop_frame. This is our current point of validation for
>   the reference count of PyObjects.
> - An added optional custom stmt_finder parameter to
>   region_model_context::warn. This aids in emitting a diagnostic
>   concerning the reference count, especially when the stmt_finder is
>   NULL, which is currently the case during region_model::pop_frame.
>
> The current diagnostic we emit relating to the reference count
> appears as follows:
>
> rc3.c:23:10: warning: expected <variable name belonging to m_base_region> to have reference count: ‘1’ but ob_refcnt field is: ‘2’
>    23 |   return list;
>       |          ^~~~
>   ‘create_py_object’: events 1-4
>     |
>     |    4 |   PyObject* item = PyLong_FromLong(3);
>     |      |                    ^~~~~~~~~~~~~~~~~~
>     |      |                    |
>     |      |                    (1) when ‘PyLong_FromLong’ succeeds
>     |    5 |   PyObject* list = PyList_New(1);
>     |      |                    ~~~~~~~~~~~~~
>     |      |                    |
>     |      |                    (2) when ‘PyList_New’ succeeds
>     |......
>     |   14 |   PyList_Append(list, item);
>     |      |   ~~~~~~~~~~~~~~~~~~~~~~~~~
>     |      |   |
>     |      |   (3) when ‘PyList_Append’ succeeds, moving buffer
>     |......
>     |   23 |   return list;
>     |      |          ~~~~
>     |      |          |
>     |      |          (4) here
>     |
>
> This is a WIP in several ways:
> - Enhancing the diagnostic for better clarity. For instance, users should
>   expect to see the variable name 'item' instead of the placeholder in the
>   diagnostic above.
> - Currently, functions returning PyObject * are assumed to always produce
>   a new reference.
> - The validation of reference count is only for PyObjects created within a
>   function body. Verifying reference counts for PyObjects passed as
>   parameters is not supported in this patch.
>
> gcc/analyzer/ChangeLog:
>   PR analyzer/107646
>         * engine.cc (impl_region_model_context::warn): New optional parameter.
>         * exploded-graph.h (class impl_region_model_context): Likewise.
>         * region-model.cc (region_model::pop_frame): New callback feature for
>   * region_model::pop_frame.
>         * region-model.h (struct append_regions_cb_data): Likewise.
>         (class region_model): Likewise.
>         (class region_model_context): New optional parameter.
>         (class region_model_context_decorator): Likewise.
>
> gcc/testsuite/ChangeLog:
>   PR analyzer/107646
>         * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
>   * checking for PyObjects.
>         * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
>         * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
>   * added more tests).
>         * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
>         * gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
>   * more tests).
>         * gcc.dg/plugin/plugin.exp: New tests.
>         * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
>         * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
>         * gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.
>
> Signed-off-by: Eric Feng <ef2648@columbia.edu>
>
> ---
>  gcc/analyzer/engine.cc                        |   8 +-
>  gcc/analyzer/exploded-graph.h                 |   4 +-
>  gcc/analyzer/region-model.cc                  |   3 +
>  gcc/analyzer/region-model.h                   |  48 ++-
>  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376 +++++++++++++++++-
>  ....c => cpython-plugin-test-PyList_Append.c} |  56 +--
>  .../plugin/cpython-plugin-test-PyList_New.c   |  38 ++
>  .../cpython-plugin-test-PyLong_FromLong.c     |  38 ++
>  ...st-1.c => cpython-plugin-test-no-plugin.c} |   0
>  .../cpython-plugin-test-refcnt-checking.c     |  78 ++++
>  gcc/testsuite/gcc.dg/plugin/plugin.exp        |   5 +-
>  11 files changed, 612 insertions(+), 42 deletions(-)
>  rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-2.c => cpython-plugin-test-PyList_Append.c} (64%)
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
>  rename gcc/testsuite/gcc.dg/plugin/{cpython-plugin-test-1.c => cpython-plugin-test-no-plugin.c} (100%)
>  create mode 100644 gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
>
> diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
> index a1908cdb364..736a41ecdaf 100644
> --- a/gcc/analyzer/engine.cc
> +++ b/gcc/analyzer/engine.cc
> @@ -115,10 +115,12 @@ impl_region_model_context (program_state *state,
>  }
>
>  bool
> -impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d)
> +impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d,
> +                                const stmt_finder *custom_finder)
>  {
>    LOG_FUNC (get_logger ());
> -  if (m_stmt == NULL && m_stmt_finder == NULL)
> +  auto curr_stmt_finder = custom_finder ? custom_finder : m_stmt_finder;
> +  if (m_stmt == NULL && curr_stmt_finder == NULL)
>      {
>        if (get_logger ())
>         get_logger ()->log ("rejecting diagnostic: no stmt");
> @@ -129,7 +131,7 @@ impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d)
>        bool terminate_path = d->terminate_path_p ();
>        if (m_eg->get_diagnostic_manager ().add_diagnostic
>           (m_enode_for_diag, m_enode_for_diag->get_supernode (),
> -          m_stmt, m_stmt_finder, std::move (d)))
> +          m_stmt, curr_stmt_finder, std::move (d)))
>         {
>           if (m_path_ctxt
>               && terminate_path
> diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
> index 5a7ab645bfe..6e9a5ef58c7 100644
> --- a/gcc/analyzer/exploded-graph.h
> +++ b/gcc/analyzer/exploded-graph.h
> @@ -56,7 +56,8 @@ class impl_region_model_context : public region_model_context
>                              uncertainty_t *uncertainty,
>                              logger *logger = NULL);
>
> -  bool warn (std::unique_ptr<pending_diagnostic> d) final override;
> +  bool warn (std::unique_ptr<pending_diagnostic> d,
> +            const stmt_finder *custom_finder = NULL) final override;
>    void add_note (std::unique_ptr<pending_note> pn) final override;
>    void add_event (std::unique_ptr<checker_event> event) final override;
>    void on_svalue_leak (const svalue *) override;
> @@ -107,6 +108,7 @@ class impl_region_model_context : public region_model_context
>                          std::unique_ptr<sm_context> *out_sm_context) override;
>
>    const gimple *get_stmt () const override { return m_stmt; }
> +  const exploded_graph *get_eg () const override { return m_eg; }
>
>    exploded_graph *m_eg;
>    log_user m_logger;
> diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
> index 4f31a6dcf0f..eb4f976b83a 100644
> --- a/gcc/analyzer/region-model.cc
> +++ b/gcc/analyzer/region-model.cc
> @@ -82,6 +82,8 @@ along with GCC; see the file COPYING3.  If not see
>
>  namespace ana {
>
> +auto_vec<pop_frame_callback> region_model::pop_frame_callbacks;
> +
>  /* Dump T to PP in language-independent form, for debugging/logging/dumping
>     purposes.  */
>
> @@ -5422,6 +5424,7 @@ region_model::pop_frame (tree result_lvalue,
>      }
>
>    unbind_region_and_descendents (frame_reg,POISON_KIND_POPPED_STACK);
> +  notify_on_pop_frame (this, retval, ctxt);
>  }
>
>  /* Get the number of frames in this region_model's stack.  */
> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
> index 10b2a59e787..440ea6d828d 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h
> @@ -236,6 +236,10 @@ public:
>
>  struct append_regions_cb_data;
>
> +typedef void (*pop_frame_callback) (const region_model *model,
> +                                   const svalue *retval,
> +                                   region_model_context *ctxt);
> +
>  /* A region_model encapsulates a representation of the state of memory, with
>     a tree of regions, along with their associated values.
>     The representation is graph-like because values can be pointers to
> @@ -532,6 +536,20 @@ class region_model
>    get_builtin_kf (const gcall *call,
>                   region_model_context *ctxt = NULL) const;
>
> +  static void
> +  register_pop_frame_callback (const pop_frame_callback &callback)
> +  {
> +    pop_frame_callbacks.safe_push (callback);
> +  }
> +
> +  static void
> +  notify_on_pop_frame (const region_model *model, const svalue *retval,
> +                      region_model_context *ctxt)
> +  {
> +    for (auto &callback : pop_frame_callbacks)
> +       callback (model, retval, ctxt);
> +  }
> +
>  private:
>    const region *get_lvalue_1 (path_var pv, region_model_context *ctxt) const;
>    const svalue *get_rvalue_1 (path_var pv, region_model_context *ctxt) const;
> @@ -621,6 +639,7 @@ private:
>                                                 tree callee_fndecl,
>                                                 region_model_context *ctxt) const;
>
> +  static auto_vec<pop_frame_callback> pop_frame_callbacks;
>    /* Storing this here to avoid passing it around everywhere.  */
>    region_model_manager *const m_mgr;
>
> @@ -649,8 +668,10 @@ class region_model_context
>  {
>   public:
>    /* Hook for clients to store pending diagnostics.
> -     Return true if the diagnostic was stored, or false if it was deleted.  */
> -  virtual bool warn (std::unique_ptr<pending_diagnostic> d) = 0;
> +     Return true if the diagnostic was stored, or false if it was deleted.
> +     Optionally provide a custom stmt_finder.  */
> +  virtual bool warn (std::unique_ptr<pending_diagnostic> d,
> +                    const stmt_finder *custom_finder = NULL) = 0;
>
>    /* Hook for clients to add a note to the last previously stored
>       pending diagnostic.  */
> @@ -757,6 +778,8 @@ class region_model_context
>
>    /* Get the current statement, if any.  */
>    virtual const gimple *get_stmt () const = 0;
> +
> +  virtual const exploded_graph *get_eg () const = 0;
>  };
>
>  /* A "do nothing" subclass of region_model_context.  */
> @@ -764,7 +787,8 @@ class region_model_context
>  class noop_region_model_context : public region_model_context
>  {
>  public:
> -  bool warn (std::unique_ptr<pending_diagnostic>) override { return false; }
> +  bool warn (std::unique_ptr<pending_diagnostic> d,
> +            const stmt_finder *custom_finder) override { return false; }
>    void add_note (std::unique_ptr<pending_note>) override;
>    void add_event (std::unique_ptr<checker_event>) override;
>    void on_svalue_leak (const svalue *) override {}
> @@ -812,6 +836,7 @@ public:
>    }
>
>    const gimple *get_stmt () const override { return NULL; }
> +  const exploded_graph *get_eg () const override { return NULL; }
>  };
>
>  /* A subclass of region_model_context for determining if operations fail
> @@ -840,7 +865,8 @@ private:
>  class region_model_context_decorator : public region_model_context
>  {
>   public:
> -  bool warn (std::unique_ptr<pending_diagnostic> d) override
> +  bool warn (std::unique_ptr<pending_diagnostic> d,
> +            const stmt_finder *custom_finder)
>    {
>      if (m_inner)
>        return m_inner->warn (std::move (d));
> @@ -978,6 +1004,14 @@ class region_model_context_decorator : public region_model_context
>        return nullptr;
>    }
>
> +  const exploded_graph *get_eg () const override
> +  {
> +    if (m_inner)
> +       return m_inner->get_eg ();
> +    else
> +       return nullptr;
> +  }
> +
>  protected:
>    region_model_context_decorator (region_model_context *inner)
>    : m_inner (inner)
> @@ -993,7 +1027,8 @@ protected:
>  class annotating_context : public region_model_context_decorator
>  {
>  public:
> -  bool warn (std::unique_ptr<pending_diagnostic> d) override
> +  bool warn (std::unique_ptr<pending_diagnostic> d,
> +            const stmt_finder *custom_finder) override
>    {
>      if (m_inner)
>        if (m_inner->warn (std::move (d)))
> @@ -1158,7 +1193,8 @@ using namespace ::selftest;
>  class test_region_model_context : public noop_region_model_context
>  {
>  public:
> -  bool warn (std::unique_ptr<pending_diagnostic> d) final override
> +  bool warn (std::unique_ptr<pending_diagnostic> d,
> +            const stmt_finder *custom_finder) final override
>    {
>      m_diagnostics.safe_push (d.release ());
>      return true;
> diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> index 7cd72e8a886..b2caed8fc1b 100644
> --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> @@ -44,6 +44,7 @@
>  #include "analyzer/region-model.h"
>  #include "analyzer/call-details.h"
>  #include "analyzer/call-info.h"
> +#include "analyzer/exploded-graph.h"
>  #include "make-unique.h"
>
>  int plugin_is_GPL_compatible;
> @@ -191,6 +192,372 @@ public:
>    }
>  };
>
> +/* This is just a copy of leak_stmt_finder for now (subject to change if
> + * necssary)  */
> +
> +class refcnt_stmt_finder : public stmt_finder
> +{
> +public:
> +  refcnt_stmt_finder (const exploded_graph &eg, tree var)
> +      : m_eg (eg), m_var (var)
> +  {
> +  }
> +
> +  std::unique_ptr<stmt_finder>
> +  clone () const final override
> +  {
> +    return make_unique<refcnt_stmt_finder> (m_eg, m_var);
> +  }
> +
> +  const gimple *
> +  find_stmt (const exploded_path &epath) final override
> +  {
> +    logger *const logger = m_eg.get_logger ();
> +    LOG_FUNC (logger);
> +
> +    if (m_var && TREE_CODE (m_var) == SSA_NAME)
> +      {
> +       /* Locate the final write to this SSA name in the path.  */
> +       const gimple *def_stmt = SSA_NAME_DEF_STMT (m_var);
> +
> +       int idx_of_def_stmt;
> +       bool found = epath.find_stmt_backwards (def_stmt, &idx_of_def_stmt);
> +       if (!found)
> +         goto not_found;
> +
> +       /* What was the next write to the underlying var
> +          after the SSA name was set? (if any).  */
> +
> +       for (unsigned idx = idx_of_def_stmt + 1; idx < epath.m_edges.length ();
> +            ++idx)
> +         {
> +           const exploded_edge *eedge = epath.m_edges[idx];
> +           if (logger)
> +                   logger->log ("eedge[%i]: EN %i -> EN %i", idx,
> +                                eedge->m_src->m_index,
> +                                eedge->m_dest->m_index);
> +           const exploded_node *dst_node = eedge->m_dest;
> +           const program_point &dst_point = dst_node->get_point ();
> +           const gimple *stmt = dst_point.get_stmt ();
> +           if (!stmt)
> +                   continue;
> +           if (const gassign *assign = dyn_cast<const gassign *> (stmt))
> +                   {
> +                           tree lhs = gimple_assign_lhs (assign);
> +                           if (TREE_CODE (lhs) == SSA_NAME
> +                               && SSA_NAME_VAR (lhs) == SSA_NAME_VAR (m_var))
> +                                   return assign;
> +                   }
> +         }
> +      }
> +
> +  not_found:
> +
> +    /* Look backwards for the first statement with a location.  */
> +    int i;
> +    const exploded_edge *eedge;
> +    FOR_EACH_VEC_ELT_REVERSE (epath.m_edges, i, eedge)
> +    {
> +      if (logger)
> +       logger->log ("eedge[%i]: EN %i -> EN %i", i, eedge->m_src->m_index,
> +                    eedge->m_dest->m_index);
> +      const exploded_node *dst_node = eedge->m_dest;
> +      const program_point &dst_point = dst_node->get_point ();
> +      const gimple *stmt = dst_point.get_stmt ();
> +      if (stmt)
> +       if (get_pure_location (stmt->location) != UNKNOWN_LOCATION)
> +         return stmt;
> +    }
> +
> +    gcc_unreachable ();
> +    return NULL;
> +  }
> +
> +private:
> +  const exploded_graph &m_eg;
> +  tree m_var;
> +};
> +
> +class refcnt_mismatch : public pending_diagnostic_subclass<refcnt_mismatch>
> +{
> +public:
> +  refcnt_mismatch (const region *base_region,
> +                               const svalue *ob_refcnt,
> +                               const svalue *actual_refcnt,
> +        tree reg_tree)
> +      : m_base_region (base_region), m_ob_refcnt (ob_refcnt),
> +       m_actual_refcnt (actual_refcnt), m_reg_tree(reg_tree)
> +  {
> +  }
> +
> +  const char *
> +  get_kind () const final override
> +  {
> +    return "refcnt_mismatch";
> +  }
> +
> +  bool
> +  operator== (const refcnt_mismatch &other) const
> +  {
> +    return (m_base_region == other.m_base_region
> +           && m_ob_refcnt == other.m_ob_refcnt
> +           && m_actual_refcnt == other.m_actual_refcnt);
> +  }
> +
> +  int get_controlling_option () const final override
> +  {
> +    return 0;
> +  }
> +
> +  bool
> +  emit (rich_location *rich_loc, logger *) final override
> +  {
> +    diagnostic_metadata m;
> +    bool warned;
> +    // just assuming constants for now
> +    auto actual_refcnt
> +       = m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
> +    auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant ();
> +    warned = warning_meta (
> +       rich_loc, m, get_controlling_option (),
> +       "expected <variable name belonging to m_base_region> to have "
> +       "reference count: %qE but ob_refcnt field is: %qE",
> +       actual_refcnt, ob_refcnt);
> +
> +    // location_t loc = rich_loc->get_loc ();
> +    // foo (loc);
> +    return warned;
> +  }
> +
> +  void mark_interesting_stuff (interesting_t *interest) final override
> +  {
> +    if (m_base_region)
> +      interest->add_region_creation (m_base_region);
> +  }
> +
> +private:
> +
> +  void foo(location_t loc) const
> +  {
> +    inform(loc, "something is up right here");
> +  }
> +  const region *m_base_region;
> +  const svalue *m_ob_refcnt;
> +  const svalue *m_actual_refcnt;
> +  tree m_reg_tree;
> +};
> +
> +/* Retrieves the svalue associated with the ob_refcnt field of the base region.
> + */
> +static const svalue *
> +retrieve_ob_refcnt_sval (const region *base_reg, const region_model *model,
> +                        region_model_context *ctxt)
> +{
> +  region_model_manager *mgr = model->get_manager ();
> +  tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt");
> +  const region *ob_refcnt_region
> +      = mgr->get_field_region (base_reg, ob_refcnt_tree);
> +  const svalue *ob_refcnt_sval
> +      = model->get_store_value (ob_refcnt_region, ctxt);
> +  return ob_refcnt_sval;
> +}
> +
> +static void
> +increment_region_refcnt (hash_map<const region *, int> &map, const region *key)
> +{
> +  bool existed;
> +  auto &refcnt = map.get_or_insert (key, &existed);
> +  refcnt = existed ? refcnt + 1 : 1;
> +}
> +
> +
> +/* Recursively fills in region_to_refcnt with the references owned by
> +   pyobj_ptr_sval.  */
> +static void
> +count_expected_pyobj_references (const region_model *model,
> +                          hash_map<const region *, int> &region_to_refcnt,
> +                          const svalue *pyobj_ptr_sval,
> +                          hash_set<const region *> &seen)
> +{
> +  if (!pyobj_ptr_sval)
> +    return;
> +
> +  const auto *pyobj_region_sval = pyobj_ptr_sval->dyn_cast_region_svalue ();
> +  const auto *pyobj_initial_sval = pyobj_ptr_sval->dyn_cast_initial_svalue ();
> +  if (!pyobj_region_sval && !pyobj_initial_sval)
> +    return;
> +
> +  // todo: support initial sval (e.g passed in as parameter)
> +  if (pyobj_initial_sval)
> +    {
> +  //     increment_region_refcnt (region_to_refcnt,
> +       //                     pyobj_initial_sval->get_region ());
> +      return;
> +    }
> +
> +  const region *pyobj_region = pyobj_region_sval->get_pointee ();
> +  if (!pyobj_region || seen.contains (pyobj_region))
> +    return;
> +
> +  seen.add (pyobj_region);
> +
> +  if (pyobj_ptr_sval->get_type () == pyobj_ptr_tree)
> +    increment_region_refcnt (region_to_refcnt, pyobj_region);
> +
> +  const auto *curr_store = model->get_store ();
> +  const auto *retval_cluster = curr_store->get_cluster (pyobj_region);
> +  if (!retval_cluster)
> +    return;
> +
> +  const auto &retval_binding_map = retval_cluster->get_map ();
> +
> +  for (const auto &binding : retval_binding_map)
> +    {
> +      const svalue *binding_sval = binding.second;
> +      const svalue *unwrapped_sval = binding_sval->unwrap_any_unmergeable ();
> +      const region *pointee = unwrapped_sval->maybe_get_region ();
> +
> +      if (pointee && pointee->get_kind () == RK_HEAP_ALLOCATED)
> +       count_expected_pyobj_references (model, region_to_refcnt, binding_sval,
> +                                        seen);
> +    }
> +}
> +
> +/* Compare ob_refcnt field vs the actual reference count of a region */
> +static void
> +check_refcnt (const region_model *model, region_model_context *ctxt,
> +             const hash_map<const ana::region *,
> +                            int>::iterator::reference_pair region_refcnt)
> +{
> +  region_model_manager *mgr = model->get_manager ();
> +  const auto &curr_region = region_refcnt.first;
> +  const auto &actual_refcnt = region_refcnt.second;
> +  const svalue *ob_refcnt_sval = retrieve_ob_refcnt_sval (curr_region, model, ctxt);
> +  const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst (
> +      ob_refcnt_sval->get_type (), actual_refcnt);
> +
> +  if (ob_refcnt_sval != actual_refcnt_sval)
> +  {
> +    // todo: fix this
> +    tree reg_tree = model->get_representative_tree (curr_region);
> +
> +    const auto &eg = ctxt->get_eg ();
> +    refcnt_stmt_finder finder (*eg, reg_tree);
> +    auto pd = make_unique<refcnt_mismatch> (curr_region, ob_refcnt_sval,
> +                                           actual_refcnt_sval, reg_tree);
> +    if (pd && eg)
> +    ctxt->warn (std::move (pd), &finder);
> +  }
> +}
> +
> +static void
> +check_refcnts (const region_model *model, const svalue *retval,
> +           region_model_context *ctxt,
> +           hash_map<const region *, int> &region_to_refcnt)
> +{
> +  for (const auto &region_refcnt : region_to_refcnt)
> +  {
> +    check_refcnt(model, ctxt, region_refcnt);
> +  }
> +}
> +
> +/* Validates the reference count of all Python objects. */
> +void
> +pyobj_refcnt_checker (const region_model *model, const svalue *retval,
> +                   region_model_context *ctxt)
> +{
> +  if (!ctxt)
> +  return;
> +
> +  auto region_to_refcnt = hash_map<const region *, int> ();
> +  auto seen_regions = hash_set<const region *> ();
> +
> +  count_expected_pyobj_references (model, region_to_refcnt, retval, seen_regions);
> +  check_refcnts (model, retval, ctxt, region_to_refcnt);
> +}
> +
> +/* Counts the actual pyobject references from all clusters in the model's
> + * store. */
> +static void
> +count_all_references (const region_model *model,
> +                     hash_map<const region *, int> &region_to_refcnt)
> +{
> +  for (const auto &cluster : *model->get_store ())
> +  {
> +    auto curr_region = cluster.first;
> +    if (curr_region->get_kind () != RK_HEAP_ALLOCATED)
> +    continue;
> +
> +    increment_region_refcnt (region_to_refcnt, curr_region);
> +
> +    auto binding_cluster = cluster.second;
> +    for (const auto &binding : binding_cluster->get_map ())
> +    {
> +         const svalue *binding_sval = binding.second;
> +
> +         const svalue *unwrapped_sval
> +             = binding_sval->unwrap_any_unmergeable ();
> +         // if (unwrapped_sval->get_type () != pyobj_ptr_tree)
> +         // continue;
> +
> +         const region *pointee = unwrapped_sval->maybe_get_region ();
> +         if (!pointee || pointee->get_kind () != RK_HEAP_ALLOCATED)
> +           continue;
> +
> +         increment_region_refcnt (region_to_refcnt, pointee);
> +    }
> +  }
> +}
> +
> +static void
> +dump_refcnt_info (const hash_map<const region *, int> &region_to_refcnt,
> +                 const region_model *model, region_model_context *ctxt)
> +{
> +  region_model_manager *mgr = model->get_manager ();
> +  pretty_printer pp;
> +  pp_format_decoder (&pp) = default_tree_printer;
> +  pp_show_color (&pp) = pp_show_color (global_dc->printer);
> +  pp.buffer->stream = stderr;
> +
> +  for (const auto &region_refcnt : region_to_refcnt)
> +  {
> +    auto region = region_refcnt.first;
> +    auto actual_refcnt = region_refcnt.second;
> +    const svalue *ob_refcnt_sval
> +       = retrieve_ob_refcnt_sval (region, model, ctxt);
> +    const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst (
> +       ob_refcnt_sval->get_type (), actual_refcnt);
> +
> +    region->dump_to_pp (&pp, true);
> +    pp_string (&pp, " — ob_refcnt: ");
> +    ob_refcnt_sval->dump_to_pp (&pp, true);
> +    pp_string (&pp, " actual refcnt: ");
> +    actual_refcnt_sval->dump_to_pp (&pp, true);
> +    pp_newline (&pp);
> +  }
> +  pp_string (&pp, "~~~~~~~~\n");
> +  pp_flush (&pp);
> +}
> +
> +class kf_analyzer_cpython_dump_refcounts : public known_function
> +{
> +public:
> +  bool matches_call_types_p (const call_details &cd) const final override
> +  {
> +    return cd.num_args () == 0;
> +  }
> +  void impl_call_pre (const call_details &cd) const final override
> +  {
> +    region_model_context *ctxt = cd.get_ctxt ();
> +    if (!ctxt)
> +      return;
> +    region_model *model = cd.get_model ();
> +    auto region_to_refcnt = hash_map<const region *, int> ();
> +    count_all_references(model, region_to_refcnt);
> +    dump_refcnt_info(region_to_refcnt, model, ctxt);
> +  }
> +};
> +
>  /* Some concessions were made to
>  simplify the analysis process when comparing kf_PyList_Append with the
>  real implementation. In particular, PyList_Append performs some
> @@ -927,6 +1294,10 @@ cpython_analyzer_init_cb (void *gcc_data, void * /*user_data */)
>    iface->register_known_function ("PyList_New", make_unique<kf_PyList_New> ());
>    iface->register_known_function ("PyLong_FromLong",
>                                    make_unique<kf_PyLong_FromLong> ());
> +
> +  iface->register_known_function (
> +      "__analyzer_cpython_dump_refcounts",
> +      make_unique<kf_analyzer_cpython_dump_refcounts> ());
>  }
>  } // namespace ana
>
> @@ -940,8 +1311,9 @@ plugin_init (struct plugin_name_args *plugin_info,
>    const char *plugin_name = plugin_info->base_name;
>    if (0)
>      inform (input_location, "got here; %qs", plugin_name);
> -  ana::register_finish_translation_unit_callback (&stash_named_types);
> -  ana::register_finish_translation_unit_callback (&stash_global_vars);
> +  register_finish_translation_unit_callback (&stash_named_types);
> +  register_finish_translation_unit_callback (&stash_global_vars);
> +  region_model::register_pop_frame_callback(pyobj_refcnt_checker);
>    register_callback (plugin_info->base_name, PLUGIN_ANALYZER_INIT,
>                       ana::cpython_analyzer_init_cb,
>                       NULL); /* void *user_data */
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
> similarity index 64%
> rename from gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> rename to gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
> index 19b5c17428a..9912f9105d4 100644
> --- a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
> @@ -8,34 +8,6 @@
>  #include <Python.h>
>  #include "../analyzer/analyzer-decls.h"
>
> -PyObject *
> -test_PyList_New (Py_ssize_t len)
> -{
> -  PyObject *obj = PyList_New (len);
> -  if (obj)
> -    {
> -     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> -     __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
> -    }
> -  else
> -    __analyzer_dump_path (); /* { dg-message "path" } */
> -  return obj;
> -}
> -
> -PyObject *
> -test_PyLong_New (long n)
> -{
> -  PyObject *obj = PyLong_FromLong (n);
> -  if (obj)
> -    {
> -     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> -     __analyzer_eval (PyLong_CheckExact (obj)); /* { dg-warning "TRUE" } */
> -    }
> -  else
> -    __analyzer_dump_path (); /* { dg-message "path" } */
> -  return obj;
> -}
> -
>  PyObject *
>  test_PyListAppend (long n)
>  {
> @@ -43,6 +15,7 @@ test_PyListAppend (long n)
>    PyObject *list = PyList_New (0);
>    PyList_Append(list, item);
>    return list; /* { dg-warning "leak of 'item'" } */
> +  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
>  }
>
>  PyObject *
> @@ -67,6 +40,7 @@ test_PyListAppend_2 (long n)
>    else
>      __analyzer_eval (item->ob_refcnt == 2); /* { dg-warning "TRUE" } */
>    return list; /* { dg-warning "leak of 'item'" } */
> +  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
>  }
>
>
> @@ -75,4 +49,30 @@ test_PyListAppend_3 (PyObject *item, PyObject *list)
>  {
>    PyList_Append (list, item);
>    return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_4 (long n)
> +{
> +  PyObject *item = PyLong_FromLong (n);
> +  PyObject *list = NULL;
> +  PyList_Append(list, item);
> +  return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_5 ()
> +{
> +  PyObject *list = PyList_New (0);
> +  PyList_Append(list, NULL);
> +  return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_6 ()
> +{
> +  PyObject *item = NULL;
> +  PyObject *list = NULL;
> +  PyList_Append(list, item);
> +  return list;
>  }
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
> new file mode 100644
> index 00000000000..492d4f7d58d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
> @@ -0,0 +1,38 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target analyzer } */
> +/* { dg-options "-fanalyzer" } */
> +/* { dg-require-python-h "" } */
> +
> +
> +#define PY_SSIZE_T_CLEAN
> +#include <Python.h>
> +#include "../analyzer/analyzer-decls.h"
> +
> +PyObject *
> +test_PyList_New (Py_ssize_t len)
> +{
> +  PyObject *obj = PyList_New (len);
> +  if (obj)
> +    {
> +     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> +     __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
> +    }
> +  else
> +    __analyzer_dump_path (); /* { dg-message "path" } */
> +  return obj;
> +}
> +
> +void
> +test_PyList_New_2 ()
> +{
> +  PyObject *obj = PyList_New (0);
> +} /* { dg-warning "leak of 'obj'" } */
> +
> +PyObject *test_stray_incref_PyList ()
> +{
> +  PyObject *p = PyList_New (2);
> +  if (p)
> +    Py_INCREF (p);
> +  return p;
> +  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
> new file mode 100644
> index 00000000000..97b29849302
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
> @@ -0,0 +1,38 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target analyzer } */
> +/* { dg-options "-fanalyzer" } */
> +/* { dg-require-python-h "" } */
> +
> +
> +#define PY_SSIZE_T_CLEAN
> +#include <Python.h>
> +#include "../analyzer/analyzer-decls.h"
> +
> +PyObject *
> +test_PyLong_New (long n)
> +{
> +  PyObject *obj = PyLong_FromLong (n);
> +  if (obj)
> +    {
> +     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> +     __analyzer_eval (PyLong_CheckExact (obj)); /* { dg-warning "TRUE" } */
> +    }
> +  else
> +    __analyzer_dump_path (); /* { dg-message "path" } */
> +  return obj;
> +}
> +
> +void
> +test_PyLong_New_2 (long n)
> +{
> +  PyObject *obj = PyLong_FromLong (n);
> +} /* { dg-warning "leak of 'obj'" } */
> +
> +PyObject *test_stray_incref_PyLong (long val)
> +{
> +  PyObject *p = PyLong_FromLong (val);
> +  if (p)
> +    Py_INCREF (p);
> +  return p;
> +  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c
> similarity index 100%
> rename from gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c
> rename to gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c
> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
> new file mode 100644
> index 00000000000..9912f9105d4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
> @@ -0,0 +1,78 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target analyzer } */
> +/* { dg-options "-fanalyzer" } */
> +/* { dg-require-python-h "" } */
> +
> +
> +#define PY_SSIZE_T_CLEAN
> +#include <Python.h>
> +#include "../analyzer/analyzer-decls.h"
> +
> +PyObject *
> +test_PyListAppend (long n)
> +{
> +  PyObject *item = PyLong_FromLong (n);
> +  PyObject *list = PyList_New (0);
> +  PyList_Append(list, item);
> +  return list; /* { dg-warning "leak of 'item'" } */
> +  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> +
> +PyObject *
> +test_PyListAppend_2 (long n)
> +{
> +  PyObject *item = PyLong_FromLong (n);
> +  if (!item)
> +       return NULL;
> +
> +  __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> +  PyObject *list = PyList_New (n);
> +  if (!list)
> +  {
> +       Py_DECREF(item);
> +       return NULL;
> +  }
> +
> +  __analyzer_eval (list->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> +
> +  if (PyList_Append (list, item) < 0)
> +    __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" } */
> +  else
> +    __analyzer_eval (item->ob_refcnt == 2); /* { dg-warning "TRUE" } */
> +  return list; /* { dg-warning "leak of 'item'" } */
> +  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
> +}
> +
> +
> +PyObject *
> +test_PyListAppend_3 (PyObject *item, PyObject *list)
> +{
> +  PyList_Append (list, item);
> +  return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_4 (long n)
> +{
> +  PyObject *item = PyLong_FromLong (n);
> +  PyObject *list = NULL;
> +  PyList_Append(list, item);
> +  return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_5 ()
> +{
> +  PyObject *list = PyList_New (0);
> +  PyList_Append(list, NULL);
> +  return list;
> +}
> +
> +PyObject *
> +test_PyListAppend_6 ()
> +{
> +  PyObject *item = NULL;
> +  PyObject *list = NULL;
> +  PyList_Append(list, item);
> +  return list;
> +}
> \ No newline at end of file
> diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
> index e1ed2d2589e..cbef6da8d86 100644
> --- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
> +++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
> @@ -161,8 +161,9 @@ set plugin_test_list [list \
>           taint-CVE-2011-0521-6.c \
>           taint-antipatterns-1.c } \
>      { analyzer_cpython_plugin.c \
> -         cpython-plugin-test-1.c \
> -         cpython-plugin-test-2.c } \
> +         cpython-plugin-test-PyList_Append.c \
> +         cpython-plugin-test-PyList_New.c \
> +         cpython-plugin-test-PyLong_FromLong.c } \
>  ]
>
>  foreach plugin_test $plugin_test_list {
> --
> 2.30.2
>
David Malcolm Aug. 29, 2023, 9:08 p.m. UTC | #2
On Tue, 2023-08-29 at 00:31 -0400, Eric Feng wrote:
> Hi Dave,

Hi Eric.

Thanks for the updated patch.

A few nits below; this is OK for trunk with them fixed...

[...snip...]

> 
> gcc/analyzer/ChangeLog:
>   PR analyzer/107646
> 	* engine.cc (impl_region_model_context::warn): New optional parameter.
> 	* exploded-graph.h (class impl_region_model_context): Likewise.
> 	* region-model.cc (region_model::pop_frame): New callback feature for
>   * region_model::pop_frame.
> 	* region-model.h (struct append_regions_cb_data): Likewise.
> 	(class region_model): Likewise.
> 	(class region_model_context): New optional parameter.
> 	(class region_model_context_decorator): Likewise.
> 
> gcc/testsuite/ChangeLog:
>   PR analyzer/107646
> 	* gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
>   * checking for PyObjects.
> 	* gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
> 	* gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
>   * added more tests).
> 	* gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
> 	* gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
>   * more tests).
> 	* gcc.dg/plugin/plugin.exp: New tests.
> 	* gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
> 	* gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
> 	* gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.

The ChangeLog formatting here seems wrong; lines starting with a '*'
should refer to a filename.  Continuation lines begin with just a tab
character.

[...snip...]

> diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
> index 10b2a59e787..440ea6d828d 100644
> --- a/gcc/analyzer/region-model.h
> +++ b/gcc/analyzer/region-model.h

[...snip...]

> @@ -840,7 +865,8 @@ private:
>  class region_model_context_decorator : public region_model_context
>  {
>   public:
> -  bool warn (std::unique_ptr<pending_diagnostic> d) override
> +  bool warn (std::unique_ptr<pending_diagnostic> d,
> +	     const stmt_finder *custom_finder)
>    {
>      if (m_inner)
>        return m_inner->warn (std::move (d));

This should presumably pass the custom_finder on to the 2nd argument of
m_inner->warn, rather than have the inner call to warn implicitly use
the NULL default arg.

[...snip...]

> diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c
> similarity index 100%
> rename from gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c
> rename to gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c

Looks like
  "-no-Python-h.c"
would be a better suffix than
  "-no-plugin.c"
as it's the include that's missing, not the plugin.

[...snip...]

> diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
> index e1ed2d2589e..cbef6da8d86 100644
> --- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
> +++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
> @@ -161,8 +161,9 @@ set plugin_test_list [list \
>  	  taint-CVE-2011-0521-6.c \
>  	  taint-antipatterns-1.c } \
>      { analyzer_cpython_plugin.c \
> -	  cpython-plugin-test-1.c \
> -	  cpython-plugin-test-2.c } \
> +	  cpython-plugin-test-PyList_Append.c \
> +	  cpython-plugin-test-PyList_New.c \
> +	  cpython-plugin-test-PyLong_FromLong.c } \

Looks like this is missing:
  cpython-plugin-test-no-plugin.c
and
  cpython-plugin-test-refcnt-checking.c
(though as noted above"cpython-plugin-test-no-Python-h.c" would be a
better name for the former)

so it wasn't actually compiling these tests.

Be sure to doublecheck that these tests pass when updating.

[...snip...]

OK for trunk with the above nits fixed.

Dave
Hans-Peter Nilsson Sept. 1, 2023, 2:49 a.m. UTC | #3
(Looks like this was committed as r14-3580-g597b9ec69bca8a)

> Cc: gcc@gcc.gnu.org, gcc-patches@gcc.gnu.org, Eric Feng <ef2648@columbia.edu>
> From: Eric Feng via Gcc <gcc@gcc.gnu.org>

> gcc/testsuite/ChangeLog:
>   PR analyzer/107646
> 	* gcc.dg/plugin/analyzer_cpython_plugin.c: Implements reference count
>   * checking for PyObjects.
> 	* gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
> 	* gcc.dg/plugin/cpython-plugin-test-PyList_Append.c: ...here (and
>   * added more tests).
> 	* gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
> 	* gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here (and added
>   * more tests).
> 	* gcc.dg/plugin/plugin.exp: New tests.
> 	* gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
> 	* gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New test.
> 	* gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New test.

It seems this was more or less a rewrite, but that said,
it's generally preferable to always *add* tests, never *modify* them.

>  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376 +++++++++++++++++-

^^^ Ouch!  Was it not within reason to keep that test as it
was, and just add another test?

Anyway, the test after rewrite fails, and for some targets
like cris-elf and apparently m68k-linux, yields an error.
I see a PR was already opened.

Also, mostly for future reference, several files in the
patch miss a final newline, as seen by a "\ No newline at
end of file"-marker.

I think I found the problem; a mismatch between default C++
language standard between host-gcc and target-gcc.

(It's actually *not* as simple as "auto var = typeofvar<bar>()"
not being recognized in C++11 --or else there'd be an error
for the hash_set declaration too, which I just changed for
consistency-- but it's close enough for me.)

With this, retesting plugin.exp for cris-elf works.

Ok to commit?

-- >8 --
From: Hans-Peter Nilsson <hp@axis.com>
Date: Fri, 1 Sep 2023 04:36:03 +0200
Subject: [PATCH] testsuite: Fix analyzer_cpython_plugin.c declarations, PR testsuite/111264

Also, add missing newline at end of file.

	PR testsuite/111264
	* gcc.dg/plugin/analyzer_cpython_plugin.c: Make declarations
	C++11-compatible.
---
 gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
index 7af520436549..bf1982e79c37 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -477,8 +477,8 @@ pyobj_refcnt_checker (const region_model *model,
   if (!ctxt)
     return;
 
-  auto region_to_refcnt = hash_map<const region *, int> ();
-  auto seen_regions = hash_set<const region *> ();
+  hash_map<const region *, int> region_to_refcnt;
+  hash_set<const region *> seen_regions;
 
   count_pyobj_references (model, region_to_refcnt, retval, seen_regions);
   check_refcnts (model, old_model, retval, ctxt, region_to_refcnt);
@@ -561,7 +561,7 @@ public:
     if (!ctxt)
       return;
     region_model *model = cd.get_model ();
-    auto region_to_refcnt = hash_map<const region *, int> ();
+    hash_map<const region *, int> region_to_refcnt;
     count_all_references(model, region_to_refcnt);
     dump_refcnt_info(region_to_refcnt, model, ctxt);
   }
@@ -1330,4 +1330,4 @@ plugin_init (struct plugin_name_args *plugin_info,
   sorry_no_analyzer ();
 #endif
   return 0;
-}
\ No newline at end of file
+}
David Malcolm Sept. 1, 2023, 2:51 p.m. UTC | #4
On Fri, 2023-09-01 at 04:49 +0200, Hans-Peter Nilsson wrote:
> (Looks like this was committed as r14-3580-g597b9ec69bca8a)
> 
> > Cc: gcc@gcc.gnu.org, gcc-patches@gcc.gnu.org, Eric Feng
> > <ef2648@columbia.edu>
> > From: Eric Feng via Gcc <gcc@gcc.gnu.org>
> 
> > gcc/testsuite/ChangeLog:
> >   PR analyzer/107646
> >         * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements
> > reference count
> >   * checking for PyObjects.
> >         * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
> >         * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c:
> > ...here (and
> >   * added more tests).
> >         * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
> >         * gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here
> > (and added
> >   * more tests).
> >         * gcc.dg/plugin/plugin.exp: New tests.
> >         * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
> >         * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New
> > test.
> >         * gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New
> > test.
> 
> It seems this was more or less a rewrite, but that said,
> it's generally preferable to always *add* tests, never *modify* them.
> 
> >  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376
> > +++++++++++++++++-
> 
> ^^^ Ouch!  Was it not within reason to keep that test as it
> was, and just add another test?
> 
> Anyway, the test after rewrite fails, and for some targets
> like cris-elf and apparently m68k-linux, yields an error.
> I see a PR was already opened.
> 
> Also, mostly for future reference, several files in the
> patch miss a final newline, as seen by a "\ No newline at
> end of file"-marker.
> 
> I think I found the problem; a mismatch between default C++
> language standard between host-gcc and target-gcc.
> 
> (It's actually *not* as simple as "auto var = typeofvar<bar>()"
> not being recognized in C++11 --or else there'd be an error
> for the hash_set declaration too, which I just changed for
> consistency-- but it's close enough for me.)
> 
> With this, retesting plugin.exp for cris-elf works.
> 
> Ok to commit?

Sorry about the failing tests.

Thanks for the patch; please go ahead and commit.

Dave

> 
> -- >8 --
> From: Hans-Peter Nilsson <hp@axis.com>
> Date: Fri, 1 Sep 2023 04:36:03 +0200
> Subject: [PATCH] testsuite: Fix analyzer_cpython_plugin.c
> declarations, PR testsuite/111264
> 
> Also, add missing newline at end of file.
> 
>         PR testsuite/111264
>         * gcc.dg/plugin/analyzer_cpython_plugin.c: Make declarations
>         C++11-compatible.
> ---
>  gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> index 7af520436549..bf1982e79c37 100644
> --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> @@ -477,8 +477,8 @@ pyobj_refcnt_checker (const region_model *model,
>    if (!ctxt)
>      return;
>  
> -  auto region_to_refcnt = hash_map<const region *, int> ();
> -  auto seen_regions = hash_set<const region *> ();
> +  hash_map<const region *, int> region_to_refcnt;
> +  hash_set<const region *> seen_regions;
>  
>    count_pyobj_references (model, region_to_refcnt, retval,
> seen_regions);
>    check_refcnts (model, old_model, retval, ctxt, region_to_refcnt);
> @@ -561,7 +561,7 @@ public:
>      if (!ctxt)
>        return;
>      region_model *model = cd.get_model ();
> -    auto region_to_refcnt = hash_map<const region *, int> ();
> +    hash_map<const region *, int> region_to_refcnt;
>      count_all_references(model, region_to_refcnt);
>      dump_refcnt_info(region_to_refcnt, model, ctxt);
>    }
> @@ -1330,4 +1330,4 @@ plugin_init (struct plugin_name_args
> *plugin_info,
>    sorry_no_analyzer ();
>  #endif
>    return 0;
> -}
> \ No newline at end of file
> +}
Eric Feng Sept. 1, 2023, 9:07 p.m. UTC | #5
Thank you for the patch!

On Fri, Sep 1, 2023 at 10:51 AM David Malcolm <dmalcolm@redhat.com> wrote:
>
> On Fri, 2023-09-01 at 04:49 +0200, Hans-Peter Nilsson wrote:
> > (Looks like this was committed as r14-3580-g597b9ec69bca8a)
> >
> > > Cc: gcc@gcc.gnu.org, gcc-patches@gcc.gnu.org, Eric Feng
> > > <ef2648@columbia.edu>
> > > From: Eric Feng via Gcc <gcc@gcc.gnu.org>
> >
> > > gcc/testsuite/ChangeLog:
> > >   PR analyzer/107646
> > >         * gcc.dg/plugin/analyzer_cpython_plugin.c: Implements
> > > reference count
> > >   * checking for PyObjects.
> > >         * gcc.dg/plugin/cpython-plugin-test-2.c: Moved to...
> > >         * gcc.dg/plugin/cpython-plugin-test-PyList_Append.c:
> > > ...here (and
> > >   * added more tests).
> > >         * gcc.dg/plugin/cpython-plugin-test-1.c: Moved to...
> > >         * gcc.dg/plugin/cpython-plugin-test-no-plugin.c: ...here
> > > (and added
> > >   * more tests).
> > >         * gcc.dg/plugin/plugin.exp: New tests.
> > >         * gcc.dg/plugin/cpython-plugin-test-PyList_New.c: New test.
> > >         * gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c: New
> > > test.
> > >         * gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c: New
> > > test.
> >
> > It seems this was more or less a rewrite, but that said,
> > it's generally preferable to always *add* tests, never *modify* them.
> >
> > >  .../gcc.dg/plugin/analyzer_cpython_plugin.c   | 376
> > > +++++++++++++++++-
> >
> > ^^^ Ouch!  Was it not within reason to keep that test as it
> > was, and just add another test?
Thanks for the feedback. To clarify, 'analyzer_cpython_plugin.c' is
not a test itself but rather a plugin that currently lives within the
testsuite. The core of the test cases were also not modified, rather I
renamed certain filenames containing them for clarity (unless this is
what you meant in terms of modification, in which case noted) and
added to them. However, I understand the preference and will keep that
in mind.
> >
> > Anyway, the test after rewrite fails, and for some targets
> > like cris-elf and apparently m68k-linux, yields an error.
> > I see a PR was already opened.
> >
> > Also, mostly for future reference, several files in the
> > patch miss a final newline, as seen by a "\ No newline at
> > end of file"-marker.
Noted.
> >
> > I think I found the problem; a mismatch between default C++
> > language standard between host-gcc and target-gcc.
> >
> > (It's actually *not* as simple as "auto var = typeofvar<bar>()"
> > not being recognized in C++11 --or else there'd be an error
> > for the hash_set declaration too, which I just changed for
> > consistency-- but it's close enough for me.)
> >
> > With this, retesting plugin.exp for cris-elf works.
Sounds good, thanks again! I was also curious about why hash_map had
an issue here with that syntax whilst hash_set did not, so I tried to
investigate a bit further. I believe the issue was due to the compiler
having trouble disambiguating between the hash_map constructors in
C++11.

From the error message we received:

test/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c:480:58:
error: no matching function for call to 'hash_map<const ana::region*,
int>::hash_map(hash_map<const ana::region*, int>)'
   auto region_to_refcnt = hash_map<const region *, int> ();

I think the compiler is mistakenly interpreting the call here as we
would like to create a new hash_map object using its copy constructor,
but we "forgot" to provide the object to be copied, rather than our
intention of using the default constructor.

Looking at hash_map.h and hash_set.h seems to support this hypothesis,
as hash_map has two constructors, one of which resembles a copy
constructor with additional arguments:
https://github.com/gcc-mirror/gcc/blob/master/gcc/hash-map.h#L147.
Perhaps the default arguments here further complicated the ambiguity
as to which constructor to use in the presence of the empty
parenthesis.

On the other hand, hash_set has only the default constructor with
default parameters, and thus there is no ambiguity:
https://github.com/gcc-mirror/gcc/blob/master/gcc/hash-set.h#L40.

I assume this ambiguity was cleared up by later versions, and thus we
observed no problems in C++17. However, I am certainly still a
relative newbie of C++, so please anyone feel free to correct my
reasoning and chime in!
> >
> > Ok to commit?
>
> Sorry about the failing tests.
>
> Thanks for the patch; please go ahead and commit.
>
> Dave
>
> >
> > -- >8 --
> > From: Hans-Peter Nilsson <hp@axis.com>
> > Date: Fri, 1 Sep 2023 04:36:03 +0200
> > Subject: [PATCH] testsuite: Fix analyzer_cpython_plugin.c
> > declarations, PR testsuite/111264
> >
> > Also, add missing newline at end of file.
> >
> >         PR testsuite/111264
> >         * gcc.dg/plugin/analyzer_cpython_plugin.c: Make declarations
> >         C++11-compatible.
> > ---
> >  gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c | 8 ++++----
> >  1 file changed, 4 insertions(+), 4 deletions(-)
> >
> > diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > index 7af520436549..bf1982e79c37 100644
> > --- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > +++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
> > @@ -477,8 +477,8 @@ pyobj_refcnt_checker (const region_model *model,
> >    if (!ctxt)
> >      return;
> >
> > -  auto region_to_refcnt = hash_map<const region *, int> ();
> > -  auto seen_regions = hash_set<const region *> ();
> > +  hash_map<const region *, int> region_to_refcnt;
> > +  hash_set<const region *> seen_regions;
> >
> >    count_pyobj_references (model, region_to_refcnt, retval,
> > seen_regions);
> >    check_refcnts (model, old_model, retval, ctxt, region_to_refcnt);
> > @@ -561,7 +561,7 @@ public:
> >      if (!ctxt)
> >        return;
> >      region_model *model = cd.get_model ();
> > -    auto region_to_refcnt = hash_map<const region *, int> ();
> > +    hash_map<const region *, int> region_to_refcnt;
> >      count_all_references(model, region_to_refcnt);
> >      dump_refcnt_info(region_to_refcnt, model, ctxt);
> >    }
> > @@ -1330,4 +1330,4 @@ plugin_init (struct plugin_name_args
> > *plugin_info,
> >    sorry_no_analyzer ();
> >  #endif
> >    return 0;
> > -}
> > \ No newline at end of file
> > +}
>

Best,
Eric
diff mbox series

Patch

diff --git a/gcc/analyzer/engine.cc b/gcc/analyzer/engine.cc
index a1908cdb364..736a41ecdaf 100644
--- a/gcc/analyzer/engine.cc
+++ b/gcc/analyzer/engine.cc
@@ -115,10 +115,12 @@  impl_region_model_context (program_state *state,
 }
 
 bool
-impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d)
+impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d,
+				 const stmt_finder *custom_finder)
 {
   LOG_FUNC (get_logger ());
-  if (m_stmt == NULL && m_stmt_finder == NULL)
+  auto curr_stmt_finder = custom_finder ? custom_finder : m_stmt_finder;
+  if (m_stmt == NULL && curr_stmt_finder == NULL)
     {
       if (get_logger ())
 	get_logger ()->log ("rejecting diagnostic: no stmt");
@@ -129,7 +131,7 @@  impl_region_model_context::warn (std::unique_ptr<pending_diagnostic> d)
       bool terminate_path = d->terminate_path_p ();
       if (m_eg->get_diagnostic_manager ().add_diagnostic
 	  (m_enode_for_diag, m_enode_for_diag->get_supernode (),
-	   m_stmt, m_stmt_finder, std::move (d)))
+	   m_stmt, curr_stmt_finder, std::move (d)))
 	{
 	  if (m_path_ctxt
 	      && terminate_path
diff --git a/gcc/analyzer/exploded-graph.h b/gcc/analyzer/exploded-graph.h
index 5a7ab645bfe..6e9a5ef58c7 100644
--- a/gcc/analyzer/exploded-graph.h
+++ b/gcc/analyzer/exploded-graph.h
@@ -56,7 +56,8 @@  class impl_region_model_context : public region_model_context
 			     uncertainty_t *uncertainty,
 			     logger *logger = NULL);
 
-  bool warn (std::unique_ptr<pending_diagnostic> d) final override;
+  bool warn (std::unique_ptr<pending_diagnostic> d,
+	     const stmt_finder *custom_finder = NULL) final override;
   void add_note (std::unique_ptr<pending_note> pn) final override;
   void add_event (std::unique_ptr<checker_event> event) final override;
   void on_svalue_leak (const svalue *) override;
@@ -107,6 +108,7 @@  class impl_region_model_context : public region_model_context
 			 std::unique_ptr<sm_context> *out_sm_context) override;
 
   const gimple *get_stmt () const override { return m_stmt; }
+  const exploded_graph *get_eg () const override { return m_eg; }
 
   exploded_graph *m_eg;
   log_user m_logger;
diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index 4f31a6dcf0f..eb4f976b83a 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -82,6 +82,8 @@  along with GCC; see the file COPYING3.  If not see
 
 namespace ana {
 
+auto_vec<pop_frame_callback> region_model::pop_frame_callbacks;
+
 /* Dump T to PP in language-independent form, for debugging/logging/dumping
    purposes.  */
 
@@ -5422,6 +5424,7 @@  region_model::pop_frame (tree result_lvalue,
     }
 
   unbind_region_and_descendents (frame_reg,POISON_KIND_POPPED_STACK);
+  notify_on_pop_frame (this, retval, ctxt);
 }
 
 /* Get the number of frames in this region_model's stack.  */
diff --git a/gcc/analyzer/region-model.h b/gcc/analyzer/region-model.h
index 10b2a59e787..440ea6d828d 100644
--- a/gcc/analyzer/region-model.h
+++ b/gcc/analyzer/region-model.h
@@ -236,6 +236,10 @@  public:
 
 struct append_regions_cb_data;
 
+typedef void (*pop_frame_callback) (const region_model *model,
+				    const svalue *retval,
+				    region_model_context *ctxt);
+
 /* A region_model encapsulates a representation of the state of memory, with
    a tree of regions, along with their associated values.
    The representation is graph-like because values can be pointers to
@@ -532,6 +536,20 @@  class region_model
   get_builtin_kf (const gcall *call,
 		  region_model_context *ctxt = NULL) const;
 
+  static void
+  register_pop_frame_callback (const pop_frame_callback &callback)
+  {
+    pop_frame_callbacks.safe_push (callback);
+  }
+
+  static void
+  notify_on_pop_frame (const region_model *model, const svalue *retval,
+		       region_model_context *ctxt)
+  {
+    for (auto &callback : pop_frame_callbacks)
+	callback (model, retval, ctxt);
+  }
+
 private:
   const region *get_lvalue_1 (path_var pv, region_model_context *ctxt) const;
   const svalue *get_rvalue_1 (path_var pv, region_model_context *ctxt) const;
@@ -621,6 +639,7 @@  private:
 						tree callee_fndecl,
 						region_model_context *ctxt) const;
 
+  static auto_vec<pop_frame_callback> pop_frame_callbacks;
   /* Storing this here to avoid passing it around everywhere.  */
   region_model_manager *const m_mgr;
 
@@ -649,8 +668,10 @@  class region_model_context
 {
  public:
   /* Hook for clients to store pending diagnostics.
-     Return true if the diagnostic was stored, or false if it was deleted.  */
-  virtual bool warn (std::unique_ptr<pending_diagnostic> d) = 0;
+     Return true if the diagnostic was stored, or false if it was deleted.
+     Optionally provide a custom stmt_finder.  */
+  virtual bool warn (std::unique_ptr<pending_diagnostic> d,
+		     const stmt_finder *custom_finder = NULL) = 0;
 
   /* Hook for clients to add a note to the last previously stored
      pending diagnostic.  */
@@ -757,6 +778,8 @@  class region_model_context
 
   /* Get the current statement, if any.  */
   virtual const gimple *get_stmt () const = 0;
+
+  virtual const exploded_graph *get_eg () const = 0;
 };
 
 /* A "do nothing" subclass of region_model_context.  */
@@ -764,7 +787,8 @@  class region_model_context
 class noop_region_model_context : public region_model_context
 {
 public:
-  bool warn (std::unique_ptr<pending_diagnostic>) override { return false; }
+  bool warn (std::unique_ptr<pending_diagnostic> d,
+	     const stmt_finder *custom_finder) override { return false; }
   void add_note (std::unique_ptr<pending_note>) override;
   void add_event (std::unique_ptr<checker_event>) override;
   void on_svalue_leak (const svalue *) override {}
@@ -812,6 +836,7 @@  public:
   }
 
   const gimple *get_stmt () const override { return NULL; }
+  const exploded_graph *get_eg () const override { return NULL; }
 };
 
 /* A subclass of region_model_context for determining if operations fail
@@ -840,7 +865,8 @@  private:
 class region_model_context_decorator : public region_model_context
 {
  public:
-  bool warn (std::unique_ptr<pending_diagnostic> d) override
+  bool warn (std::unique_ptr<pending_diagnostic> d,
+	     const stmt_finder *custom_finder)
   {
     if (m_inner)
       return m_inner->warn (std::move (d));
@@ -978,6 +1004,14 @@  class region_model_context_decorator : public region_model_context
       return nullptr;
   }
 
+  const exploded_graph *get_eg () const override
+  {
+    if (m_inner)
+	return m_inner->get_eg ();
+    else
+	return nullptr;
+  }
+
 protected:
   region_model_context_decorator (region_model_context *inner)
   : m_inner (inner)
@@ -993,7 +1027,8 @@  protected:
 class annotating_context : public region_model_context_decorator
 {
 public:
-  bool warn (std::unique_ptr<pending_diagnostic> d) override
+  bool warn (std::unique_ptr<pending_diagnostic> d,
+	     const stmt_finder *custom_finder) override
   {
     if (m_inner)
       if (m_inner->warn (std::move (d)))
@@ -1158,7 +1193,8 @@  using namespace ::selftest;
 class test_region_model_context : public noop_region_model_context
 {
 public:
-  bool warn (std::unique_ptr<pending_diagnostic> d) final override
+  bool warn (std::unique_ptr<pending_diagnostic> d,
+	     const stmt_finder *custom_finder) final override
   {
     m_diagnostics.safe_push (d.release ());
     return true;
diff --git a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
index 7cd72e8a886..b2caed8fc1b 100644
--- a/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/analyzer_cpython_plugin.c
@@ -44,6 +44,7 @@ 
 #include "analyzer/region-model.h"
 #include "analyzer/call-details.h"
 #include "analyzer/call-info.h"
+#include "analyzer/exploded-graph.h"
 #include "make-unique.h"
 
 int plugin_is_GPL_compatible;
@@ -191,6 +192,372 @@  public:
   }
 };
 
+/* This is just a copy of leak_stmt_finder for now (subject to change if
+ * necssary)  */
+
+class refcnt_stmt_finder : public stmt_finder
+{
+public:
+  refcnt_stmt_finder (const exploded_graph &eg, tree var)
+      : m_eg (eg), m_var (var)
+  {
+  }
+
+  std::unique_ptr<stmt_finder>
+  clone () const final override
+  {
+    return make_unique<refcnt_stmt_finder> (m_eg, m_var);
+  }
+
+  const gimple *
+  find_stmt (const exploded_path &epath) final override
+  {
+    logger *const logger = m_eg.get_logger ();
+    LOG_FUNC (logger);
+
+    if (m_var && TREE_CODE (m_var) == SSA_NAME)
+      {
+	/* Locate the final write to this SSA name in the path.  */
+	const gimple *def_stmt = SSA_NAME_DEF_STMT (m_var);
+
+	int idx_of_def_stmt;
+	bool found = epath.find_stmt_backwards (def_stmt, &idx_of_def_stmt);
+	if (!found)
+	  goto not_found;
+
+	/* What was the next write to the underlying var
+	   after the SSA name was set? (if any).  */
+
+	for (unsigned idx = idx_of_def_stmt + 1; idx < epath.m_edges.length ();
+	     ++idx)
+	  {
+	    const exploded_edge *eedge = epath.m_edges[idx];
+	    if (logger)
+		    logger->log ("eedge[%i]: EN %i -> EN %i", idx,
+				 eedge->m_src->m_index,
+				 eedge->m_dest->m_index);
+	    const exploded_node *dst_node = eedge->m_dest;
+	    const program_point &dst_point = dst_node->get_point ();
+	    const gimple *stmt = dst_point.get_stmt ();
+	    if (!stmt)
+		    continue;
+	    if (const gassign *assign = dyn_cast<const gassign *> (stmt))
+		    {
+			    tree lhs = gimple_assign_lhs (assign);
+			    if (TREE_CODE (lhs) == SSA_NAME
+				&& SSA_NAME_VAR (lhs) == SSA_NAME_VAR (m_var))
+				    return assign;
+		    }
+	  }
+      }
+
+  not_found:
+
+    /* Look backwards for the first statement with a location.  */
+    int i;
+    const exploded_edge *eedge;
+    FOR_EACH_VEC_ELT_REVERSE (epath.m_edges, i, eedge)
+    {
+      if (logger)
+	logger->log ("eedge[%i]: EN %i -> EN %i", i, eedge->m_src->m_index,
+		     eedge->m_dest->m_index);
+      const exploded_node *dst_node = eedge->m_dest;
+      const program_point &dst_point = dst_node->get_point ();
+      const gimple *stmt = dst_point.get_stmt ();
+      if (stmt)
+	if (get_pure_location (stmt->location) != UNKNOWN_LOCATION)
+	  return stmt;
+    }
+
+    gcc_unreachable ();
+    return NULL;
+  }
+
+private:
+  const exploded_graph &m_eg;
+  tree m_var;
+};
+
+class refcnt_mismatch : public pending_diagnostic_subclass<refcnt_mismatch>
+{
+public:
+  refcnt_mismatch (const region *base_region,
+				const svalue *ob_refcnt,
+				const svalue *actual_refcnt,
+        tree reg_tree)
+      : m_base_region (base_region), m_ob_refcnt (ob_refcnt),
+	m_actual_refcnt (actual_refcnt), m_reg_tree(reg_tree)
+  {
+  }
+
+  const char *
+  get_kind () const final override
+  {
+    return "refcnt_mismatch";
+  }
+
+  bool
+  operator== (const refcnt_mismatch &other) const
+  {
+    return (m_base_region == other.m_base_region
+	    && m_ob_refcnt == other.m_ob_refcnt
+	    && m_actual_refcnt == other.m_actual_refcnt);
+  }
+
+  int get_controlling_option () const final override
+  {
+    return 0;
+  }
+
+  bool
+  emit (rich_location *rich_loc, logger *) final override
+  {
+    diagnostic_metadata m;
+    bool warned;
+    // just assuming constants for now
+    auto actual_refcnt
+	= m_actual_refcnt->dyn_cast_constant_svalue ()->get_constant ();
+    auto ob_refcnt = m_ob_refcnt->dyn_cast_constant_svalue ()->get_constant ();
+    warned = warning_meta (
+	rich_loc, m, get_controlling_option (),
+	"expected <variable name belonging to m_base_region> to have "
+	"reference count: %qE but ob_refcnt field is: %qE",
+	actual_refcnt, ob_refcnt);
+
+    // location_t loc = rich_loc->get_loc ();
+    // foo (loc);
+    return warned;
+  }
+
+  void mark_interesting_stuff (interesting_t *interest) final override
+  {
+    if (m_base_region)
+      interest->add_region_creation (m_base_region);
+  }
+
+private:
+
+  void foo(location_t loc) const 
+  {
+    inform(loc, "something is up right here");
+  }
+  const region *m_base_region;
+  const svalue *m_ob_refcnt;
+  const svalue *m_actual_refcnt;
+  tree m_reg_tree;
+};
+
+/* Retrieves the svalue associated with the ob_refcnt field of the base region.
+ */
+static const svalue *
+retrieve_ob_refcnt_sval (const region *base_reg, const region_model *model,
+			 region_model_context *ctxt)
+{
+  region_model_manager *mgr = model->get_manager ();
+  tree ob_refcnt_tree = get_field_by_name (pyobj_record, "ob_refcnt");
+  const region *ob_refcnt_region
+      = mgr->get_field_region (base_reg, ob_refcnt_tree);
+  const svalue *ob_refcnt_sval
+      = model->get_store_value (ob_refcnt_region, ctxt);
+  return ob_refcnt_sval;
+}
+
+static void
+increment_region_refcnt (hash_map<const region *, int> &map, const region *key)
+{
+  bool existed;
+  auto &refcnt = map.get_or_insert (key, &existed);
+  refcnt = existed ? refcnt + 1 : 1;
+}
+
+
+/* Recursively fills in region_to_refcnt with the references owned by
+   pyobj_ptr_sval.  */
+static void
+count_expected_pyobj_references (const region_model *model,
+			   hash_map<const region *, int> &region_to_refcnt,
+			   const svalue *pyobj_ptr_sval,
+			   hash_set<const region *> &seen)
+{
+  if (!pyobj_ptr_sval)
+    return;
+
+  const auto *pyobj_region_sval = pyobj_ptr_sval->dyn_cast_region_svalue ();
+  const auto *pyobj_initial_sval = pyobj_ptr_sval->dyn_cast_initial_svalue ();
+  if (!pyobj_region_sval && !pyobj_initial_sval)
+    return;
+
+  // todo: support initial sval (e.g passed in as parameter)
+  if (pyobj_initial_sval)
+    {
+  //     increment_region_refcnt (region_to_refcnt,
+	// 		       pyobj_initial_sval->get_region ());
+      return;
+    }
+
+  const region *pyobj_region = pyobj_region_sval->get_pointee ();
+  if (!pyobj_region || seen.contains (pyobj_region))
+    return;
+
+  seen.add (pyobj_region);
+
+  if (pyobj_ptr_sval->get_type () == pyobj_ptr_tree)
+    increment_region_refcnt (region_to_refcnt, pyobj_region);
+
+  const auto *curr_store = model->get_store ();
+  const auto *retval_cluster = curr_store->get_cluster (pyobj_region);
+  if (!retval_cluster)
+    return;
+
+  const auto &retval_binding_map = retval_cluster->get_map ();
+
+  for (const auto &binding : retval_binding_map)
+    {
+      const svalue *binding_sval = binding.second;
+      const svalue *unwrapped_sval = binding_sval->unwrap_any_unmergeable ();
+      const region *pointee = unwrapped_sval->maybe_get_region ();
+
+      if (pointee && pointee->get_kind () == RK_HEAP_ALLOCATED)
+	count_expected_pyobj_references (model, region_to_refcnt, binding_sval,
+					 seen);
+    }
+}
+
+/* Compare ob_refcnt field vs the actual reference count of a region */
+static void
+check_refcnt (const region_model *model, region_model_context *ctxt,
+	      const hash_map<const ana::region *,
+			     int>::iterator::reference_pair region_refcnt)
+{
+  region_model_manager *mgr = model->get_manager ();
+  const auto &curr_region = region_refcnt.first;
+  const auto &actual_refcnt = region_refcnt.second;
+  const svalue *ob_refcnt_sval = retrieve_ob_refcnt_sval (curr_region, model, ctxt);
+  const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst (
+      ob_refcnt_sval->get_type (), actual_refcnt);
+
+  if (ob_refcnt_sval != actual_refcnt_sval)
+  {
+    // todo: fix this
+    tree reg_tree = model->get_representative_tree (curr_region);
+
+    const auto &eg = ctxt->get_eg ();
+    refcnt_stmt_finder finder (*eg, reg_tree);
+    auto pd = make_unique<refcnt_mismatch> (curr_region, ob_refcnt_sval,
+					    actual_refcnt_sval, reg_tree);
+    if (pd && eg)
+    ctxt->warn (std::move (pd), &finder);
+  }
+}
+
+static void
+check_refcnts (const region_model *model, const svalue *retval,
+	    region_model_context *ctxt,
+	    hash_map<const region *, int> &region_to_refcnt)
+{
+  for (const auto &region_refcnt : region_to_refcnt)
+  {
+    check_refcnt(model, ctxt, region_refcnt);
+  }
+}
+
+/* Validates the reference count of all Python objects. */
+void
+pyobj_refcnt_checker (const region_model *model, const svalue *retval,
+		    region_model_context *ctxt)
+{
+  if (!ctxt)
+  return;
+
+  auto region_to_refcnt = hash_map<const region *, int> ();
+  auto seen_regions = hash_set<const region *> ();
+
+  count_expected_pyobj_references (model, region_to_refcnt, retval, seen_regions);
+  check_refcnts (model, retval, ctxt, region_to_refcnt);
+}
+
+/* Counts the actual pyobject references from all clusters in the model's
+ * store. */
+static void
+count_all_references (const region_model *model,
+		      hash_map<const region *, int> &region_to_refcnt)
+{
+  for (const auto &cluster : *model->get_store ())
+  {
+    auto curr_region = cluster.first;
+    if (curr_region->get_kind () != RK_HEAP_ALLOCATED)
+    continue;
+
+    increment_region_refcnt (region_to_refcnt, curr_region);
+
+    auto binding_cluster = cluster.second;
+    for (const auto &binding : binding_cluster->get_map ())
+    {
+	  const svalue *binding_sval = binding.second;
+
+	  const svalue *unwrapped_sval
+	      = binding_sval->unwrap_any_unmergeable ();
+	  // if (unwrapped_sval->get_type () != pyobj_ptr_tree)
+	  // continue;
+
+	  const region *pointee = unwrapped_sval->maybe_get_region ();
+	  if (!pointee || pointee->get_kind () != RK_HEAP_ALLOCATED)
+	    continue;
+
+	  increment_region_refcnt (region_to_refcnt, pointee);
+    }
+  }
+}
+
+static void
+dump_refcnt_info (const hash_map<const region *, int> &region_to_refcnt,
+		  const region_model *model, region_model_context *ctxt)
+{
+  region_model_manager *mgr = model->get_manager ();
+  pretty_printer pp;
+  pp_format_decoder (&pp) = default_tree_printer;
+  pp_show_color (&pp) = pp_show_color (global_dc->printer);
+  pp.buffer->stream = stderr;
+
+  for (const auto &region_refcnt : region_to_refcnt)
+  {
+    auto region = region_refcnt.first;
+    auto actual_refcnt = region_refcnt.second;
+    const svalue *ob_refcnt_sval
+	= retrieve_ob_refcnt_sval (region, model, ctxt);
+    const svalue *actual_refcnt_sval = mgr->get_or_create_int_cst (
+	ob_refcnt_sval->get_type (), actual_refcnt);
+
+    region->dump_to_pp (&pp, true);
+    pp_string (&pp, " — ob_refcnt: ");
+    ob_refcnt_sval->dump_to_pp (&pp, true);
+    pp_string (&pp, " actual refcnt: ");
+    actual_refcnt_sval->dump_to_pp (&pp, true);
+    pp_newline (&pp);
+  }
+  pp_string (&pp, "~~~~~~~~\n");
+  pp_flush (&pp);
+}
+
+class kf_analyzer_cpython_dump_refcounts : public known_function
+{
+public:
+  bool matches_call_types_p (const call_details &cd) const final override
+  {
+    return cd.num_args () == 0;
+  }
+  void impl_call_pre (const call_details &cd) const final override
+  {
+    region_model_context *ctxt = cd.get_ctxt ();
+    if (!ctxt)
+      return;
+    region_model *model = cd.get_model ();
+    auto region_to_refcnt = hash_map<const region *, int> ();
+    count_all_references(model, region_to_refcnt);
+    dump_refcnt_info(region_to_refcnt, model, ctxt);
+  }
+};
+
 /* Some concessions were made to
 simplify the analysis process when comparing kf_PyList_Append with the
 real implementation. In particular, PyList_Append performs some
@@ -927,6 +1294,10 @@  cpython_analyzer_init_cb (void *gcc_data, void * /*user_data */)
   iface->register_known_function ("PyList_New", make_unique<kf_PyList_New> ());
   iface->register_known_function ("PyLong_FromLong",
                                   make_unique<kf_PyLong_FromLong> ());
+
+  iface->register_known_function (
+      "__analyzer_cpython_dump_refcounts",
+      make_unique<kf_analyzer_cpython_dump_refcounts> ());
 }
 } // namespace ana
 
@@ -940,8 +1311,9 @@  plugin_init (struct plugin_name_args *plugin_info,
   const char *plugin_name = plugin_info->base_name;
   if (0)
     inform (input_location, "got here; %qs", plugin_name);
-  ana::register_finish_translation_unit_callback (&stash_named_types);
-  ana::register_finish_translation_unit_callback (&stash_global_vars);
+  register_finish_translation_unit_callback (&stash_named_types);
+  register_finish_translation_unit_callback (&stash_global_vars);
+  region_model::register_pop_frame_callback(pyobj_refcnt_checker);
   register_callback (plugin_info->base_name, PLUGIN_ANALYZER_INIT,
                      ana::cpython_analyzer_init_cb,
                      NULL); /* void *user_data */
diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
similarity index 64%
rename from gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
rename to gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
index 19b5c17428a..9912f9105d4 100644
--- a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-2.c
+++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_Append.c
@@ -8,34 +8,6 @@ 
 #include <Python.h>
 #include "../analyzer/analyzer-decls.h"
 
-PyObject *
-test_PyList_New (Py_ssize_t len)
-{
-  PyObject *obj = PyList_New (len);
-  if (obj)
-    {
-     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
-     __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
-    }
-  else
-    __analyzer_dump_path (); /* { dg-message "path" } */
-  return obj;
-}
-
-PyObject *
-test_PyLong_New (long n)
-{
-  PyObject *obj = PyLong_FromLong (n);
-  if (obj)
-    {
-     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
-     __analyzer_eval (PyLong_CheckExact (obj)); /* { dg-warning "TRUE" } */
-    }
-  else
-    __analyzer_dump_path (); /* { dg-message "path" } */
-  return obj;
-}
-
 PyObject *
 test_PyListAppend (long n)
 {
@@ -43,6 +15,7 @@  test_PyListAppend (long n)
   PyObject *list = PyList_New (0);
   PyList_Append(list, item);
   return list; /* { dg-warning "leak of 'item'" } */
+  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
 }
 
 PyObject *
@@ -67,6 +40,7 @@  test_PyListAppend_2 (long n)
   else
     __analyzer_eval (item->ob_refcnt == 2); /* { dg-warning "TRUE" } */
   return list; /* { dg-warning "leak of 'item'" } */
+  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
 }
 
 
@@ -75,4 +49,30 @@  test_PyListAppend_3 (PyObject *item, PyObject *list)
 {
   PyList_Append (list, item);
   return list;
+}
+
+PyObject *
+test_PyListAppend_4 (long n)
+{
+  PyObject *item = PyLong_FromLong (n);
+  PyObject *list = NULL;
+  PyList_Append(list, item);
+  return list;
+}
+
+PyObject *
+test_PyListAppend_5 ()
+{
+  PyObject *list = PyList_New (0);
+  PyList_Append(list, NULL);
+  return list;
+}
+
+PyObject *
+test_PyListAppend_6 ()
+{
+  PyObject *item = NULL;
+  PyObject *list = NULL;
+  PyList_Append(list, item);
+  return list;
 }
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
new file mode 100644
index 00000000000..492d4f7d58d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyList_New.c
@@ -0,0 +1,38 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target analyzer } */
+/* { dg-options "-fanalyzer" } */
+/* { dg-require-python-h "" } */
+
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#include "../analyzer/analyzer-decls.h"
+
+PyObject *
+test_PyList_New (Py_ssize_t len)
+{
+  PyObject *obj = PyList_New (len);
+  if (obj)
+    {
+     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
+     __analyzer_eval (PyList_CheckExact (obj)); /* { dg-warning "TRUE" } */
+    }
+  else
+    __analyzer_dump_path (); /* { dg-message "path" } */
+  return obj;
+}
+
+void
+test_PyList_New_2 ()
+{
+  PyObject *obj = PyList_New (0);
+} /* { dg-warning "leak of 'obj'" } */
+
+PyObject *test_stray_incref_PyList ()
+{
+  PyObject *p = PyList_New (2);
+  if (p)
+    Py_INCREF (p);
+  return p;
+  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
+}
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
new file mode 100644
index 00000000000..97b29849302
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-PyLong_FromLong.c
@@ -0,0 +1,38 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target analyzer } */
+/* { dg-options "-fanalyzer" } */
+/* { dg-require-python-h "" } */
+
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#include "../analyzer/analyzer-decls.h"
+
+PyObject *
+test_PyLong_New (long n)
+{
+  PyObject *obj = PyLong_FromLong (n);
+  if (obj)
+    {
+     __analyzer_eval (obj->ob_refcnt == 1); /* { dg-warning "TRUE" } */
+     __analyzer_eval (PyLong_CheckExact (obj)); /* { dg-warning "TRUE" } */
+    }
+  else
+    __analyzer_dump_path (); /* { dg-message "path" } */
+  return obj;
+}
+
+void
+test_PyLong_New_2 (long n)
+{
+  PyObject *obj = PyLong_FromLong (n);
+} /* { dg-warning "leak of 'obj'" } */
+
+PyObject *test_stray_incref_PyLong (long val)
+{
+  PyObject *p = PyLong_FromLong (val);
+  if (p)
+    Py_INCREF (p);
+  return p;
+  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
+}
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c
similarity index 100%
rename from gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-1.c
rename to gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-no-plugin.c
diff --git a/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
new file mode 100644
index 00000000000..9912f9105d4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/plugin/cpython-plugin-test-refcnt-checking.c
@@ -0,0 +1,78 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target analyzer } */
+/* { dg-options "-fanalyzer" } */
+/* { dg-require-python-h "" } */
+
+
+#define PY_SSIZE_T_CLEAN
+#include <Python.h>
+#include "../analyzer/analyzer-decls.h"
+
+PyObject *
+test_PyListAppend (long n)
+{
+  PyObject *item = PyLong_FromLong (n);
+  PyObject *list = PyList_New (0);
+  PyList_Append(list, item);
+  return list; /* { dg-warning "leak of 'item'" } */
+  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
+}
+
+PyObject *
+test_PyListAppend_2 (long n)
+{
+  PyObject *item = PyLong_FromLong (n);
+  if (!item)
+	return NULL;
+
+  __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" } */
+  PyObject *list = PyList_New (n);
+  if (!list)
+  {
+	Py_DECREF(item);
+	return NULL;
+  }
+
+  __analyzer_eval (list->ob_refcnt == 1); /* { dg-warning "TRUE" } */
+
+  if (PyList_Append (list, item) < 0)
+    __analyzer_eval (item->ob_refcnt == 1); /* { dg-warning "TRUE" } */
+  else
+    __analyzer_eval (item->ob_refcnt == 2); /* { dg-warning "TRUE" } */
+  return list; /* { dg-warning "leak of 'item'" } */
+  /* { dg-warning "reference count" "" { target *-*-* } .-1 } */
+}
+
+
+PyObject *
+test_PyListAppend_3 (PyObject *item, PyObject *list)
+{
+  PyList_Append (list, item);
+  return list;
+}
+
+PyObject *
+test_PyListAppend_4 (long n)
+{
+  PyObject *item = PyLong_FromLong (n);
+  PyObject *list = NULL;
+  PyList_Append(list, item);
+  return list;
+}
+
+PyObject *
+test_PyListAppend_5 ()
+{
+  PyObject *list = PyList_New (0);
+  PyList_Append(list, NULL);
+  return list;
+}
+
+PyObject *
+test_PyListAppend_6 ()
+{
+  PyObject *item = NULL;
+  PyObject *list = NULL;
+  PyList_Append(list, item);
+  return list;
+}
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.dg/plugin/plugin.exp b/gcc/testsuite/gcc.dg/plugin/plugin.exp
index e1ed2d2589e..cbef6da8d86 100644
--- a/gcc/testsuite/gcc.dg/plugin/plugin.exp
+++ b/gcc/testsuite/gcc.dg/plugin/plugin.exp
@@ -161,8 +161,9 @@  set plugin_test_list [list \
 	  taint-CVE-2011-0521-6.c \
 	  taint-antipatterns-1.c } \
     { analyzer_cpython_plugin.c \
-	  cpython-plugin-test-1.c \
-	  cpython-plugin-test-2.c } \
+	  cpython-plugin-test-PyList_Append.c \
+	  cpython-plugin-test-PyList_New.c \
+	  cpython-plugin-test-PyLong_FromLong.c } \
 ]
 
 foreach plugin_test $plugin_test_list {