diff mbox series

[v2] Provide more contexts for -Warray-bounds warning messages

Message ID 20240712140327.709430-1-qing.zhao@oracle.com
State New
Headers show
Series [v2] Provide more contexts for -Warray-bounds warning messages | expand

Commit Message

Qing Zhao July 12, 2024, 2:03 p.m. UTC
due to code duplication from jump threading [PR109071]
Control this with a new option -fdiagnostic-explain-harder.

Compared to V1, the major difference are: (address David's comments)

1. Change the name of the option from:

-fdiagnostic-try-to-explain-harder 
To
-fdiagnostic-explain-harder 

2. Sync the commit comments with the real output of the compilation message.

3. Add one more event in the end of the path to repeat the out-of-bound
   issue.

4. Fixed the unnecessary changes in Makefile.in.

5. Add new copy_history_diagnostic_path.[cc|h] to implement a new
class copy_history_diagnostic_path : public diagnostic_path

for copy_history_t. 

6. Only building the rich locaiton and populating the path when warning_at
is called.

There are two comments from David that I didn't addressed in this version:

1. Make regenerate-opt-urls.
will do this in a later version. 

2. Add a ⚠️  emoji for the last event. 
I didn't add this yet since I think the current message is clear enough.
might not worth the effort to add this emoji (it's not that straightforward
to add on). 

With this new version, the message emitted by GCC:

$gcc -O2 -Wall -fdiagnostics-explain-harder -c -o t.o t.c
t.c: In function ‘sparx5_set’:
t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
   12 |   int *val = &sg->vals[index];
      |               ~~~~~~~~^~~~~~~
  ‘sparx5_set’: events 1-2
    4 |   if (*index >= 4)
      |      ^
      |      |
      |      (1) when the condition is evaluated to true
......
   12 |   int *val = &sg->vals[index];
      |               ~~~~~~~~~~~~~~~
      |                       |
      |                       (2) out of array bounds here
t.c:8:18: note: while referencing ‘vals’
    8 | struct nums {int vals[4];};
      |                  ^~~~

Bootstrapped and regression tested on both aarch64 and x86. no issues.

Let me know any further comments and suggestions.

thanks.

Qing

==================================================
$ cat t.c
extern void warn(void);
static inline void assign(int val, int *regs, int *index)
{
  if (*index >= 4)
    warn();
  *regs = val;
}
struct nums {int vals[4];};

void sparx5_set (int *ptr, struct nums *sg, int index)
{
  int *val = &sg->vals[index];

  assign(0,    ptr, &index);
  assign(*val, ptr, &index);
}

$ gcc -Wall -O2  -c -o t.o t.c
t.c: In function ‘sparx5_set’:
t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
   12 |   int *val = &sg->vals[index];
      |               ~~~~~~~~^~~~~~~
t.c:8:18: note: while referencing ‘vals’
    8 | struct nums {int vals[4];};
      |                  ^~~~

In the above, Although the warning is correct in theory, the warning message
itself is confusing to the end-user since there is information that cannot
be connected to the source code directly.

It will be a nice improvement to add more information in the warning message
to report where such index value come from.

In order to achieve this, we add a new data structure copy_history to record
the condition and the transformation that triggered the code duplication.
Whenever there is a code duplication due to some specific transformations,
such as jump threading, loop switching, etc, a copy_history structure is
created and attached to the duplicated gimple statement.

During array out-of-bound checking or other warning checking, the copy_history
that was attached to the gimple statement is used to form a sequence of
diagnostic events that are added to the corresponding rich location to be used
to report the warning message.

This behavior is controled by the new option -fdiagnostic-explain-harder
which is off by default.

With this change, by adding -fdiagnostic-explain-harder,
the warning message for the above testing case is now:

$ gcc -Wall -O2 -fdiagnostics-explain-harder -c -o t.o t.c
t.c: In function ‘sparx5_set’:
t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
   12 |   int *val = &sg->vals[index];
      |               ~~~~~~~~^~~~~~~
  ‘sparx5_set’: events 1-2
    4 |   if (*index >= 4)
      |      ^
      |      |
      |      (1) when the condition is evaluated to true
......
   12 |   int *val = &sg->vals[index];
      |               ~~~~~~~~~~~~~~~
      |                       |
      |                       (2) out of array bounds here
t.c:8:18: note: while referencing ‘vals’
    8 | struct nums {int vals[4];};
      |                  ^~~~

gcc/ChangeLog:

	PR tree-optimization/109071

	* Makefile.in (OBJS): Add diagnostic-copy-history.o
	and copy-history-diagnostic-path.o.
	* gcc/common.opt (fdiagnostics-explain-harder): New option.
	* gcc/doc/invoke.texi (fdiagnostics-explain-harder): Add
	documentation for the new option.
	* gimple-array-bounds.cc (build_rich_location_with_diagnostic_path):
	New function.
	(check_out_of_bounds_and_warn): Add one new parameter. Use rich
	location with copy_history_diagnostic_path for warning_at.
	(array_bounds_checker::check_array_ref): Use rich location with
	copy_history_diagnostic_path for warning_at.
	(array_bounds_checker::check_mem_ref): Add one new parameter.
	Use rich location with copy_history_diagnostic_path for warning_at.
	(array_bounds_checker::check_addr_expr): Use rich location with
	copy_history_diagnostic_path for warning_at.
	(array_bounds_checker::check_array_bounds): Call check_mem_ref with
	one more parameter.
	* gimple-array-bounds.h: Update prototype for check_mem_ref.
	* gimple-iterator.cc (gsi_remove): (gsi_remove): Remove the copy
	history when removing the gimple.
	* toplev.cc (toplev::finalize):  Call copy_history_finalize.
	* tree-ssa-threadupdate.cc (set_copy_history_to_stmts_in_bb): New
	function.
	(back_jt_path_registry::duplicate_thread_path): Call the new function
	to set copy history for every duplicated gimple.
	* copy-history-diagnostic-path.cc: New file.
	* copy-history-diagnostic-path.h: New file.
	* diagnostic-copy-history.cc: New file.
	* diagnostic-copy-history.h: New file.

gcc/testsuite/ChangeLog:

	PR tree-optimization/109071

	* gcc.dg/pr109071.c: New test.
---
 gcc/Makefile.in                     |   2 +
 gcc/common.opt                      |   4 +
 gcc/copy-history-diagnostic-path.cc |  86 +++++++++++++++
 gcc/copy-history-diagnostic-path.h  |  87 +++++++++++++++
 gcc/diagnostic-copy-history.cc      | 163 ++++++++++++++++++++++++++++
 gcc/diagnostic-copy-history.h       |  80 ++++++++++++++
 gcc/doc/invoke.texi                 |   7 ++
 gcc/gimple-array-bounds.cc          |  94 +++++++++++++---
 gcc/gimple-array-bounds.h           |   2 +-
 gcc/gimple-iterator.cc              |   3 +
 gcc/testsuite/gcc.dg/pr109071.c     |  43 ++++++++
 gcc/toplev.cc                       |   3 +
 gcc/tree-ssa-threadupdate.cc        |  42 +++++++
 13 files changed, 598 insertions(+), 18 deletions(-)
 create mode 100644 gcc/copy-history-diagnostic-path.cc
 create mode 100644 gcc/copy-history-diagnostic-path.h
 create mode 100644 gcc/diagnostic-copy-history.cc
 create mode 100644 gcc/diagnostic-copy-history.h
 create mode 100644 gcc/testsuite/gcc.dg/pr109071.c

Comments

Qing Zhao July 22, 2024, 1:01 p.m. UTC | #1
Hi, Richard,

Could you please take a look at the patch and let me know any comment you have (especially on the middle-end part)?

David, let me know if you have further comment and suggestions. 

Thanks a lot.

Qing

> On Jul 12, 2024, at 10:03, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> due to code duplication from jump threading [PR109071]
> Control this with a new option -fdiagnostic-explain-harder.
> 
> Compared to V1, the major difference are: (address David's comments)
> 
> 1. Change the name of the option from:
> 
> -fdiagnostic-try-to-explain-harder 
> To
> -fdiagnostic-explain-harder 
> 
> 2. Sync the commit comments with the real output of the compilation message.
> 
> 3. Add one more event in the end of the path to repeat the out-of-bound
>   issue.
> 
> 4. Fixed the unnecessary changes in Makefile.in.
> 
> 5. Add new copy_history_diagnostic_path.[cc|h] to implement a new
> class copy_history_diagnostic_path : public diagnostic_path
> 
> for copy_history_t. 
> 
> 6. Only building the rich locaiton and populating the path when warning_at
> is called.
> 
> There are two comments from David that I didn't addressed in this version:
> 
> 1. Make regenerate-opt-urls.
> will do this in a later version. 
> 
> 2. Add a ⚠️  emoji for the last event. 
> I didn't add this yet since I think the current message is clear enough.
> might not worth the effort to add this emoji (it's not that straightforward
> to add on). 
> 
> With this new version, the message emitted by GCC:
> 
> $gcc -O2 -Wall -fdiagnostics-explain-harder -c -o t.o t.c
> t.c: In function ‘sparx5_set’:
> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>   12 |   int *val = &sg->vals[index];
>      |               ~~~~~~~~^~~~~~~
>  ‘sparx5_set’: events 1-2
>    4 |   if (*index >= 4)
>      |      ^
>      |      |
>      |      (1) when the condition is evaluated to true
> ......
>   12 |   int *val = &sg->vals[index];
>      |               ~~~~~~~~~~~~~~~
>      |                       |
>      |                       (2) out of array bounds here
> t.c:8:18: note: while referencing ‘vals’
>    8 | struct nums {int vals[4];};
>      |                  ^~~~
> 
> Bootstrapped and regression tested on both aarch64 and x86. no issues.
> 
> Let me know any further comments and suggestions.
> 
> thanks.
> 
> Qing
> 
> ==================================================
> $ cat t.c
> extern void warn(void);
> static inline void assign(int val, int *regs, int *index)
> {
>  if (*index >= 4)
>    warn();
>  *regs = val;
> }
> struct nums {int vals[4];};
> 
> void sparx5_set (int *ptr, struct nums *sg, int index)
> {
>  int *val = &sg->vals[index];
> 
>  assign(0,    ptr, &index);
>  assign(*val, ptr, &index);
> }
> 
> $ gcc -Wall -O2  -c -o t.o t.c
> t.c: In function ‘sparx5_set’:
> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>   12 |   int *val = &sg->vals[index];
>      |               ~~~~~~~~^~~~~~~
> t.c:8:18: note: while referencing ‘vals’
>    8 | struct nums {int vals[4];};
>      |                  ^~~~
> 
> In the above, Although the warning is correct in theory, the warning message
> itself is confusing to the end-user since there is information that cannot
> be connected to the source code directly.
> 
> It will be a nice improvement to add more information in the warning message
> to report where such index value come from.
> 
> In order to achieve this, we add a new data structure copy_history to record
> the condition and the transformation that triggered the code duplication.
> Whenever there is a code duplication due to some specific transformations,
> such as jump threading, loop switching, etc, a copy_history structure is
> created and attached to the duplicated gimple statement.
> 
> During array out-of-bound checking or other warning checking, the copy_history
> that was attached to the gimple statement is used to form a sequence of
> diagnostic events that are added to the corresponding rich location to be used
> to report the warning message.
> 
> This behavior is controled by the new option -fdiagnostic-explain-harder
> which is off by default.
> 
> With this change, by adding -fdiagnostic-explain-harder,
> the warning message for the above testing case is now:
> 
> $ gcc -Wall -O2 -fdiagnostics-explain-harder -c -o t.o t.c
> t.c: In function ‘sparx5_set’:
> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>   12 |   int *val = &sg->vals[index];
>      |               ~~~~~~~~^~~~~~~
>  ‘sparx5_set’: events 1-2
>    4 |   if (*index >= 4)
>      |      ^
>      |      |
>      |      (1) when the condition is evaluated to true
> ......
>   12 |   int *val = &sg->vals[index];
>      |               ~~~~~~~~~~~~~~~
>      |                       |
>      |                       (2) out of array bounds here
> t.c:8:18: note: while referencing ‘vals’
>    8 | struct nums {int vals[4];};
>      |                  ^~~~
> 
> gcc/ChangeLog:
> 
> PR tree-optimization/109071
> 
> * Makefile.in (OBJS): Add diagnostic-copy-history.o
> and copy-history-diagnostic-path.o.
> * gcc/common.opt (fdiagnostics-explain-harder): New option.
> * gcc/doc/invoke.texi (fdiagnostics-explain-harder): Add
> documentation for the new option.
> * gimple-array-bounds.cc (build_rich_location_with_diagnostic_path):
> New function.
> (check_out_of_bounds_and_warn): Add one new parameter. Use rich
> location with copy_history_diagnostic_path for warning_at.
> (array_bounds_checker::check_array_ref): Use rich location with
> copy_history_diagnostic_path for warning_at.
> (array_bounds_checker::check_mem_ref): Add one new parameter.
> Use rich location with copy_history_diagnostic_path for warning_at.
> (array_bounds_checker::check_addr_expr): Use rich location with
> copy_history_diagnostic_path for warning_at.
> (array_bounds_checker::check_array_bounds): Call check_mem_ref with
> one more parameter.
> * gimple-array-bounds.h: Update prototype for check_mem_ref.
> * gimple-iterator.cc (gsi_remove): (gsi_remove): Remove the copy
> history when removing the gimple.
> * toplev.cc (toplev::finalize):  Call copy_history_finalize.
> * tree-ssa-threadupdate.cc (set_copy_history_to_stmts_in_bb): New
> function.
> (back_jt_path_registry::duplicate_thread_path): Call the new function
> to set copy history for every duplicated gimple.
> * copy-history-diagnostic-path.cc: New file.
> * copy-history-diagnostic-path.h: New file.
> * diagnostic-copy-history.cc: New file.
> * diagnostic-copy-history.h: New file.
> 
> gcc/testsuite/ChangeLog:
> 
> PR tree-optimization/109071
> 
> * gcc.dg/pr109071.c: New test.
> ---
> gcc/Makefile.in                     |   2 +
> gcc/common.opt                      |   4 +
> gcc/copy-history-diagnostic-path.cc |  86 +++++++++++++++
> gcc/copy-history-diagnostic-path.h  |  87 +++++++++++++++
> gcc/diagnostic-copy-history.cc      | 163 ++++++++++++++++++++++++++++
> gcc/diagnostic-copy-history.h       |  80 ++++++++++++++
> gcc/doc/invoke.texi                 |   7 ++
> gcc/gimple-array-bounds.cc          |  94 +++++++++++++---
> gcc/gimple-array-bounds.h           |   2 +-
> gcc/gimple-iterator.cc              |   3 +
> gcc/testsuite/gcc.dg/pr109071.c     |  43 ++++++++
> gcc/toplev.cc                       |   3 +
> gcc/tree-ssa-threadupdate.cc        |  42 +++++++
> 13 files changed, 598 insertions(+), 18 deletions(-)
> create mode 100644 gcc/copy-history-diagnostic-path.cc
> create mode 100644 gcc/copy-history-diagnostic-path.h
> create mode 100644 gcc/diagnostic-copy-history.cc
> create mode 100644 gcc/diagnostic-copy-history.h
> create mode 100644 gcc/testsuite/gcc.dg/pr109071.c
> 
> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
> index 7d3ea27a6ab..f8bbfda845f 100644
> --- a/gcc/Makefile.in
> +++ b/gcc/Makefile.in
> @@ -1436,6 +1436,8 @@ OBJS = \
> df-problems.o \
> df-scan.o \
> dfp.o \
> + diagnostic-copy-history.o \
> + copy-history-diagnostic-path.o \
> digraph.o \
> dojump.o \
> dominance.o \
> diff --git a/gcc/common.opt b/gcc/common.opt
> index a300470bbc5..714280390d9 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -1558,6 +1558,10 @@ fdiagnostics-minimum-margin-width=
> Common Joined UInteger Var(diagnostics_minimum_margin_width) Init(6)
> Set minimum width of left margin of source code when showing source.
> 
> +fdiagnostics-explain-harder
> +Common Var(flag_diagnostics_explain_harder)
> +Collect and print more context information for diagnostics.
> +
> fdisable-
> Common Joined RejectNegative Var(common_deferred_options) Defer
> -fdisable-[tree|rtl|ipa]-<pass>=range1+range2 Disable an optimization pass.
> diff --git a/gcc/copy-history-diagnostic-path.cc b/gcc/copy-history-diagnostic-path.cc
> new file mode 100644
> index 00000000000..65e5c056165
> --- /dev/null
> +++ b/gcc/copy-history-diagnostic-path.cc
> @@ -0,0 +1,86 @@
> +/* Classes for implementing diagnostic paths for copy_history_t.
> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "function.h"
> +#include "copy-history-diagnostic-path.h"
> +
> +bool
> +copy_history_diagnostic_path::same_function_p (int event_idx_a,
> +       int event_idx_b) const
> +{
> +  return (m_events[event_idx_a]->get_fndecl ()
> +  == m_events[event_idx_b]->get_fndecl ());
> +}
> +
> +/* Add an event to this path at LOC within function FNDECL at
> +   stack depth DEPTH.
> +
> +   Use m_context's printer to format FMT, as the text of the new
> +   event.  */
> +
> +void
> +copy_history_diagnostic_path::add_event (location_t loc, tree fndecl, int depth,
> + const char *fmt, ...)
> +{
> +  pretty_printer *pp = m_event_pp;
> +  pp_clear_output_area (pp);
> +
> +  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
> +
> +  va_list ap;
> +
> +  va_start (ap, fmt);
> +
> +  text_info ti (fmt, &ap, 0, nullptr, &rich_loc);
> +  pp_format (pp, &ti);
> +  pp_output_formatted_text (pp);
> +
> +  va_end (ap);
> +
> +  simple_diagnostic_event *new_event
> +    = new simple_diagnostic_event (loc, fndecl, depth, pp_formatted_text (pp));
> +  m_events.safe_push (new_event);
> +
> +  pp_clear_output_area (pp);
> +
> +  return;
> +}
> +
> +/* Populate the diagnostic_path from the copy_history.  */
> +void
> +copy_history_diagnostic_path::populate_path_from_copy_history ()
> +{
> +  if (!m_copy_history)
> +    return;
> +
> +  for (copy_history_t cur_ch = m_copy_history; cur_ch;
> +       cur_ch = cur_ch->prev_copy)
> +    add_event (cur_ch->condition, cfun->decl, 1,
> +       "when the condition is evaluated to %s",
> +       cur_ch->is_true_path ? "true" : "false");
> +
> +  /* Add an end of path warning event in the end of the path.  */
> +  add_event (m_location, cfun->decl, 1,
> +     "out of array bounds here");
> +  return;
> +}
> diff --git a/gcc/copy-history-diagnostic-path.h b/gcc/copy-history-diagnostic-path.h
> new file mode 100644
> index 00000000000..377e9136070
> --- /dev/null
> +++ b/gcc/copy-history-diagnostic-path.h
> @@ -0,0 +1,87 @@
> +/* Classes for implementing diagnostic paths for copy_history_t.
> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
> +
> +This file is part of GCC.
> +
> +GCC is free software; you can redistribute it and/or modify it under
> +the terms of the GNU General Public License as published by the Free
> +Software Foundation; either version 3, or (at your option) any later
> +version.
> +
> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +for more details.
> +
> +You should have received a copy of the GNU General Public License
> +along with GCC; see the file COPYING3.  If not see
> +<http://www.gnu.org/licenses/>.  */
> +
> +#ifndef GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
> +#define GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
> +
> +#include "diagnostic-path.h"
> +#include "simple-diagnostic-path.h"
> +#include "diagnostic-copy-history.h"
> +
> +/* An implementation of diagnostic_path for copy_history_t, as a
> +   vector of simple_diagnostic_event instances.  */
> +
> +class copy_history_diagnostic_path : public diagnostic_path
> +{
> + public:
> +  copy_history_diagnostic_path (pretty_printer *event_pp,
> + copy_history_t cp_history,
> + location_t loc)
> +  : diagnostic_path (),
> +    m_thread ("main"),
> +    m_event_pp (event_pp),
> +    m_copy_history (cp_history),
> +    m_location (loc)
> +  {}
> +
> +  unsigned num_events () const final override
> +  {
> +    return m_events.length ();
> +  }
> +  const diagnostic_event & get_event (int idx) const final override
> +  {
> +    return *m_events[idx];
> +  }
> +  unsigned num_threads () const final override
> +  {
> +    return 1;
> +  }
> +  const diagnostic_thread &
> +  get_thread (diagnostic_thread_id_t) const final override
> +  {
> +    return m_thread;
> +  }
> +  bool
> +  same_function_p (int event_idx_a,
> +   int event_idx_b) const final override;
> +
> +  void add_event (location_t loc, tree fndecl, int depth,
> +  const char *fmt, ...);
> +
> +  void populate_path_from_copy_history ();
> +
> + private:
> +  simple_diagnostic_thread m_thread;
> +
> +  /* The events that have occurred along this path.  */
> +  auto_delete_vec<simple_diagnostic_event> m_events;
> +
> +  pretty_printer *m_event_pp;
> +
> +  /* The copy_history associated with this path.  */
> +  copy_history_t m_copy_history;
> +
> +  /* The location for the gimple statement where the
> +     diagnostic message emitted.  */
> +  location_t m_location;
> +
> +};
> +
> +#endif /* ! GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H */
> diff --git a/gcc/diagnostic-copy-history.cc b/gcc/diagnostic-copy-history.cc
> new file mode 100644
> index 00000000000..2733d137622
> --- /dev/null
> +++ b/gcc/diagnostic-copy-history.cc
> @@ -0,0 +1,163 @@
> +/* Functions to handle copy history.
> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it under
> +   the terms of the GNU General Public License as published by the Free
> +   Software Foundation; either version 3, or (at your option) any later
> +   version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +   for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#include "config.h"
> +#include "system.h"
> +#include "coretypes.h"
> +#include "backend.h"
> +#include "tree.h"
> +#include "diagnostic-copy-history.h"
> +
> +/* A mapping from a gimple to a pointer to the copy history of it.  */
> +static copy_history_map_t *copy_history_map;
> +
> +/* Obstack for copy history.  */
> +static struct obstack copy_history_obstack;
> +
> +/* Create a new copy history.  */
> +
> +copy_history_t
> +create_copy_history (location_t condition,
> +     bool is_true_path,
> +     enum copy_reason reason,
> +     copy_history_t prev_copy)
> +{
> +  static bool initialized = false;
> +
> +  if (!initialized)
> +    {
> +      gcc_obstack_init (&copy_history_obstack);
> +      initialized = true;
> +    }
> +
> +  copy_history_t copy_history
> +    = (copy_history_t) obstack_alloc (&copy_history_obstack,
> +      sizeof (struct copy_history));
> +  copy_history->condition = condition;
> +  copy_history->is_true_path = is_true_path;
> +  copy_history->reason = reason;
> +  copy_history->prev_copy = prev_copy;
> +  return copy_history;
> +}
> +
> +/* Insert the copy history for the gimple STMT assuming the linked list
> +   of CP_HISTORY does not have duplications.  It's the caller's
> +   responsibility to make sure that the linked list of CP_HISTORY does
> +   not have duplications.  */
> +
> +void
> +insert_copy_history (gimple *stmt, copy_history_t cp_history)
> +{
> +  if (!copy_history_map)
> +    copy_history_map = new copy_history_map_t;
> +
> +  copy_history_map->put (stmt, cp_history);
> +  return;
> +}
> +
> +/* Get the copy history for the gimple STMT, return NULL when there is
> +   no associated copy history.  */
> +
> +copy_history_t
> +get_copy_history (gimple *stmt)
> +{
> +  if (!copy_history_map)
> +    return NULL;
> +
> +  if (const copy_history_t *cp_history_p = copy_history_map->get (stmt))
> +    return *cp_history_p;
> +
> +  return NULL;
> +}
> +
> +/* Remove the copy history for STMT.  */
> +
> +void
> +remove_copy_history (gimple *stmt)
> +{
> +  if (!copy_history_map)
> +    return;
> +  copy_history_map->remove (stmt);
> +  return;
> +}
> +
> +/* Check whether the cond_location, is_true_path and reason existed
> + * in the OLD_COPY_HISTORY.  */
> +
> +static bool
> +is_copy_history_existed (location_t cond_location, bool is_true_path,
> + enum copy_reason reason,
> + copy_history_t old_copy_history)
> +{
> +  for (copy_history_t cur_ch = old_copy_history; cur_ch;
> +       cur_ch = cur_ch->prev_copy)
> +    if ((cur_ch->condition == cond_location)
> + && (cur_ch->is_true_path == is_true_path)
> + && (cur_ch->reason == reason))
> +      return true;
> +
> +  return false;
> +}
> +
> +/* Set copy history for the gimple STMT.  Return TRUE when a new copy
> + * history is created and inserted.  Otherwise return FALSE.  */
> +
> +bool
> +set_copy_history (gimple *stmt, location_t cond_location,
> +  bool is_true_path, enum copy_reason reason)
> +{
> +
> +  /* First, get the old copy history associated with this STMT.  */
> +  copy_history_t old_cp_history = get_copy_history (stmt);
> +
> +  /* If the current copy history is not in the STMT's copy history linked
> +     list yet, create the new copy history, put the old_copy_history as the
> +     prev_copy of it.  */
> +  copy_history_t new_cp_history = NULL;
> +  if (!is_copy_history_existed (cond_location, is_true_path,
> + reason, old_cp_history))
> +    new_cp_history
> +      = create_copy_history (cond_location, is_true_path,
> +     reason, old_cp_history);
> +
> +  /* Insert the copy history into the hash map.  */
> +  if (new_cp_history)
> +    {
> +      insert_copy_history (stmt, new_cp_history);
> +      return true;
> +    }
> +
> +  return false;
> +}
> +
> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
> +   compiler within the same process.  For use by toplev::finalize.  */
> +
> +void
> +copy_history_finalize (void)
> +{
> +  if (copy_history_map)
> +    {
> +      delete copy_history_map;
> +      copy_history_map = NULL;
> +    }
> +  obstack_free (&copy_history_obstack, NULL);
> +  return;
> +}
> diff --git a/gcc/diagnostic-copy-history.h b/gcc/diagnostic-copy-history.h
> new file mode 100644
> index 00000000000..0c2c55a2a81
> --- /dev/null
> +++ b/gcc/diagnostic-copy-history.h
> @@ -0,0 +1,80 @@
> +/* Copy history associated with a gimple statement to record its history
> +   of duplications due to different transformations.
> +   The copy history will be used to construct events for later diagnostic.
> +
> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
> +
> +   This file is part of GCC.
> +
> +   GCC is free software; you can redistribute it and/or modify it under
> +   the terms of the GNU General Public License as published by the Free
> +   Software Foundation; either version 3, or (at your option) any later
> +   version.
> +
> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
> +   for more details.
> +
> +   You should have received a copy of the GNU General Public License
> +   along with GCC; see the file COPYING3.  If not see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +#ifndef DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
> +#define DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
> +
> +#include "hash-map.h"
> +#include "line-map.h"
> +
> +/* An enum for the reason why this copy is made.  Right now, there are
> +   three reasons, we can add more if needed.  */
> +enum copy_reason {
> +  COPY_BY_THREAD_JUMP,
> +  COPY_BY_LOOP_UNSWITCH,
> +  COPY_BY_LOOP_UNROLL
> +};
> +
> +/* This data structure records the information when a statement is
> +   duplicated during different transformations.  Such information
> +   will be used by the later diagnostic messages to report more
> +   contexts of the warnings or errors.  */
> +struct copy_history {
> +  /* The location of the condition statement that triggered the code
> +     duplication.  */
> +  location_t condition;
> +
> +  /* Whether this copy is on the TRUE path of the condition.  */
> +  bool is_true_path;
> +
> +  /* The reason for the code duplication.  */
> +  enum copy_reason reason;
> +
> +  /* This statement itself might be a previous code duplication.  */
> +  struct copy_history *prev_copy;
> +};
> +
> +typedef struct copy_history *copy_history_t;
> +
> +/* Create a new copy history.  */
> +extern copy_history_t create_copy_history (location_t, bool,
> +   enum copy_reason, copy_history_t);
> +
> +typedef hash_map<gimple *, copy_history_t> copy_history_map_t;
> +
> +/* Get the copy history for the gimple STMT, return NULL when there is
> +   no associated copy history.  */
> +extern copy_history_t get_copy_history (gimple *);
> +
> +/* Remove the copy history for STMT from the copy_history_map.  */
> +extern void remove_copy_history (gimple *);
> +
> +/* Set copy history for the gimple STMT.  */
> +extern bool set_copy_history (gimple *, location_t,
> +      bool, enum copy_reason);
> +
> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
> +   compiler within the same process.  For use by toplev::finalize.  */
> +extern void copy_history_finalize (void);
> +
> +#endif // DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index b37c7af7a39..bb0c6a87523 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -321,6 +321,7 @@ Objective-C and Objective-C++ Dialects}.
> -fdiagnostics-column-origin=@var{origin}
> -fdiagnostics-escape-format=@r{[}unicode@r{|}bytes@r{]}
> -fdiagnostics-text-art-charset=@r{[}none@r{|}ascii@r{|}unicode@r{|}emoji@r{]}}
> +-fdiagnostics-explain-harder
> 
> @item Warning Options
> @xref{Warning Options,,Options to Request or Suppress Warnings}.
> @@ -5518,6 +5519,12 @@ left margin.
> This option controls the minimum width of the left margin printed by
> @option{-fdiagnostics-show-line-numbers}.  It defaults to 6.
> 
> +@opindex fdiagnostics-explain-harder
> +@item -fdiagnostics-explain-harder
> +With this option, the compiler collects more context information for
> +diagnostics and emits them to the users to provide more hints on how
> +the diagnostics come from.
> +
> @opindex fdiagnostics-parseable-fixits
> @item -fdiagnostics-parseable-fixits
> Emit fix-it hints in a machine-parseable format, suitable for consumption
> diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
> index 1637a2fc4f4..ba98ab3b413 100644
> --- a/gcc/gimple-array-bounds.cc
> +++ b/gcc/gimple-array-bounds.cc
> @@ -31,6 +31,9 @@ along with GCC; see the file COPYING3.  If not see
> #include "tree-dfa.h"
> #include "fold-const.h"
> #include "diagnostic-core.h"
> +#include "simple-diagnostic-path.h"
> +#include "diagnostic-copy-history.h"
> +#include "copy-history-diagnostic-path.h"
> #include "intl.h"
> #include "tree-vrp.h"
> #include "alloc-pool.h"
> @@ -254,6 +257,25 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
>   return;
> }
> 
> +/* Build a rich location for LOCATION, and populate the diagonistic path
> +   for it.  */
> +
> +static rich_location *
> +build_rich_location_with_diagnostic_path (location_t location, gimple *stmt)
> +{
> +  /* Generate a rich location for this location.  */
> +  rich_location *richloc = new rich_location (line_table, location);
> +
> +  copy_history_t cp_history = stmt ? get_copy_history (stmt) : NULL;
> +  copy_history_diagnostic_path *path
> +    = new copy_history_diagnostic_path (global_dc->printer,
> + cp_history, location);
> +  path->populate_path_from_copy_history ();
> +
> +  richloc->set_path (path);
> +  return richloc;
> +}
> +
> /* Given the LOW_SUB_ORG, LOW_SUB and UP_SUB, and the computed UP_BOUND
>    and UP_BOUND_P1, check whether the array reference REF is out of bound.
>    When out of bounds, set OUT_OF_BOUND to true.
> @@ -262,6 +284,7 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
> 
> static bool
> check_out_of_bounds_and_warn (location_t location, tree ref,
> +      gimple *stmt,
>      tree low_sub_org, tree low_sub, tree up_sub,
>      tree up_bound, tree up_bound_p1,
>      const irange *vr,
> @@ -280,9 +303,13 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>     {
>       *out_of_bound = true;
>       if (for_array_bound)
> - warned = warning_at (location, OPT_Warray_bounds_,
> + {
> +  rich_location *richloc
> +    = build_rich_location_with_diagnostic_path (location, stmt);
> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>     "array subscript %E is outside array"
>     " bounds of %qT", low_sub_org, artype);
> + }
>     }
> 
>   if (warned)
> @@ -299,10 +326,14 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
> {
>  *out_of_bound = true;
>  if (for_array_bound)
> -    warned = warning_at (location, OPT_Warray_bounds_,
> +    {
> +      rich_location *richloc
> + = build_rich_location_with_diagnostic_path (location, stmt);
> +      warned = warning_at (richloc, OPT_Warray_bounds_,
> "array subscript [%E, %E] is outside "
> "array bounds of %qT",
> low_sub, up_sub, artype);
> +    }
> }
>     }
>   else if (up_bound
> @@ -313,18 +344,26 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>     {
>       *out_of_bound = true;
>       if (for_array_bound)
> - warned = warning_at (location, OPT_Warray_bounds_,
> + {
> +  rich_location *richloc
> +    = build_rich_location_with_diagnostic_path (location, stmt);
> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>     "array subscript %E is above array bounds of %qT",
>     up_sub, artype);
> + }
>     }
>   else if (TREE_CODE (low_sub) == INTEGER_CST
>   && tree_int_cst_lt (low_sub, low_bound))
>     {
>       *out_of_bound = true;
>       if (for_array_bound)
> - warned = warning_at (location, OPT_Warray_bounds_,
> + {
> +  rich_location *richloc
> +    = build_rich_location_with_diagnostic_path (location, stmt);
> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>     "array subscript %E is below array bounds of %qT",
>     low_sub, artype);
> + }
>     }
>   return warned;
> }
> @@ -388,21 +427,24 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
> }
>     }
> 
> -  warned = check_out_of_bounds_and_warn (location, ref,
> +  warned = check_out_of_bounds_and_warn (location, ref, stmt,
> low_sub_org, low_sub, up_sub,
> up_bound, up_bound_p1, &vr,
> ignore_off_by_one, warn_array_bounds,
> &out_of_bound);
> 
> -
>   if (!warned && sam == special_array_member::int_0)
> -    warned = warning_at (location, OPT_Wzero_length_bounds,
> +    {
> +      rich_location *richloc
> + = build_rich_location_with_diagnostic_path (location, stmt);
> +      warned = warning_at (richloc, OPT_Wzero_length_bounds,
> (TREE_CODE (low_sub) == INTEGER_CST
>  ? G_("array subscript %E is outside the bounds "
>       "of an interior zero-length array %qT")
>  : G_("array subscript %qE is outside the bounds "
>       "of an interior zero-length array %qT")),
> low_sub, artype);
> +    }
> 
>   if (warned && dump_file && (dump_flags & TDF_DETAILS))
>     {
> @@ -419,8 +461,10 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>  || sam == special_array_member::trail_n)
>       && DECL_NOT_FLEXARRAY (afield_decl))
>     {
> +      rich_location *richloc
> + = build_rich_location_with_diagnostic_path (location, stmt);
>       bool warned1
> - = warning_at (location, OPT_Wstrict_flex_arrays,
> + = warning_at (richloc, OPT_Wstrict_flex_arrays,
>      "trailing array %qT should not be used as "
>      "a flexible array member",
>      artype);
> @@ -478,6 +522,7 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
> 
> bool
> array_bounds_checker::check_mem_ref (location_t location, tree ref,
> +     gimple *stmt,
>     bool ignore_off_by_one)
> {
>   if (warning_suppressed_p (ref, OPT_Warray_bounds_))
> @@ -580,16 +625,24 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>   if (lboob)
>     {
>       if (offrange[0] == offrange[1])
> - warned = warning_at (location, OPT_Warray_bounds_,
> + {
> +  rich_location *richloc
> +    = build_rich_location_with_diagnostic_path (location, stmt);
> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>     "array subscript %wi is outside array bounds "
>     "of %qT",
>     offrange[0].to_shwi (), reftype);
> + }
>       else
> - warned = warning_at (location, OPT_Warray_bounds_,
> + {
> +  rich_location *richloc
> +    = build_rich_location_with_diagnostic_path (location, stmt);
> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>     "array subscript [%wi, %wi] is outside "
>     "array bounds of %qT",
>     offrange[0].to_shwi (),
>     offrange[1].to_shwi (), reftype);
> + }
>     }
>   else if (uboob && !ignore_off_by_one)
>     {
> @@ -599,8 +652,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>   it were an untyped array of bytes.  */
> backtype = build_array_type_nelts (unsigned_char_type_node,
>   aref.sizrng[1].to_uhwi ());
> -
> -      warned = warning_at (location, OPT_Warray_bounds_,
> +      rich_location *richloc
> + = build_rich_location_with_diagnostic_path (location, stmt);
> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>   "array subscript %<%T[%wi]%> is partly "
>   "outside array bounds of %qT",
>   axstype, offrange[0].to_shwi (), backtype);
> @@ -623,7 +677,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>     {
>       HOST_WIDE_INT tmpidx = (aref.offmax[i] / eltsize).to_shwi ();
> 
> -      if (warning_at (location, OPT_Warray_bounds_,
> +      rich_location *richloc
> + = build_rich_location_with_diagnostic_path (location, stmt);
> +      if (warning_at (richloc, OPT_Warray_bounds_,
>      "intermediate array offset %wi is outside array bounds "
>      "of %qT", tmpidx, reftype))
> {
> @@ -656,7 +712,7 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>  ignore_off_by_one = false;
> }
>       else if (TREE_CODE (t) == MEM_REF)
> - warned = check_mem_ref (location, t, ignore_off_by_one);
> + warned = check_mem_ref (location, t, stmt, ignore_off_by_one);
> 
>       if (warned)
> suppress_warning (t, OPT_Warray_bounds_);
> @@ -702,7 +758,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>  dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>  fprintf (dump_file, "\n");
> }
> -      warned = warning_at (location, OPT_Warray_bounds_,
> +      rich_location *richloc
> + = build_rich_location_with_diagnostic_path (location, stmt);
> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>   "array subscript %wi is below "
>   "array bounds of %qT",
>   idx.to_shwi (), TREE_TYPE (tem));
> @@ -716,7 +774,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>  dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>  fprintf (dump_file, "\n");
> }
> -      warned = warning_at (location, OPT_Warray_bounds_,
> +      rich_location *richloc
> + = build_rich_location_with_diagnostic_path (location, stmt);
> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>   "array subscript %wu is above "
>   "array bounds of %qT",
>   idx.to_uhwi (), TREE_TYPE (tem));
> @@ -811,7 +871,7 @@ array_bounds_checker::check_array_bounds (tree *tp, int *walk_subtree,
>     warned = checker->check_array_ref (location, t, wi->stmt,
>       false/*ignore_off_by_one*/);
>   else if (TREE_CODE (t) == MEM_REF)
> -    warned = checker->check_mem_ref (location, t,
> +    warned = checker->check_mem_ref (location, t, wi->stmt,
>     false /*ignore_off_by_one*/);
>   else if (TREE_CODE (t) == ADDR_EXPR)
>     {
> diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h
> index aa7ca8e9730..2d1d48d1e94 100644
> --- a/gcc/gimple-array-bounds.h
> +++ b/gcc/gimple-array-bounds.h
> @@ -33,7 +33,7 @@ public:
> private:
>   static tree check_array_bounds (tree *tp, int *walk_subtree, void *data);
>   bool check_array_ref (location_t, tree, gimple *, bool ignore_off_by_one);
> -  bool check_mem_ref (location_t, tree, bool ignore_off_by_one);
> +  bool check_mem_ref (location_t, tree, gimple *, bool ignore_off_by_one);
>   void check_addr_expr (location_t, tree, gimple *);
>   void get_value_range (irange &r, const_tree op, gimple *);
> 
> diff --git a/gcc/gimple-iterator.cc b/gcc/gimple-iterator.cc
> index 93646262eac..472a79360d7 100644
> --- a/gcc/gimple-iterator.cc
> +++ b/gcc/gimple-iterator.cc
> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "tree-ssa.h"
> #include "value-prof.h"
> #include "gimplify.h"
> +#include "diagnostic-copy-history.h"
> 
> 
> /* Mark the statement STMT as modified, and update it.  */
> @@ -581,6 +582,8 @@ gsi_remove (gimple_stmt_iterator *i, bool remove_permanently)
> cfun->debug_marker_count--;
>       require_eh_edge_purge = remove_stmt_from_eh_lp (stmt);
>       gimple_remove_stmt_histograms (cfun, stmt);
> +      if (get_copy_history (stmt) != NULL)
> + remove_copy_history (stmt);
>     }
> 
>   /* Update the iterator and re-wire the links in I->SEQ.  */
> diff --git a/gcc/testsuite/gcc.dg/pr109071.c b/gcc/testsuite/gcc.dg/pr109071.c
> new file mode 100644
> index 00000000000..95ae576451e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr109071.c
> @@ -0,0 +1,43 @@
> +/* PR tree-optimization/109071 need more context for -Warray-bounds warnings
> +   due to code duplication from jump threading.  */  
> +/* { dg-options "-O2 -Wall -fdiagnostics-explain-harder" } */
> +/* { dg-additional-options "-fdiagnostics-show-line-numbers -fdiagnostics-path-format=inline-events -fdiagnostics-show-caret" } */
> +/* { dg-enable-nn-line-numbers "" } */
> +
> +extern void warn(void);
> +static inline void assign(int val, int *regs, int index)
> +{
> +  if (index >= 4)
> +    warn();
> +  *regs = val;
> +}
> +struct nums {int vals[4];};
> +
> +void sparx5_set (int *ptr, struct nums *sg, int index)
> +{
> +  int *val = &sg->vals[index]; /* { dg-warning "is above array bounds" } */
> +
> +  assign(0,    ptr, index);
> +  assign(*val, ptr, index);
> +}
> +/* { dg-begin-multiline-output "" }
> +   NN |   int *val = &sg->vals[index];
> +      |               ~~~~~~~~^~~~~~~
> +  'sparx5_set': events 1-2
> +   NN |   if (index >= 4)
> +      |      ^
> +      |      |
> +      |      (1) when the condition is evaluated to true
> +......
> +   { dg-end-multiline-output "" } */
> +
> +/* { dg-begin-multiline-output "" }
> +      |               ~~~~~~~~~~~~~~~
> +      |                       |
> +      |                       (2) out of array bounds here
> +   { dg-end-multiline-output "" } */
> +
> +/* { dg-begin-multiline-output "" }
> +   NN | struct nums {int vals[4];};
> +      |                  ^~~~
> +   { dg-end-multiline-output "" } */
> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
> index f715f977b72..e9ae55f64be 100644
> --- a/gcc/toplev.cc
> +++ b/gcc/toplev.cc
> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "cgraph.h"
> #include "coverage.h"
> #include "diagnostic.h"
> +#include "diagnostic-copy-history.h"
> #include "varasm.h"
> #include "tree-inline.h"
> #include "realmpfr.h" /* For GMP/MPFR/MPC versions, in print_version.  */
> @@ -2381,6 +2382,8 @@ toplev::finalize (void)
>   tree_cc_finalize ();
>   reginfo_cc_finalize ();
> 
> +  copy_history_finalize ();
> +
>   /* save_decoded_options uses opts_obstack, so these must
>      be cleaned up together.  */
>   obstack_free (&opts_obstack, NULL);
> diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
> index fa61ba9512b..9292f0c2aa1 100644
> --- a/gcc/tree-ssa-threadupdate.cc
> +++ b/gcc/tree-ssa-threadupdate.cc
> @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
> #include "tree-cfg.h"
> #include "tree-vectorizer.h"
> #include "tree-pass.h"
> +#include "diagnostic-copy-history.h"
> 
> /* Given a block B, update the CFG and SSA graph to reflect redirecting
>    one or more in-edges to B to instead reach the destination of an
> @@ -2371,6 +2372,37 @@ back_jt_path_registry::adjust_paths_after_duplication (unsigned curr_path_num)
>     }
> }
> 
> +/* Set copy history to all the stmts in the basic block BB based on
> +   the entry edge ENTRY and whether this basic block is on the true
> +   path of the condition in the destination of the edge ENTRY.
> +   Return TRUE when copy history are set, otherwise return FALSE.  */
> +
> +static bool
> +set_copy_history_to_stmts_in_bb (basic_block bb, edge entry, bool is_true_path)
> +{
> +  /* First, get the condition statement in the destination of the
> +     edge ENTRY.  */
> +  basic_block cond_block = entry->dest;
> +  gimple *cond_stmt = NULL;
> +  gimple_stmt_iterator gsi;
> +  gsi = gsi_last_bb (cond_block);
> +  /* if the cond_block ends with a conditional statement, get it.  */
> +  if (!gsi_end_p (gsi)
> +      && gsi_stmt (gsi)
> +      && (gimple_code (gsi_stmt (gsi)) == GIMPLE_COND))
> +    cond_stmt = gsi_stmt (gsi);
> +
> +  if (!cond_stmt)
> +    return false;
> +
> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
> +    set_copy_history (gsi_stmt (gsi),
> +      gimple_location (cond_stmt),
> +      is_true_path,
> +      COPY_BY_THREAD_JUMP);
> +  return true;
> +}
> +
> /* Duplicates a jump-thread path of N_REGION basic blocks.
>    The ENTRY edge is redirected to the duplicate of the region.
> 
> @@ -2419,6 +2451,16 @@ back_jt_path_registry::duplicate_thread_path (edge entry,
>   copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
>    split_edge_bb_loc (entry), false);
> 
> +  /* Set the copy history for all the stmts in both original and copied
> +     basic blocks.  The copied regions are in the true path.  */
> +  if (flag_diagnostics_explain_harder)
> +    for (i = 0; i < n_region; i++)
> +      {
> + set_copy_history_to_stmts_in_bb (region[i], entry, false);
> + set_copy_history_to_stmts_in_bb (region_copy[i], entry, true);
> +      }
> +
> +
>   /* Fix up: copy_bbs redirects all edges pointing to copied blocks.  The
>      following code ensures that all the edges exiting the jump-thread path are
>      redirected back to the original code: these edges are exceptions
> -- 
> 2.31.1
>
Qing Zhao July 29, 2024, 3:32 p.m. UTC | #2
The 2nd ping for the following patch:

https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657150.html

thanks.

Qing

> On Jul 22, 2024, at 09:01, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> Hi, Richard,
> 
> Could you please take a look at the patch and let me know any comment you have (especially on the middle-end part)?
> 
> David, let me know if you have further comment and suggestions. 
> 
> Thanks a lot.
> 
> Qing
> 
>> On Jul 12, 2024, at 10:03, Qing Zhao <qing.zhao@oracle.com> wrote:
>> 
>> due to code duplication from jump threading [PR109071]
>> Control this with a new option -fdiagnostic-explain-harder.
>> 
>> Compared to V1, the major difference are: (address David's comments)
>> 
>> 1. Change the name of the option from:
>> 
>> -fdiagnostic-try-to-explain-harder 
>> To
>> -fdiagnostic-explain-harder 
>> 
>> 2. Sync the commit comments with the real output of the compilation message.
>> 
>> 3. Add one more event in the end of the path to repeat the out-of-bound
>>  issue.
>> 
>> 4. Fixed the unnecessary changes in Makefile.in.
>> 
>> 5. Add new copy_history_diagnostic_path.[cc|h] to implement a new
>> class copy_history_diagnostic_path : public diagnostic_path
>> 
>> for copy_history_t. 
>> 
>> 6. Only building the rich locaiton and populating the path when warning_at
>> is called.
>> 
>> There are two comments from David that I didn't addressed in this version:
>> 
>> 1. Make regenerate-opt-urls.
>> will do this in a later version. 
>> 
>> 2. Add a ⚠️  emoji for the last event. 
>> I didn't add this yet since I think the current message is clear enough.
>> might not worth the effort to add this emoji (it's not that straightforward
>> to add on). 
>> 
>> With this new version, the message emitted by GCC:
>> 
>> $gcc -O2 -Wall -fdiagnostics-explain-harder -c -o t.o t.c
>> t.c: In function ‘sparx5_set’:
>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>  12 |   int *val = &sg->vals[index];
>>     |               ~~~~~~~~^~~~~~~
>> ‘sparx5_set’: events 1-2
>>   4 |   if (*index >= 4)
>>     |      ^
>>     |      |
>>     |      (1) when the condition is evaluated to true
>> ......
>>  12 |   int *val = &sg->vals[index];
>>     |               ~~~~~~~~~~~~~~~
>>     |                       |
>>     |                       (2) out of array bounds here
>> t.c:8:18: note: while referencing ‘vals’
>>   8 | struct nums {int vals[4];};
>>     |                  ^~~~
>> 
>> Bootstrapped and regression tested on both aarch64 and x86. no issues.
>> 
>> Let me know any further comments and suggestions.
>> 
>> thanks.
>> 
>> Qing
>> 
>> ==================================================
>> $ cat t.c
>> extern void warn(void);
>> static inline void assign(int val, int *regs, int *index)
>> {
>> if (*index >= 4)
>>   warn();
>> *regs = val;
>> }
>> struct nums {int vals[4];};
>> 
>> void sparx5_set (int *ptr, struct nums *sg, int index)
>> {
>> int *val = &sg->vals[index];
>> 
>> assign(0,    ptr, &index);
>> assign(*val, ptr, &index);
>> }
>> 
>> $ gcc -Wall -O2  -c -o t.o t.c
>> t.c: In function ‘sparx5_set’:
>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>  12 |   int *val = &sg->vals[index];
>>     |               ~~~~~~~~^~~~~~~
>> t.c:8:18: note: while referencing ‘vals’
>>   8 | struct nums {int vals[4];};
>>     |                  ^~~~
>> 
>> In the above, Although the warning is correct in theory, the warning message
>> itself is confusing to the end-user since there is information that cannot
>> be connected to the source code directly.
>> 
>> It will be a nice improvement to add more information in the warning message
>> to report where such index value come from.
>> 
>> In order to achieve this, we add a new data structure copy_history to record
>> the condition and the transformation that triggered the code duplication.
>> Whenever there is a code duplication due to some specific transformations,
>> such as jump threading, loop switching, etc, a copy_history structure is
>> created and attached to the duplicated gimple statement.
>> 
>> During array out-of-bound checking or other warning checking, the copy_history
>> that was attached to the gimple statement is used to form a sequence of
>> diagnostic events that are added to the corresponding rich location to be used
>> to report the warning message.
>> 
>> This behavior is controled by the new option -fdiagnostic-explain-harder
>> which is off by default.
>> 
>> With this change, by adding -fdiagnostic-explain-harder,
>> the warning message for the above testing case is now:
>> 
>> $ gcc -Wall -O2 -fdiagnostics-explain-harder -c -o t.o t.c
>> t.c: In function ‘sparx5_set’:
>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>  12 |   int *val = &sg->vals[index];
>>     |               ~~~~~~~~^~~~~~~
>> ‘sparx5_set’: events 1-2
>>   4 |   if (*index >= 4)
>>     |      ^
>>     |      |
>>     |      (1) when the condition is evaluated to true
>> ......
>>  12 |   int *val = &sg->vals[index];
>>     |               ~~~~~~~~~~~~~~~
>>     |                       |
>>     |                       (2) out of array bounds here
>> t.c:8:18: note: while referencing ‘vals’
>>   8 | struct nums {int vals[4];};
>>     |                  ^~~~
>> 
>> gcc/ChangeLog:
>> 
>> PR tree-optimization/109071
>> 
>> * Makefile.in (OBJS): Add diagnostic-copy-history.o
>> and copy-history-diagnostic-path.o.
>> * gcc/common.opt (fdiagnostics-explain-harder): New option.
>> * gcc/doc/invoke.texi (fdiagnostics-explain-harder): Add
>> documentation for the new option.
>> * gimple-array-bounds.cc (build_rich_location_with_diagnostic_path):
>> New function.
>> (check_out_of_bounds_and_warn): Add one new parameter. Use rich
>> location with copy_history_diagnostic_path for warning_at.
>> (array_bounds_checker::check_array_ref): Use rich location with
>> copy_history_diagnostic_path for warning_at.
>> (array_bounds_checker::check_mem_ref): Add one new parameter.
>> Use rich location with copy_history_diagnostic_path for warning_at.
>> (array_bounds_checker::check_addr_expr): Use rich location with
>> copy_history_diagnostic_path for warning_at.
>> (array_bounds_checker::check_array_bounds): Call check_mem_ref with
>> one more parameter.
>> * gimple-array-bounds.h: Update prototype for check_mem_ref.
>> * gimple-iterator.cc (gsi_remove): (gsi_remove): Remove the copy
>> history when removing the gimple.
>> * toplev.cc (toplev::finalize):  Call copy_history_finalize.
>> * tree-ssa-threadupdate.cc (set_copy_history_to_stmts_in_bb): New
>> function.
>> (back_jt_path_registry::duplicate_thread_path): Call the new function
>> to set copy history for every duplicated gimple.
>> * copy-history-diagnostic-path.cc: New file.
>> * copy-history-diagnostic-path.h: New file.
>> * diagnostic-copy-history.cc: New file.
>> * diagnostic-copy-history.h: New file.
>> 
>> gcc/testsuite/ChangeLog:
>> 
>> PR tree-optimization/109071
>> 
>> * gcc.dg/pr109071.c: New test.
>> ---
>> gcc/Makefile.in                     |   2 +
>> gcc/common.opt                      |   4 +
>> gcc/copy-history-diagnostic-path.cc |  86 +++++++++++++++
>> gcc/copy-history-diagnostic-path.h  |  87 +++++++++++++++
>> gcc/diagnostic-copy-history.cc      | 163 ++++++++++++++++++++++++++++
>> gcc/diagnostic-copy-history.h       |  80 ++++++++++++++
>> gcc/doc/invoke.texi                 |   7 ++
>> gcc/gimple-array-bounds.cc          |  94 +++++++++++++---
>> gcc/gimple-array-bounds.h           |   2 +-
>> gcc/gimple-iterator.cc              |   3 +
>> gcc/testsuite/gcc.dg/pr109071.c     |  43 ++++++++
>> gcc/toplev.cc                       |   3 +
>> gcc/tree-ssa-threadupdate.cc        |  42 +++++++
>> 13 files changed, 598 insertions(+), 18 deletions(-)
>> create mode 100644 gcc/copy-history-diagnostic-path.cc
>> create mode 100644 gcc/copy-history-diagnostic-path.h
>> create mode 100644 gcc/diagnostic-copy-history.cc
>> create mode 100644 gcc/diagnostic-copy-history.h
>> create mode 100644 gcc/testsuite/gcc.dg/pr109071.c
>> 
>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>> index 7d3ea27a6ab..f8bbfda845f 100644
>> --- a/gcc/Makefile.in
>> +++ b/gcc/Makefile.in
>> @@ -1436,6 +1436,8 @@ OBJS = \
>> df-problems.o \
>> df-scan.o \
>> dfp.o \
>> + diagnostic-copy-history.o \
>> + copy-history-diagnostic-path.o \
>> digraph.o \
>> dojump.o \
>> dominance.o \
>> diff --git a/gcc/common.opt b/gcc/common.opt
>> index a300470bbc5..714280390d9 100644
>> --- a/gcc/common.opt
>> +++ b/gcc/common.opt
>> @@ -1558,6 +1558,10 @@ fdiagnostics-minimum-margin-width=
>> Common Joined UInteger Var(diagnostics_minimum_margin_width) Init(6)
>> Set minimum width of left margin of source code when showing source.
>> 
>> +fdiagnostics-explain-harder
>> +Common Var(flag_diagnostics_explain_harder)
>> +Collect and print more context information for diagnostics.
>> +
>> fdisable-
>> Common Joined RejectNegative Var(common_deferred_options) Defer
>> -fdisable-[tree|rtl|ipa]-<pass>=range1+range2 Disable an optimization pass.
>> diff --git a/gcc/copy-history-diagnostic-path.cc b/gcc/copy-history-diagnostic-path.cc
>> new file mode 100644
>> index 00000000000..65e5c056165
>> --- /dev/null
>> +++ b/gcc/copy-history-diagnostic-path.cc
>> @@ -0,0 +1,86 @@
>> +/* Classes for implementing diagnostic paths for copy_history_t.
>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
>> +
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it and/or modify it under
>> +the terms of the GNU General Public License as published by the Free
>> +Software Foundation; either version 3, or (at your option) any later
>> +version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with GCC; see the file COPYING3.  If not see
>> +<http://www.gnu.org/licenses/>.  */
>> +
>> +#include "config.h"
>> +#include "system.h"
>> +#include "coretypes.h"
>> +#include "function.h"
>> +#include "copy-history-diagnostic-path.h"
>> +
>> +bool
>> +copy_history_diagnostic_path::same_function_p (int event_idx_a,
>> +       int event_idx_b) const
>> +{
>> +  return (m_events[event_idx_a]->get_fndecl ()
>> +  == m_events[event_idx_b]->get_fndecl ());
>> +}
>> +
>> +/* Add an event to this path at LOC within function FNDECL at
>> +   stack depth DEPTH.
>> +
>> +   Use m_context's printer to format FMT, as the text of the new
>> +   event.  */
>> +
>> +void
>> +copy_history_diagnostic_path::add_event (location_t loc, tree fndecl, int depth,
>> + const char *fmt, ...)
>> +{
>> +  pretty_printer *pp = m_event_pp;
>> +  pp_clear_output_area (pp);
>> +
>> +  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
>> +
>> +  va_list ap;
>> +
>> +  va_start (ap, fmt);
>> +
>> +  text_info ti (fmt, &ap, 0, nullptr, &rich_loc);
>> +  pp_format (pp, &ti);
>> +  pp_output_formatted_text (pp);
>> +
>> +  va_end (ap);
>> +
>> +  simple_diagnostic_event *new_event
>> +    = new simple_diagnostic_event (loc, fndecl, depth, pp_formatted_text (pp));
>> +  m_events.safe_push (new_event);
>> +
>> +  pp_clear_output_area (pp);
>> +
>> +  return;
>> +}
>> +
>> +/* Populate the diagnostic_path from the copy_history.  */
>> +void
>> +copy_history_diagnostic_path::populate_path_from_copy_history ()
>> +{
>> +  if (!m_copy_history)
>> +    return;
>> +
>> +  for (copy_history_t cur_ch = m_copy_history; cur_ch;
>> +       cur_ch = cur_ch->prev_copy)
>> +    add_event (cur_ch->condition, cfun->decl, 1,
>> +       "when the condition is evaluated to %s",
>> +       cur_ch->is_true_path ? "true" : "false");
>> +
>> +  /* Add an end of path warning event in the end of the path.  */
>> +  add_event (m_location, cfun->decl, 1,
>> +     "out of array bounds here");
>> +  return;
>> +}
>> diff --git a/gcc/copy-history-diagnostic-path.h b/gcc/copy-history-diagnostic-path.h
>> new file mode 100644
>> index 00000000000..377e9136070
>> --- /dev/null
>> +++ b/gcc/copy-history-diagnostic-path.h
>> @@ -0,0 +1,87 @@
>> +/* Classes for implementing diagnostic paths for copy_history_t.
>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
>> +
>> +This file is part of GCC.
>> +
>> +GCC is free software; you can redistribute it and/or modify it under
>> +the terms of the GNU General Public License as published by the Free
>> +Software Foundation; either version 3, or (at your option) any later
>> +version.
>> +
>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +for more details.
>> +
>> +You should have received a copy of the GNU General Public License
>> +along with GCC; see the file COPYING3.  If not see
>> +<http://www.gnu.org/licenses/>.  */
>> +
>> +#ifndef GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
>> +#define GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
>> +
>> +#include "diagnostic-path.h"
>> +#include "simple-diagnostic-path.h"
>> +#include "diagnostic-copy-history.h"
>> +
>> +/* An implementation of diagnostic_path for copy_history_t, as a
>> +   vector of simple_diagnostic_event instances.  */
>> +
>> +class copy_history_diagnostic_path : public diagnostic_path
>> +{
>> + public:
>> +  copy_history_diagnostic_path (pretty_printer *event_pp,
>> + copy_history_t cp_history,
>> + location_t loc)
>> +  : diagnostic_path (),
>> +    m_thread ("main"),
>> +    m_event_pp (event_pp),
>> +    m_copy_history (cp_history),
>> +    m_location (loc)
>> +  {}
>> +
>> +  unsigned num_events () const final override
>> +  {
>> +    return m_events.length ();
>> +  }
>> +  const diagnostic_event & get_event (int idx) const final override
>> +  {
>> +    return *m_events[idx];
>> +  }
>> +  unsigned num_threads () const final override
>> +  {
>> +    return 1;
>> +  }
>> +  const diagnostic_thread &
>> +  get_thread (diagnostic_thread_id_t) const final override
>> +  {
>> +    return m_thread;
>> +  }
>> +  bool
>> +  same_function_p (int event_idx_a,
>> +   int event_idx_b) const final override;
>> +
>> +  void add_event (location_t loc, tree fndecl, int depth,
>> +  const char *fmt, ...);
>> +
>> +  void populate_path_from_copy_history ();
>> +
>> + private:
>> +  simple_diagnostic_thread m_thread;
>> +
>> +  /* The events that have occurred along this path.  */
>> +  auto_delete_vec<simple_diagnostic_event> m_events;
>> +
>> +  pretty_printer *m_event_pp;
>> +
>> +  /* The copy_history associated with this path.  */
>> +  copy_history_t m_copy_history;
>> +
>> +  /* The location for the gimple statement where the
>> +     diagnostic message emitted.  */
>> +  location_t m_location;
>> +
>> +};
>> +
>> +#endif /* ! GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H */
>> diff --git a/gcc/diagnostic-copy-history.cc b/gcc/diagnostic-copy-history.cc
>> new file mode 100644
>> index 00000000000..2733d137622
>> --- /dev/null
>> +++ b/gcc/diagnostic-copy-history.cc
>> @@ -0,0 +1,163 @@
>> +/* Functions to handle copy history.
>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
>> +
>> +   This file is part of GCC.
>> +
>> +   GCC is free software; you can redistribute it and/or modify it under
>> +   the terms of the GNU General Public License as published by the Free
>> +   Software Foundation; either version 3, or (at your option) any later
>> +   version.
>> +
>> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +   for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with GCC; see the file COPYING3.  If not see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#include "config.h"
>> +#include "system.h"
>> +#include "coretypes.h"
>> +#include "backend.h"
>> +#include "tree.h"
>> +#include "diagnostic-copy-history.h"
>> +
>> +/* A mapping from a gimple to a pointer to the copy history of it.  */
>> +static copy_history_map_t *copy_history_map;
>> +
>> +/* Obstack for copy history.  */
>> +static struct obstack copy_history_obstack;
>> +
>> +/* Create a new copy history.  */
>> +
>> +copy_history_t
>> +create_copy_history (location_t condition,
>> +     bool is_true_path,
>> +     enum copy_reason reason,
>> +     copy_history_t prev_copy)
>> +{
>> +  static bool initialized = false;
>> +
>> +  if (!initialized)
>> +    {
>> +      gcc_obstack_init (&copy_history_obstack);
>> +      initialized = true;
>> +    }
>> +
>> +  copy_history_t copy_history
>> +    = (copy_history_t) obstack_alloc (&copy_history_obstack,
>> +      sizeof (struct copy_history));
>> +  copy_history->condition = condition;
>> +  copy_history->is_true_path = is_true_path;
>> +  copy_history->reason = reason;
>> +  copy_history->prev_copy = prev_copy;
>> +  return copy_history;
>> +}
>> +
>> +/* Insert the copy history for the gimple STMT assuming the linked list
>> +   of CP_HISTORY does not have duplications.  It's the caller's
>> +   responsibility to make sure that the linked list of CP_HISTORY does
>> +   not have duplications.  */
>> +
>> +void
>> +insert_copy_history (gimple *stmt, copy_history_t cp_history)
>> +{
>> +  if (!copy_history_map)
>> +    copy_history_map = new copy_history_map_t;
>> +
>> +  copy_history_map->put (stmt, cp_history);
>> +  return;
>> +}
>> +
>> +/* Get the copy history for the gimple STMT, return NULL when there is
>> +   no associated copy history.  */
>> +
>> +copy_history_t
>> +get_copy_history (gimple *stmt)
>> +{
>> +  if (!copy_history_map)
>> +    return NULL;
>> +
>> +  if (const copy_history_t *cp_history_p = copy_history_map->get (stmt))
>> +    return *cp_history_p;
>> +
>> +  return NULL;
>> +}
>> +
>> +/* Remove the copy history for STMT.  */
>> +
>> +void
>> +remove_copy_history (gimple *stmt)
>> +{
>> +  if (!copy_history_map)
>> +    return;
>> +  copy_history_map->remove (stmt);
>> +  return;
>> +}
>> +
>> +/* Check whether the cond_location, is_true_path and reason existed
>> + * in the OLD_COPY_HISTORY.  */
>> +
>> +static bool
>> +is_copy_history_existed (location_t cond_location, bool is_true_path,
>> + enum copy_reason reason,
>> + copy_history_t old_copy_history)
>> +{
>> +  for (copy_history_t cur_ch = old_copy_history; cur_ch;
>> +       cur_ch = cur_ch->prev_copy)
>> +    if ((cur_ch->condition == cond_location)
>> + && (cur_ch->is_true_path == is_true_path)
>> + && (cur_ch->reason == reason))
>> +      return true;
>> +
>> +  return false;
>> +}
>> +
>> +/* Set copy history for the gimple STMT.  Return TRUE when a new copy
>> + * history is created and inserted.  Otherwise return FALSE.  */
>> +
>> +bool
>> +set_copy_history (gimple *stmt, location_t cond_location,
>> +  bool is_true_path, enum copy_reason reason)
>> +{
>> +
>> +  /* First, get the old copy history associated with this STMT.  */
>> +  copy_history_t old_cp_history = get_copy_history (stmt);
>> +
>> +  /* If the current copy history is not in the STMT's copy history linked
>> +     list yet, create the new copy history, put the old_copy_history as the
>> +     prev_copy of it.  */
>> +  copy_history_t new_cp_history = NULL;
>> +  if (!is_copy_history_existed (cond_location, is_true_path,
>> + reason, old_cp_history))
>> +    new_cp_history
>> +      = create_copy_history (cond_location, is_true_path,
>> +     reason, old_cp_history);
>> +
>> +  /* Insert the copy history into the hash map.  */
>> +  if (new_cp_history)
>> +    {
>> +      insert_copy_history (stmt, new_cp_history);
>> +      return true;
>> +    }
>> +
>> +  return false;
>> +}
>> +
>> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
>> +   compiler within the same process.  For use by toplev::finalize.  */
>> +
>> +void
>> +copy_history_finalize (void)
>> +{
>> +  if (copy_history_map)
>> +    {
>> +      delete copy_history_map;
>> +      copy_history_map = NULL;
>> +    }
>> +  obstack_free (&copy_history_obstack, NULL);
>> +  return;
>> +}
>> diff --git a/gcc/diagnostic-copy-history.h b/gcc/diagnostic-copy-history.h
>> new file mode 100644
>> index 00000000000..0c2c55a2a81
>> --- /dev/null
>> +++ b/gcc/diagnostic-copy-history.h
>> @@ -0,0 +1,80 @@
>> +/* Copy history associated with a gimple statement to record its history
>> +   of duplications due to different transformations.
>> +   The copy history will be used to construct events for later diagnostic.
>> +
>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
>> +
>> +   This file is part of GCC.
>> +
>> +   GCC is free software; you can redistribute it and/or modify it under
>> +   the terms of the GNU General Public License as published by the Free
>> +   Software Foundation; either version 3, or (at your option) any later
>> +   version.
>> +
>> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>> +   for more details.
>> +
>> +   You should have received a copy of the GNU General Public License
>> +   along with GCC; see the file COPYING3.  If not see
>> +   <http://www.gnu.org/licenses/>.  */
>> +
>> +#ifndef DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>> +#define DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>> +
>> +#include "hash-map.h"
>> +#include "line-map.h"
>> +
>> +/* An enum for the reason why this copy is made.  Right now, there are
>> +   three reasons, we can add more if needed.  */
>> +enum copy_reason {
>> +  COPY_BY_THREAD_JUMP,
>> +  COPY_BY_LOOP_UNSWITCH,
>> +  COPY_BY_LOOP_UNROLL
>> +};
>> +
>> +/* This data structure records the information when a statement is
>> +   duplicated during different transformations.  Such information
>> +   will be used by the later diagnostic messages to report more
>> +   contexts of the warnings or errors.  */
>> +struct copy_history {
>> +  /* The location of the condition statement that triggered the code
>> +     duplication.  */
>> +  location_t condition;
>> +
>> +  /* Whether this copy is on the TRUE path of the condition.  */
>> +  bool is_true_path;
>> +
>> +  /* The reason for the code duplication.  */
>> +  enum copy_reason reason;
>> +
>> +  /* This statement itself might be a previous code duplication.  */
>> +  struct copy_history *prev_copy;
>> +};
>> +
>> +typedef struct copy_history *copy_history_t;
>> +
>> +/* Create a new copy history.  */
>> +extern copy_history_t create_copy_history (location_t, bool,
>> +   enum copy_reason, copy_history_t);
>> +
>> +typedef hash_map<gimple *, copy_history_t> copy_history_map_t;
>> +
>> +/* Get the copy history for the gimple STMT, return NULL when there is
>> +   no associated copy history.  */
>> +extern copy_history_t get_copy_history (gimple *);
>> +
>> +/* Remove the copy history for STMT from the copy_history_map.  */
>> +extern void remove_copy_history (gimple *);
>> +
>> +/* Set copy history for the gimple STMT.  */
>> +extern bool set_copy_history (gimple *, location_t,
>> +      bool, enum copy_reason);
>> +
>> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
>> +   compiler within the same process.  For use by toplev::finalize.  */
>> +extern void copy_history_finalize (void);
>> +
>> +#endif // DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index b37c7af7a39..bb0c6a87523 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -321,6 +321,7 @@ Objective-C and Objective-C++ Dialects}.
>> -fdiagnostics-column-origin=@var{origin}
>> -fdiagnostics-escape-format=@r{[}unicode@r{|}bytes@r{]}
>> -fdiagnostics-text-art-charset=@r{[}none@r{|}ascii@r{|}unicode@r{|}emoji@r{]}}
>> +-fdiagnostics-explain-harder
>> 
>> @item Warning Options
>> @xref{Warning Options,,Options to Request or Suppress Warnings}.
>> @@ -5518,6 +5519,12 @@ left margin.
>> This option controls the minimum width of the left margin printed by
>> @option{-fdiagnostics-show-line-numbers}.  It defaults to 6.
>> 
>> +@opindex fdiagnostics-explain-harder
>> +@item -fdiagnostics-explain-harder
>> +With this option, the compiler collects more context information for
>> +diagnostics and emits them to the users to provide more hints on how
>> +the diagnostics come from.
>> +
>> @opindex fdiagnostics-parseable-fixits
>> @item -fdiagnostics-parseable-fixits
>> Emit fix-it hints in a machine-parseable format, suitable for consumption
>> diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
>> index 1637a2fc4f4..ba98ab3b413 100644
>> --- a/gcc/gimple-array-bounds.cc
>> +++ b/gcc/gimple-array-bounds.cc
>> @@ -31,6 +31,9 @@ along with GCC; see the file COPYING3.  If not see
>> #include "tree-dfa.h"
>> #include "fold-const.h"
>> #include "diagnostic-core.h"
>> +#include "simple-diagnostic-path.h"
>> +#include "diagnostic-copy-history.h"
>> +#include "copy-history-diagnostic-path.h"
>> #include "intl.h"
>> #include "tree-vrp.h"
>> #include "alloc-pool.h"
>> @@ -254,6 +257,25 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
>>  return;
>> }
>> 
>> +/* Build a rich location for LOCATION, and populate the diagonistic path
>> +   for it.  */
>> +
>> +static rich_location *
>> +build_rich_location_with_diagnostic_path (location_t location, gimple *stmt)
>> +{
>> +  /* Generate a rich location for this location.  */
>> +  rich_location *richloc = new rich_location (line_table, location);
>> +
>> +  copy_history_t cp_history = stmt ? get_copy_history (stmt) : NULL;
>> +  copy_history_diagnostic_path *path
>> +    = new copy_history_diagnostic_path (global_dc->printer,
>> + cp_history, location);
>> +  path->populate_path_from_copy_history ();
>> +
>> +  richloc->set_path (path);
>> +  return richloc;
>> +}
>> +
>> /* Given the LOW_SUB_ORG, LOW_SUB and UP_SUB, and the computed UP_BOUND
>>   and UP_BOUND_P1, check whether the array reference REF is out of bound.
>>   When out of bounds, set OUT_OF_BOUND to true.
>> @@ -262,6 +284,7 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
>> 
>> static bool
>> check_out_of_bounds_and_warn (location_t location, tree ref,
>> +      gimple *stmt,
>>     tree low_sub_org, tree low_sub, tree up_sub,
>>     tree up_bound, tree up_bound_p1,
>>     const irange *vr,
>> @@ -280,9 +303,13 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>    {
>>      *out_of_bound = true;
>>      if (for_array_bound)
>> - warned = warning_at (location, OPT_Warray_bounds_,
>> + {
>> +  rich_location *richloc
>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>    "array subscript %E is outside array"
>>    " bounds of %qT", low_sub_org, artype);
>> + }
>>    }
>> 
>>  if (warned)
>> @@ -299,10 +326,14 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>> {
>> *out_of_bound = true;
>> if (for_array_bound)
>> -    warned = warning_at (location, OPT_Warray_bounds_,
>> +    {
>> +      rich_location *richloc
>> + = build_rich_location_with_diagnostic_path (location, stmt);
>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>> "array subscript [%E, %E] is outside "
>> "array bounds of %qT",
>> low_sub, up_sub, artype);
>> +    }
>> }
>>    }
>>  else if (up_bound
>> @@ -313,18 +344,26 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>    {
>>      *out_of_bound = true;
>>      if (for_array_bound)
>> - warned = warning_at (location, OPT_Warray_bounds_,
>> + {
>> +  rich_location *richloc
>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>    "array subscript %E is above array bounds of %qT",
>>    up_sub, artype);
>> + }
>>    }
>>  else if (TREE_CODE (low_sub) == INTEGER_CST
>>  && tree_int_cst_lt (low_sub, low_bound))
>>    {
>>      *out_of_bound = true;
>>      if (for_array_bound)
>> - warned = warning_at (location, OPT_Warray_bounds_,
>> + {
>> +  rich_location *richloc
>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>    "array subscript %E is below array bounds of %qT",
>>    low_sub, artype);
>> + }
>>    }
>>  return warned;
>> }
>> @@ -388,21 +427,24 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>> }
>>    }
>> 
>> -  warned = check_out_of_bounds_and_warn (location, ref,
>> +  warned = check_out_of_bounds_and_warn (location, ref, stmt,
>> low_sub_org, low_sub, up_sub,
>> up_bound, up_bound_p1, &vr,
>> ignore_off_by_one, warn_array_bounds,
>> &out_of_bound);
>> 
>> -
>>  if (!warned && sam == special_array_member::int_0)
>> -    warned = warning_at (location, OPT_Wzero_length_bounds,
>> +    {
>> +      rich_location *richloc
>> + = build_rich_location_with_diagnostic_path (location, stmt);
>> +      warned = warning_at (richloc, OPT_Wzero_length_bounds,
>> (TREE_CODE (low_sub) == INTEGER_CST
>> ? G_("array subscript %E is outside the bounds "
>>      "of an interior zero-length array %qT")
>> : G_("array subscript %qE is outside the bounds "
>>      "of an interior zero-length array %qT")),
>> low_sub, artype);
>> +    }
>> 
>>  if (warned && dump_file && (dump_flags & TDF_DETAILS))
>>    {
>> @@ -419,8 +461,10 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>> || sam == special_array_member::trail_n)
>>      && DECL_NOT_FLEXARRAY (afield_decl))
>>    {
>> +      rich_location *richloc
>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>      bool warned1
>> - = warning_at (location, OPT_Wstrict_flex_arrays,
>> + = warning_at (richloc, OPT_Wstrict_flex_arrays,
>>     "trailing array %qT should not be used as "
>>     "a flexible array member",
>>     artype);
>> @@ -478,6 +522,7 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>> 
>> bool
>> array_bounds_checker::check_mem_ref (location_t location, tree ref,
>> +     gimple *stmt,
>>    bool ignore_off_by_one)
>> {
>>  if (warning_suppressed_p (ref, OPT_Warray_bounds_))
>> @@ -580,16 +625,24 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>  if (lboob)
>>    {
>>      if (offrange[0] == offrange[1])
>> - warned = warning_at (location, OPT_Warray_bounds_,
>> + {
>> +  rich_location *richloc
>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>    "array subscript %wi is outside array bounds "
>>    "of %qT",
>>    offrange[0].to_shwi (), reftype);
>> + }
>>      else
>> - warned = warning_at (location, OPT_Warray_bounds_,
>> + {
>> +  rich_location *richloc
>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>    "array subscript [%wi, %wi] is outside "
>>    "array bounds of %qT",
>>    offrange[0].to_shwi (),
>>    offrange[1].to_shwi (), reftype);
>> + }
>>    }
>>  else if (uboob && !ignore_off_by_one)
>>    {
>> @@ -599,8 +652,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>  it were an untyped array of bytes.  */
>> backtype = build_array_type_nelts (unsigned_char_type_node,
>>  aref.sizrng[1].to_uhwi ());
>> -
>> -      warned = warning_at (location, OPT_Warray_bounds_,
>> +      rich_location *richloc
>> + = build_rich_location_with_diagnostic_path (location, stmt);
>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>  "array subscript %<%T[%wi]%> is partly "
>>  "outside array bounds of %qT",
>>  axstype, offrange[0].to_shwi (), backtype);
>> @@ -623,7 +677,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>    {
>>      HOST_WIDE_INT tmpidx = (aref.offmax[i] / eltsize).to_shwi ();
>> 
>> -      if (warning_at (location, OPT_Warray_bounds_,
>> +      rich_location *richloc
>> + = build_rich_location_with_diagnostic_path (location, stmt);
>> +      if (warning_at (richloc, OPT_Warray_bounds_,
>>     "intermediate array offset %wi is outside array bounds "
>>     "of %qT", tmpidx, reftype))
>> {
>> @@ -656,7 +712,7 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>> ignore_off_by_one = false;
>> }
>>      else if (TREE_CODE (t) == MEM_REF)
>> - warned = check_mem_ref (location, t, ignore_off_by_one);
>> + warned = check_mem_ref (location, t, stmt, ignore_off_by_one);
>> 
>>      if (warned)
>> suppress_warning (t, OPT_Warray_bounds_);
>> @@ -702,7 +758,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>> dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>> fprintf (dump_file, "\n");
>> }
>> -      warned = warning_at (location, OPT_Warray_bounds_,
>> +      rich_location *richloc
>> + = build_rich_location_with_diagnostic_path (location, stmt);
>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>  "array subscript %wi is below "
>>  "array bounds of %qT",
>>  idx.to_shwi (), TREE_TYPE (tem));
>> @@ -716,7 +774,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>> dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>> fprintf (dump_file, "\n");
>> }
>> -      warned = warning_at (location, OPT_Warray_bounds_,
>> +      rich_location *richloc
>> + = build_rich_location_with_diagnostic_path (location, stmt);
>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>  "array subscript %wu is above "
>>  "array bounds of %qT",
>>  idx.to_uhwi (), TREE_TYPE (tem));
>> @@ -811,7 +871,7 @@ array_bounds_checker::check_array_bounds (tree *tp, int *walk_subtree,
>>    warned = checker->check_array_ref (location, t, wi->stmt,
>>      false/*ignore_off_by_one*/);
>>  else if (TREE_CODE (t) == MEM_REF)
>> -    warned = checker->check_mem_ref (location, t,
>> +    warned = checker->check_mem_ref (location, t, wi->stmt,
>>    false /*ignore_off_by_one*/);
>>  else if (TREE_CODE (t) == ADDR_EXPR)
>>    {
>> diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h
>> index aa7ca8e9730..2d1d48d1e94 100644
>> --- a/gcc/gimple-array-bounds.h
>> +++ b/gcc/gimple-array-bounds.h
>> @@ -33,7 +33,7 @@ public:
>> private:
>>  static tree check_array_bounds (tree *tp, int *walk_subtree, void *data);
>>  bool check_array_ref (location_t, tree, gimple *, bool ignore_off_by_one);
>> -  bool check_mem_ref (location_t, tree, bool ignore_off_by_one);
>> +  bool check_mem_ref (location_t, tree, gimple *, bool ignore_off_by_one);
>>  void check_addr_expr (location_t, tree, gimple *);
>>  void get_value_range (irange &r, const_tree op, gimple *);
>> 
>> diff --git a/gcc/gimple-iterator.cc b/gcc/gimple-iterator.cc
>> index 93646262eac..472a79360d7 100644
>> --- a/gcc/gimple-iterator.cc
>> +++ b/gcc/gimple-iterator.cc
>> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
>> #include "tree-ssa.h"
>> #include "value-prof.h"
>> #include "gimplify.h"
>> +#include "diagnostic-copy-history.h"
>> 
>> 
>> /* Mark the statement STMT as modified, and update it.  */
>> @@ -581,6 +582,8 @@ gsi_remove (gimple_stmt_iterator *i, bool remove_permanently)
>> cfun->debug_marker_count--;
>>      require_eh_edge_purge = remove_stmt_from_eh_lp (stmt);
>>      gimple_remove_stmt_histograms (cfun, stmt);
>> +      if (get_copy_history (stmt) != NULL)
>> + remove_copy_history (stmt);
>>    }
>> 
>>  /* Update the iterator and re-wire the links in I->SEQ.  */
>> diff --git a/gcc/testsuite/gcc.dg/pr109071.c b/gcc/testsuite/gcc.dg/pr109071.c
>> new file mode 100644
>> index 00000000000..95ae576451e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pr109071.c
>> @@ -0,0 +1,43 @@
>> +/* PR tree-optimization/109071 need more context for -Warray-bounds warnings
>> +   due to code duplication from jump threading.  */  
>> +/* { dg-options "-O2 -Wall -fdiagnostics-explain-harder" } */
>> +/* { dg-additional-options "-fdiagnostics-show-line-numbers -fdiagnostics-path-format=inline-events -fdiagnostics-show-caret" } */
>> +/* { dg-enable-nn-line-numbers "" } */
>> +
>> +extern void warn(void);
>> +static inline void assign(int val, int *regs, int index)
>> +{
>> +  if (index >= 4)
>> +    warn();
>> +  *regs = val;
>> +}
>> +struct nums {int vals[4];};
>> +
>> +void sparx5_set (int *ptr, struct nums *sg, int index)
>> +{
>> +  int *val = &sg->vals[index]; /* { dg-warning "is above array bounds" } */
>> +
>> +  assign(0,    ptr, index);
>> +  assign(*val, ptr, index);
>> +}
>> +/* { dg-begin-multiline-output "" }
>> +   NN |   int *val = &sg->vals[index];
>> +      |               ~~~~~~~~^~~~~~~
>> +  'sparx5_set': events 1-2
>> +   NN |   if (index >= 4)
>> +      |      ^
>> +      |      |
>> +      |      (1) when the condition is evaluated to true
>> +......
>> +   { dg-end-multiline-output "" } */
>> +
>> +/* { dg-begin-multiline-output "" }
>> +      |               ~~~~~~~~~~~~~~~
>> +      |                       |
>> +      |                       (2) out of array bounds here
>> +   { dg-end-multiline-output "" } */
>> +
>> +/* { dg-begin-multiline-output "" }
>> +   NN | struct nums {int vals[4];};
>> +      |                  ^~~~
>> +   { dg-end-multiline-output "" } */
>> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
>> index f715f977b72..e9ae55f64be 100644
>> --- a/gcc/toplev.cc
>> +++ b/gcc/toplev.cc
>> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
>> #include "cgraph.h"
>> #include "coverage.h"
>> #include "diagnostic.h"
>> +#include "diagnostic-copy-history.h"
>> #include "varasm.h"
>> #include "tree-inline.h"
>> #include "realmpfr.h" /* For GMP/MPFR/MPC versions, in print_version.  */
>> @@ -2381,6 +2382,8 @@ toplev::finalize (void)
>>  tree_cc_finalize ();
>>  reginfo_cc_finalize ();
>> 
>> +  copy_history_finalize ();
>> +
>>  /* save_decoded_options uses opts_obstack, so these must
>>     be cleaned up together.  */
>>  obstack_free (&opts_obstack, NULL);
>> diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
>> index fa61ba9512b..9292f0c2aa1 100644
>> --- a/gcc/tree-ssa-threadupdate.cc
>> +++ b/gcc/tree-ssa-threadupdate.cc
>> @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
>> #include "tree-cfg.h"
>> #include "tree-vectorizer.h"
>> #include "tree-pass.h"
>> +#include "diagnostic-copy-history.h"
>> 
>> /* Given a block B, update the CFG and SSA graph to reflect redirecting
>>   one or more in-edges to B to instead reach the destination of an
>> @@ -2371,6 +2372,37 @@ back_jt_path_registry::adjust_paths_after_duplication (unsigned curr_path_num)
>>    }
>> }
>> 
>> +/* Set copy history to all the stmts in the basic block BB based on
>> +   the entry edge ENTRY and whether this basic block is on the true
>> +   path of the condition in the destination of the edge ENTRY.
>> +   Return TRUE when copy history are set, otherwise return FALSE.  */
>> +
>> +static bool
>> +set_copy_history_to_stmts_in_bb (basic_block bb, edge entry, bool is_true_path)
>> +{
>> +  /* First, get the condition statement in the destination of the
>> +     edge ENTRY.  */
>> +  basic_block cond_block = entry->dest;
>> +  gimple *cond_stmt = NULL;
>> +  gimple_stmt_iterator gsi;
>> +  gsi = gsi_last_bb (cond_block);
>> +  /* if the cond_block ends with a conditional statement, get it.  */
>> +  if (!gsi_end_p (gsi)
>> +      && gsi_stmt (gsi)
>> +      && (gimple_code (gsi_stmt (gsi)) == GIMPLE_COND))
>> +    cond_stmt = gsi_stmt (gsi);
>> +
>> +  if (!cond_stmt)
>> +    return false;
>> +
>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>> +    set_copy_history (gsi_stmt (gsi),
>> +      gimple_location (cond_stmt),
>> +      is_true_path,
>> +      COPY_BY_THREAD_JUMP);
>> +  return true;
>> +}
>> +
>> /* Duplicates a jump-thread path of N_REGION basic blocks.
>>   The ENTRY edge is redirected to the duplicate of the region.
>> 
>> @@ -2419,6 +2451,16 @@ back_jt_path_registry::duplicate_thread_path (edge entry,
>>  copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
>>   split_edge_bb_loc (entry), false);
>> 
>> +  /* Set the copy history for all the stmts in both original and copied
>> +     basic blocks.  The copied regions are in the true path.  */
>> +  if (flag_diagnostics_explain_harder)
>> +    for (i = 0; i < n_region; i++)
>> +      {
>> + set_copy_history_to_stmts_in_bb (region[i], entry, false);
>> + set_copy_history_to_stmts_in_bb (region_copy[i], entry, true);
>> +      }
>> +
>> +
>>  /* Fix up: copy_bbs redirects all edges pointing to copied blocks.  The
>>     following code ensures that all the edges exiting the jump-thread path are
>>     redirected back to the original code: these edges are exceptions
>> -- 
>> 2.31.1
>> 
>
Qing Zhao Aug. 12, 2024, 1:50 p.m. UTC | #3
Hi,  Richard,

Do we still need such improvement into GCC? Could you please take a look at the patch and let me know
Any comment or suggestions?

thanks.

Qing

The 3rd ping for the following patch:

https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657150.html

> On Jul 29, 2024, at 11:32, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> The 2nd ping for the following patch:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657150.html
> 
> thanks.
> 
> Qing
> 
>> On Jul 22, 2024, at 09:01, Qing Zhao <qing.zhao@oracle.com> wrote:
>> 
>> Hi, Richard,
>> 
>> Could you please take a look at the patch and let me know any comment you have (especially on the middle-end part)?
>> 
>> David, let me know if you have further comment and suggestions. 
>> 
>> Thanks a lot.
>> 
>> Qing
>> 
>>> On Jul 12, 2024, at 10:03, Qing Zhao <qing.zhao@oracle.com> wrote:
>>> 
>>> due to code duplication from jump threading [PR109071]
>>> Control this with a new option -fdiagnostic-explain-harder.
>>> 
>>> Compared to V1, the major difference are: (address David's comments)
>>> 
>>> 1. Change the name of the option from:
>>> 
>>> -fdiagnostic-try-to-explain-harder 
>>> To
>>> -fdiagnostic-explain-harder 
>>> 
>>> 2. Sync the commit comments with the real output of the compilation message.
>>> 
>>> 3. Add one more event in the end of the path to repeat the out-of-bound
>>> issue.
>>> 
>>> 4. Fixed the unnecessary changes in Makefile.in.
>>> 
>>> 5. Add new copy_history_diagnostic_path.[cc|h] to implement a new
>>> class copy_history_diagnostic_path : public diagnostic_path
>>> 
>>> for copy_history_t. 
>>> 
>>> 6. Only building the rich locaiton and populating the path when warning_at
>>> is called.
>>> 
>>> There are two comments from David that I didn't addressed in this version:
>>> 
>>> 1. Make regenerate-opt-urls.
>>> will do this in a later version. 
>>> 
>>> 2. Add a ⚠️  emoji for the last event. 
>>> I didn't add this yet since I think the current message is clear enough.
>>> might not worth the effort to add this emoji (it's not that straightforward
>>> to add on). 
>>> 
>>> With this new version, the message emitted by GCC:
>>> 
>>> $gcc -O2 -Wall -fdiagnostics-explain-harder -c -o t.o t.c
>>> t.c: In function ‘sparx5_set’:
>>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>> 12 |   int *val = &sg->vals[index];
>>>    |               ~~~~~~~~^~~~~~~
>>> ‘sparx5_set’: events 1-2
>>>  4 |   if (*index >= 4)
>>>    |      ^
>>>    |      |
>>>    |      (1) when the condition is evaluated to true
>>> ......
>>> 12 |   int *val = &sg->vals[index];
>>>    |               ~~~~~~~~~~~~~~~
>>>    |                       |
>>>    |                       (2) out of array bounds here
>>> t.c:8:18: note: while referencing ‘vals’
>>>  8 | struct nums {int vals[4];};
>>>    |                  ^~~~
>>> 
>>> Bootstrapped and regression tested on both aarch64 and x86. no issues.
>>> 
>>> Let me know any further comments and suggestions.
>>> 
>>> thanks.
>>> 
>>> Qing
>>> 
>>> ==================================================
>>> $ cat t.c
>>> extern void warn(void);
>>> static inline void assign(int val, int *regs, int *index)
>>> {
>>> if (*index >= 4)
>>>  warn();
>>> *regs = val;
>>> }
>>> struct nums {int vals[4];};
>>> 
>>> void sparx5_set (int *ptr, struct nums *sg, int index)
>>> {
>>> int *val = &sg->vals[index];
>>> 
>>> assign(0,    ptr, &index);
>>> assign(*val, ptr, &index);
>>> }
>>> 
>>> $ gcc -Wall -O2  -c -o t.o t.c
>>> t.c: In function ‘sparx5_set’:
>>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>> 12 |   int *val = &sg->vals[index];
>>>    |               ~~~~~~~~^~~~~~~
>>> t.c:8:18: note: while referencing ‘vals’
>>>  8 | struct nums {int vals[4];};
>>>    |                  ^~~~
>>> 
>>> In the above, Although the warning is correct in theory, the warning message
>>> itself is confusing to the end-user since there is information that cannot
>>> be connected to the source code directly.
>>> 
>>> It will be a nice improvement to add more information in the warning message
>>> to report where such index value come from.
>>> 
>>> In order to achieve this, we add a new data structure copy_history to record
>>> the condition and the transformation that triggered the code duplication.
>>> Whenever there is a code duplication due to some specific transformations,
>>> such as jump threading, loop switching, etc, a copy_history structure is
>>> created and attached to the duplicated gimple statement.
>>> 
>>> During array out-of-bound checking or other warning checking, the copy_history
>>> that was attached to the gimple statement is used to form a sequence of
>>> diagnostic events that are added to the corresponding rich location to be used
>>> to report the warning message.
>>> 
>>> This behavior is controled by the new option -fdiagnostic-explain-harder
>>> which is off by default.
>>> 
>>> With this change, by adding -fdiagnostic-explain-harder,
>>> the warning message for the above testing case is now:
>>> 
>>> $ gcc -Wall -O2 -fdiagnostics-explain-harder -c -o t.o t.c
>>> t.c: In function ‘sparx5_set’:
>>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>> 12 |   int *val = &sg->vals[index];
>>>    |               ~~~~~~~~^~~~~~~
>>> ‘sparx5_set’: events 1-2
>>>  4 |   if (*index >= 4)
>>>    |      ^
>>>    |      |
>>>    |      (1) when the condition is evaluated to true
>>> ......
>>> 12 |   int *val = &sg->vals[index];
>>>    |               ~~~~~~~~~~~~~~~
>>>    |                       |
>>>    |                       (2) out of array bounds here
>>> t.c:8:18: note: while referencing ‘vals’
>>>  8 | struct nums {int vals[4];};
>>>    |                  ^~~~
>>> 
>>> gcc/ChangeLog:
>>> 
>>> PR tree-optimization/109071
>>> 
>>> * Makefile.in (OBJS): Add diagnostic-copy-history.o
>>> and copy-history-diagnostic-path.o.
>>> * gcc/common.opt (fdiagnostics-explain-harder): New option.
>>> * gcc/doc/invoke.texi (fdiagnostics-explain-harder): Add
>>> documentation for the new option.
>>> * gimple-array-bounds.cc (build_rich_location_with_diagnostic_path):
>>> New function.
>>> (check_out_of_bounds_and_warn): Add one new parameter. Use rich
>>> location with copy_history_diagnostic_path for warning_at.
>>> (array_bounds_checker::check_array_ref): Use rich location with
>>> copy_history_diagnostic_path for warning_at.
>>> (array_bounds_checker::check_mem_ref): Add one new parameter.
>>> Use rich location with copy_history_diagnostic_path for warning_at.
>>> (array_bounds_checker::check_addr_expr): Use rich location with
>>> copy_history_diagnostic_path for warning_at.
>>> (array_bounds_checker::check_array_bounds): Call check_mem_ref with
>>> one more parameter.
>>> * gimple-array-bounds.h: Update prototype for check_mem_ref.
>>> * gimple-iterator.cc (gsi_remove): (gsi_remove): Remove the copy
>>> history when removing the gimple.
>>> * toplev.cc (toplev::finalize):  Call copy_history_finalize.
>>> * tree-ssa-threadupdate.cc (set_copy_history_to_stmts_in_bb): New
>>> function.
>>> (back_jt_path_registry::duplicate_thread_path): Call the new function
>>> to set copy history for every duplicated gimple.
>>> * copy-history-diagnostic-path.cc: New file.
>>> * copy-history-diagnostic-path.h: New file.
>>> * diagnostic-copy-history.cc: New file.
>>> * diagnostic-copy-history.h: New file.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> PR tree-optimization/109071
>>> 
>>> * gcc.dg/pr109071.c: New test.
>>> ---
>>> gcc/Makefile.in                     |   2 +
>>> gcc/common.opt                      |   4 +
>>> gcc/copy-history-diagnostic-path.cc |  86 +++++++++++++++
>>> gcc/copy-history-diagnostic-path.h  |  87 +++++++++++++++
>>> gcc/diagnostic-copy-history.cc      | 163 ++++++++++++++++++++++++++++
>>> gcc/diagnostic-copy-history.h       |  80 ++++++++++++++
>>> gcc/doc/invoke.texi                 |   7 ++
>>> gcc/gimple-array-bounds.cc          |  94 +++++++++++++---
>>> gcc/gimple-array-bounds.h           |   2 +-
>>> gcc/gimple-iterator.cc              |   3 +
>>> gcc/testsuite/gcc.dg/pr109071.c     |  43 ++++++++
>>> gcc/toplev.cc                       |   3 +
>>> gcc/tree-ssa-threadupdate.cc        |  42 +++++++
>>> 13 files changed, 598 insertions(+), 18 deletions(-)
>>> create mode 100644 gcc/copy-history-diagnostic-path.cc
>>> create mode 100644 gcc/copy-history-diagnostic-path.h
>>> create mode 100644 gcc/diagnostic-copy-history.cc
>>> create mode 100644 gcc/diagnostic-copy-history.h
>>> create mode 100644 gcc/testsuite/gcc.dg/pr109071.c
>>> 
>>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>>> index 7d3ea27a6ab..f8bbfda845f 100644
>>> --- a/gcc/Makefile.in
>>> +++ b/gcc/Makefile.in
>>> @@ -1436,6 +1436,8 @@ OBJS = \
>>> df-problems.o \
>>> df-scan.o \
>>> dfp.o \
>>> + diagnostic-copy-history.o \
>>> + copy-history-diagnostic-path.o \
>>> digraph.o \
>>> dojump.o \
>>> dominance.o \
>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>> index a300470bbc5..714280390d9 100644
>>> --- a/gcc/common.opt
>>> +++ b/gcc/common.opt
>>> @@ -1558,6 +1558,10 @@ fdiagnostics-minimum-margin-width=
>>> Common Joined UInteger Var(diagnostics_minimum_margin_width) Init(6)
>>> Set minimum width of left margin of source code when showing source.
>>> 
>>> +fdiagnostics-explain-harder
>>> +Common Var(flag_diagnostics_explain_harder)
>>> +Collect and print more context information for diagnostics.
>>> +
>>> fdisable-
>>> Common Joined RejectNegative Var(common_deferred_options) Defer
>>> -fdisable-[tree|rtl|ipa]-<pass>=range1+range2 Disable an optimization pass.
>>> diff --git a/gcc/copy-history-diagnostic-path.cc b/gcc/copy-history-diagnostic-path.cc
>>> new file mode 100644
>>> index 00000000000..65e5c056165
>>> --- /dev/null
>>> +++ b/gcc/copy-history-diagnostic-path.cc
>>> @@ -0,0 +1,86 @@
>>> +/* Classes for implementing diagnostic paths for copy_history_t.
>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
>>> +
>>> +This file is part of GCC.
>>> +
>>> +GCC is free software; you can redistribute it and/or modify it under
>>> +the terms of the GNU General Public License as published by the Free
>>> +Software Foundation; either version 3, or (at your option) any later
>>> +version.
>>> +
>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> +for more details.
>>> +
>>> +You should have received a copy of the GNU General Public License
>>> +along with GCC; see the file COPYING3.  If not see
>>> +<http://www.gnu.org/licenses/>.  */
>>> +
>>> +#include "config.h"
>>> +#include "system.h"
>>> +#include "coretypes.h"
>>> +#include "function.h"
>>> +#include "copy-history-diagnostic-path.h"
>>> +
>>> +bool
>>> +copy_history_diagnostic_path::same_function_p (int event_idx_a,
>>> +       int event_idx_b) const
>>> +{
>>> +  return (m_events[event_idx_a]->get_fndecl ()
>>> +  == m_events[event_idx_b]->get_fndecl ());
>>> +}
>>> +
>>> +/* Add an event to this path at LOC within function FNDECL at
>>> +   stack depth DEPTH.
>>> +
>>> +   Use m_context's printer to format FMT, as the text of the new
>>> +   event.  */
>>> +
>>> +void
>>> +copy_history_diagnostic_path::add_event (location_t loc, tree fndecl, int depth,
>>> + const char *fmt, ...)
>>> +{
>>> +  pretty_printer *pp = m_event_pp;
>>> +  pp_clear_output_area (pp);
>>> +
>>> +  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
>>> +
>>> +  va_list ap;
>>> +
>>> +  va_start (ap, fmt);
>>> +
>>> +  text_info ti (fmt, &ap, 0, nullptr, &rich_loc);
>>> +  pp_format (pp, &ti);
>>> +  pp_output_formatted_text (pp);
>>> +
>>> +  va_end (ap);
>>> +
>>> +  simple_diagnostic_event *new_event
>>> +    = new simple_diagnostic_event (loc, fndecl, depth, pp_formatted_text (pp));
>>> +  m_events.safe_push (new_event);
>>> +
>>> +  pp_clear_output_area (pp);
>>> +
>>> +  return;
>>> +}
>>> +
>>> +/* Populate the diagnostic_path from the copy_history.  */
>>> +void
>>> +copy_history_diagnostic_path::populate_path_from_copy_history ()
>>> +{
>>> +  if (!m_copy_history)
>>> +    return;
>>> +
>>> +  for (copy_history_t cur_ch = m_copy_history; cur_ch;
>>> +       cur_ch = cur_ch->prev_copy)
>>> +    add_event (cur_ch->condition, cfun->decl, 1,
>>> +       "when the condition is evaluated to %s",
>>> +       cur_ch->is_true_path ? "true" : "false");
>>> +
>>> +  /* Add an end of path warning event in the end of the path.  */
>>> +  add_event (m_location, cfun->decl, 1,
>>> +     "out of array bounds here");
>>> +  return;
>>> +}
>>> diff --git a/gcc/copy-history-diagnostic-path.h b/gcc/copy-history-diagnostic-path.h
>>> new file mode 100644
>>> index 00000000000..377e9136070
>>> --- /dev/null
>>> +++ b/gcc/copy-history-diagnostic-path.h
>>> @@ -0,0 +1,87 @@
>>> +/* Classes for implementing diagnostic paths for copy_history_t.
>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
>>> +
>>> +This file is part of GCC.
>>> +
>>> +GCC is free software; you can redistribute it and/or modify it under
>>> +the terms of the GNU General Public License as published by the Free
>>> +Software Foundation; either version 3, or (at your option) any later
>>> +version.
>>> +
>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> +for more details.
>>> +
>>> +You should have received a copy of the GNU General Public License
>>> +along with GCC; see the file COPYING3.  If not see
>>> +<http://www.gnu.org/licenses/>.  */
>>> +
>>> +#ifndef GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
>>> +#define GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
>>> +
>>> +#include "diagnostic-path.h"
>>> +#include "simple-diagnostic-path.h"
>>> +#include "diagnostic-copy-history.h"
>>> +
>>> +/* An implementation of diagnostic_path for copy_history_t, as a
>>> +   vector of simple_diagnostic_event instances.  */
>>> +
>>> +class copy_history_diagnostic_path : public diagnostic_path
>>> +{
>>> + public:
>>> +  copy_history_diagnostic_path (pretty_printer *event_pp,
>>> + copy_history_t cp_history,
>>> + location_t loc)
>>> +  : diagnostic_path (),
>>> +    m_thread ("main"),
>>> +    m_event_pp (event_pp),
>>> +    m_copy_history (cp_history),
>>> +    m_location (loc)
>>> +  {}
>>> +
>>> +  unsigned num_events () const final override
>>> +  {
>>> +    return m_events.length ();
>>> +  }
>>> +  const diagnostic_event & get_event (int idx) const final override
>>> +  {
>>> +    return *m_events[idx];
>>> +  }
>>> +  unsigned num_threads () const final override
>>> +  {
>>> +    return 1;
>>> +  }
>>> +  const diagnostic_thread &
>>> +  get_thread (diagnostic_thread_id_t) const final override
>>> +  {
>>> +    return m_thread;
>>> +  }
>>> +  bool
>>> +  same_function_p (int event_idx_a,
>>> +   int event_idx_b) const final override;
>>> +
>>> +  void add_event (location_t loc, tree fndecl, int depth,
>>> +  const char *fmt, ...);
>>> +
>>> +  void populate_path_from_copy_history ();
>>> +
>>> + private:
>>> +  simple_diagnostic_thread m_thread;
>>> +
>>> +  /* The events that have occurred along this path.  */
>>> +  auto_delete_vec<simple_diagnostic_event> m_events;
>>> +
>>> +  pretty_printer *m_event_pp;
>>> +
>>> +  /* The copy_history associated with this path.  */
>>> +  copy_history_t m_copy_history;
>>> +
>>> +  /* The location for the gimple statement where the
>>> +     diagnostic message emitted.  */
>>> +  location_t m_location;
>>> +
>>> +};
>>> +
>>> +#endif /* ! GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H */
>>> diff --git a/gcc/diagnostic-copy-history.cc b/gcc/diagnostic-copy-history.cc
>>> new file mode 100644
>>> index 00000000000..2733d137622
>>> --- /dev/null
>>> +++ b/gcc/diagnostic-copy-history.cc
>>> @@ -0,0 +1,163 @@
>>> +/* Functions to handle copy history.
>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
>>> +
>>> +   This file is part of GCC.
>>> +
>>> +   GCC is free software; you can redistribute it and/or modify it under
>>> +   the terms of the GNU General Public License as published by the Free
>>> +   Software Foundation; either version 3, or (at your option) any later
>>> +   version.
>>> +
>>> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> +   for more details.
>>> +
>>> +   You should have received a copy of the GNU General Public License
>>> +   along with GCC; see the file COPYING3.  If not see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +#include "config.h"
>>> +#include "system.h"
>>> +#include "coretypes.h"
>>> +#include "backend.h"
>>> +#include "tree.h"
>>> +#include "diagnostic-copy-history.h"
>>> +
>>> +/* A mapping from a gimple to a pointer to the copy history of it.  */
>>> +static copy_history_map_t *copy_history_map;
>>> +
>>> +/* Obstack for copy history.  */
>>> +static struct obstack copy_history_obstack;
>>> +
>>> +/* Create a new copy history.  */
>>> +
>>> +copy_history_t
>>> +create_copy_history (location_t condition,
>>> +     bool is_true_path,
>>> +     enum copy_reason reason,
>>> +     copy_history_t prev_copy)
>>> +{
>>> +  static bool initialized = false;
>>> +
>>> +  if (!initialized)
>>> +    {
>>> +      gcc_obstack_init (&copy_history_obstack);
>>> +      initialized = true;
>>> +    }
>>> +
>>> +  copy_history_t copy_history
>>> +    = (copy_history_t) obstack_alloc (&copy_history_obstack,
>>> +      sizeof (struct copy_history));
>>> +  copy_history->condition = condition;
>>> +  copy_history->is_true_path = is_true_path;
>>> +  copy_history->reason = reason;
>>> +  copy_history->prev_copy = prev_copy;
>>> +  return copy_history;
>>> +}
>>> +
>>> +/* Insert the copy history for the gimple STMT assuming the linked list
>>> +   of CP_HISTORY does not have duplications.  It's the caller's
>>> +   responsibility to make sure that the linked list of CP_HISTORY does
>>> +   not have duplications.  */
>>> +
>>> +void
>>> +insert_copy_history (gimple *stmt, copy_history_t cp_history)
>>> +{
>>> +  if (!copy_history_map)
>>> +    copy_history_map = new copy_history_map_t;
>>> +
>>> +  copy_history_map->put (stmt, cp_history);
>>> +  return;
>>> +}
>>> +
>>> +/* Get the copy history for the gimple STMT, return NULL when there is
>>> +   no associated copy history.  */
>>> +
>>> +copy_history_t
>>> +get_copy_history (gimple *stmt)
>>> +{
>>> +  if (!copy_history_map)
>>> +    return NULL;
>>> +
>>> +  if (const copy_history_t *cp_history_p = copy_history_map->get (stmt))
>>> +    return *cp_history_p;
>>> +
>>> +  return NULL;
>>> +}
>>> +
>>> +/* Remove the copy history for STMT.  */
>>> +
>>> +void
>>> +remove_copy_history (gimple *stmt)
>>> +{
>>> +  if (!copy_history_map)
>>> +    return;
>>> +  copy_history_map->remove (stmt);
>>> +  return;
>>> +}
>>> +
>>> +/* Check whether the cond_location, is_true_path and reason existed
>>> + * in the OLD_COPY_HISTORY.  */
>>> +
>>> +static bool
>>> +is_copy_history_existed (location_t cond_location, bool is_true_path,
>>> + enum copy_reason reason,
>>> + copy_history_t old_copy_history)
>>> +{
>>> +  for (copy_history_t cur_ch = old_copy_history; cur_ch;
>>> +       cur_ch = cur_ch->prev_copy)
>>> +    if ((cur_ch->condition == cond_location)
>>> + && (cur_ch->is_true_path == is_true_path)
>>> + && (cur_ch->reason == reason))
>>> +      return true;
>>> +
>>> +  return false;
>>> +}
>>> +
>>> +/* Set copy history for the gimple STMT.  Return TRUE when a new copy
>>> + * history is created and inserted.  Otherwise return FALSE.  */
>>> +
>>> +bool
>>> +set_copy_history (gimple *stmt, location_t cond_location,
>>> +  bool is_true_path, enum copy_reason reason)
>>> +{
>>> +
>>> +  /* First, get the old copy history associated with this STMT.  */
>>> +  copy_history_t old_cp_history = get_copy_history (stmt);
>>> +
>>> +  /* If the current copy history is not in the STMT's copy history linked
>>> +     list yet, create the new copy history, put the old_copy_history as the
>>> +     prev_copy of it.  */
>>> +  copy_history_t new_cp_history = NULL;
>>> +  if (!is_copy_history_existed (cond_location, is_true_path,
>>> + reason, old_cp_history))
>>> +    new_cp_history
>>> +      = create_copy_history (cond_location, is_true_path,
>>> +     reason, old_cp_history);
>>> +
>>> +  /* Insert the copy history into the hash map.  */
>>> +  if (new_cp_history)
>>> +    {
>>> +      insert_copy_history (stmt, new_cp_history);
>>> +      return true;
>>> +    }
>>> +
>>> +  return false;
>>> +}
>>> +
>>> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
>>> +   compiler within the same process.  For use by toplev::finalize.  */
>>> +
>>> +void
>>> +copy_history_finalize (void)
>>> +{
>>> +  if (copy_history_map)
>>> +    {
>>> +      delete copy_history_map;
>>> +      copy_history_map = NULL;
>>> +    }
>>> +  obstack_free (&copy_history_obstack, NULL);
>>> +  return;
>>> +}
>>> diff --git a/gcc/diagnostic-copy-history.h b/gcc/diagnostic-copy-history.h
>>> new file mode 100644
>>> index 00000000000..0c2c55a2a81
>>> --- /dev/null
>>> +++ b/gcc/diagnostic-copy-history.h
>>> @@ -0,0 +1,80 @@
>>> +/* Copy history associated with a gimple statement to record its history
>>> +   of duplications due to different transformations.
>>> +   The copy history will be used to construct events for later diagnostic.
>>> +
>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
>>> +
>>> +   This file is part of GCC.
>>> +
>>> +   GCC is free software; you can redistribute it and/or modify it under
>>> +   the terms of the GNU General Public License as published by the Free
>>> +   Software Foundation; either version 3, or (at your option) any later
>>> +   version.
>>> +
>>> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>> +   for more details.
>>> +
>>> +   You should have received a copy of the GNU General Public License
>>> +   along with GCC; see the file COPYING3.  If not see
>>> +   <http://www.gnu.org/licenses/>.  */
>>> +
>>> +#ifndef DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>>> +#define DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>>> +
>>> +#include "hash-map.h"
>>> +#include "line-map.h"
>>> +
>>> +/* An enum for the reason why this copy is made.  Right now, there are
>>> +   three reasons, we can add more if needed.  */
>>> +enum copy_reason {
>>> +  COPY_BY_THREAD_JUMP,
>>> +  COPY_BY_LOOP_UNSWITCH,
>>> +  COPY_BY_LOOP_UNROLL
>>> +};
>>> +
>>> +/* This data structure records the information when a statement is
>>> +   duplicated during different transformations.  Such information
>>> +   will be used by the later diagnostic messages to report more
>>> +   contexts of the warnings or errors.  */
>>> +struct copy_history {
>>> +  /* The location of the condition statement that triggered the code
>>> +     duplication.  */
>>> +  location_t condition;
>>> +
>>> +  /* Whether this copy is on the TRUE path of the condition.  */
>>> +  bool is_true_path;
>>> +
>>> +  /* The reason for the code duplication.  */
>>> +  enum copy_reason reason;
>>> +
>>> +  /* This statement itself might be a previous code duplication.  */
>>> +  struct copy_history *prev_copy;
>>> +};
>>> +
>>> +typedef struct copy_history *copy_history_t;
>>> +
>>> +/* Create a new copy history.  */
>>> +extern copy_history_t create_copy_history (location_t, bool,
>>> +   enum copy_reason, copy_history_t);
>>> +
>>> +typedef hash_map<gimple *, copy_history_t> copy_history_map_t;
>>> +
>>> +/* Get the copy history for the gimple STMT, return NULL when there is
>>> +   no associated copy history.  */
>>> +extern copy_history_t get_copy_history (gimple *);
>>> +
>>> +/* Remove the copy history for STMT from the copy_history_map.  */
>>> +extern void remove_copy_history (gimple *);
>>> +
>>> +/* Set copy history for the gimple STMT.  */
>>> +extern bool set_copy_history (gimple *, location_t,
>>> +      bool, enum copy_reason);
>>> +
>>> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
>>> +   compiler within the same process.  For use by toplev::finalize.  */
>>> +extern void copy_history_finalize (void);
>>> +
>>> +#endif // DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>> index b37c7af7a39..bb0c6a87523 100644
>>> --- a/gcc/doc/invoke.texi
>>> +++ b/gcc/doc/invoke.texi
>>> @@ -321,6 +321,7 @@ Objective-C and Objective-C++ Dialects}.
>>> -fdiagnostics-column-origin=@var{origin}
>>> -fdiagnostics-escape-format=@r{[}unicode@r{|}bytes@r{]}
>>> -fdiagnostics-text-art-charset=@r{[}none@r{|}ascii@r{|}unicode@r{|}emoji@r{]}}
>>> +-fdiagnostics-explain-harder
>>> 
>>> @item Warning Options
>>> @xref{Warning Options,,Options to Request or Suppress Warnings}.
>>> @@ -5518,6 +5519,12 @@ left margin.
>>> This option controls the minimum width of the left margin printed by
>>> @option{-fdiagnostics-show-line-numbers}.  It defaults to 6.
>>> 
>>> +@opindex fdiagnostics-explain-harder
>>> +@item -fdiagnostics-explain-harder
>>> +With this option, the compiler collects more context information for
>>> +diagnostics and emits them to the users to provide more hints on how
>>> +the diagnostics come from.
>>> +
>>> @opindex fdiagnostics-parseable-fixits
>>> @item -fdiagnostics-parseable-fixits
>>> Emit fix-it hints in a machine-parseable format, suitable for consumption
>>> diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
>>> index 1637a2fc4f4..ba98ab3b413 100644
>>> --- a/gcc/gimple-array-bounds.cc
>>> +++ b/gcc/gimple-array-bounds.cc
>>> @@ -31,6 +31,9 @@ along with GCC; see the file COPYING3.  If not see
>>> #include "tree-dfa.h"
>>> #include "fold-const.h"
>>> #include "diagnostic-core.h"
>>> +#include "simple-diagnostic-path.h"
>>> +#include "diagnostic-copy-history.h"
>>> +#include "copy-history-diagnostic-path.h"
>>> #include "intl.h"
>>> #include "tree-vrp.h"
>>> #include "alloc-pool.h"
>>> @@ -254,6 +257,25 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
>>> return;
>>> }
>>> 
>>> +/* Build a rich location for LOCATION, and populate the diagonistic path
>>> +   for it.  */
>>> +
>>> +static rich_location *
>>> +build_rich_location_with_diagnostic_path (location_t location, gimple *stmt)
>>> +{
>>> +  /* Generate a rich location for this location.  */
>>> +  rich_location *richloc = new rich_location (line_table, location);
>>> +
>>> +  copy_history_t cp_history = stmt ? get_copy_history (stmt) : NULL;
>>> +  copy_history_diagnostic_path *path
>>> +    = new copy_history_diagnostic_path (global_dc->printer,
>>> + cp_history, location);
>>> +  path->populate_path_from_copy_history ();
>>> +
>>> +  richloc->set_path (path);
>>> +  return richloc;
>>> +}
>>> +
>>> /* Given the LOW_SUB_ORG, LOW_SUB and UP_SUB, and the computed UP_BOUND
>>>  and UP_BOUND_P1, check whether the array reference REF is out of bound.
>>>  When out of bounds, set OUT_OF_BOUND to true.
>>> @@ -262,6 +284,7 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
>>> 
>>> static bool
>>> check_out_of_bounds_and_warn (location_t location, tree ref,
>>> +      gimple *stmt,
>>>    tree low_sub_org, tree low_sub, tree up_sub,
>>>    tree up_bound, tree up_bound_p1,
>>>    const irange *vr,
>>> @@ -280,9 +303,13 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>>   {
>>>     *out_of_bound = true;
>>>     if (for_array_bound)
>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>> + {
>>> +  rich_location *richloc
>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>   "array subscript %E is outside array"
>>>   " bounds of %qT", low_sub_org, artype);
>>> + }
>>>   }
>>> 
>>> if (warned)
>>> @@ -299,10 +326,14 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>> {
>>> *out_of_bound = true;
>>> if (for_array_bound)
>>> -    warned = warning_at (location, OPT_Warray_bounds_,
>>> +    {
>>> +      rich_location *richloc
>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>> "array subscript [%E, %E] is outside "
>>> "array bounds of %qT",
>>> low_sub, up_sub, artype);
>>> +    }
>>> }
>>>   }
>>> else if (up_bound
>>> @@ -313,18 +344,26 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>>   {
>>>     *out_of_bound = true;
>>>     if (for_array_bound)
>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>> + {
>>> +  rich_location *richloc
>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>   "array subscript %E is above array bounds of %qT",
>>>   up_sub, artype);
>>> + }
>>>   }
>>> else if (TREE_CODE (low_sub) == INTEGER_CST
>>> && tree_int_cst_lt (low_sub, low_bound))
>>>   {
>>>     *out_of_bound = true;
>>>     if (for_array_bound)
>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>> + {
>>> +  rich_location *richloc
>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>   "array subscript %E is below array bounds of %qT",
>>>   low_sub, artype);
>>> + }
>>>   }
>>> return warned;
>>> }
>>> @@ -388,21 +427,24 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>>> }
>>>   }
>>> 
>>> -  warned = check_out_of_bounds_and_warn (location, ref,
>>> +  warned = check_out_of_bounds_and_warn (location, ref, stmt,
>>> low_sub_org, low_sub, up_sub,
>>> up_bound, up_bound_p1, &vr,
>>> ignore_off_by_one, warn_array_bounds,
>>> &out_of_bound);
>>> 
>>> -
>>> if (!warned && sam == special_array_member::int_0)
>>> -    warned = warning_at (location, OPT_Wzero_length_bounds,
>>> +    {
>>> +      rich_location *richloc
>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>> +      warned = warning_at (richloc, OPT_Wzero_length_bounds,
>>> (TREE_CODE (low_sub) == INTEGER_CST
>>> ? G_("array subscript %E is outside the bounds "
>>>     "of an interior zero-length array %qT")
>>> : G_("array subscript %qE is outside the bounds "
>>>     "of an interior zero-length array %qT")),
>>> low_sub, artype);
>>> +    }
>>> 
>>> if (warned && dump_file && (dump_flags & TDF_DETAILS))
>>>   {
>>> @@ -419,8 +461,10 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>>> || sam == special_array_member::trail_n)
>>>     && DECL_NOT_FLEXARRAY (afield_decl))
>>>   {
>>> +      rich_location *richloc
>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>     bool warned1
>>> - = warning_at (location, OPT_Wstrict_flex_arrays,
>>> + = warning_at (richloc, OPT_Wstrict_flex_arrays,
>>>    "trailing array %qT should not be used as "
>>>    "a flexible array member",
>>>    artype);
>>> @@ -478,6 +522,7 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>>> 
>>> bool
>>> array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>> +     gimple *stmt,
>>>   bool ignore_off_by_one)
>>> {
>>> if (warning_suppressed_p (ref, OPT_Warray_bounds_))
>>> @@ -580,16 +625,24 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>> if (lboob)
>>>   {
>>>     if (offrange[0] == offrange[1])
>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>> + {
>>> +  rich_location *richloc
>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>   "array subscript %wi is outside array bounds "
>>>   "of %qT",
>>>   offrange[0].to_shwi (), reftype);
>>> + }
>>>     else
>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>> + {
>>> +  rich_location *richloc
>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>   "array subscript [%wi, %wi] is outside "
>>>   "array bounds of %qT",
>>>   offrange[0].to_shwi (),
>>>   offrange[1].to_shwi (), reftype);
>>> + }
>>>   }
>>> else if (uboob && !ignore_off_by_one)
>>>   {
>>> @@ -599,8 +652,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>> it were an untyped array of bytes.  */
>>> backtype = build_array_type_nelts (unsigned_char_type_node,
>>> aref.sizrng[1].to_uhwi ());
>>> -
>>> -      warned = warning_at (location, OPT_Warray_bounds_,
>>> +      rich_location *richloc
>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>> "array subscript %<%T[%wi]%> is partly "
>>> "outside array bounds of %qT",
>>> axstype, offrange[0].to_shwi (), backtype);
>>> @@ -623,7 +677,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>>   {
>>>     HOST_WIDE_INT tmpidx = (aref.offmax[i] / eltsize).to_shwi ();
>>> 
>>> -      if (warning_at (location, OPT_Warray_bounds_,
>>> +      rich_location *richloc
>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>> +      if (warning_at (richloc, OPT_Warray_bounds_,
>>>    "intermediate array offset %wi is outside array bounds "
>>>    "of %qT", tmpidx, reftype))
>>> {
>>> @@ -656,7 +712,7 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>>> ignore_off_by_one = false;
>>> }
>>>     else if (TREE_CODE (t) == MEM_REF)
>>> - warned = check_mem_ref (location, t, ignore_off_by_one);
>>> + warned = check_mem_ref (location, t, stmt, ignore_off_by_one);
>>> 
>>>     if (warned)
>>> suppress_warning (t, OPT_Warray_bounds_);
>>> @@ -702,7 +758,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>>> dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>>> fprintf (dump_file, "\n");
>>> }
>>> -      warned = warning_at (location, OPT_Warray_bounds_,
>>> +      rich_location *richloc
>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>> "array subscript %wi is below "
>>> "array bounds of %qT",
>>> idx.to_shwi (), TREE_TYPE (tem));
>>> @@ -716,7 +774,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>>> dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>>> fprintf (dump_file, "\n");
>>> }
>>> -      warned = warning_at (location, OPT_Warray_bounds_,
>>> +      rich_location *richloc
>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>> "array subscript %wu is above "
>>> "array bounds of %qT",
>>> idx.to_uhwi (), TREE_TYPE (tem));
>>> @@ -811,7 +871,7 @@ array_bounds_checker::check_array_bounds (tree *tp, int *walk_subtree,
>>>   warned = checker->check_array_ref (location, t, wi->stmt,
>>>     false/*ignore_off_by_one*/);
>>> else if (TREE_CODE (t) == MEM_REF)
>>> -    warned = checker->check_mem_ref (location, t,
>>> +    warned = checker->check_mem_ref (location, t, wi->stmt,
>>>   false /*ignore_off_by_one*/);
>>> else if (TREE_CODE (t) == ADDR_EXPR)
>>>   {
>>> diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h
>>> index aa7ca8e9730..2d1d48d1e94 100644
>>> --- a/gcc/gimple-array-bounds.h
>>> +++ b/gcc/gimple-array-bounds.h
>>> @@ -33,7 +33,7 @@ public:
>>> private:
>>> static tree check_array_bounds (tree *tp, int *walk_subtree, void *data);
>>> bool check_array_ref (location_t, tree, gimple *, bool ignore_off_by_one);
>>> -  bool check_mem_ref (location_t, tree, bool ignore_off_by_one);
>>> +  bool check_mem_ref (location_t, tree, gimple *, bool ignore_off_by_one);
>>> void check_addr_expr (location_t, tree, gimple *);
>>> void get_value_range (irange &r, const_tree op, gimple *);
>>> 
>>> diff --git a/gcc/gimple-iterator.cc b/gcc/gimple-iterator.cc
>>> index 93646262eac..472a79360d7 100644
>>> --- a/gcc/gimple-iterator.cc
>>> +++ b/gcc/gimple-iterator.cc
>>> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
>>> #include "tree-ssa.h"
>>> #include "value-prof.h"
>>> #include "gimplify.h"
>>> +#include "diagnostic-copy-history.h"
>>> 
>>> 
>>> /* Mark the statement STMT as modified, and update it.  */
>>> @@ -581,6 +582,8 @@ gsi_remove (gimple_stmt_iterator *i, bool remove_permanently)
>>> cfun->debug_marker_count--;
>>>     require_eh_edge_purge = remove_stmt_from_eh_lp (stmt);
>>>     gimple_remove_stmt_histograms (cfun, stmt);
>>> +      if (get_copy_history (stmt) != NULL)
>>> + remove_copy_history (stmt);
>>>   }
>>> 
>>> /* Update the iterator and re-wire the links in I->SEQ.  */
>>> diff --git a/gcc/testsuite/gcc.dg/pr109071.c b/gcc/testsuite/gcc.dg/pr109071.c
>>> new file mode 100644
>>> index 00000000000..95ae576451e
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/pr109071.c
>>> @@ -0,0 +1,43 @@
>>> +/* PR tree-optimization/109071 need more context for -Warray-bounds warnings
>>> +   due to code duplication from jump threading.  */  
>>> +/* { dg-options "-O2 -Wall -fdiagnostics-explain-harder" } */
>>> +/* { dg-additional-options "-fdiagnostics-show-line-numbers -fdiagnostics-path-format=inline-events -fdiagnostics-show-caret" } */
>>> +/* { dg-enable-nn-line-numbers "" } */
>>> +
>>> +extern void warn(void);
>>> +static inline void assign(int val, int *regs, int index)
>>> +{
>>> +  if (index >= 4)
>>> +    warn();
>>> +  *regs = val;
>>> +}
>>> +struct nums {int vals[4];};
>>> +
>>> +void sparx5_set (int *ptr, struct nums *sg, int index)
>>> +{
>>> +  int *val = &sg->vals[index]; /* { dg-warning "is above array bounds" } */
>>> +
>>> +  assign(0,    ptr, index);
>>> +  assign(*val, ptr, index);
>>> +}
>>> +/* { dg-begin-multiline-output "" }
>>> +   NN |   int *val = &sg->vals[index];
>>> +      |               ~~~~~~~~^~~~~~~
>>> +  'sparx5_set': events 1-2
>>> +   NN |   if (index >= 4)
>>> +      |      ^
>>> +      |      |
>>> +      |      (1) when the condition is evaluated to true
>>> +......
>>> +   { dg-end-multiline-output "" } */
>>> +
>>> +/* { dg-begin-multiline-output "" }
>>> +      |               ~~~~~~~~~~~~~~~
>>> +      |                       |
>>> +      |                       (2) out of array bounds here
>>> +   { dg-end-multiline-output "" } */
>>> +
>>> +/* { dg-begin-multiline-output "" }
>>> +   NN | struct nums {int vals[4];};
>>> +      |                  ^~~~
>>> +   { dg-end-multiline-output "" } */
>>> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
>>> index f715f977b72..e9ae55f64be 100644
>>> --- a/gcc/toplev.cc
>>> +++ b/gcc/toplev.cc
>>> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
>>> #include "cgraph.h"
>>> #include "coverage.h"
>>> #include "diagnostic.h"
>>> +#include "diagnostic-copy-history.h"
>>> #include "varasm.h"
>>> #include "tree-inline.h"
>>> #include "realmpfr.h" /* For GMP/MPFR/MPC versions, in print_version.  */
>>> @@ -2381,6 +2382,8 @@ toplev::finalize (void)
>>> tree_cc_finalize ();
>>> reginfo_cc_finalize ();
>>> 
>>> +  copy_history_finalize ();
>>> +
>>> /* save_decoded_options uses opts_obstack, so these must
>>>    be cleaned up together.  */
>>> obstack_free (&opts_obstack, NULL);
>>> diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
>>> index fa61ba9512b..9292f0c2aa1 100644
>>> --- a/gcc/tree-ssa-threadupdate.cc
>>> +++ b/gcc/tree-ssa-threadupdate.cc
>>> @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
>>> #include "tree-cfg.h"
>>> #include "tree-vectorizer.h"
>>> #include "tree-pass.h"
>>> +#include "diagnostic-copy-history.h"
>>> 
>>> /* Given a block B, update the CFG and SSA graph to reflect redirecting
>>>  one or more in-edges to B to instead reach the destination of an
>>> @@ -2371,6 +2372,37 @@ back_jt_path_registry::adjust_paths_after_duplication (unsigned curr_path_num)
>>>   }
>>> }
>>> 
>>> +/* Set copy history to all the stmts in the basic block BB based on
>>> +   the entry edge ENTRY and whether this basic block is on the true
>>> +   path of the condition in the destination of the edge ENTRY.
>>> +   Return TRUE when copy history are set, otherwise return FALSE.  */
>>> +
>>> +static bool
>>> +set_copy_history_to_stmts_in_bb (basic_block bb, edge entry, bool is_true_path)
>>> +{
>>> +  /* First, get the condition statement in the destination of the
>>> +     edge ENTRY.  */
>>> +  basic_block cond_block = entry->dest;
>>> +  gimple *cond_stmt = NULL;
>>> +  gimple_stmt_iterator gsi;
>>> +  gsi = gsi_last_bb (cond_block);
>>> +  /* if the cond_block ends with a conditional statement, get it.  */
>>> +  if (!gsi_end_p (gsi)
>>> +      && gsi_stmt (gsi)
>>> +      && (gimple_code (gsi_stmt (gsi)) == GIMPLE_COND))
>>> +    cond_stmt = gsi_stmt (gsi);
>>> +
>>> +  if (!cond_stmt)
>>> +    return false;
>>> +
>>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>> +    set_copy_history (gsi_stmt (gsi),
>>> +      gimple_location (cond_stmt),
>>> +      is_true_path,
>>> +      COPY_BY_THREAD_JUMP);
>>> +  return true;
>>> +}
>>> +
>>> /* Duplicates a jump-thread path of N_REGION basic blocks.
>>>  The ENTRY edge is redirected to the duplicate of the region.
>>> 
>>> @@ -2419,6 +2451,16 @@ back_jt_path_registry::duplicate_thread_path (edge entry,
>>> copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
>>>  split_edge_bb_loc (entry), false);
>>> 
>>> +  /* Set the copy history for all the stmts in both original and copied
>>> +     basic blocks.  The copied regions are in the true path.  */
>>> +  if (flag_diagnostics_explain_harder)
>>> +    for (i = 0; i < n_region; i++)
>>> +      {
>>> + set_copy_history_to_stmts_in_bb (region[i], entry, false);
>>> + set_copy_history_to_stmts_in_bb (region_copy[i], entry, true);
>>> +      }
>>> +
>>> +
>>> /* Fix up: copy_bbs redirects all edges pointing to copied blocks.  The
>>>    following code ensures that all the edges exiting the jump-thread path are
>>>    redirected back to the original code: these edges are exceptions
>>> -- 
>>> 2.31.1
>>> 
>> 
>
Qing Zhao Sept. 3, 2024, 1:55 p.m. UTC | #4
Hi, Richard, 

I’d like to ping this patch again. 

For the convenience, the original 2nd version of the patch is at:
https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657150.html

The diagnostic part has been reviewed by David. Could you please take a look at
the middle end implementation and let me know whether it’s okay for committing?

It has been waiting for the middle-end review for 2 months already. The implementation is based on what you suggested during the discussion of this problem. 

Thanks.

Qing

> On Aug 12, 2024, at 09:50, Qing Zhao <qing.zhao@oracle.com> wrote:
> 
> Hi,  Richard,
> 
> Do we still need such improvement into GCC? Could you please take a look at the patch and let me know
> Any comment or suggestions?
> 
> thanks.
> 
> Qing
> 
> The 3rd ping for the following patch:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657150.html
> 
>> On Jul 29, 2024, at 11:32, Qing Zhao <qing.zhao@oracle.com> wrote:
>> 
>> The 2nd ping for the following patch:
>> 
>> https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657150.html
>> 
>> thanks.
>> 
>> Qing
>> 
>>> On Jul 22, 2024, at 09:01, Qing Zhao <qing.zhao@oracle.com> wrote:
>>> 
>>> Hi, Richard,
>>> 
>>> Could you please take a look at the patch and let me know any comment you have (especially on the middle-end part)?
>>> 
>>> David, let me know if you have further comment and suggestions. 
>>> 
>>> Thanks a lot.
>>> 
>>> Qing
>>> 
>>>> On Jul 12, 2024, at 10:03, Qing Zhao <qing.zhao@oracle.com> wrote:
>>>> 
>>>> due to code duplication from jump threading [PR109071]
>>>> Control this with a new option -fdiagnostic-explain-harder.
>>>> 
>>>> Compared to V1, the major difference are: (address David's comments)
>>>> 
>>>> 1. Change the name of the option from:
>>>> 
>>>> -fdiagnostic-try-to-explain-harder 
>>>> To
>>>> -fdiagnostic-explain-harder 
>>>> 
>>>> 2. Sync the commit comments with the real output of the compilation message.
>>>> 
>>>> 3. Add one more event in the end of the path to repeat the out-of-bound
>>>> issue.
>>>> 
>>>> 4. Fixed the unnecessary changes in Makefile.in.
>>>> 
>>>> 5. Add new copy_history_diagnostic_path.[cc|h] to implement a new
>>>> class copy_history_diagnostic_path : public diagnostic_path
>>>> 
>>>> for copy_history_t. 
>>>> 
>>>> 6. Only building the rich locaiton and populating the path when warning_at
>>>> is called.
>>>> 
>>>> There are two comments from David that I didn't addressed in this version:
>>>> 
>>>> 1. Make regenerate-opt-urls.
>>>> will do this in a later version. 
>>>> 
>>>> 2. Add a ⚠️  emoji for the last event. 
>>>> I didn't add this yet since I think the current message is clear enough.
>>>> might not worth the effort to add this emoji (it's not that straightforward
>>>> to add on). 
>>>> 
>>>> With this new version, the message emitted by GCC:
>>>> 
>>>> $gcc -O2 -Wall -fdiagnostics-explain-harder -c -o t.o t.c
>>>> t.c: In function ‘sparx5_set’:
>>>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>>> 12 |   int *val = &sg->vals[index];
>>>>   |               ~~~~~~~~^~~~~~~
>>>> ‘sparx5_set’: events 1-2
>>>> 4 |   if (*index >= 4)
>>>>   |      ^
>>>>   |      |
>>>>   |      (1) when the condition is evaluated to true
>>>> ......
>>>> 12 |   int *val = &sg->vals[index];
>>>>   |               ~~~~~~~~~~~~~~~
>>>>   |                       |
>>>>   |                       (2) out of array bounds here
>>>> t.c:8:18: note: while referencing ‘vals’
>>>> 8 | struct nums {int vals[4];};
>>>>   |                  ^~~~
>>>> 
>>>> Bootstrapped and regression tested on both aarch64 and x86. no issues.
>>>> 
>>>> Let me know any further comments and suggestions.
>>>> 
>>>> thanks.
>>>> 
>>>> Qing
>>>> 
>>>> ==================================================
>>>> $ cat t.c
>>>> extern void warn(void);
>>>> static inline void assign(int val, int *regs, int *index)
>>>> {
>>>> if (*index >= 4)
>>>> warn();
>>>> *regs = val;
>>>> }
>>>> struct nums {int vals[4];};
>>>> 
>>>> void sparx5_set (int *ptr, struct nums *sg, int index)
>>>> {
>>>> int *val = &sg->vals[index];
>>>> 
>>>> assign(0,    ptr, &index);
>>>> assign(*val, ptr, &index);
>>>> }
>>>> 
>>>> $ gcc -Wall -O2  -c -o t.o t.c
>>>> t.c: In function ‘sparx5_set’:
>>>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>>> 12 |   int *val = &sg->vals[index];
>>>>   |               ~~~~~~~~^~~~~~~
>>>> t.c:8:18: note: while referencing ‘vals’
>>>> 8 | struct nums {int vals[4];};
>>>>   |                  ^~~~
>>>> 
>>>> In the above, Although the warning is correct in theory, the warning message
>>>> itself is confusing to the end-user since there is information that cannot
>>>> be connected to the source code directly.
>>>> 
>>>> It will be a nice improvement to add more information in the warning message
>>>> to report where such index value come from.
>>>> 
>>>> In order to achieve this, we add a new data structure copy_history to record
>>>> the condition and the transformation that triggered the code duplication.
>>>> Whenever there is a code duplication due to some specific transformations,
>>>> such as jump threading, loop switching, etc, a copy_history structure is
>>>> created and attached to the duplicated gimple statement.
>>>> 
>>>> During array out-of-bound checking or other warning checking, the copy_history
>>>> that was attached to the gimple statement is used to form a sequence of
>>>> diagnostic events that are added to the corresponding rich location to be used
>>>> to report the warning message.
>>>> 
>>>> This behavior is controled by the new option -fdiagnostic-explain-harder
>>>> which is off by default.
>>>> 
>>>> With this change, by adding -fdiagnostic-explain-harder,
>>>> the warning message for the above testing case is now:
>>>> 
>>>> $ gcc -Wall -O2 -fdiagnostics-explain-harder -c -o t.o t.c
>>>> t.c: In function ‘sparx5_set’:
>>>> t.c:12:23: warning: array subscript 4 is above array bounds of ‘int[4]’ [-Warray-bounds=]
>>>> 12 |   int *val = &sg->vals[index];
>>>>   |               ~~~~~~~~^~~~~~~
>>>> ‘sparx5_set’: events 1-2
>>>> 4 |   if (*index >= 4)
>>>>   |      ^
>>>>   |      |
>>>>   |      (1) when the condition is evaluated to true
>>>> ......
>>>> 12 |   int *val = &sg->vals[index];
>>>>   |               ~~~~~~~~~~~~~~~
>>>>   |                       |
>>>>   |                       (2) out of array bounds here
>>>> t.c:8:18: note: while referencing ‘vals’
>>>> 8 | struct nums {int vals[4];};
>>>>   |                  ^~~~
>>>> 
>>>> gcc/ChangeLog:
>>>> 
>>>> PR tree-optimization/109071
>>>> 
>>>> * Makefile.in (OBJS): Add diagnostic-copy-history.o
>>>> and copy-history-diagnostic-path.o.
>>>> * gcc/common.opt (fdiagnostics-explain-harder): New option.
>>>> * gcc/doc/invoke.texi (fdiagnostics-explain-harder): Add
>>>> documentation for the new option.
>>>> * gimple-array-bounds.cc (build_rich_location_with_diagnostic_path):
>>>> New function.
>>>> (check_out_of_bounds_and_warn): Add one new parameter. Use rich
>>>> location with copy_history_diagnostic_path for warning_at.
>>>> (array_bounds_checker::check_array_ref): Use rich location with
>>>> copy_history_diagnostic_path for warning_at.
>>>> (array_bounds_checker::check_mem_ref): Add one new parameter.
>>>> Use rich location with copy_history_diagnostic_path for warning_at.
>>>> (array_bounds_checker::check_addr_expr): Use rich location with
>>>> copy_history_diagnostic_path for warning_at.
>>>> (array_bounds_checker::check_array_bounds): Call check_mem_ref with
>>>> one more parameter.
>>>> * gimple-array-bounds.h: Update prototype for check_mem_ref.
>>>> * gimple-iterator.cc (gsi_remove): (gsi_remove): Remove the copy
>>>> history when removing the gimple.
>>>> * toplev.cc (toplev::finalize):  Call copy_history_finalize.
>>>> * tree-ssa-threadupdate.cc (set_copy_history_to_stmts_in_bb): New
>>>> function.
>>>> (back_jt_path_registry::duplicate_thread_path): Call the new function
>>>> to set copy history for every duplicated gimple.
>>>> * copy-history-diagnostic-path.cc: New file.
>>>> * copy-history-diagnostic-path.h: New file.
>>>> * diagnostic-copy-history.cc: New file.
>>>> * diagnostic-copy-history.h: New file.
>>>> 
>>>> gcc/testsuite/ChangeLog:
>>>> 
>>>> PR tree-optimization/109071
>>>> 
>>>> * gcc.dg/pr109071.c: New test.
>>>> ---
>>>> gcc/Makefile.in                     |   2 +
>>>> gcc/common.opt                      |   4 +
>>>> gcc/copy-history-diagnostic-path.cc |  86 +++++++++++++++
>>>> gcc/copy-history-diagnostic-path.h  |  87 +++++++++++++++
>>>> gcc/diagnostic-copy-history.cc      | 163 ++++++++++++++++++++++++++++
>>>> gcc/diagnostic-copy-history.h       |  80 ++++++++++++++
>>>> gcc/doc/invoke.texi                 |   7 ++
>>>> gcc/gimple-array-bounds.cc          |  94 +++++++++++++---
>>>> gcc/gimple-array-bounds.h           |   2 +-
>>>> gcc/gimple-iterator.cc              |   3 +
>>>> gcc/testsuite/gcc.dg/pr109071.c     |  43 ++++++++
>>>> gcc/toplev.cc                       |   3 +
>>>> gcc/tree-ssa-threadupdate.cc        |  42 +++++++
>>>> 13 files changed, 598 insertions(+), 18 deletions(-)
>>>> create mode 100644 gcc/copy-history-diagnostic-path.cc
>>>> create mode 100644 gcc/copy-history-diagnostic-path.h
>>>> create mode 100644 gcc/diagnostic-copy-history.cc
>>>> create mode 100644 gcc/diagnostic-copy-history.h
>>>> create mode 100644 gcc/testsuite/gcc.dg/pr109071.c
>>>> 
>>>> diff --git a/gcc/Makefile.in b/gcc/Makefile.in
>>>> index 7d3ea27a6ab..f8bbfda845f 100644
>>>> --- a/gcc/Makefile.in
>>>> +++ b/gcc/Makefile.in
>>>> @@ -1436,6 +1436,8 @@ OBJS = \
>>>> df-problems.o \
>>>> df-scan.o \
>>>> dfp.o \
>>>> + diagnostic-copy-history.o \
>>>> + copy-history-diagnostic-path.o \
>>>> digraph.o \
>>>> dojump.o \
>>>> dominance.o \
>>>> diff --git a/gcc/common.opt b/gcc/common.opt
>>>> index a300470bbc5..714280390d9 100644
>>>> --- a/gcc/common.opt
>>>> +++ b/gcc/common.opt
>>>> @@ -1558,6 +1558,10 @@ fdiagnostics-minimum-margin-width=
>>>> Common Joined UInteger Var(diagnostics_minimum_margin_width) Init(6)
>>>> Set minimum width of left margin of source code when showing source.
>>>> 
>>>> +fdiagnostics-explain-harder
>>>> +Common Var(flag_diagnostics_explain_harder)
>>>> +Collect and print more context information for diagnostics.
>>>> +
>>>> fdisable-
>>>> Common Joined RejectNegative Var(common_deferred_options) Defer
>>>> -fdisable-[tree|rtl|ipa]-<pass>=range1+range2 Disable an optimization pass.
>>>> diff --git a/gcc/copy-history-diagnostic-path.cc b/gcc/copy-history-diagnostic-path.cc
>>>> new file mode 100644
>>>> index 00000000000..65e5c056165
>>>> --- /dev/null
>>>> +++ b/gcc/copy-history-diagnostic-path.cc
>>>> @@ -0,0 +1,86 @@
>>>> +/* Classes for implementing diagnostic paths for copy_history_t.
>>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>>> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
>>>> +
>>>> +This file is part of GCC.
>>>> +
>>>> +GCC is free software; you can redistribute it and/or modify it under
>>>> +the terms of the GNU General Public License as published by the Free
>>>> +Software Foundation; either version 3, or (at your option) any later
>>>> +version.
>>>> +
>>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>> +for more details.
>>>> +
>>>> +You should have received a copy of the GNU General Public License
>>>> +along with GCC; see the file COPYING3.  If not see
>>>> +<http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#include "config.h"
>>>> +#include "system.h"
>>>> +#include "coretypes.h"
>>>> +#include "function.h"
>>>> +#include "copy-history-diagnostic-path.h"
>>>> +
>>>> +bool
>>>> +copy_history_diagnostic_path::same_function_p (int event_idx_a,
>>>> +       int event_idx_b) const
>>>> +{
>>>> +  return (m_events[event_idx_a]->get_fndecl ()
>>>> +  == m_events[event_idx_b]->get_fndecl ());
>>>> +}
>>>> +
>>>> +/* Add an event to this path at LOC within function FNDECL at
>>>> +   stack depth DEPTH.
>>>> +
>>>> +   Use m_context's printer to format FMT, as the text of the new
>>>> +   event.  */
>>>> +
>>>> +void
>>>> +copy_history_diagnostic_path::add_event (location_t loc, tree fndecl, int depth,
>>>> + const char *fmt, ...)
>>>> +{
>>>> +  pretty_printer *pp = m_event_pp;
>>>> +  pp_clear_output_area (pp);
>>>> +
>>>> +  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
>>>> +
>>>> +  va_list ap;
>>>> +
>>>> +  va_start (ap, fmt);
>>>> +
>>>> +  text_info ti (fmt, &ap, 0, nullptr, &rich_loc);
>>>> +  pp_format (pp, &ti);
>>>> +  pp_output_formatted_text (pp);
>>>> +
>>>> +  va_end (ap);
>>>> +
>>>> +  simple_diagnostic_event *new_event
>>>> +    = new simple_diagnostic_event (loc, fndecl, depth, pp_formatted_text (pp));
>>>> +  m_events.safe_push (new_event);
>>>> +
>>>> +  pp_clear_output_area (pp);
>>>> +
>>>> +  return;
>>>> +}
>>>> +
>>>> +/* Populate the diagnostic_path from the copy_history.  */
>>>> +void
>>>> +copy_history_diagnostic_path::populate_path_from_copy_history ()
>>>> +{
>>>> +  if (!m_copy_history)
>>>> +    return;
>>>> +
>>>> +  for (copy_history_t cur_ch = m_copy_history; cur_ch;
>>>> +       cur_ch = cur_ch->prev_copy)
>>>> +    add_event (cur_ch->condition, cfun->decl, 1,
>>>> +       "when the condition is evaluated to %s",
>>>> +       cur_ch->is_true_path ? "true" : "false");
>>>> +
>>>> +  /* Add an end of path warning event in the end of the path.  */
>>>> +  add_event (m_location, cfun->decl, 1,
>>>> +     "out of array bounds here");
>>>> +  return;
>>>> +}
>>>> diff --git a/gcc/copy-history-diagnostic-path.h b/gcc/copy-history-diagnostic-path.h
>>>> new file mode 100644
>>>> index 00000000000..377e9136070
>>>> --- /dev/null
>>>> +++ b/gcc/copy-history-diagnostic-path.h
>>>> @@ -0,0 +1,87 @@
>>>> +/* Classes for implementing diagnostic paths for copy_history_t.
>>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>>> +   Contributed by Qing Zhao<qing.zhao@oracle.com>
>>>> +
>>>> +This file is part of GCC.
>>>> +
>>>> +GCC is free software; you can redistribute it and/or modify it under
>>>> +the terms of the GNU General Public License as published by the Free
>>>> +Software Foundation; either version 3, or (at your option) any later
>>>> +version.
>>>> +
>>>> +GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>>> +WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>> +FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>> +for more details.
>>>> +
>>>> +You should have received a copy of the GNU General Public License
>>>> +along with GCC; see the file COPYING3.  If not see
>>>> +<http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#ifndef GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
>>>> +#define GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
>>>> +
>>>> +#include "diagnostic-path.h"
>>>> +#include "simple-diagnostic-path.h"
>>>> +#include "diagnostic-copy-history.h"
>>>> +
>>>> +/* An implementation of diagnostic_path for copy_history_t, as a
>>>> +   vector of simple_diagnostic_event instances.  */
>>>> +
>>>> +class copy_history_diagnostic_path : public diagnostic_path
>>>> +{
>>>> + public:
>>>> +  copy_history_diagnostic_path (pretty_printer *event_pp,
>>>> + copy_history_t cp_history,
>>>> + location_t loc)
>>>> +  : diagnostic_path (),
>>>> +    m_thread ("main"),
>>>> +    m_event_pp (event_pp),
>>>> +    m_copy_history (cp_history),
>>>> +    m_location (loc)
>>>> +  {}
>>>> +
>>>> +  unsigned num_events () const final override
>>>> +  {
>>>> +    return m_events.length ();
>>>> +  }
>>>> +  const diagnostic_event & get_event (int idx) const final override
>>>> +  {
>>>> +    return *m_events[idx];
>>>> +  }
>>>> +  unsigned num_threads () const final override
>>>> +  {
>>>> +    return 1;
>>>> +  }
>>>> +  const diagnostic_thread &
>>>> +  get_thread (diagnostic_thread_id_t) const final override
>>>> +  {
>>>> +    return m_thread;
>>>> +  }
>>>> +  bool
>>>> +  same_function_p (int event_idx_a,
>>>> +   int event_idx_b) const final override;
>>>> +
>>>> +  void add_event (location_t loc, tree fndecl, int depth,
>>>> +  const char *fmt, ...);
>>>> +
>>>> +  void populate_path_from_copy_history ();
>>>> +
>>>> + private:
>>>> +  simple_diagnostic_thread m_thread;
>>>> +
>>>> +  /* The events that have occurred along this path.  */
>>>> +  auto_delete_vec<simple_diagnostic_event> m_events;
>>>> +
>>>> +  pretty_printer *m_event_pp;
>>>> +
>>>> +  /* The copy_history associated with this path.  */
>>>> +  copy_history_t m_copy_history;
>>>> +
>>>> +  /* The location for the gimple statement where the
>>>> +     diagnostic message emitted.  */
>>>> +  location_t m_location;
>>>> +
>>>> +};
>>>> +
>>>> +#endif /* ! GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H */
>>>> diff --git a/gcc/diagnostic-copy-history.cc b/gcc/diagnostic-copy-history.cc
>>>> new file mode 100644
>>>> index 00000000000..2733d137622
>>>> --- /dev/null
>>>> +++ b/gcc/diagnostic-copy-history.cc
>>>> @@ -0,0 +1,163 @@
>>>> +/* Functions to handle copy history.
>>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>>> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
>>>> +
>>>> +   This file is part of GCC.
>>>> +
>>>> +   GCC is free software; you can redistribute it and/or modify it under
>>>> +   the terms of the GNU General Public License as published by the Free
>>>> +   Software Foundation; either version 3, or (at your option) any later
>>>> +   version.
>>>> +
>>>> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>>> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>> +   for more details.
>>>> +
>>>> +   You should have received a copy of the GNU General Public License
>>>> +   along with GCC; see the file COPYING3.  If not see
>>>> +   <http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#include "config.h"
>>>> +#include "system.h"
>>>> +#include "coretypes.h"
>>>> +#include "backend.h"
>>>> +#include "tree.h"
>>>> +#include "diagnostic-copy-history.h"
>>>> +
>>>> +/* A mapping from a gimple to a pointer to the copy history of it.  */
>>>> +static copy_history_map_t *copy_history_map;
>>>> +
>>>> +/* Obstack for copy history.  */
>>>> +static struct obstack copy_history_obstack;
>>>> +
>>>> +/* Create a new copy history.  */
>>>> +
>>>> +copy_history_t
>>>> +create_copy_history (location_t condition,
>>>> +     bool is_true_path,
>>>> +     enum copy_reason reason,
>>>> +     copy_history_t prev_copy)
>>>> +{
>>>> +  static bool initialized = false;
>>>> +
>>>> +  if (!initialized)
>>>> +    {
>>>> +      gcc_obstack_init (&copy_history_obstack);
>>>> +      initialized = true;
>>>> +    }
>>>> +
>>>> +  copy_history_t copy_history
>>>> +    = (copy_history_t) obstack_alloc (&copy_history_obstack,
>>>> +      sizeof (struct copy_history));
>>>> +  copy_history->condition = condition;
>>>> +  copy_history->is_true_path = is_true_path;
>>>> +  copy_history->reason = reason;
>>>> +  copy_history->prev_copy = prev_copy;
>>>> +  return copy_history;
>>>> +}
>>>> +
>>>> +/* Insert the copy history for the gimple STMT assuming the linked list
>>>> +   of CP_HISTORY does not have duplications.  It's the caller's
>>>> +   responsibility to make sure that the linked list of CP_HISTORY does
>>>> +   not have duplications.  */
>>>> +
>>>> +void
>>>> +insert_copy_history (gimple *stmt, copy_history_t cp_history)
>>>> +{
>>>> +  if (!copy_history_map)
>>>> +    copy_history_map = new copy_history_map_t;
>>>> +
>>>> +  copy_history_map->put (stmt, cp_history);
>>>> +  return;
>>>> +}
>>>> +
>>>> +/* Get the copy history for the gimple STMT, return NULL when there is
>>>> +   no associated copy history.  */
>>>> +
>>>> +copy_history_t
>>>> +get_copy_history (gimple *stmt)
>>>> +{
>>>> +  if (!copy_history_map)
>>>> +    return NULL;
>>>> +
>>>> +  if (const copy_history_t *cp_history_p = copy_history_map->get (stmt))
>>>> +    return *cp_history_p;
>>>> +
>>>> +  return NULL;
>>>> +}
>>>> +
>>>> +/* Remove the copy history for STMT.  */
>>>> +
>>>> +void
>>>> +remove_copy_history (gimple *stmt)
>>>> +{
>>>> +  if (!copy_history_map)
>>>> +    return;
>>>> +  copy_history_map->remove (stmt);
>>>> +  return;
>>>> +}
>>>> +
>>>> +/* Check whether the cond_location, is_true_path and reason existed
>>>> + * in the OLD_COPY_HISTORY.  */
>>>> +
>>>> +static bool
>>>> +is_copy_history_existed (location_t cond_location, bool is_true_path,
>>>> + enum copy_reason reason,
>>>> + copy_history_t old_copy_history)
>>>> +{
>>>> +  for (copy_history_t cur_ch = old_copy_history; cur_ch;
>>>> +       cur_ch = cur_ch->prev_copy)
>>>> +    if ((cur_ch->condition == cond_location)
>>>> + && (cur_ch->is_true_path == is_true_path)
>>>> + && (cur_ch->reason == reason))
>>>> +      return true;
>>>> +
>>>> +  return false;
>>>> +}
>>>> +
>>>> +/* Set copy history for the gimple STMT.  Return TRUE when a new copy
>>>> + * history is created and inserted.  Otherwise return FALSE.  */
>>>> +
>>>> +bool
>>>> +set_copy_history (gimple *stmt, location_t cond_location,
>>>> +  bool is_true_path, enum copy_reason reason)
>>>> +{
>>>> +
>>>> +  /* First, get the old copy history associated with this STMT.  */
>>>> +  copy_history_t old_cp_history = get_copy_history (stmt);
>>>> +
>>>> +  /* If the current copy history is not in the STMT's copy history linked
>>>> +     list yet, create the new copy history, put the old_copy_history as the
>>>> +     prev_copy of it.  */
>>>> +  copy_history_t new_cp_history = NULL;
>>>> +  if (!is_copy_history_existed (cond_location, is_true_path,
>>>> + reason, old_cp_history))
>>>> +    new_cp_history
>>>> +      = create_copy_history (cond_location, is_true_path,
>>>> +     reason, old_cp_history);
>>>> +
>>>> +  /* Insert the copy history into the hash map.  */
>>>> +  if (new_cp_history)
>>>> +    {
>>>> +      insert_copy_history (stmt, new_cp_history);
>>>> +      return true;
>>>> +    }
>>>> +
>>>> +  return false;
>>>> +}
>>>> +
>>>> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
>>>> +   compiler within the same process.  For use by toplev::finalize.  */
>>>> +
>>>> +void
>>>> +copy_history_finalize (void)
>>>> +{
>>>> +  if (copy_history_map)
>>>> +    {
>>>> +      delete copy_history_map;
>>>> +      copy_history_map = NULL;
>>>> +    }
>>>> +  obstack_free (&copy_history_obstack, NULL);
>>>> +  return;
>>>> +}
>>>> diff --git a/gcc/diagnostic-copy-history.h b/gcc/diagnostic-copy-history.h
>>>> new file mode 100644
>>>> index 00000000000..0c2c55a2a81
>>>> --- /dev/null
>>>> +++ b/gcc/diagnostic-copy-history.h
>>>> @@ -0,0 +1,80 @@
>>>> +/* Copy history associated with a gimple statement to record its history
>>>> +   of duplications due to different transformations.
>>>> +   The copy history will be used to construct events for later diagnostic.
>>>> +
>>>> +   Copyright (C) 2024-2024 Free Software Foundation, Inc.
>>>> +   Contributed by Qing Zhao <qing.zhao@oracle.com>
>>>> +
>>>> +   This file is part of GCC.
>>>> +
>>>> +   GCC is free software; you can redistribute it and/or modify it under
>>>> +   the terms of the GNU General Public License as published by the Free
>>>> +   Software Foundation; either version 3, or (at your option) any later
>>>> +   version.
>>>> +
>>>> +   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
>>>> +   WARRANTY; without even the implied warranty of MERCHANTABILITY or
>>>> +   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
>>>> +   for more details.
>>>> +
>>>> +   You should have received a copy of the GNU General Public License
>>>> +   along with GCC; see the file COPYING3.  If not see
>>>> +   <http://www.gnu.org/licenses/>.  */
>>>> +
>>>> +#ifndef DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>>>> +#define DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>>>> +
>>>> +#include "hash-map.h"
>>>> +#include "line-map.h"
>>>> +
>>>> +/* An enum for the reason why this copy is made.  Right now, there are
>>>> +   three reasons, we can add more if needed.  */
>>>> +enum copy_reason {
>>>> +  COPY_BY_THREAD_JUMP,
>>>> +  COPY_BY_LOOP_UNSWITCH,
>>>> +  COPY_BY_LOOP_UNROLL
>>>> +};
>>>> +
>>>> +/* This data structure records the information when a statement is
>>>> +   duplicated during different transformations.  Such information
>>>> +   will be used by the later diagnostic messages to report more
>>>> +   contexts of the warnings or errors.  */
>>>> +struct copy_history {
>>>> +  /* The location of the condition statement that triggered the code
>>>> +     duplication.  */
>>>> +  location_t condition;
>>>> +
>>>> +  /* Whether this copy is on the TRUE path of the condition.  */
>>>> +  bool is_true_path;
>>>> +
>>>> +  /* The reason for the code duplication.  */
>>>> +  enum copy_reason reason;
>>>> +
>>>> +  /* This statement itself might be a previous code duplication.  */
>>>> +  struct copy_history *prev_copy;
>>>> +};
>>>> +
>>>> +typedef struct copy_history *copy_history_t;
>>>> +
>>>> +/* Create a new copy history.  */
>>>> +extern copy_history_t create_copy_history (location_t, bool,
>>>> +   enum copy_reason, copy_history_t);
>>>> +
>>>> +typedef hash_map<gimple *, copy_history_t> copy_history_map_t;
>>>> +
>>>> +/* Get the copy history for the gimple STMT, return NULL when there is
>>>> +   no associated copy history.  */
>>>> +extern copy_history_t get_copy_history (gimple *);
>>>> +
>>>> +/* Remove the copy history for STMT from the copy_history_map.  */
>>>> +extern void remove_copy_history (gimple *);
>>>> +
>>>> +/* Set copy history for the gimple STMT.  */
>>>> +extern bool set_copy_history (gimple *, location_t,
>>>> +      bool, enum copy_reason);
>>>> +
>>>> +/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
>>>> +   compiler within the same process.  For use by toplev::finalize.  */
>>>> +extern void copy_history_finalize (void);
>>>> +
>>>> +#endif // DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
>>>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>>>> index b37c7af7a39..bb0c6a87523 100644
>>>> --- a/gcc/doc/invoke.texi
>>>> +++ b/gcc/doc/invoke.texi
>>>> @@ -321,6 +321,7 @@ Objective-C and Objective-C++ Dialects}.
>>>> -fdiagnostics-column-origin=@var{origin}
>>>> -fdiagnostics-escape-format=@r{[}unicode@r{|}bytes@r{]}
>>>> -fdiagnostics-text-art-charset=@r{[}none@r{|}ascii@r{|}unicode@r{|}emoji@r{]}}
>>>> +-fdiagnostics-explain-harder
>>>> 
>>>> @item Warning Options
>>>> @xref{Warning Options,,Options to Request or Suppress Warnings}.
>>>> @@ -5518,6 +5519,12 @@ left margin.
>>>> This option controls the minimum width of the left margin printed by
>>>> @option{-fdiagnostics-show-line-numbers}.  It defaults to 6.
>>>> 
>>>> +@opindex fdiagnostics-explain-harder
>>>> +@item -fdiagnostics-explain-harder
>>>> +With this option, the compiler collects more context information for
>>>> +diagnostics and emits them to the users to provide more hints on how
>>>> +the diagnostics come from.
>>>> +
>>>> @opindex fdiagnostics-parseable-fixits
>>>> @item -fdiagnostics-parseable-fixits
>>>> Emit fix-it hints in a machine-parseable format, suitable for consumption
>>>> diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
>>>> index 1637a2fc4f4..ba98ab3b413 100644
>>>> --- a/gcc/gimple-array-bounds.cc
>>>> +++ b/gcc/gimple-array-bounds.cc
>>>> @@ -31,6 +31,9 @@ along with GCC; see the file COPYING3.  If not see
>>>> #include "tree-dfa.h"
>>>> #include "fold-const.h"
>>>> #include "diagnostic-core.h"
>>>> +#include "simple-diagnostic-path.h"
>>>> +#include "diagnostic-copy-history.h"
>>>> +#include "copy-history-diagnostic-path.h"
>>>> #include "intl.h"
>>>> #include "tree-vrp.h"
>>>> #include "alloc-pool.h"
>>>> @@ -254,6 +257,25 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
>>>> return;
>>>> }
>>>> 
>>>> +/* Build a rich location for LOCATION, and populate the diagonistic path
>>>> +   for it.  */
>>>> +
>>>> +static rich_location *
>>>> +build_rich_location_with_diagnostic_path (location_t location, gimple *stmt)
>>>> +{
>>>> +  /* Generate a rich location for this location.  */
>>>> +  rich_location *richloc = new rich_location (line_table, location);
>>>> +
>>>> +  copy_history_t cp_history = stmt ? get_copy_history (stmt) : NULL;
>>>> +  copy_history_diagnostic_path *path
>>>> +    = new copy_history_diagnostic_path (global_dc->printer,
>>>> + cp_history, location);
>>>> +  path->populate_path_from_copy_history ();
>>>> +
>>>> +  richloc->set_path (path);
>>>> +  return richloc;
>>>> +}
>>>> +
>>>> /* Given the LOW_SUB_ORG, LOW_SUB and UP_SUB, and the computed UP_BOUND
>>>> and UP_BOUND_P1, check whether the array reference REF is out of bound.
>>>> When out of bounds, set OUT_OF_BOUND to true.
>>>> @@ -262,6 +284,7 @@ get_up_bounds_for_array_ref (tree ref, tree *decl,
>>>> 
>>>> static bool
>>>> check_out_of_bounds_and_warn (location_t location, tree ref,
>>>> +      gimple *stmt,
>>>>   tree low_sub_org, tree low_sub, tree up_sub,
>>>>   tree up_bound, tree up_bound_p1,
>>>>   const irange *vr,
>>>> @@ -280,9 +303,13 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>>>  {
>>>>    *out_of_bound = true;
>>>>    if (for_array_bound)
>>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>>> + {
>>>> +  rich_location *richloc
>>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>>  "array subscript %E is outside array"
>>>>  " bounds of %qT", low_sub_org, artype);
>>>> + }
>>>>  }
>>>> 
>>>> if (warned)
>>>> @@ -299,10 +326,14 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>>> {
>>>> *out_of_bound = true;
>>>> if (for_array_bound)
>>>> -    warned = warning_at (location, OPT_Warray_bounds_,
>>>> +    {
>>>> +      rich_location *richloc
>>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>>> "array subscript [%E, %E] is outside "
>>>> "array bounds of %qT",
>>>> low_sub, up_sub, artype);
>>>> +    }
>>>> }
>>>>  }
>>>> else if (up_bound
>>>> @@ -313,18 +344,26 @@ check_out_of_bounds_and_warn (location_t location, tree ref,
>>>>  {
>>>>    *out_of_bound = true;
>>>>    if (for_array_bound)
>>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>>> + {
>>>> +  rich_location *richloc
>>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>>  "array subscript %E is above array bounds of %qT",
>>>>  up_sub, artype);
>>>> + }
>>>>  }
>>>> else if (TREE_CODE (low_sub) == INTEGER_CST
>>>> && tree_int_cst_lt (low_sub, low_bound))
>>>>  {
>>>>    *out_of_bound = true;
>>>>    if (for_array_bound)
>>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>>> + {
>>>> +  rich_location *richloc
>>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>>  "array subscript %E is below array bounds of %qT",
>>>>  low_sub, artype);
>>>> + }
>>>>  }
>>>> return warned;
>>>> }
>>>> @@ -388,21 +427,24 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>>>> }
>>>>  }
>>>> 
>>>> -  warned = check_out_of_bounds_and_warn (location, ref,
>>>> +  warned = check_out_of_bounds_and_warn (location, ref, stmt,
>>>> low_sub_org, low_sub, up_sub,
>>>> up_bound, up_bound_p1, &vr,
>>>> ignore_off_by_one, warn_array_bounds,
>>>> &out_of_bound);
>>>> 
>>>> -
>>>> if (!warned && sam == special_array_member::int_0)
>>>> -    warned = warning_at (location, OPT_Wzero_length_bounds,
>>>> +    {
>>>> +      rich_location *richloc
>>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +      warned = warning_at (richloc, OPT_Wzero_length_bounds,
>>>> (TREE_CODE (low_sub) == INTEGER_CST
>>>> ? G_("array subscript %E is outside the bounds "
>>>>    "of an interior zero-length array %qT")
>>>> : G_("array subscript %qE is outside the bounds "
>>>>    "of an interior zero-length array %qT")),
>>>> low_sub, artype);
>>>> +    }
>>>> 
>>>> if (warned && dump_file && (dump_flags & TDF_DETAILS))
>>>>  {
>>>> @@ -419,8 +461,10 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>>>> || sam == special_array_member::trail_n)
>>>>    && DECL_NOT_FLEXARRAY (afield_decl))
>>>>  {
>>>> +      rich_location *richloc
>>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>>    bool warned1
>>>> - = warning_at (location, OPT_Wstrict_flex_arrays,
>>>> + = warning_at (richloc, OPT_Wstrict_flex_arrays,
>>>>   "trailing array %qT should not be used as "
>>>>   "a flexible array member",
>>>>   artype);
>>>> @@ -478,6 +522,7 @@ array_bounds_checker::check_array_ref (location_t location, tree ref,
>>>> 
>>>> bool
>>>> array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>>> +     gimple *stmt,
>>>>  bool ignore_off_by_one)
>>>> {
>>>> if (warning_suppressed_p (ref, OPT_Warray_bounds_))
>>>> @@ -580,16 +625,24 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>>> if (lboob)
>>>>  {
>>>>    if (offrange[0] == offrange[1])
>>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>>> + {
>>>> +  rich_location *richloc
>>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>>  "array subscript %wi is outside array bounds "
>>>>  "of %qT",
>>>>  offrange[0].to_shwi (), reftype);
>>>> + }
>>>>    else
>>>> - warned = warning_at (location, OPT_Warray_bounds_,
>>>> + {
>>>> +  rich_location *richloc
>>>> +    = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +  warned = warning_at (richloc, OPT_Warray_bounds_,
>>>>  "array subscript [%wi, %wi] is outside "
>>>>  "array bounds of %qT",
>>>>  offrange[0].to_shwi (),
>>>>  offrange[1].to_shwi (), reftype);
>>>> + }
>>>>  }
>>>> else if (uboob && !ignore_off_by_one)
>>>>  {
>>>> @@ -599,8 +652,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>>> it were an untyped array of bytes.  */
>>>> backtype = build_array_type_nelts (unsigned_char_type_node,
>>>> aref.sizrng[1].to_uhwi ());
>>>> -
>>>> -      warned = warning_at (location, OPT_Warray_bounds_,
>>>> +      rich_location *richloc
>>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>>> "array subscript %<%T[%wi]%> is partly "
>>>> "outside array bounds of %qT",
>>>> axstype, offrange[0].to_shwi (), backtype);
>>>> @@ -623,7 +677,9 @@ array_bounds_checker::check_mem_ref (location_t location, tree ref,
>>>>  {
>>>>    HOST_WIDE_INT tmpidx = (aref.offmax[i] / eltsize).to_shwi ();
>>>> 
>>>> -      if (warning_at (location, OPT_Warray_bounds_,
>>>> +      rich_location *richloc
>>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +      if (warning_at (richloc, OPT_Warray_bounds_,
>>>>   "intermediate array offset %wi is outside array bounds "
>>>>   "of %qT", tmpidx, reftype))
>>>> {
>>>> @@ -656,7 +712,7 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>>>> ignore_off_by_one = false;
>>>> }
>>>>    else if (TREE_CODE (t) == MEM_REF)
>>>> - warned = check_mem_ref (location, t, ignore_off_by_one);
>>>> + warned = check_mem_ref (location, t, stmt, ignore_off_by_one);
>>>> 
>>>>    if (warned)
>>>> suppress_warning (t, OPT_Warray_bounds_);
>>>> @@ -702,7 +758,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>>>> dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>>>> fprintf (dump_file, "\n");
>>>> }
>>>> -      warned = warning_at (location, OPT_Warray_bounds_,
>>>> +      rich_location *richloc
>>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>>> "array subscript %wi is below "
>>>> "array bounds of %qT",
>>>> idx.to_shwi (), TREE_TYPE (tem));
>>>> @@ -716,7 +774,9 @@ array_bounds_checker::check_addr_expr (location_t location, tree t,
>>>> dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
>>>> fprintf (dump_file, "\n");
>>>> }
>>>> -      warned = warning_at (location, OPT_Warray_bounds_,
>>>> +      rich_location *richloc
>>>> + = build_rich_location_with_diagnostic_path (location, stmt);
>>>> +      warned = warning_at (richloc, OPT_Warray_bounds_,
>>>> "array subscript %wu is above "
>>>> "array bounds of %qT",
>>>> idx.to_uhwi (), TREE_TYPE (tem));
>>>> @@ -811,7 +871,7 @@ array_bounds_checker::check_array_bounds (tree *tp, int *walk_subtree,
>>>>  warned = checker->check_array_ref (location, t, wi->stmt,
>>>>    false/*ignore_off_by_one*/);
>>>> else if (TREE_CODE (t) == MEM_REF)
>>>> -    warned = checker->check_mem_ref (location, t,
>>>> +    warned = checker->check_mem_ref (location, t, wi->stmt,
>>>>  false /*ignore_off_by_one*/);
>>>> else if (TREE_CODE (t) == ADDR_EXPR)
>>>>  {
>>>> diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h
>>>> index aa7ca8e9730..2d1d48d1e94 100644
>>>> --- a/gcc/gimple-array-bounds.h
>>>> +++ b/gcc/gimple-array-bounds.h
>>>> @@ -33,7 +33,7 @@ public:
>>>> private:
>>>> static tree check_array_bounds (tree *tp, int *walk_subtree, void *data);
>>>> bool check_array_ref (location_t, tree, gimple *, bool ignore_off_by_one);
>>>> -  bool check_mem_ref (location_t, tree, bool ignore_off_by_one);
>>>> +  bool check_mem_ref (location_t, tree, gimple *, bool ignore_off_by_one);
>>>> void check_addr_expr (location_t, tree, gimple *);
>>>> void get_value_range (irange &r, const_tree op, gimple *);
>>>> 
>>>> diff --git a/gcc/gimple-iterator.cc b/gcc/gimple-iterator.cc
>>>> index 93646262eac..472a79360d7 100644
>>>> --- a/gcc/gimple-iterator.cc
>>>> +++ b/gcc/gimple-iterator.cc
>>>> @@ -33,6 +33,7 @@ along with GCC; see the file COPYING3.  If not see
>>>> #include "tree-ssa.h"
>>>> #include "value-prof.h"
>>>> #include "gimplify.h"
>>>> +#include "diagnostic-copy-history.h"
>>>> 
>>>> 
>>>> /* Mark the statement STMT as modified, and update it.  */
>>>> @@ -581,6 +582,8 @@ gsi_remove (gimple_stmt_iterator *i, bool remove_permanently)
>>>> cfun->debug_marker_count--;
>>>>    require_eh_edge_purge = remove_stmt_from_eh_lp (stmt);
>>>>    gimple_remove_stmt_histograms (cfun, stmt);
>>>> +      if (get_copy_history (stmt) != NULL)
>>>> + remove_copy_history (stmt);
>>>>  }
>>>> 
>>>> /* Update the iterator and re-wire the links in I->SEQ.  */
>>>> diff --git a/gcc/testsuite/gcc.dg/pr109071.c b/gcc/testsuite/gcc.dg/pr109071.c
>>>> new file mode 100644
>>>> index 00000000000..95ae576451e
>>>> --- /dev/null
>>>> +++ b/gcc/testsuite/gcc.dg/pr109071.c
>>>> @@ -0,0 +1,43 @@
>>>> +/* PR tree-optimization/109071 need more context for -Warray-bounds warnings
>>>> +   due to code duplication from jump threading.  */  
>>>> +/* { dg-options "-O2 -Wall -fdiagnostics-explain-harder" } */
>>>> +/* { dg-additional-options "-fdiagnostics-show-line-numbers -fdiagnostics-path-format=inline-events -fdiagnostics-show-caret" } */
>>>> +/* { dg-enable-nn-line-numbers "" } */
>>>> +
>>>> +extern void warn(void);
>>>> +static inline void assign(int val, int *regs, int index)
>>>> +{
>>>> +  if (index >= 4)
>>>> +    warn();
>>>> +  *regs = val;
>>>> +}
>>>> +struct nums {int vals[4];};
>>>> +
>>>> +void sparx5_set (int *ptr, struct nums *sg, int index)
>>>> +{
>>>> +  int *val = &sg->vals[index]; /* { dg-warning "is above array bounds" } */
>>>> +
>>>> +  assign(0,    ptr, index);
>>>> +  assign(*val, ptr, index);
>>>> +}
>>>> +/* { dg-begin-multiline-output "" }
>>>> +   NN |   int *val = &sg->vals[index];
>>>> +      |               ~~~~~~~~^~~~~~~
>>>> +  'sparx5_set': events 1-2
>>>> +   NN |   if (index >= 4)
>>>> +      |      ^
>>>> +      |      |
>>>> +      |      (1) when the condition is evaluated to true
>>>> +......
>>>> +   { dg-end-multiline-output "" } */
>>>> +
>>>> +/* { dg-begin-multiline-output "" }
>>>> +      |               ~~~~~~~~~~~~~~~
>>>> +      |                       |
>>>> +      |                       (2) out of array bounds here
>>>> +   { dg-end-multiline-output "" } */
>>>> +
>>>> +/* { dg-begin-multiline-output "" }
>>>> +   NN | struct nums {int vals[4];};
>>>> +      |                  ^~~~
>>>> +   { dg-end-multiline-output "" } */
>>>> diff --git a/gcc/toplev.cc b/gcc/toplev.cc
>>>> index f715f977b72..e9ae55f64be 100644
>>>> --- a/gcc/toplev.cc
>>>> +++ b/gcc/toplev.cc
>>>> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
>>>> #include "cgraph.h"
>>>> #include "coverage.h"
>>>> #include "diagnostic.h"
>>>> +#include "diagnostic-copy-history.h"
>>>> #include "varasm.h"
>>>> #include "tree-inline.h"
>>>> #include "realmpfr.h" /* For GMP/MPFR/MPC versions, in print_version.  */
>>>> @@ -2381,6 +2382,8 @@ toplev::finalize (void)
>>>> tree_cc_finalize ();
>>>> reginfo_cc_finalize ();
>>>> 
>>>> +  copy_history_finalize ();
>>>> +
>>>> /* save_decoded_options uses opts_obstack, so these must
>>>>   be cleaned up together.  */
>>>> obstack_free (&opts_obstack, NULL);
>>>> diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
>>>> index fa61ba9512b..9292f0c2aa1 100644
>>>> --- a/gcc/tree-ssa-threadupdate.cc
>>>> +++ b/gcc/tree-ssa-threadupdate.cc
>>>> @@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
>>>> #include "tree-cfg.h"
>>>> #include "tree-vectorizer.h"
>>>> #include "tree-pass.h"
>>>> +#include "diagnostic-copy-history.h"
>>>> 
>>>> /* Given a block B, update the CFG and SSA graph to reflect redirecting
>>>> one or more in-edges to B to instead reach the destination of an
>>>> @@ -2371,6 +2372,37 @@ back_jt_path_registry::adjust_paths_after_duplication (unsigned curr_path_num)
>>>>  }
>>>> }
>>>> 
>>>> +/* Set copy history to all the stmts in the basic block BB based on
>>>> +   the entry edge ENTRY and whether this basic block is on the true
>>>> +   path of the condition in the destination of the edge ENTRY.
>>>> +   Return TRUE when copy history are set, otherwise return FALSE.  */
>>>> +
>>>> +static bool
>>>> +set_copy_history_to_stmts_in_bb (basic_block bb, edge entry, bool is_true_path)
>>>> +{
>>>> +  /* First, get the condition statement in the destination of the
>>>> +     edge ENTRY.  */
>>>> +  basic_block cond_block = entry->dest;
>>>> +  gimple *cond_stmt = NULL;
>>>> +  gimple_stmt_iterator gsi;
>>>> +  gsi = gsi_last_bb (cond_block);
>>>> +  /* if the cond_block ends with a conditional statement, get it.  */
>>>> +  if (!gsi_end_p (gsi)
>>>> +      && gsi_stmt (gsi)
>>>> +      && (gimple_code (gsi_stmt (gsi)) == GIMPLE_COND))
>>>> +    cond_stmt = gsi_stmt (gsi);
>>>> +
>>>> +  if (!cond_stmt)
>>>> +    return false;
>>>> +
>>>> +  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
>>>> +    set_copy_history (gsi_stmt (gsi),
>>>> +      gimple_location (cond_stmt),
>>>> +      is_true_path,
>>>> +      COPY_BY_THREAD_JUMP);
>>>> +  return true;
>>>> +}
>>>> +
>>>> /* Duplicates a jump-thread path of N_REGION basic blocks.
>>>> The ENTRY edge is redirected to the duplicate of the region.
>>>> 
>>>> @@ -2419,6 +2451,16 @@ back_jt_path_registry::duplicate_thread_path (edge entry,
>>>> copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
>>>> split_edge_bb_loc (entry), false);
>>>> 
>>>> +  /* Set the copy history for all the stmts in both original and copied
>>>> +     basic blocks.  The copied regions are in the true path.  */
>>>> +  if (flag_diagnostics_explain_harder)
>>>> +    for (i = 0; i < n_region; i++)
>>>> +      {
>>>> + set_copy_history_to_stmts_in_bb (region[i], entry, false);
>>>> + set_copy_history_to_stmts_in_bb (region_copy[i], entry, true);
>>>> +      }
>>>> +
>>>> +
>>>> /* Fix up: copy_bbs redirects all edges pointing to copied blocks.  The
>>>>   following code ensures that all the edges exiting the jump-thread path are
>>>>   redirected back to the original code: these edges are exceptions
>>>> -- 
>>>> 2.31.1
>>>> 
>>> 
>> 
>
diff mbox series

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 7d3ea27a6ab..f8bbfda845f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1436,6 +1436,8 @@  OBJS = \
 	df-problems.o \
 	df-scan.o \
 	dfp.o \
+	diagnostic-copy-history.o \
+	copy-history-diagnostic-path.o \
 	digraph.o \
 	dojump.o \
 	dominance.o \
diff --git a/gcc/common.opt b/gcc/common.opt
index a300470bbc5..714280390d9 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1558,6 +1558,10 @@  fdiagnostics-minimum-margin-width=
 Common Joined UInteger Var(diagnostics_minimum_margin_width) Init(6)
 Set minimum width of left margin of source code when showing source.
 
+fdiagnostics-explain-harder
+Common Var(flag_diagnostics_explain_harder)
+Collect and print more context information for diagnostics.
+
 fdisable-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -fdisable-[tree|rtl|ipa]-<pass>=range1+range2	Disable an optimization pass.
diff --git a/gcc/copy-history-diagnostic-path.cc b/gcc/copy-history-diagnostic-path.cc
new file mode 100644
index 00000000000..65e5c056165
--- /dev/null
+++ b/gcc/copy-history-diagnostic-path.cc
@@ -0,0 +1,86 @@ 
+/* Classes for implementing diagnostic paths for copy_history_t.
+   Copyright (C) 2024-2024 Free Software Foundation, Inc.
+   Contributed by Qing Zhao<qing.zhao@oracle.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "function.h"
+#include "copy-history-diagnostic-path.h"
+
+bool
+copy_history_diagnostic_path::same_function_p (int event_idx_a,
+					       int event_idx_b) const
+{
+  return (m_events[event_idx_a]->get_fndecl ()
+	  == m_events[event_idx_b]->get_fndecl ());
+}
+
+/* Add an event to this path at LOC within function FNDECL at
+   stack depth DEPTH.
+
+   Use m_context's printer to format FMT, as the text of the new
+   event.  */
+
+void
+copy_history_diagnostic_path::add_event (location_t loc, tree fndecl, int depth,
+					 const char *fmt, ...)
+{
+  pretty_printer *pp = m_event_pp;
+  pp_clear_output_area (pp);
+
+  rich_location rich_loc (line_table, UNKNOWN_LOCATION);
+
+  va_list ap;
+
+  va_start (ap, fmt);
+
+  text_info ti (fmt, &ap, 0, nullptr, &rich_loc);
+  pp_format (pp, &ti);
+  pp_output_formatted_text (pp);
+
+  va_end (ap);
+
+  simple_diagnostic_event *new_event
+    = new simple_diagnostic_event (loc, fndecl, depth, pp_formatted_text (pp));
+  m_events.safe_push (new_event);
+
+  pp_clear_output_area (pp);
+
+  return;
+}
+
+/* Populate the diagnostic_path from the copy_history.  */
+void
+copy_history_diagnostic_path::populate_path_from_copy_history ()
+{
+  if (!m_copy_history)
+    return;
+
+  for (copy_history_t cur_ch = m_copy_history; cur_ch;
+       cur_ch = cur_ch->prev_copy)
+    add_event (cur_ch->condition, cfun->decl, 1,
+	       "when the condition is evaluated to %s",
+	       cur_ch->is_true_path ? "true" : "false");
+
+  /* Add an end of path warning event in the end of the path.  */
+  add_event (m_location, cfun->decl, 1,
+	     "out of array bounds here");
+  return;
+}
diff --git a/gcc/copy-history-diagnostic-path.h b/gcc/copy-history-diagnostic-path.h
new file mode 100644
index 00000000000..377e9136070
--- /dev/null
+++ b/gcc/copy-history-diagnostic-path.h
@@ -0,0 +1,87 @@ 
+/* Classes for implementing diagnostic paths for copy_history_t.
+   Copyright (C) 2024-2024 Free Software Foundation, Inc.
+   Contributed by Qing Zhao<qing.zhao@oracle.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
+#define GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H
+
+#include "diagnostic-path.h"
+#include "simple-diagnostic-path.h"
+#include "diagnostic-copy-history.h"
+
+/* An implementation of diagnostic_path for copy_history_t, as a
+   vector of simple_diagnostic_event instances.  */
+
+class copy_history_diagnostic_path : public diagnostic_path
+{
+ public:
+  copy_history_diagnostic_path (pretty_printer *event_pp,
+				copy_history_t cp_history,
+				location_t loc)
+  : diagnostic_path (),
+    m_thread ("main"),
+    m_event_pp (event_pp),
+    m_copy_history (cp_history),
+    m_location (loc)
+  {}
+
+  unsigned num_events () const final override
+  {
+    return m_events.length ();
+  }
+  const diagnostic_event & get_event (int idx) const final override
+  {
+    return *m_events[idx];
+  }
+  unsigned num_threads () const final override
+  {
+    return 1;
+  }
+  const diagnostic_thread &
+  get_thread (diagnostic_thread_id_t) const final override
+  {
+    return m_thread;
+  }
+  bool
+  same_function_p (int event_idx_a,
+		   int event_idx_b) const final override;
+
+  void add_event (location_t loc, tree fndecl, int depth,
+		  const char *fmt, ...);
+
+  void populate_path_from_copy_history ();
+
+ private:
+  simple_diagnostic_thread m_thread;
+
+  /* The events that have occurred along this path.  */
+  auto_delete_vec<simple_diagnostic_event> m_events;
+
+  pretty_printer *m_event_pp;
+
+  /* The copy_history associated with this path.  */
+  copy_history_t m_copy_history;
+
+  /* The location for the gimple statement where the
+     diagnostic message emitted.  */
+  location_t m_location;
+
+};
+
+#endif /* ! GCC_COPY_HISTORY_DIAGNOSTIC_PATH_H */
diff --git a/gcc/diagnostic-copy-history.cc b/gcc/diagnostic-copy-history.cc
new file mode 100644
index 00000000000..2733d137622
--- /dev/null
+++ b/gcc/diagnostic-copy-history.cc
@@ -0,0 +1,163 @@ 
+/* Functions to handle copy history.
+   Copyright (C) 2024-2024 Free Software Foundation, Inc.
+   Contributed by Qing Zhao <qing.zhao@oracle.com>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 3, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "diagnostic-copy-history.h"
+
+/* A mapping from a gimple to a pointer to the copy history of it.  */
+static copy_history_map_t *copy_history_map;
+
+/* Obstack for copy history.  */
+static struct obstack copy_history_obstack;
+
+/* Create a new copy history.  */
+
+copy_history_t
+create_copy_history (location_t condition,
+		     bool is_true_path,
+		     enum copy_reason reason,
+		     copy_history_t prev_copy)
+{
+  static bool initialized = false;
+
+  if (!initialized)
+    {
+      gcc_obstack_init (&copy_history_obstack);
+      initialized = true;
+    }
+
+  copy_history_t copy_history
+    = (copy_history_t) obstack_alloc (&copy_history_obstack,
+				      sizeof (struct copy_history));
+  copy_history->condition = condition;
+  copy_history->is_true_path = is_true_path;
+  copy_history->reason = reason;
+  copy_history->prev_copy = prev_copy;
+  return copy_history;
+}
+
+/* Insert the copy history for the gimple STMT assuming the linked list
+   of CP_HISTORY does not have duplications.  It's the caller's
+   responsibility to make sure that the linked list of CP_HISTORY does
+   not have duplications.  */
+
+void
+insert_copy_history (gimple *stmt, copy_history_t cp_history)
+{
+  if (!copy_history_map)
+    copy_history_map = new copy_history_map_t;
+
+  copy_history_map->put (stmt, cp_history);
+  return;
+}
+
+/* Get the copy history for the gimple STMT, return NULL when there is
+   no associated copy history.  */
+
+copy_history_t
+get_copy_history (gimple *stmt)
+{
+  if (!copy_history_map)
+    return NULL;
+
+  if (const copy_history_t *cp_history_p = copy_history_map->get (stmt))
+    return *cp_history_p;
+
+  return NULL;
+}
+
+/* Remove the copy history for STMT.  */
+
+void
+remove_copy_history (gimple *stmt)
+{
+  if (!copy_history_map)
+    return;
+  copy_history_map->remove (stmt);
+  return;
+}
+
+/* Check whether the cond_location, is_true_path and reason existed
+ * in the OLD_COPY_HISTORY.  */
+
+static bool
+is_copy_history_existed (location_t cond_location, bool is_true_path,
+			 enum copy_reason reason,
+			 copy_history_t old_copy_history)
+{
+  for (copy_history_t cur_ch = old_copy_history; cur_ch;
+       cur_ch = cur_ch->prev_copy)
+    if ((cur_ch->condition == cond_location)
+	&& (cur_ch->is_true_path == is_true_path)
+	&& (cur_ch->reason == reason))
+      return true;
+
+  return false;
+}
+
+/* Set copy history for the gimple STMT.  Return TRUE when a new copy
+ * history is created and inserted.  Otherwise return FALSE.  */
+
+bool
+set_copy_history (gimple *stmt, location_t cond_location,
+		  bool is_true_path, enum copy_reason reason)
+{
+
+  /* First, get the old copy history associated with this STMT.  */
+  copy_history_t old_cp_history = get_copy_history (stmt);
+
+  /* If the current copy history is not in the STMT's copy history linked
+     list yet, create the new copy history, put the old_copy_history as the
+     prev_copy of it.  */
+  copy_history_t new_cp_history = NULL;
+  if (!is_copy_history_existed (cond_location, is_true_path,
+				reason, old_cp_history))
+    new_cp_history
+      = create_copy_history (cond_location, is_true_path,
+			     reason, old_cp_history);
+
+  /* Insert the copy history into the hash map.  */
+  if (new_cp_history)
+    {
+      insert_copy_history (stmt, new_cp_history);
+      return true;
+    }
+
+  return false;
+}
+
+/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
+   compiler within the same process.  For use by toplev::finalize.  */
+
+void
+copy_history_finalize (void)
+{
+  if (copy_history_map)
+    {
+      delete copy_history_map;
+      copy_history_map = NULL;
+    }
+  obstack_free (&copy_history_obstack, NULL);
+  return;
+}
diff --git a/gcc/diagnostic-copy-history.h b/gcc/diagnostic-copy-history.h
new file mode 100644
index 00000000000..0c2c55a2a81
--- /dev/null
+++ b/gcc/diagnostic-copy-history.h
@@ -0,0 +1,80 @@ 
+/* Copy history associated with a gimple statement to record its history
+   of duplications due to different transformations.
+   The copy history will be used to construct events for later diagnostic.
+
+   Copyright (C) 2024-2024 Free Software Foundation, Inc.
+   Contributed by Qing Zhao <qing.zhao@oracle.com>
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it under
+   the terms of the GNU General Public License as published by the Free
+   Software Foundation; either version 3, or (at your option) any later
+   version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+   WARRANTY; without even the implied warranty of MERCHANTABILITY or
+   FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+   for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#ifndef DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
+#define DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
+
+#include "hash-map.h"
+#include "line-map.h"
+
+/* An enum for the reason why this copy is made.  Right now, there are
+   three reasons, we can add more if needed.  */
+enum copy_reason {
+  COPY_BY_THREAD_JUMP,
+  COPY_BY_LOOP_UNSWITCH,
+  COPY_BY_LOOP_UNROLL
+};
+
+/* This data structure records the information when a statement is
+   duplicated during different transformations.  Such information
+   will be used by the later diagnostic messages to report more
+   contexts of the warnings or errors.  */
+struct copy_history {
+  /* The location of the condition statement that triggered the code
+     duplication.  */
+  location_t condition;
+
+  /* Whether this copy is on the TRUE path of the condition.  */
+  bool is_true_path;
+
+  /* The reason for the code duplication.  */
+  enum copy_reason reason;
+
+  /* This statement itself might be a previous code duplication.  */
+  struct copy_history *prev_copy;
+};
+
+typedef struct copy_history *copy_history_t;
+
+/* Create a new copy history.  */
+extern copy_history_t create_copy_history (location_t, bool,
+					   enum copy_reason, copy_history_t);
+
+typedef hash_map<gimple *, copy_history_t> copy_history_map_t;
+
+/* Get the copy history for the gimple STMT, return NULL when there is
+   no associated copy history.  */
+extern copy_history_t get_copy_history (gimple *);
+
+/* Remove the copy history for STMT from the copy_history_map.  */
+extern void remove_copy_history (gimple *);
+
+/* Set copy history for the gimple STMT.  */
+extern bool set_copy_history (gimple *, location_t,
+			      bool, enum copy_reason);
+
+/* Reset all state for diagnostic-copy-history.cc so that we can rerun the
+   compiler within the same process.  For use by toplev::finalize.  */
+extern void copy_history_finalize (void);
+
+#endif // DIAGNOSTIC_COPY_HISTORY_H_INCLUDED
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index b37c7af7a39..bb0c6a87523 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -321,6 +321,7 @@  Objective-C and Objective-C++ Dialects}.
 -fdiagnostics-column-origin=@var{origin}
 -fdiagnostics-escape-format=@r{[}unicode@r{|}bytes@r{]}
 -fdiagnostics-text-art-charset=@r{[}none@r{|}ascii@r{|}unicode@r{|}emoji@r{]}}
+-fdiagnostics-explain-harder
 
 @item Warning Options
 @xref{Warning Options,,Options to Request or Suppress Warnings}.
@@ -5518,6 +5519,12 @@  left margin.
 This option controls the minimum width of the left margin printed by
 @option{-fdiagnostics-show-line-numbers}.  It defaults to 6.
 
+@opindex fdiagnostics-explain-harder
+@item -fdiagnostics-explain-harder
+With this option, the compiler collects more context information for
+diagnostics and emits them to the users to provide more hints on how
+the diagnostics come from.
+
 @opindex fdiagnostics-parseable-fixits
 @item -fdiagnostics-parseable-fixits
 Emit fix-it hints in a machine-parseable format, suitable for consumption
diff --git a/gcc/gimple-array-bounds.cc b/gcc/gimple-array-bounds.cc
index 1637a2fc4f4..ba98ab3b413 100644
--- a/gcc/gimple-array-bounds.cc
+++ b/gcc/gimple-array-bounds.cc
@@ -31,6 +31,9 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-dfa.h"
 #include "fold-const.h"
 #include "diagnostic-core.h"
+#include "simple-diagnostic-path.h"
+#include "diagnostic-copy-history.h"
+#include "copy-history-diagnostic-path.h"
 #include "intl.h"
 #include "tree-vrp.h"
 #include "alloc-pool.h"
@@ -254,6 +257,25 @@  get_up_bounds_for_array_ref (tree ref, tree *decl,
   return;
 }
 
+/* Build a rich location for LOCATION, and populate the diagonistic path
+   for it.  */
+
+static rich_location *
+build_rich_location_with_diagnostic_path (location_t location, gimple *stmt)
+{
+  /* Generate a rich location for this location.  */
+  rich_location *richloc = new rich_location (line_table, location);
+
+  copy_history_t cp_history = stmt ? get_copy_history (stmt) : NULL;
+  copy_history_diagnostic_path *path
+    = new copy_history_diagnostic_path (global_dc->printer,
+					cp_history, location);
+  path->populate_path_from_copy_history ();
+
+  richloc->set_path (path);
+  return richloc;
+}
+
 /* Given the LOW_SUB_ORG, LOW_SUB and UP_SUB, and the computed UP_BOUND
    and UP_BOUND_P1, check whether the array reference REF is out of bound.
    When out of bounds, set OUT_OF_BOUND to true.
@@ -262,6 +284,7 @@  get_up_bounds_for_array_ref (tree ref, tree *decl,
 
 static bool
 check_out_of_bounds_and_warn (location_t location, tree ref,
+			      gimple *stmt,
 			      tree low_sub_org, tree low_sub, tree up_sub,
 			      tree up_bound, tree up_bound_p1,
 			      const irange *vr,
@@ -280,9 +303,13 @@  check_out_of_bounds_and_warn (location_t location, tree ref,
     {
       *out_of_bound = true;
       if (for_array_bound)
-	warned = warning_at (location, OPT_Warray_bounds_,
+	{
+	  rich_location *richloc
+	    = build_rich_location_with_diagnostic_path (location, stmt);
+	  warned = warning_at (richloc, OPT_Warray_bounds_,
 			     "array subscript %E is outside array"
 			     " bounds of %qT", low_sub_org, artype);
+	}
     }
 
   if (warned)
@@ -299,10 +326,14 @@  check_out_of_bounds_and_warn (location_t location, tree ref,
 	{
 	  *out_of_bound = true;
 	  if (for_array_bound)
-	    warned = warning_at (location, OPT_Warray_bounds_,
+	    {
+	      rich_location *richloc
+		= build_rich_location_with_diagnostic_path (location, stmt);
+	      warned = warning_at (richloc, OPT_Warray_bounds_,
 				 "array subscript [%E, %E] is outside "
 				 "array bounds of %qT",
 				 low_sub, up_sub, artype);
+	    }
 	}
     }
   else if (up_bound
@@ -313,18 +344,26 @@  check_out_of_bounds_and_warn (location_t location, tree ref,
     {
       *out_of_bound = true;
       if (for_array_bound)
-	warned = warning_at (location, OPT_Warray_bounds_,
+	{
+	  rich_location *richloc
+	    = build_rich_location_with_diagnostic_path (location, stmt);
+	  warned = warning_at (richloc, OPT_Warray_bounds_,
 			     "array subscript %E is above array bounds of %qT",
 			     up_sub, artype);
+	}
     }
   else if (TREE_CODE (low_sub) == INTEGER_CST
 	   && tree_int_cst_lt (low_sub, low_bound))
     {
       *out_of_bound = true;
       if (for_array_bound)
-	warned = warning_at (location, OPT_Warray_bounds_,
+	{
+	  rich_location *richloc
+	    = build_rich_location_with_diagnostic_path (location, stmt);
+	  warned = warning_at (richloc, OPT_Warray_bounds_,
 			     "array subscript %E is below array bounds of %qT",
 			     low_sub, artype);
+	}
     }
   return warned;
 }
@@ -388,21 +427,24 @@  array_bounds_checker::check_array_ref (location_t location, tree ref,
 	}
     }
 
-  warned = check_out_of_bounds_and_warn (location, ref,
+  warned = check_out_of_bounds_and_warn (location, ref, stmt,
 					 low_sub_org, low_sub, up_sub,
 					 up_bound, up_bound_p1, &vr,
 					 ignore_off_by_one, warn_array_bounds,
 					 &out_of_bound);
 
-
   if (!warned && sam == special_array_member::int_0)
-    warned = warning_at (location, OPT_Wzero_length_bounds,
+    {
+      rich_location *richloc
+	= build_rich_location_with_diagnostic_path (location, stmt);
+      warned = warning_at (richloc, OPT_Wzero_length_bounds,
 			 (TREE_CODE (low_sub) == INTEGER_CST
 			  ? G_("array subscript %E is outside the bounds "
 			       "of an interior zero-length array %qT")
 			  : G_("array subscript %qE is outside the bounds "
 			       "of an interior zero-length array %qT")),
 			 low_sub, artype);
+    }
 
   if (warned && dump_file && (dump_flags & TDF_DETAILS))
     {
@@ -419,8 +461,10 @@  array_bounds_checker::check_array_ref (location_t location, tree ref,
 	  || sam == special_array_member::trail_n)
       && DECL_NOT_FLEXARRAY (afield_decl))
     {
+      rich_location *richloc
+	= build_rich_location_with_diagnostic_path (location, stmt);
       bool warned1
-	= warning_at (location, OPT_Wstrict_flex_arrays,
+	= warning_at (richloc, OPT_Wstrict_flex_arrays,
 		      "trailing array %qT should not be used as "
 		      "a flexible array member",
 		      artype);
@@ -478,6 +522,7 @@  array_bounds_checker::check_array_ref (location_t location, tree ref,
 
 bool
 array_bounds_checker::check_mem_ref (location_t location, tree ref,
+				     gimple *stmt,
 				     bool ignore_off_by_one)
 {
   if (warning_suppressed_p (ref, OPT_Warray_bounds_))
@@ -580,16 +625,24 @@  array_bounds_checker::check_mem_ref (location_t location, tree ref,
   if (lboob)
     {
       if (offrange[0] == offrange[1])
-	warned = warning_at (location, OPT_Warray_bounds_,
+	{
+	  rich_location *richloc
+	    = build_rich_location_with_diagnostic_path (location, stmt);
+	  warned = warning_at (richloc, OPT_Warray_bounds_,
 			     "array subscript %wi is outside array bounds "
 			     "of %qT",
 			     offrange[0].to_shwi (), reftype);
+	}
       else
-	warned = warning_at (location, OPT_Warray_bounds_,
+	{
+	  rich_location *richloc
+	    = build_rich_location_with_diagnostic_path (location, stmt);
+	  warned = warning_at (richloc, OPT_Warray_bounds_,
 			     "array subscript [%wi, %wi] is outside "
 			     "array bounds of %qT",
 			     offrange[0].to_shwi (),
 			     offrange[1].to_shwi (), reftype);
+	}
     }
   else if (uboob && !ignore_off_by_one)
     {
@@ -599,8 +652,9 @@  array_bounds_checker::check_mem_ref (location_t location, tree ref,
 	   it were an untyped array of bytes.  */
 	backtype = build_array_type_nelts (unsigned_char_type_node,
 					   aref.sizrng[1].to_uhwi ());
-
-      warned = warning_at (location, OPT_Warray_bounds_,
+      rich_location *richloc
+	= build_rich_location_with_diagnostic_path (location, stmt);
+      warned = warning_at (richloc, OPT_Warray_bounds_,
 			   "array subscript %<%T[%wi]%> is partly "
 			   "outside array bounds of %qT",
 			   axstype, offrange[0].to_shwi (), backtype);
@@ -623,7 +677,9 @@  array_bounds_checker::check_mem_ref (location_t location, tree ref,
     {
       HOST_WIDE_INT tmpidx = (aref.offmax[i] / eltsize).to_shwi ();
 
-      if (warning_at (location, OPT_Warray_bounds_,
+      rich_location *richloc
+	= build_rich_location_with_diagnostic_path (location, stmt);
+      if (warning_at (richloc, OPT_Warray_bounds_,
 		      "intermediate array offset %wi is outside array bounds "
 		      "of %qT", tmpidx, reftype))
 	{
@@ -656,7 +712,7 @@  array_bounds_checker::check_addr_expr (location_t location, tree t,
 	  ignore_off_by_one = false;
 	}
       else if (TREE_CODE (t) == MEM_REF)
-	warned = check_mem_ref (location, t, ignore_off_by_one);
+	warned = check_mem_ref (location, t, stmt, ignore_off_by_one);
 
       if (warned)
 	suppress_warning (t, OPT_Warray_bounds_);
@@ -702,7 +758,9 @@  array_bounds_checker::check_addr_expr (location_t location, tree t,
 	  dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
 	  fprintf (dump_file, "\n");
 	}
-      warned = warning_at (location, OPT_Warray_bounds_,
+      rich_location *richloc
+	= build_rich_location_with_diagnostic_path (location, stmt);
+      warned = warning_at (richloc, OPT_Warray_bounds_,
 			   "array subscript %wi is below "
 			   "array bounds of %qT",
 			   idx.to_shwi (), TREE_TYPE (tem));
@@ -716,7 +774,9 @@  array_bounds_checker::check_addr_expr (location_t location, tree t,
 	  dump_generic_expr (MSG_NOTE, TDF_SLIM, t);
 	  fprintf (dump_file, "\n");
 	}
-      warned = warning_at (location, OPT_Warray_bounds_,
+      rich_location *richloc
+	= build_rich_location_with_diagnostic_path (location, stmt);
+      warned = warning_at (richloc, OPT_Warray_bounds_,
 			   "array subscript %wu is above "
 			   "array bounds of %qT",
 			   idx.to_uhwi (), TREE_TYPE (tem));
@@ -811,7 +871,7 @@  array_bounds_checker::check_array_bounds (tree *tp, int *walk_subtree,
     warned = checker->check_array_ref (location, t, wi->stmt,
 				       false/*ignore_off_by_one*/);
   else if (TREE_CODE (t) == MEM_REF)
-    warned = checker->check_mem_ref (location, t,
+    warned = checker->check_mem_ref (location, t, wi->stmt,
 				     false /*ignore_off_by_one*/);
   else if (TREE_CODE (t) == ADDR_EXPR)
     {
diff --git a/gcc/gimple-array-bounds.h b/gcc/gimple-array-bounds.h
index aa7ca8e9730..2d1d48d1e94 100644
--- a/gcc/gimple-array-bounds.h
+++ b/gcc/gimple-array-bounds.h
@@ -33,7 +33,7 @@  public:
 private:
   static tree check_array_bounds (tree *tp, int *walk_subtree, void *data);
   bool check_array_ref (location_t, tree, gimple *, bool ignore_off_by_one);
-  bool check_mem_ref (location_t, tree, bool ignore_off_by_one);
+  bool check_mem_ref (location_t, tree, gimple *, bool ignore_off_by_one);
   void check_addr_expr (location_t, tree, gimple *);
   void get_value_range (irange &r, const_tree op, gimple *);
 
diff --git a/gcc/gimple-iterator.cc b/gcc/gimple-iterator.cc
index 93646262eac..472a79360d7 100644
--- a/gcc/gimple-iterator.cc
+++ b/gcc/gimple-iterator.cc
@@ -33,6 +33,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-ssa.h"
 #include "value-prof.h"
 #include "gimplify.h"
+#include "diagnostic-copy-history.h"
 
 
 /* Mark the statement STMT as modified, and update it.  */
@@ -581,6 +582,8 @@  gsi_remove (gimple_stmt_iterator *i, bool remove_permanently)
 	cfun->debug_marker_count--;
       require_eh_edge_purge = remove_stmt_from_eh_lp (stmt);
       gimple_remove_stmt_histograms (cfun, stmt);
+      if (get_copy_history (stmt) != NULL)
+	remove_copy_history (stmt);
     }
 
   /* Update the iterator and re-wire the links in I->SEQ.  */
diff --git a/gcc/testsuite/gcc.dg/pr109071.c b/gcc/testsuite/gcc.dg/pr109071.c
new file mode 100644
index 00000000000..95ae576451e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr109071.c
@@ -0,0 +1,43 @@ 
+/* PR tree-optimization/109071 need more context for -Warray-bounds warnings
+   due to code duplication from jump threading.  */  
+/* { dg-options "-O2 -Wall -fdiagnostics-explain-harder" } */
+/* { dg-additional-options "-fdiagnostics-show-line-numbers -fdiagnostics-path-format=inline-events -fdiagnostics-show-caret" } */
+/* { dg-enable-nn-line-numbers "" } */
+
+extern void warn(void);
+static inline void assign(int val, int *regs, int index)
+{
+  if (index >= 4)
+    warn();
+  *regs = val;
+}
+struct nums {int vals[4];};
+
+void sparx5_set (int *ptr, struct nums *sg, int index)
+{
+  int *val = &sg->vals[index]; /* { dg-warning "is above array bounds" } */
+
+  assign(0,    ptr, index);
+  assign(*val, ptr, index);
+}
+/* { dg-begin-multiline-output "" }
+   NN |   int *val = &sg->vals[index];
+      |               ~~~~~~~~^~~~~~~
+  'sparx5_set': events 1-2
+   NN |   if (index >= 4)
+      |      ^
+      |      |
+      |      (1) when the condition is evaluated to true
+......
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+      |               ~~~~~~~~~~~~~~~
+      |                       |
+      |                       (2) out of array bounds here
+   { dg-end-multiline-output "" } */
+
+/* { dg-begin-multiline-output "" }
+   NN | struct nums {int vals[4];};
+      |                  ^~~~
+   { dg-end-multiline-output "" } */
diff --git a/gcc/toplev.cc b/gcc/toplev.cc
index f715f977b72..e9ae55f64be 100644
--- a/gcc/toplev.cc
+++ b/gcc/toplev.cc
@@ -41,6 +41,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "cgraph.h"
 #include "coverage.h"
 #include "diagnostic.h"
+#include "diagnostic-copy-history.h"
 #include "varasm.h"
 #include "tree-inline.h"
 #include "realmpfr.h"	/* For GMP/MPFR/MPC versions, in print_version.  */
@@ -2381,6 +2382,8 @@  toplev::finalize (void)
   tree_cc_finalize ();
   reginfo_cc_finalize ();
 
+  copy_history_finalize ();
+
   /* save_decoded_options uses opts_obstack, so these must
      be cleaned up together.  */
   obstack_free (&opts_obstack, NULL);
diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
index fa61ba9512b..9292f0c2aa1 100644
--- a/gcc/tree-ssa-threadupdate.cc
+++ b/gcc/tree-ssa-threadupdate.cc
@@ -36,6 +36,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "tree-cfg.h"
 #include "tree-vectorizer.h"
 #include "tree-pass.h"
+#include "diagnostic-copy-history.h"
 
 /* Given a block B, update the CFG and SSA graph to reflect redirecting
    one or more in-edges to B to instead reach the destination of an
@@ -2371,6 +2372,37 @@  back_jt_path_registry::adjust_paths_after_duplication (unsigned curr_path_num)
     }
 }
 
+/* Set copy history to all the stmts in the basic block BB based on
+   the entry edge ENTRY and whether this basic block is on the true
+   path of the condition in the destination of the edge ENTRY.
+   Return TRUE when copy history are set, otherwise return FALSE.  */
+
+static bool
+set_copy_history_to_stmts_in_bb (basic_block bb, edge entry, bool is_true_path)
+{
+  /* First, get the condition statement in the destination of the
+     edge ENTRY.  */
+  basic_block cond_block = entry->dest;
+  gimple *cond_stmt = NULL;
+  gimple_stmt_iterator gsi;
+  gsi = gsi_last_bb (cond_block);
+  /* if the cond_block ends with a conditional statement, get it.  */
+  if (!gsi_end_p (gsi)
+      && gsi_stmt (gsi)
+      && (gimple_code (gsi_stmt (gsi)) == GIMPLE_COND))
+    cond_stmt = gsi_stmt (gsi);
+
+  if (!cond_stmt)
+    return false;
+
+  for (gsi = gsi_start_bb (bb); !gsi_end_p (gsi); gsi_next (&gsi))
+    set_copy_history (gsi_stmt (gsi),
+		      gimple_location (cond_stmt),
+		      is_true_path,
+		      COPY_BY_THREAD_JUMP);
+  return true;
+}
+
 /* Duplicates a jump-thread path of N_REGION basic blocks.
    The ENTRY edge is redirected to the duplicate of the region.
 
@@ -2419,6 +2451,16 @@  back_jt_path_registry::duplicate_thread_path (edge entry,
   copy_bbs (region, n_region, region_copy, &exit, 1, &exit_copy, loop,
 	    split_edge_bb_loc (entry), false);
 
+  /* Set the copy history for all the stmts in both original and copied
+     basic blocks.  The copied regions are in the true path.  */
+  if (flag_diagnostics_explain_harder)
+    for (i = 0; i < n_region; i++)
+      {
+	set_copy_history_to_stmts_in_bb (region[i], entry, false);
+	set_copy_history_to_stmts_in_bb (region_copy[i], entry, true);
+      }
+
+
   /* Fix up: copy_bbs redirects all edges pointing to copied blocks.  The
      following code ensures that all the edges exiting the jump-thread path are
      redirected back to the original code: these edges are exceptions