From patchwork Thu Aug 29 22:58:10 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Malcolm <dmalcolm@redhat.com>
X-Patchwork-Id: 1978640
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (1024-bit key;
 unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256
 header.s=mimecast20190719 header.b=A0Gs6Vq6;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4WvxY56Xxmz1yfn
	for <incoming@patchwork.ozlabs.org>; Fri, 30 Aug 2024 08:59:05 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id B39DF3861831
	for <incoming@patchwork.ozlabs.org>; Thu, 29 Aug 2024 22:59:02 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 919D2385C6C8
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 919D2385C6C8
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 919D2385C6C8
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=170.10.133.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724972309; cv=none;
 b=t4xXx20de6rOH3hAOmED6GlGHWJROfXoF0xCT8Bjb7HxrXRkGL1XHDDWiOQHUOFw5higD7ruT5Fm/O0fujtovdcuv2RRJup/34BSTFWLq3F9KZDnVc3v5Rkxebr01YCUsccZTcrkoKDiTxkysrmY+P1wvS+Qav2rRvNA2JCIypg=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1724972309; c=relaxed/simple;
 bh=y1XP7QAgYnlIKhUHZ12MaQ+4Q8ABeyiGv9lhRED1CW8=;
 h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version;
 b=vp7UZRn0sna28+Kbenq1NrYdahB7ncRla6VUma7576i7+UeymvoN6agRCsizCTugyrGSmWPjRN1eR2cpNw3nzIQ0bvNPwdHy9fWK3QbTOtdH91kw+8kdCZo9uzcP1Uim5mmlk97pWeRBNnN5JB7+WBsMTfbAdeXIejbWATrVJCI=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1724972304;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding;
 bh=k7ZMnq1fi5KG9MoFdV4Cwb/gYqpMm67Pw1uf6ruPR/Y=;
 b=A0Gs6Vq6ZXl6k5qHVR5LswB8fvVPjgQIejEM96L9fWRd84qSmT+KlX0GV+d9Jii7rjdc1l
 Pu/nqZ88kgSVPyvsiPt9wrPo+ektmAtK/wSVnzhO72vVWleGK4etrWEd9JO2TF6++1u8sJ
 qkfGHF+AO4sgqSaAXBEWyUc7SdrVaMw=
Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com
 (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by
 relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3,
 cipher=TLS_AES_256_GCM_SHA384) id us-mta-690-7s4mc3Z7P86URIYzqhTxFQ-1; Thu,
 29 Aug 2024 18:58:21 -0400
X-MC-Unique: 7s4mc3Z7P86URIYzqhTxFQ-1
Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com
 (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest
 SHA256)
 (No client certificate requested)
 by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS
 id 588F619560A6
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:20 +0000 (UTC)
Received: from t14s.localdomain.com (unknown [10.22.16.43])
 by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP
 id C7C5B19560A3; Thu, 29 Aug 2024 22:58:17 +0000 (UTC)
From: David Malcolm <dmalcolm@redhat.com>
To: gcc-patches@gcc.gnu.org
Cc: David Malcolm <dmalcolm@redhat.com>
Subject: [pushed 1/4] Use std::unique_ptr for optinfo_item
Date: Thu, 29 Aug 2024 18:58:10 -0400
Message-Id: <20240829225813.2567570-1-dmalcolm@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-11.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL,
 SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

As preliminary work towards an overhaul of how optinfo_items
interact with dump_pretty_printer, replace uses of optinfo_item * with
std::unique_ptr<optinfo_item> to make ownership clearer.

No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Successfully built stage1 cc1 on all configs via config-list.mk.
Pushed to trunk as r15-3309-g464a3d2fe53362.

gcc/ChangeLog:
	* config/aarch64/aarch64.cc: Define INCLUDE_MEMORY.
	* config/arm/arm.cc: Likewise.
	* config/i386/i386.cc: Likewise.
	* config/loongarch/loongarch.cc: Likewise.
	* config/riscv/riscv-vector-costs.cc: Likewise.
	* config/riscv/riscv.cc: Likewise.
	* config/rs6000/rs6000.cc: Likewise.
	* dump-context.h (dump_context::emit_item): Convert "item" param
	from * to const &.
	(dump_pretty_printer::stash_item): Convert "item" param from
	optinfo_ * to std::unique_ptr<optinfo_item>.
	(dump_pretty_printer::emit_item): Likewise.
	* dumpfile.cc: Include "make-unique.h".
	(make_item_for_dump_gimple_stmt): Replace uses of optinfo_item *
	with std::unique_ptr<optinfo_item>.
	(dump_context::dump_gimple_stmt): Likewise.
	(make_item_for_dump_gimple_expr): Likewise.
	(dump_context::dump_gimple_expr): Likewise.
	(make_item_for_dump_generic_expr): Likewise.
	(dump_context::dump_generic_expr): Likewise.
	(make_item_for_dump_symtab_node): Likewise.
	(dump_pretty_printer::emit_items): Likewise.
	(dump_pretty_printer::emit_any_pending_textual_chunks): Likewise.
	(dump_pretty_printer::emit_item): Likewise.
	(dump_pretty_printer::stash_item): Likewise.
	(dump_pretty_printer::decode_format): Likewise.
	(dump_context::dump_printf_va): Fix overlong line.
	(make_item_for_dump_dec): Replace uses of optinfo_item * with
	std::unique_ptr<optinfo_item>.
	(dump_context::dump_dec): Likewise.
	(dump_context::dump_symtab_node): Likewise.
	(dump_context::begin_scope): Likewise.
	(dump_context::emit_item): Likewise.
	* gimple-loop-interchange.cc: Define INCLUDE_MEMORY.
	* gimple-loop-jam.cc: Likewise.
	* gimple-loop-versioning.cc: Likewise.
	* graphite-dependences.cc: Likewise.
	* graphite-isl-ast-to-gimple.cc: Likewise.
	* graphite-optimize-isl.cc: Likewise.
	* graphite-poly.cc: Likewise.
	* graphite-scop-detection.cc: Likewise.
	* graphite-sese-to-poly.cc: Likewise.
	* graphite.cc: Likewise.
	* opt-problem.cc: Likewise.
	* optinfo.cc (optinfo::add_item): Convert "item" param from
	optinfo_ * to std::unique_ptr<optinfo_item>.
	(optinfo::emit_for_opt_problem): Update for change to
	dump_context::emit_item.
	* optinfo.h: Add #error to fail immediately if INCLUDE_MEMORY
	wasn't defined, rather than fail to find std::unique_ptr.
	(optinfo::add_item): Convert "item" param from optinfo_ * to
	std::unique_ptr<optinfo_item>.
	* sese.cc: Define INCLUDE_MEMORY.
	* targhooks.cc: Likewise.
	* tree-data-ref.cc: Likewise.
	* tree-if-conv.cc: Likewise.
	* tree-loop-distribution.cc: Likewise.
	* tree-parloops.cc: Likewise.
	* tree-predcom.cc: Likewise.
	* tree-ssa-live.cc: Likewise.
	* tree-ssa-loop-ivcanon.cc: Likewise.
	* tree-ssa-loop-ivopts.cc: Likewise.
	* tree-ssa-loop-prefetch.cc: Likewise.
	* tree-ssa-loop-unswitch.cc: Likewise.
	* tree-ssa-phiopt.cc: Likewise.
	* tree-ssa-threadbackward.cc: Likewise.
	* tree-ssa-threadupdate.cc: Likewise.
	* tree-vect-data-refs.cc: Likewise.
	* tree-vect-generic.cc: Likewise.
	* tree-vect-loop-manip.cc: Likewise.
	* tree-vect-loop.cc: Likewise.
	* tree-vect-patterns.cc: Likewise.
	* tree-vect-slp-patterns.cc: Likewise.
	* tree-vect-slp.cc: Likewise.
	* tree-vect-stmts.cc: Likewise.
	* tree-vectorizer.cc: Likewise.

gcc/testsuite/ChangeLog:
	* gcc.dg/plugin/dump_plugin.c: Define INCLUDE_MEMORY.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/config/aarch64/aarch64.cc             |   1 +
 gcc/config/arm/arm.cc                     |   1 +
 gcc/config/i386/i386.cc                   |   1 +
 gcc/config/loongarch/loongarch.cc         |   1 +
 gcc/config/riscv/riscv-vector-costs.cc    |   1 +
 gcc/config/riscv/riscv.cc                 |   1 +
 gcc/config/rs6000/rs6000.cc               |   1 +
 gcc/dump-context.h                        |   7 +-
 gcc/dumpfile.cc                           | 156 +++++++++++-----------
 gcc/gimple-loop-interchange.cc            |   1 +
 gcc/gimple-loop-jam.cc                    |   1 +
 gcc/gimple-loop-versioning.cc             |   1 +
 gcc/graphite-dependences.cc               |   1 +
 gcc/graphite-isl-ast-to-gimple.cc         |   1 +
 gcc/graphite-optimize-isl.cc              |   1 +
 gcc/graphite-poly.cc                      |   1 +
 gcc/graphite-scop-detection.cc            |   1 +
 gcc/graphite-sese-to-poly.cc              |   1 +
 gcc/graphite.cc                           |   1 +
 gcc/opt-problem.cc                        |   1 +
 gcc/optinfo.cc                            |   8 +-
 gcc/optinfo.h                             |  11 +-
 gcc/sese.cc                               |   1 +
 gcc/targhooks.cc                          |   1 +
 gcc/testsuite/gcc.dg/plugin/dump_plugin.c |   1 +
 gcc/tree-data-ref.cc                      |   1 +
 gcc/tree-if-conv.cc                       |   1 +
 gcc/tree-loop-distribution.cc             |   1 +
 gcc/tree-parloops.cc                      |   1 +
 gcc/tree-predcom.cc                       |   1 +
 gcc/tree-ssa-live.cc                      |   1 +
 gcc/tree-ssa-loop-ivcanon.cc              |   1 +
 gcc/tree-ssa-loop-ivopts.cc               |   1 +
 gcc/tree-ssa-loop-prefetch.cc             |   1 +
 gcc/tree-ssa-loop-unswitch.cc             |   1 +
 gcc/tree-ssa-phiopt.cc                    |   1 +
 gcc/tree-ssa-threadbackward.cc            |   1 +
 gcc/tree-ssa-threadupdate.cc              |   1 +
 gcc/tree-vect-data-refs.cc                |   1 +
 gcc/tree-vect-generic.cc                  |   1 +
 gcc/tree-vect-loop-manip.cc               |   1 +
 gcc/tree-vect-loop.cc                     |   1 +
 gcc/tree-vect-patterns.cc                 |   1 +
 gcc/tree-vect-slp-patterns.cc             |   1 +
 gcc/tree-vect-slp.cc                      |   1 +
 gcc/tree-vect-stmts.cc                    |   1 +
 gcc/tree-vectorizer.cc                    |   1 +
 47 files changed, 137 insertions(+), 88 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index bfd7bcdef7cb..765189317be2 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -22,6 +22,7 @@
 
 #define INCLUDE_STRING
 #define INCLUDE_ALGORITHM
+#define INCLUDE_MEMORY
 #define INCLUDE_VECTOR
 #include "config.h"
 #include "system.h"
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 92cd168e6593..458e1af1fba4 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -23,6 +23,7 @@
 #define IN_TARGET_CODE 1
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #define INCLUDE_STRING
 #include "system.h"
 #include "coretypes.h"
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index f044826269ca..ece33f16ebc5 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -17,6 +17,7 @@ You should have received a copy of the GNU General Public License
 along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
+#define INCLUDE_MEMORY
 #define INCLUDE_STRING
 #define IN_TARGET_CODE 1
 
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index c7a02103ef51..f956ee4b119b 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #define IN_TARGET_CODE 1
 
+#define INCLUDE_MEMORY
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
diff --git a/gcc/config/riscv/riscv-vector-costs.cc b/gcc/config/riscv/riscv-vector-costs.cc
index a80e167597be..25570bd40040 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #define IN_TARGET_CODE 1
 
+#define INCLUDE_MEMORY
 #define INCLUDE_STRING
 #include "config.h"
 #include "system.h"
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 1f60d8f9711c..3f42b98914fe 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #define IN_TARGET_CODE 1
 
+#define INCLUDE_MEMORY
 #define INCLUDE_STRING
 #include "config.h"
 #include "system.h"
diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
index f2bd9edea8a1..9efd29b43f79 100644
--- a/gcc/config/rs6000/rs6000.cc
+++ b/gcc/config/rs6000/rs6000.cc
@@ -22,6 +22,7 @@
 #define IN_TARGET_CODE 1
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/dump-context.h b/gcc/dump-context.h
index b2aed2853a37..5992956380b1 100644
--- a/gcc/dump-context.h
+++ b/gcc/dump-context.h
@@ -120,7 +120,7 @@ class dump_context
   void end_any_optinfo ();
 
   void emit_optinfo (const optinfo *info);
-  void emit_item (optinfo_item *item, dump_flags_t dump_kind);
+  void emit_item (const optinfo_item &item, dump_flags_t dump_kind);
 
   bool apply_dump_filter_p (dump_flags_t dump_kind, dump_flags_t filter) const;
 
@@ -186,11 +186,12 @@ private:
   bool decode_format (text_info *text, const char *spec,
 		      const char **buffer_ptr);
 
-  void stash_item (const char **buffer_ptr, optinfo_item *item);
+  void stash_item (const char **buffer_ptr,
+		   std::unique_ptr<optinfo_item> item);
 
   void emit_any_pending_textual_chunks (optinfo *dest);
 
-  void emit_item (optinfo_item *item, optinfo *dest);
+  void emit_item (std::unique_ptr<optinfo_item> item, optinfo *dest);
 
   dump_context *m_context;
   dump_flags_t m_dump_kind;
diff --git a/gcc/dumpfile.cc b/gcc/dumpfile.cc
index 6353c0857449..2971c69bb0a1 100644
--- a/gcc/dumpfile.cc
+++ b/gcc/dumpfile.cc
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "optinfo-emit-json.h"
 #include "stringpool.h" /* for get_identifier.  */
 #include "spellcheck.h"
+#include "make-unique.h"
 
 /* If non-NULL, return one past-the-end of the matching SUBPART of
    the WHOLE string.  */
@@ -628,7 +629,7 @@ dump_context::dump_loc_immediate (dump_flags_t dump_kind,
 
 /* Make an item for the given dump call, equivalent to print_gimple_stmt.  */
 
-static optinfo_item *
+static std::unique_ptr<optinfo_item>
 make_item_for_dump_gimple_stmt (gimple *stmt, int spc, dump_flags_t dump_flags)
 {
   pretty_printer pp;
@@ -636,9 +637,10 @@ make_item_for_dump_gimple_stmt (gimple *stmt, int spc, dump_flags_t dump_flags)
   pp_gimple_stmt_1 (&pp, stmt, spc, dump_flags);
   pp_newline (&pp);
 
-  optinfo_item *item
-    = new optinfo_item (OPTINFO_ITEM_KIND_GIMPLE, gimple_location (stmt),
-			xstrdup (pp_formatted_text (&pp)));
+  std::unique_ptr<optinfo_item> item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_GIMPLE,
+				 gimple_location (stmt),
+				 xstrdup (pp_formatted_text (&pp)));
   return item;
 }
 
@@ -650,17 +652,15 @@ dump_context::dump_gimple_stmt (const dump_metadata_t &metadata,
 				dump_flags_t extra_dump_flags,
 				gimple *gs, int spc)
 {
-  optinfo_item *item
+  auto item
     = make_item_for_dump_gimple_stmt (gs, spc, dump_flags | extra_dump_flags);
-  emit_item (item, metadata.get_dump_flags ());
+  emit_item (*item.get (), metadata.get_dump_flags ());
 
   if (optinfo_enabled_p ())
     {
       optinfo &info = ensure_pending_optinfo (metadata);
-      info.add_item (item);
+      info.add_item (std::move (item));
     }
-  else
-    delete item;
 }
 
 /* Similar to dump_gimple_stmt, except additionally print source location.  */
@@ -677,7 +677,7 @@ dump_context::dump_gimple_stmt_loc (const dump_metadata_t &metadata,
 
 /* Make an item for the given dump call, equivalent to print_gimple_expr.  */
 
-static optinfo_item *
+static std::unique_ptr<optinfo_item>
 make_item_for_dump_gimple_expr (gimple *stmt, int spc, dump_flags_t dump_flags)
 {
   dump_flags |= TDF_RHS_ONLY;
@@ -685,9 +685,10 @@ make_item_for_dump_gimple_expr (gimple *stmt, int spc, dump_flags_t dump_flags)
   pp_needs_newline (&pp) = true;
   pp_gimple_stmt_1 (&pp, stmt, spc, dump_flags);
 
-  optinfo_item *item
-    = new optinfo_item (OPTINFO_ITEM_KIND_GIMPLE, gimple_location (stmt),
-			xstrdup (pp_formatted_text (&pp)));
+  std::unique_ptr<optinfo_item> item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_GIMPLE,
+				 gimple_location (stmt),
+				 xstrdup (pp_formatted_text (&pp)));
   return item;
 }
 
@@ -700,17 +701,15 @@ dump_context::dump_gimple_expr (const dump_metadata_t &metadata,
 				dump_flags_t extra_dump_flags,
 				gimple *gs, int spc)
 {
-  optinfo_item *item
+  std::unique_ptr<optinfo_item> item
     = make_item_for_dump_gimple_expr (gs, spc, dump_flags | extra_dump_flags);
-  emit_item (item, metadata.get_dump_flags ());
+  emit_item (*item.get (), metadata.get_dump_flags ());
 
   if (optinfo_enabled_p ())
     {
       optinfo &info = ensure_pending_optinfo (metadata);
-      info.add_item (item);
+      info.add_item (std::move (item));
     }
-  else
-    delete item;
 }
 
 /* Similar to dump_gimple_expr, except additionally print source location.  */
@@ -728,7 +727,7 @@ dump_context::dump_gimple_expr_loc (const dump_metadata_t &metadata,
 
 /* Make an item for the given dump call, equivalent to print_generic_expr.  */
 
-static optinfo_item *
+static std::unique_ptr<optinfo_item>
 make_item_for_dump_generic_expr (tree node, dump_flags_t dump_flags)
 {
   pretty_printer pp;
@@ -740,9 +739,9 @@ make_item_for_dump_generic_expr (tree node, dump_flags_t dump_flags)
   if (EXPR_HAS_LOCATION (node))
     loc = EXPR_LOCATION (node);
 
-  optinfo_item *item
-    = new optinfo_item (OPTINFO_ITEM_KIND_TREE, loc,
-			xstrdup (pp_formatted_text (&pp)));
+  std::unique_ptr<optinfo_item> item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_TREE, loc,
+				 xstrdup (pp_formatted_text (&pp)));
   return item;
 }
 
@@ -754,17 +753,15 @@ dump_context::dump_generic_expr (const dump_metadata_t &metadata,
 				 dump_flags_t extra_dump_flags,
 				 tree t)
 {
-  optinfo_item *item
+  std::unique_ptr<optinfo_item> item
     = make_item_for_dump_generic_expr (t, dump_flags | extra_dump_flags);
-  emit_item (item, metadata.get_dump_flags ());
+  emit_item (*item.get (), metadata.get_dump_flags ());
 
   if (optinfo_enabled_p ())
     {
       optinfo &info = ensure_pending_optinfo (metadata);
-      info.add_item (item);
+      info.add_item (std::move (item));
     }
-  else
-    delete item;
 }
 
 
@@ -783,13 +780,13 @@ dump_context::dump_generic_expr_loc (const dump_metadata_t &metadata,
 
 /* Make an item for the given dump call.  */
 
-static optinfo_item *
+static std::unique_ptr<optinfo_item>
 make_item_for_dump_symtab_node (symtab_node *node)
 {
   location_t loc = DECL_SOURCE_LOCATION (node->decl);
-  optinfo_item *item
-    = new optinfo_item (OPTINFO_ITEM_KIND_SYMTAB_NODE, loc,
-			xstrdup (node->dump_name ()));
+  std::unique_ptr<optinfo_item> item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_SYMTAB_NODE, loc,
+				 xstrdup (node->dump_name ()));
   return item;
 }
 
@@ -834,7 +831,9 @@ dump_pretty_printer::emit_items (optinfo *dest)
 	{
 	  emit_any_pending_textual_chunks (dest);
 	  /* This chunk has a stashed item: use it.  */
-	  emit_item (m_stashed_items[stashed_item_idx++].item, dest);
+	  std::unique_ptr <optinfo_item> item
+	    (m_stashed_items[stashed_item_idx++].item);
+	  emit_item (std::move (item), dest);
 	}
       else
 	/* This chunk is purely textual.  Print it (to
@@ -866,10 +865,10 @@ dump_pretty_printer::emit_any_pending_textual_chunks (optinfo *dest)
     return;
 
   char *formatted_text = xstrdup (pp_formatted_text (this));
-  optinfo_item *item
-    = new optinfo_item (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
-			formatted_text);
-  emit_item (item, dest);
+  std::unique_ptr<optinfo_item> item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
+				 formatted_text);
+  emit_item (std::move (item), dest);
 
   /* Clear the pending text by unwinding formatted_text back to the start
      of the buffer (without deallocating).  */
@@ -881,25 +880,25 @@ dump_pretty_printer::emit_any_pending_textual_chunks (optinfo *dest)
    to DEST; otherwise delete ITEM.  */
 
 void
-dump_pretty_printer::emit_item (optinfo_item *item, optinfo *dest)
+dump_pretty_printer::emit_item (std::unique_ptr<optinfo_item> item,
+				optinfo *dest)
 {
-  m_context->emit_item (item, m_dump_kind);
+  m_context->emit_item (*item.get (), m_dump_kind);
   if (dest)
-    dest->add_item (item);
-  else
-    delete item;
+    dest->add_item (std::move (item));
 }
 
 /* Record that ITEM (generated in phase 2 of formatting) is to be used for
    the chunk at BUFFER_PTR in phase 3 (by emit_items).  */
 
 void
-dump_pretty_printer::stash_item (const char **buffer_ptr, optinfo_item *item)
+dump_pretty_printer::stash_item (const char **buffer_ptr,
+				 std::unique_ptr<optinfo_item> item)
 {
   gcc_assert (buffer_ptr);
-  gcc_assert (item);
+  gcc_assert (item.get ());
 
-  m_stashed_items.safe_push (stashed_item (buffer_ptr, item));
+  m_stashed_items.safe_push (stashed_item (buffer_ptr, item.release ()));
 }
 
 /* pp_format_decoder callback for dump_pretty_printer, and thus for
@@ -953,8 +952,8 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 	cgraph_node *node = va_arg (*text->m_args_ptr, cgraph_node *);
 
 	/* Make an item for the node, and stash it.  */
-	optinfo_item *item = make_item_for_dump_symtab_node (node);
-	stash_item (buffer_ptr, item);
+	auto item = make_item_for_dump_symtab_node (node);
+	stash_item (buffer_ptr, std::move (item));
 	return true;
       }
 
@@ -963,8 +962,8 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 	gimple *stmt = va_arg (*text->m_args_ptr, gimple *);
 
 	/* Make an item for the stmt, and stash it.  */
-	optinfo_item *item = make_item_for_dump_gimple_expr (stmt, 0, TDF_SLIM);
-	stash_item (buffer_ptr, item);
+	auto item = make_item_for_dump_gimple_expr (stmt, 0, TDF_SLIM);
+	stash_item (buffer_ptr, std::move (item));
 	return true;
       }
 
@@ -973,8 +972,8 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 	gimple *stmt = va_arg (*text->m_args_ptr, gimple *);
 
 	/* Make an item for the stmt, and stash it.  */
-	optinfo_item *item = make_item_for_dump_gimple_stmt (stmt, 0, TDF_SLIM);
-	stash_item (buffer_ptr, item);
+	auto item = make_item_for_dump_gimple_stmt (stmt, 0, TDF_SLIM);
+	stash_item (buffer_ptr, std::move (item));
 	return true;
       }
 
@@ -983,8 +982,8 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 	tree t = va_arg (*text->m_args_ptr, tree);
 
 	/* Make an item for the tree, and stash it.  */
-	optinfo_item *item = make_item_for_dump_generic_expr (t, TDF_SLIM);
-	stash_item (buffer_ptr, item);
+	auto item = make_item_for_dump_generic_expr (t, TDF_SLIM);
+	stash_item (buffer_ptr, std::move (item));
 	return true;
       }
 
@@ -996,7 +995,8 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 /* Output a formatted message using FORMAT on appropriate dump streams.  */
 
 void
-dump_context::dump_printf_va (const dump_metadata_t &metadata, const char *format,
+dump_context::dump_printf_va (const dump_metadata_t &metadata,
+			      const char *format,
 			      va_list *ap)
 {
   dump_pretty_printer pp (this, metadata.get_dump_flags ());
@@ -1031,7 +1031,7 @@ dump_context::dump_printf_loc_va (const dump_metadata_t &metadata,
 /* Make an item for the given dump call, equivalent to print_dec.  */
 
 template<unsigned int N, typename C>
-static optinfo_item *
+static std::unique_ptr<optinfo_item>
 make_item_for_dump_dec (const poly_int<N, C> &value)
 {
   STATIC_ASSERT (poly_coeff_traits<C>::signedness >= 0);
@@ -1051,9 +1051,9 @@ make_item_for_dump_dec (const poly_int<N, C> &value)
 	}
     }
 
-  optinfo_item *item
-    = new optinfo_item (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
-			xstrdup (pp_formatted_text (&pp)));
+  auto item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
+				 xstrdup (pp_formatted_text (&pp)));
   return item;
 }
 
@@ -1061,35 +1061,33 @@ make_item_for_dump_dec (const poly_int<N, C> &value)
 
 template<unsigned int N, typename C>
 void
-dump_context::dump_dec (const dump_metadata_t &metadata, const poly_int<N, C> &value)
+dump_context::dump_dec (const dump_metadata_t &metadata,
+			const poly_int<N, C> &value)
 {
-  optinfo_item *item = make_item_for_dump_dec (value);
-  emit_item (item, metadata.get_dump_flags ());
+  auto item = make_item_for_dump_dec (value);
+  emit_item (*item.get (), metadata.get_dump_flags ());
 
   if (optinfo_enabled_p ())
     {
       optinfo &info = ensure_pending_optinfo (metadata);
-      info.add_item (item);
+      info.add_item (std::move (item));
     }
-  else
-    delete item;
 }
 
 /* Output the name of NODE on appropriate dump streams.  */
 
 void
-dump_context::dump_symtab_node (const dump_metadata_t &metadata, symtab_node *node)
+dump_context::dump_symtab_node (const dump_metadata_t &metadata,
+				symtab_node *node)
 {
-  optinfo_item *item = make_item_for_dump_symtab_node (node);
-  emit_item (item, metadata.get_dump_flags ());
+  auto item = make_item_for_dump_symtab_node (node);
+  emit_item (*item.get (), metadata.get_dump_flags ());
 
   if (optinfo_enabled_p ())
     {
       optinfo &info = ensure_pending_optinfo (metadata);
-      info.add_item (item);
+      info.add_item (std::move (item));
     }
-  else
-    delete item;
 }
 
 /* Get the current dump scope-nesting depth.
@@ -1132,10 +1130,10 @@ dump_context::begin_scope (const char *name,
   pretty_printer pp;
   pp_printf (&pp, "%s %s %s", "===", name, "===");
   pp_newline (&pp);
-  optinfo_item *item
-    = new optinfo_item (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
-			xstrdup (pp_formatted_text (&pp)));
-  emit_item (item, MSG_NOTE);
+  std::unique_ptr<optinfo_item> item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
+				 xstrdup (pp_formatted_text (&pp)));
+  emit_item (*item.get (), MSG_NOTE);
 
   if (optinfo_enabled_p ())
     {
@@ -1143,11 +1141,9 @@ dump_context::begin_scope (const char *name,
 	= begin_next_optinfo (dump_metadata_t (MSG_NOTE, impl_location),
 			      user_location);
       info.m_kind = OPTINFO_KIND_SCOPE;
-      info.add_item (item);
+      info.add_item (std::move (item));
       end_any_optinfo ();
     }
-  else
-    delete item;
 }
 
 /* Pop a nested dump scope.  */
@@ -1226,17 +1222,17 @@ dump_context::emit_optinfo (const optinfo *info)
    consolidation into optinfo instances).  */
 
 void
-dump_context::emit_item (optinfo_item *item, dump_flags_t dump_kind)
+dump_context::emit_item (const optinfo_item &item, dump_flags_t dump_kind)
 {
   if (dump_file && apply_dump_filter_p (dump_kind, pflags))
-    fprintf (dump_file, "%s", item->get_text ());
+    fprintf (dump_file, "%s", item.get_text ());
 
   if (alt_dump_file && apply_dump_filter_p (dump_kind, alt_flags))
-    fprintf (alt_dump_file, "%s", item->get_text ());
+    fprintf (alt_dump_file, "%s", item.get_text ());
 
   /* Support for temp_dump_context in selftests.  */
   if (m_test_pp && apply_dump_filter_p (dump_kind, m_test_pp_flags))
-    pp_string (m_test_pp, item->get_text ());
+    pp_string (m_test_pp, item.get_text ());
 }
 
 /* The current singleton dump_context, and its default.  */
diff --git a/gcc/gimple-loop-interchange.cc b/gcc/gimple-loop-interchange.cc
index b4228044a387..a4ea818bbdfd 100644
--- a/gcc/gimple-loop-interchange.cc
+++ b/gcc/gimple-loop-interchange.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/gimple-loop-jam.cc b/gcc/gimple-loop-jam.cc
index 306d5ceaef48..bf01e0ba6467 100644
--- a/gcc/gimple-loop-jam.cc
+++ b/gcc/gimple-loop-jam.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "tree-pass.h"
diff --git a/gcc/gimple-loop-versioning.cc b/gcc/gimple-loop-versioning.cc
index adea207659be..107b00200247 100644
--- a/gcc/gimple-loop-versioning.cc
+++ b/gcc/gimple-loop-versioning.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/graphite-dependences.cc b/gcc/graphite-dependences.cc
index a35f71254f84..41e1114173bf 100644
--- a/gcc/graphite-dependences.cc
+++ b/gcc/graphite-dependences.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ISL
+#define INCLUDE_MEMORY
 
 #include "config.h"
 
diff --git a/gcc/graphite-isl-ast-to-gimple.cc b/gcc/graphite-isl-ast-to-gimple.cc
index a27402ba6b70..ff539b14045c 100644
--- a/gcc/graphite-isl-ast-to-gimple.cc
+++ b/gcc/graphite-isl-ast-to-gimple.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ISL
+#define INCLUDE_MEMORY
 
 #include "config.h"
 
diff --git a/gcc/graphite-optimize-isl.cc b/gcc/graphite-optimize-isl.cc
index 2222dd685425..ff3dd6b7ab63 100644
--- a/gcc/graphite-optimize-isl.cc
+++ b/gcc/graphite-optimize-isl.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ISL
+#define INCLUDE_MEMORY
 
 #include "config.h"
 
diff --git a/gcc/graphite-poly.cc b/gcc/graphite-poly.cc
index c78ff986c325..76aba03d6332 100644
--- a/gcc/graphite-poly.cc
+++ b/gcc/graphite-poly.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ISL
+#define INCLUDE_MEMORY
 
 #include "config.h"
 
diff --git a/gcc/graphite-scop-detection.cc b/gcc/graphite-scop-detection.cc
index 9e44f100a1df..de7c111118ba 100644
--- a/gcc/graphite-scop-detection.cc
+++ b/gcc/graphite-scop-detection.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ISL
+#define INCLUDE_MEMORY
 
 #include "config.h"
 
diff --git a/gcc/graphite-sese-to-poly.cc b/gcc/graphite-sese-to-poly.cc
index 5ce898505a30..1e7818a9ca3e 100644
--- a/gcc/graphite-sese-to-poly.cc
+++ b/gcc/graphite-sese-to-poly.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ISL
+#define INCLUDE_MEMORY
 
 #include "config.h"
 
diff --git a/gcc/graphite.cc b/gcc/graphite.cc
index 80e6a5da6b29..65b970f9444e 100644
--- a/gcc/graphite.cc
+++ b/gcc/graphite.cc
@@ -28,6 +28,7 @@ along with GCC; see the file COPYING3.  If not see
    the related work.  */
 
 #define INCLUDE_ISL
+#define INCLUDE_MEMORY
 
 #include "config.h"
 #include "system.h"
diff --git a/gcc/opt-problem.cc b/gcc/opt-problem.cc
index f40f48196dea..d76ddaf57adf 100644
--- a/gcc/opt-problem.cc
+++ b/gcc/opt-problem.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/optinfo.cc b/gcc/optinfo.cc
index 7a8256171744..48a270cbfea8 100644
--- a/gcc/optinfo.cc
+++ b/gcc/optinfo.cc
@@ -84,10 +84,10 @@ optinfo::~optinfo ()
 /* Add ITEM to this optinfo.  */
 
 void
-optinfo::add_item (optinfo_item *item)
+optinfo::add_item (std::unique_ptr<optinfo_item> item)
 {
-  gcc_assert (item);
-  m_items.safe_push (item);
+  gcc_assert (item.get ());
+  m_items.safe_push (item.release ());
 }
 
 /* Get MSG_* flags corresponding to KIND.  */
@@ -123,7 +123,7 @@ optinfo::emit_for_opt_problem () const
   unsigned i;
   optinfo_item *item;
   FOR_EACH_VEC_ELT (m_items, i, item)
-    dump_context::get ().emit_item (item, dump_kind);
+    dump_context::get ().emit_item (*item, dump_kind);
 
   /* Re-emit to "non-immediate" destinations.  */
   dump_context::get ().emit_optinfo (this);
diff --git a/gcc/optinfo.h b/gcc/optinfo.h
index 986ef75756fc..db92294f3cab 100644
--- a/gcc/optinfo.h
+++ b/gcc/optinfo.h
@@ -21,6 +21,15 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_OPTINFO_H
 #define GCC_OPTINFO_H
 
+/* This header uses std::unique_ptr, but <memory> can't be directly
+   included due to issues with macros.  Hence <memory> must be included
+   from system.h by defining INCLUDE_MEMORY in any source file using
+   optinfo.h.  */
+
+#ifndef INCLUDE_MEMORY
+# error "You must define INCLUDE_MEMORY before including system.h to use optinfo.h"
+#endif
+
 /* An "optinfo" is a bundle of information describing part of an
    optimization, which can be emitted to zero or more of several
    destinations, such as:
@@ -119,7 +128,7 @@ class optinfo
   location_t get_location_t () const { return m_loc.get_location_t (); }
   profile_count get_count () const { return m_loc.get_count (); }
 
-  void add_item (optinfo_item *item);
+  void add_item (std::unique_ptr<optinfo_item> item);
 
   void emit_for_opt_problem () const;
 
diff --git a/gcc/sese.cc b/gcc/sese.cc
index e5c460571c55..e9b17fafa011 100644
--- a/gcc/sese.cc
+++ b/gcc/sese.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/targhooks.cc b/gcc/targhooks.cc
index 793932a77c60..dc040df9fcd1 100644
--- a/gcc/targhooks.cc
+++ b/gcc/targhooks.cc
@@ -47,6 +47,7 @@ along with GCC; see the file COPYING3.  If not see
    comment can thus be removed at that point.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "target.h"
diff --git a/gcc/testsuite/gcc.dg/plugin/dump_plugin.c b/gcc/testsuite/gcc.dg/plugin/dump_plugin.c
index 12573d66af21..fc76d69adb1d 100644
--- a/gcc/testsuite/gcc.dg/plugin/dump_plugin.c
+++ b/gcc/testsuite/gcc.dg/plugin/dump_plugin.c
@@ -1,5 +1,6 @@
 /* Plugin for testing dumpfile.c.  */
 
+#define INCLUDE_MEMORY
 #include "gcc-plugin.h"
 #include "config.h"
 #include "system.h"
diff --git a/gcc/tree-data-ref.cc b/gcc/tree-data-ref.cc
index bd61069b6316..48798f458b80 100644
--- a/gcc/tree-data-ref.cc
+++ b/gcc/tree-data-ref.cc
@@ -74,6 +74,7 @@ along with GCC; see the file COPYING3.  If not see
 */
 
 #define INCLUDE_ALGORITHM
+#define INCLUDE_MEMORY
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
diff --git a/gcc/tree-if-conv.cc b/gcc/tree-if-conv.cc
index 57992b6decaf..25248c138083 100644
--- a/gcc/tree-if-conv.cc
+++ b/gcc/tree-if-conv.cc
@@ -81,6 +81,7 @@ along with GCC; see the file COPYING3.  If not see
 */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-loop-distribution.cc b/gcc/tree-loop-distribution.cc
index f87393ee94d6..86e39e4575d9 100644
--- a/gcc/tree-loop-distribution.cc
+++ b/gcc/tree-loop-distribution.cc
@@ -90,6 +90,7 @@ along with GCC; see the file COPYING3.  If not see
 	data reuse.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-parloops.cc b/gcc/tree-parloops.cc
index 888a834faf91..f4468658732b 100644
--- a/gcc/tree-parloops.cc
+++ b/gcc/tree-parloops.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-predcom.cc b/gcc/tree-predcom.cc
index 9844fee1e974..eed878b8f54b 100644
--- a/gcc/tree-predcom.cc
+++ b/gcc/tree-predcom.cc
@@ -205,6 +205,7 @@ along with GCC; see the file COPYING3.  If not see
    i * i with ii_last + 2 * i + 1), to generalize strength reduction.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-live.cc b/gcc/tree-ssa-live.cc
index 60dfc05dcd94..0739faa022ef 100644
--- a/gcc/tree-ssa-live.cc
+++ b/gcc/tree-ssa-live.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-loop-ivcanon.cc b/gcc/tree-ssa-loop-ivcanon.cc
index 5ef24a919176..a8d25ad0efc4 100644
--- a/gcc/tree-ssa-loop-ivcanon.cc
+++ b/gcc/tree-ssa-loop-ivcanon.cc
@@ -36,6 +36,7 @@ along with GCC; see the file COPYING3.  If not see
        info).  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
index c3218a3e8eed..dfe1b2541562 100644
--- a/gcc/tree-ssa-loop-ivopts.cc
+++ b/gcc/tree-ssa-loop-ivopts.cc
@@ -90,6 +90,7 @@ along with GCC; see the file COPYING3.  If not see
       profitable.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-loop-prefetch.cc b/gcc/tree-ssa-loop-prefetch.cc
index bb5d5dec7795..52ea3bad07eb 100644
--- a/gcc/tree-ssa-loop-prefetch.cc
+++ b/gcc/tree-ssa-loop-prefetch.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-loop-unswitch.cc b/gcc/tree-ssa-loop-unswitch.cc
index 14b0df1aefe7..7601d91e8070 100644
--- a/gcc/tree-ssa-loop-unswitch.cc
+++ b/gcc/tree-ssa-loop-unswitch.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index f05ca727503b..3754fc05c8f2 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-threadbackward.cc b/gcc/tree-ssa-threadbackward.cc
index ea8d7b882d08..4bc72ec23755 100644
--- a/gcc/tree-ssa-threadbackward.cc
+++ b/gcc/tree-ssa-threadbackward.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-ssa-threadupdate.cc b/gcc/tree-ssa-threadupdate.cc
index fa61ba9512b7..c88cc1d6aac9 100644
--- a/gcc/tree-ssa-threadupdate.cc
+++ b/gcc/tree-ssa-threadupdate.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 5b0d548f8479..fe7fdec4ba0d 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 4bcab71c1683..3041fb8fcf23 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index 57dbcbe862cd..cb7843f6f72e 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 6456220cdc9b..1fb7bbd4d258 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #define INCLUDE_ALGORITHM
+#define INCLUDE_MEMORY
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index f52de2b6972d..bbb86fb4677d 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vect-slp-patterns.cc b/gcc/tree-vect-slp-patterns.cc
index 4a582ec9512e..8adae8a6ec0d 100644
--- a/gcc/tree-vect-slp-patterns.cc
+++ b/gcc/tree-vect-slp-patterns.cc
@@ -18,6 +18,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index 43ecd2689701..4e84dc778761 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -21,6 +21,7 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "config.h"
 #define INCLUDE_ALGORITHM
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 385e63163c24..6333d8e30ccd 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -20,6 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"
diff --git a/gcc/tree-vectorizer.cc b/gcc/tree-vectorizer.cc
index 1fb4fb36ed44..0efabcbb2580 100644
--- a/gcc/tree-vectorizer.cc
+++ b/gcc/tree-vectorizer.cc
@@ -55,6 +55,7 @@ along with GCC; see the file COPYING3.  If not see
 */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #include "system.h"
 #include "coretypes.h"
 #include "backend.h"

From patchwork Thu Aug 29 22:58:11 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Malcolm <dmalcolm@redhat.com>
X-Patchwork-Id: 1978639
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (1024-bit key;
 unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256
 header.s=mimecast20190719 header.b=OeuYteb9;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=8.43.85.97; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4WvxY318Btz1yfn
	for <incoming@patchwork.ozlabs.org>; Fri, 30 Aug 2024 08:59:01 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id A0C33386180B
	for <incoming@patchwork.ozlabs.org>; Thu, 29 Aug 2024 22:58:57 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.129.124])
 by sourceware.org (Postfix) with ESMTP id A1AF4385EC1B
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A1AF4385EC1B
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org A1AF4385EC1B
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=170.10.129.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724972306; cv=none;
 b=WhMdZpXLF7BgwPZiiQ1VKczpDR6nuRp0u7nBD4w0gE6wprsqeJAyHz8hxyNSzO+Im8yI2n6VBljXAJoVQXh+TCpeAHsPnWfUqgfeLHjnHN6xTlE0EiJI+UUA4j8MVGVC6rlc4dcvVgt6ZtnXilyN9u+Js3w1l/XiAQZ0b4os7HA=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1724972306; c=relaxed/simple;
 bh=mrgG9LEK3ANKjHfyeToA3nXvbjUL8YV6ksw1NWYaqH0=;
 h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version;
 b=jSJbsEbzjIQz8+EjsDHJp6JqufL53G2RSLcotQE66m4KziCXfaOFVQWIepBcaEMkberFUpuT1LD/QToKBR39YmW/2gtFdLOPE8a6+aUiIpcELN3hciU5xwVHuvNxBpXwL0Buqc612KY4HIJCd3jzy3glSqUcgDRed4nMwUAM6ME=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1724972304;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=sd1T7QSHFBfQwY1E42//QfcagqS2xQEmKfcOuLoOaSo=;
 b=OeuYteb9uTi2c2EhxW6nh5T1l2djXgncY1f+CLxnmpIimmK4wgjAWt6nuSfddX4Ga8xEVI
 fX5jMndVEdU71MVXY27E0vc6CPBH6Tk0pXk+Tt7i0sinHhnLIRuSRYAYr20/EK7nnvUqCh
 gAnG0Nr3AAuMz1X6oO8JDnKJNei4iX8=
Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com
 (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by
 relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3,
 cipher=TLS_AES_256_GCM_SHA384) id us-mta-125-53hWaKg3P8qCNcvprf8vdw-1; Thu,
 29 Aug 2024 18:58:22 -0400
X-MC-Unique: 53hWaKg3P8qCNcvprf8vdw-1
Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com
 (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest
 SHA256)
 (No client certificate requested)
 by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS
 id 1AD361955D4F
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:22 +0000 (UTC)
Received: from t14s.localdomain.com (unknown [10.22.16.43])
 by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP
 id 2173719560A3; Thu, 29 Aug 2024 22:58:20 +0000 (UTC)
From: David Malcolm <dmalcolm@redhat.com>
To: gcc-patches@gcc.gnu.org
Cc: David Malcolm <dmalcolm@redhat.com>
Subject: [pushed 2/4] pretty-print: move class chunk_info into its own header
Date: Thu, 29 Aug 2024 18:58:11 -0400
Message-Id: <20240829225813.2567570-2-dmalcolm@redhat.com>
In-Reply-To: <20240829225813.2567570-1-dmalcolm@redhat.com>
References: <20240829225813.2567570-1-dmalcolm@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r15-3310-g68a0ca66972a06.

gcc/cp/ChangeLog:
	* error.cc: Include "pretty-print-format-impl.h".

gcc/ChangeLog:
	* dumpfile.cc: Include "pretty-print-format-impl.h".
	* pretty-print-format-impl.h: New file, based on material from
	pretty-print.h.
	* pretty-print.cc: Include "pretty-print-format-impl.h".
	* pretty-print.h (chunk_info): Replace full declaration with
	a forward decl, moving full decl to pretty-print-format-impl.h.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/cp/error.cc                |  1 +
 gcc/dumpfile.cc                |  1 +
 gcc/pretty-print-format-impl.h | 70 ++++++++++++++++++++++++++++++++++
 gcc/pretty-print.cc            |  1 +
 gcc/pretty-print.h             | 45 +---------------------
 5 files changed, 74 insertions(+), 44 deletions(-)
 create mode 100644 gcc/pretty-print-format-impl.h

diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
index 879e5a115cfe..3cc0dd1cdfa9 100644
--- a/gcc/cp/error.cc
+++ b/gcc/cp/error.cc
@@ -38,6 +38,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "c-family/c-type-mismatch.h"
 #include "cp-name-hint.h"
 #include "attribs.h"
+#include "pretty-print-format-impl.h"
 
 #define pp_separate_with_comma(PP) pp_cxx_separate_with (PP, ',')
 #define pp_separate_with_semicolon(PP) pp_cxx_separate_with (PP, ';')
diff --git a/gcc/dumpfile.cc b/gcc/dumpfile.cc
index 2971c69bb0a1..eb245059210a 100644
--- a/gcc/dumpfile.cc
+++ b/gcc/dumpfile.cc
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h" /* for get_identifier.  */
 #include "spellcheck.h"
 #include "make-unique.h"
+#include "pretty-print-format-impl.h"
 
 /* If non-NULL, return one past-the-end of the matching SUBPART of
    the WHOLE string.  */
diff --git a/gcc/pretty-print-format-impl.h b/gcc/pretty-print-format-impl.h
new file mode 100644
index 000000000000..e05ad388963d
--- /dev/null
+++ b/gcc/pretty-print-format-impl.h
@@ -0,0 +1,70 @@
+/* Implementation detail of pp_format.
+   Copyright (C) 2002-2024 Free Software Foundation, Inc.
+   Contributed by Gabriel Dos Reis <gdr@integrable-solutions.net>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_PRETTY_PRINT_FORMAT_IMPL_H
+#define GCC_PRETTY_PRINT_FORMAT_IMPL_H
+
+#include "pretty-print.h"
+
+/* The chunk_info data structure forms a stack of the results from the
+   first phase of formatting (pp_format) which have not yet been
+   output (pp_output_formatted_text).  A stack is necessary because
+   the diagnostic starter may decide to generate its own output by way
+   of the formatter.  */
+class chunk_info
+{
+  friend class pretty_printer;
+  friend class pp_markup::context;
+
+public:
+  const char * const *get_args () const { return m_args; }
+  quoting_info *get_quoting_info () const { return m_quotes; }
+
+  void append_formatted_chunk (const char *content);
+
+  void pop_from_output_buffer (output_buffer &buf);
+
+private:
+  void on_begin_quote (const output_buffer &buf,
+		       unsigned chunk_idx,
+		       const urlifier *urlifier);
+
+  void on_end_quote (pretty_printer *pp,
+		     output_buffer &buf,
+		     unsigned chunk_idx,
+		     const urlifier *urlifier);
+
+  /* Pointer to previous chunk on the stack.  */
+  chunk_info *m_prev;
+
+  /* Array of chunks to output.  Each chunk is a NUL-terminated string.
+     In the first phase of formatting, even-numbered chunks are
+     to be output verbatim, odd-numbered chunks are format specifiers.
+     The second phase replaces all odd-numbered chunks with formatted
+     text, and the third phase simply emits all the chunks in sequence
+     with appropriate line-wrapping.  */
+  const char *m_args[PP_NL_ARGMAX * 2];
+
+  /* If non-null, information on quoted text runs within the chunks
+     for use by a urlifier.  */
+  quoting_info *m_quotes;
+};
+
+#endif /* GCC_PRETTY_PRINT_FORMAT_IMPL_H */
diff --git a/gcc/pretty-print.cc b/gcc/pretty-print.cc
index 1d91da828212..810c629ef116 100644
--- a/gcc/pretty-print.cc
+++ b/gcc/pretty-print.cc
@@ -24,6 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "coretypes.h"
 #include "intl.h"
 #include "pretty-print.h"
+#include "pretty-print-format-impl.h"
 #include "pretty-print-markup.h"
 #include "pretty-print-urlifier.h"
 #include "diagnostic-color.h"
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index d28814a84b89..ea81706b5d8a 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -69,6 +69,7 @@ enum diagnostic_prefixing_rule_t
   DIAGNOSTICS_SHOW_PREFIX_EVERY_LINE = 0x2
 };
 
+class chunk_info;
 class quoting_info;
 class output_buffer;
 class urlifier;
@@ -77,50 +78,6 @@ namespace pp_markup {
   class context;
 } // namespace pp_markup
 
-/* The chunk_info data structure forms a stack of the results from the
-   first phase of formatting (pp_format) which have not yet been
-   output (pp_output_formatted_text).  A stack is necessary because
-   the diagnostic starter may decide to generate its own output by way
-   of the formatter.  */
-class chunk_info
-{
-  friend class pretty_printer;
-  friend class pp_markup::context;
-
-public:
-  const char * const *get_args () const { return m_args; }
-  quoting_info *get_quoting_info () const { return m_quotes; }
-
-  void append_formatted_chunk (const char *content);
-
-  void pop_from_output_buffer (output_buffer &buf);
-
-private:
-  void on_begin_quote (const output_buffer &buf,
-		       unsigned chunk_idx,
-		       const urlifier *urlifier);
-
-  void on_end_quote (pretty_printer *pp,
-		     output_buffer &buf,
-		     unsigned chunk_idx,
-		     const urlifier *urlifier);
-
-  /* Pointer to previous chunk on the stack.  */
-  chunk_info *m_prev;
-
-  /* Array of chunks to output.  Each chunk is a NUL-terminated string.
-     In the first phase of formatting, even-numbered chunks are
-     to be output verbatim, odd-numbered chunks are format specifiers.
-     The second phase replaces all odd-numbered chunks with formatted
-     text, and the third phase simply emits all the chunks in sequence
-     with appropriate line-wrapping.  */
-  const char *m_args[PP_NL_ARGMAX * 2];
-
-  /* If non-null, information on quoted text runs within the chunks
-     for use by a urlifier.  */
-  quoting_info *m_quotes;
-};
-
 /* The output buffer datatype.  This is best seen as an abstract datatype
    whose fields should not be accessed directly by clients.  */
 class output_buffer

From patchwork Thu Aug 29 22:58:12 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Malcolm <dmalcolm@redhat.com>
X-Patchwork-Id: 1978642
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (1024-bit key;
 unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256
 header.s=mimecast20190719 header.b=c96rUs0Y;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4WvxZQ6nSJz1yZ9
	for <incoming@patchwork.ozlabs.org>; Fri, 30 Aug 2024 09:00:14 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id B9EFA386181E
	for <incoming@patchwork.ozlabs.org>; Thu, 29 Aug 2024 23:00:12 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 03BA0385EC25
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:27 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 03BA0385EC25
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 03BA0385EC25
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=170.10.133.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724972315; cv=none;
 b=CkcO0lcxhptzLCUa+9Ti3SJHU7UXFuITLm/AVYYJ/gyn7mOko42JR4tGxmsMBMpKNfsVbGrf6rOiDAiolmOQIPzKZHE2s4heK+/qm3l3fQAXTkOojHxb/OiIs+aaVQhqbPhZM0j6/J7Lw1jEUmaChcEBhn33pN0/UYGhxoG8Wzs=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1724972315; c=relaxed/simple;
 bh=XFPrlND5/HDjG5Qpjn6VtB3PJTE8/rHIDRLWpWiL/Gc=;
 h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version;
 b=A99b24FDiqDE35WM6zG3OLj5fpayw+ewR+8nvUMH8Adlbc9zo1U+tDA5v+0hXfKbSfDw0tJuSy0eCAQyX+K+05WZBE8KmKxgt2cLWMyp50ROiAAnFAZ4vd/phFtIvhI+Tz2Uy05wp1O+ESwJbkOhUpD+N+NCkIZOWitNlnwgZHE=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1724972307;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=0wXiJ85tkQClLDGBateHOQp38cZhYsbaMNJOg391XSM=;
 b=c96rUs0YYhFwhFM1xCCuqon7nWuVcZ7tEGWjXxDP+Urdv+I3IkVMgEcH08AIshWFuG0Xz1
 WIiDz9THsnxzgjyNMCDHiYKUZJkaBQhsaLOF24IMFc9cYMJam3qVHzga3H9Tn7wEUvLuDH
 eA+KVAbtT0/V1x8VDqnSo3/rKXBf71E=
Received: from mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com
 (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by
 relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3,
 cipher=TLS_AES_256_GCM_SHA384) id us-mta-136-qW5PORSSPtKQOxZ0zg6stA-1; Thu,
 29 Aug 2024 18:58:25 -0400
X-MC-Unique: qW5PORSSPtKQOxZ0zg6stA-1
Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com
 (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest
 SHA256)
 (No client certificate requested)
 by mx-prod-mc-03.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS
 id BF21419560B1
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:24 +0000 (UTC)
Received: from t14s.localdomain.com (unknown [10.22.16.43])
 by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP
 id 2DFE219560A3; Thu, 29 Aug 2024 22:58:22 +0000 (UTC)
From: David Malcolm <dmalcolm@redhat.com>
To: gcc-patches@gcc.gnu.org
Cc: David Malcolm <dmalcolm@redhat.com>
Subject: [pushed 3/4] pretty-print: reimplement pp_format with a new struct
 pp_token
Date: Thu, 29 Aug 2024 18:58:12 -0400
Message-Id: <20240829225813.2567570-3-dmalcolm@redhat.com>
In-Reply-To: <20240829225813.2567570-1-dmalcolm@redhat.com>
References: <20240829225813.2567570-1-dmalcolm@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_NONE, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

The following patch rewrites the internals of pp_format.

A pretty_printer's output_buffer maintains a stack of chunk_info
instances, each one responsible for handling a call to pp_format, where
having a stack allows us to support re-entrant calls to pp_format on the
same pretty_printer.

Previously a chunk_info merely stored buffers of accumulated text
per unformatted run and per formatted argument.

This led to various special-casing for handling:

- urlifiers, needing class quoting_info to handle awkard cases where
  the run of quoted text could be split between stages 1 and 2
  of formatting

- dumpfiles, where the optinfo machinery could lead to objects being
  stashed during formatting for later replay to JSON optimization
  records

- in the C++ frontend, the format codes %H and %I can't be processed
  until we've seen both, leading to awkward code to manipulate the
  text buffers

Further, supporting URLs in messages in SARIF output (PR other/116419)
would add additional manipulations of text buffers, since our internal
pp_begin_url API gives the URL at the beginning of the wrapped text,
whereas SARIF's format for embedded URLs has the URL *after* the wrapped
text.  Also when handling "%@" we wouldn't necessarily know the URL of
an event ID until later, requiring further nasty special-case
manipulation of text buffers.

This patch rewrites pretty-print formatting by introducing a new
intermediate representation during formatting: pp_token and
pp_token_list.  Rather than simply accumulating a buffer of "char" in
the chunk_obstack during formatting, we now also accumulate a
pp_token_list, a doubly-linked list of pp_token, which can be:
- text buffers
- begin/end colorization
- begin/end quote
- begin/end URL
- "custom data" tokens

Working at the level of tokens rather than just text buffers allows the
various awkward special cases above to be replaced with uniform logic.
For example, all "urlification" is now done in phase 3 of formatting,
in one place, by looking for [..., BEGIN_QUOTE, TEXT, END_QUOTE, ...]
and injecting BEGIN_URL and END_URL wrapper tokens when the urlifier
has a URL for TEXT.  Doing so greatly simplifies the urlifier code,
allowing the removal of class quoting_info.

The tokens and token lists are allocated on the chunk_obstack, and so
there's no additional heap activity required, with the memory reclaimed
when the chunk_obstack is freed after phase 3 of formatting.

New kinds of pp_token can be added as needed to support output formats.
For example, the followup patch adds a token for "%@" for events IDs, to
better support SARIF output.

No functional change intended.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Lightly tested with valgrind.
Pushed to trunk as r15-3311-ge31b6176996567.

gcc/c/ChangeLog:
	* c-objc-common.cc (c_tree_printer): Convert final param from
	const char ** to pp_token_list &.

gcc/cp/ChangeLog:
	* error.cc: Include "make-unique.h".
	(deferred_printed_type::m_buffer_ptr): Replace with...
	(deferred_printed_type::m_printed_text): ...this and...
	(deferred_printed_type::m_token_list): ...this.
	(deferred_printed_type::deferred_printed_type): Update ctors for
	above changes.
	(deferred_printed_type::set_text_for_token_list): New.
	(append_formatted_chunk): Pass chunk_obstack to
	append_formatted_chunk.
	(add_quotes): Delete.
	(cxx_format_postprocessor::handle): Reimplement to call
	deferred_printed_type::set_text_for_token_list, rather than store
	buffer pointers.
	(defer_phase_2_of_type_diff): Replace param "buffer_ptr"
	with "formatted_token_list".  Reimplement by storing
	a pointer to formatted_token_list so that the postprocessor can
	put its text there.
	(cp_printer): Convert param "buffer_ptr" to
	"formatted_token_list".  Update calls to
	defer_phase_2_of_type_diff accordingly.

gcc/ChangeLog:
	* diagnostic.cc (diagnostic_context::report_diagnostic): Don't
	pass m_urlifier to pp_format, as urlification now happens in
	phase 3.
	* dump-context.h (class dump_pretty_printer): Update leading
	comment.
	(dump_pretty_printer::emit_items): Drop decl.
	(dump_pretty_printer::set_optinfo): New.
	(class dump_pretty_printer::stashed_item): Delete class.
	(class dump_pretty_printer::custom_token_printer): New class.
	(dump_pretty_printer::format_decoder_cb): Convert param from
	const char ** to pp_token_list &.
	(dump_pretty_printer::decode_format): Likewise.
	(dump_pretty_printer::stash_item): Likewise.
	(dump_pretty_printer::emit_any_pending_textual_chunks): Drop decl.
	(dump_pretty_printer::m_stashed_items): Delete field.
	(dump_pretty_printer::m_token_printer): New member data.
	* dumpfile.cc (struct wrapped_optinfo_item): New.
	(dump_pretty_printer::dump_pretty_printer): Update for dropping
	of field m_stashed_items and new field m_token_printer.
	(dump_pretty_printer::emit_items): Delete; we now use
	pp_output_formatted_text..
	(dump_pretty_printer::emit_any_pending_textual_chunks): Delete.
	(dump_pretty_printer::stash_item): Convert param from
	const char ** to pp_token_list &.
	(dump_pretty_printer::format_decoder_cb): Likewise.
	(dump_pretty_printer::decode_format): Likewise.
	(dump_pretty_printer::custom_token_printer::print_tokens): New.
	(dump_pretty_printer::custom_token_printer::emit_any_pending_textual_chunks):
	New.
	(dump_context::dump_printf_va): Call set_optinfo on the
	dump_pretty_printer.  Replace call to emit_items with a call to
	pp_output_formatted_text.
	* opt-problem.cc (opt_problem::opt_problem): Replace call to
	emit_items with call to set_optinfo and call to
	pp_output_formatted_text.
	* pretty-print-format-impl.h (struct pp_token): New.
	(struct pp_token_text): New.
	(is_a_helper <pp_token_text *>::test): New.
	(is_a_helper <const pp_token_text *>::test): New.
	(struct pp_token_begin_color): New.
	(is_a_helper <pp_token_begin_color *>::test): New.
	(is_a_helper <const pp_token_begin_color *>::test): New.
	(struct pp_token_end_color): New.
	(struct pp_token_begin_quote): New.
	(struct pp_token_end_quote): New.
	(struct pp_token_begin_url): New.
	(is_a_helper <pp_token_begin_url*>::test): New.
	(is_a_helper <const pp_token_begin_url*>::test): New.
	(struct pp_token_end_url): New.
	(struct pp_token_custom_data): New.
	(is_a_helper <pp_token_custom_data *>::test): New.
	(is_a_helper <const pp_token_custom_data *>::test): New.
	(class pp_token_list): New.
	(chunk_info::get_args): Drop.
	(chunk_info::get_quoting_info): Drop.
	(chunk_info::get_token_lists): New accessor.
	(chunk_info::append_formatted_chunk): Add obstack & param.
	(chunk_info::dump): New decls.
	(chunk_info::m_args): Convert element type from const char * to
	pp_token_list *.  Rewrite/update comment.
	(chunk_info::m_quotes): Drop field.
	* pretty-print-markup.h (class pp_token_list): New forward decl.
	(pp_markup::context::context): Drop urlifier param; add
	formatted_token_list param.
	(pp_markup::context::push_back_any_text): New decl.
	(pp_markup::context::m_urlifier): Drop field.
	(pp_markup::context::m_formatted_token_list): New field.
	* pretty-print-urlifier.h: Update comment.
	* pretty-print.cc: Define INCLUDE_MEMORY.  Include
	"make-unique.h".
	(default_token_printer): New forward decl.
	(obstack_append_string): Delete.
	(urlify_quoted_string): Delete.
	(pp_token::pp_token): New.
	(pp_token::dump): New.
	(allocate_object): New.
	(class quoting_info): Delete.
	(pp_token::operator new): New.
	(pp_token::operator delete): New.
	(pp_token_list::operator new): New.
	(pp_token_list::operator delete): New.
	(pp_token_list::pp_token_list): New.
	(pp_token_list::~pp_token_list): New.
	(pp_token_list::push_back_text): New.
	(pp_token_list::push_back): New.
	(pp_token_list::push_back_list): New.
	(pp_token_list::pop_front): New.
	(pp_token_list::remove_token): New.
	(pp_token_list::insert_after): New.
	(pp_token_list::replace_custom_tokens): New.
	(pp_token_list::merge_consecutive_text_tokens): New.
	(pp_token_list::apply_urlifier): New.
	(pp_token_list::dump): New.
	(chunk_info::append_formatted_chunk): Add obstack & param and use
	it to reimplement in terms of token lists.
	(chunk_info::pop_from_output_buffer): Drop m_quotes.
	(chunk_info::on_begin_quote): Delete.
	(chunk_info::dump): New.
	(chunk_info::on_end_quote): Delete.
	(push_back_any_text): New.
	(pretty_printer::format): Drop "urlifier" param and quoting_info
	logic.  Convert "formatters" and "args" from const ** to
	pp_token_list **.  Reimplement so that rather than just
	accumulating a text buffer in the chunk_obstack for each arg,
	instead also accumulate a pp_token_list and pp_tokens for each
	arg.
	(auto_obstack::operator obstack &): New.
	(quoting_info::handle_phase_3): Delete.
	(pp_output_formatted_text): Reimplement in terms of manipulations
	of pp_token_lists, rather than char buffers.  Call
	default_token_printer, or m_token_printer's print_tokens vfunc.
	(default_token_printer): New.
	(pretty_printer::pretty_printer): Initialize m_token_printer in
	both ctors.
	(pp_markup::context::begin_quote): Reimplement to use token list.
	(pp_markup::context::end_quote): Likewise.
	(pp_markup::context::begin_highlight_color): Likewise.
	(pp_markup::context::end_highlight_color): Likewise.
	(pp_markup::context::push_back_any_text): New.
	(selftest::test_merge_consecutive_text_tokens): New.
	(selftest::test_custom_tokens_1): New.
	(selftest::test_custom_tokens_2): New.
	(selftest::pp_printf_with_urlifier): Drop "urlifier" param from
	call to pp_format.
	(selftest::test_urlification): Add test of the example from
	pretty-print-format-impl.h.
	(selftest::pretty_print_cc_tests): Call the new selftest
	functions.
	* pretty-print.h (class quoting_info): Drop forward decl.
	(class pp_token_list): New forward decl.
	(printer_fn): Convert final param from const char ** to
	pp_token_list &.
	(class token_printer): New.
	(class pretty_printer): Add pp_output_formatted_text as friend.
	(pretty_printer::set_token_printer): New.
	(pretty_printer::format): Drop urlifier param as this now happens
	in phase 3.
	(pretty_printer::m_format_decoder): Update comment.
	(pretty_printer::m_token_printer): New field.
	(pp_format): Drop urlifier param.
	* tree-diagnostic.cc (default_tree_printer): Convert final param
	from const char ** to pp_token_list &.
	* tree-diagnostic.h: Likewise for decl.

gcc/fortran/ChangeLog:
	* error.cc (gfc_format_decoder): Convert final param from
	const char **buffer_ptr to pp_token_list &formatted_token_list,
	and update call to default_tree_printer accordingly.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/c/c-objc-common.cc         |    4 +-
 gcc/cp/error.cc                |  105 +--
 gcc/diagnostic.cc              |    2 +-
 gcc/dump-context.h             |   40 +-
 gcc/dumpfile.cc                |  217 +++---
 gcc/fortran/error.cc           |    5 +-
 gcc/opt-problem.cc             |    3 +-
 gcc/pretty-print-format-impl.h |  407 ++++++++++-
 gcc/pretty-print-markup.h      |   10 +-
 gcc/pretty-print-urlifier.h    |    2 +-
 gcc/pretty-print.cc            | 1176 ++++++++++++++++++++++----------
 gcc/pretty-print.h             |   43 +-
 gcc/tree-diagnostic.cc         |    2 +-
 gcc/tree-diagnostic.h          |    2 +-
 14 files changed, 1483 insertions(+), 535 deletions(-)

diff --git a/gcc/c/c-objc-common.cc b/gcc/c/c-objc-common.cc
index fde9ae6ad667..9d39fcd4e442 100644
--- a/gcc/c/c-objc-common.cc
+++ b/gcc/c/c-objc-common.cc
@@ -34,7 +34,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "dwarf2.h"
 
 static bool c_tree_printer (pretty_printer *, text_info *, const char *,
-			    int, bool, bool, bool, bool *, const char **);
+			    int, bool, bool, bool, bool *, pp_token_list &);
 
 /* Info for C language features which can be queried through
    __has_{feature,extension}.  */
@@ -318,7 +318,7 @@ pp_markup::element_quoted_type::print_type (pp_markup::context &ctxt)
 static bool
 c_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
 		int precision, bool wide, bool set_locus, bool hash,
-		bool *quoted, const char **)
+		bool *quoted, pp_token_list &)
 {
   tree t = NULL_TREE;
   // FIXME: the next cast should be a dynamic_cast, when it is permitted.
diff --git a/gcc/cp/error.cc b/gcc/cp/error.cc
index 3cc0dd1cdfa9..420fad26b7b7 100644
--- a/gcc/cp/error.cc
+++ b/gcc/cp/error.cc
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "cp-name-hint.h"
 #include "attribs.h"
 #include "pretty-print-format-impl.h"
+#include "make-unique.h"
 
 #define pp_separate_with_comma(PP) pp_cxx_separate_with (PP, ',')
 #define pp_separate_with_semicolon(PP) pp_cxx_separate_with (PP, ';')
@@ -110,7 +111,7 @@ static void cp_print_error_function (diagnostic_context *,
 				     const diagnostic_info *);
 
 static bool cp_printer (pretty_printer *, text_info *, const char *,
-			int, bool, bool, bool, bool *, const char **);
+			int, bool, bool, bool, bool *, pp_token_list &);
 
 /* Color names for highlighting "%qH" vs "%qI" values,
    and ranges corresponding to them.  */
@@ -124,22 +125,50 @@ class deferred_printed_type
 {
 public:
   deferred_printed_type ()
-  : m_tree (NULL_TREE), m_buffer_ptr (NULL), m_verbose (false), m_quote (false)
+  : m_tree (NULL_TREE),
+    m_printed_text (),
+    m_token_list (nullptr),
+    m_verbose (false), m_quote (false)
   {}
 
-  deferred_printed_type (tree type, const char **buffer_ptr, bool verbose,
+  deferred_printed_type (tree type,
+			 pp_token_list &token_list,
+			 bool verbose,
 			 bool quote)
-  : m_tree (type), m_buffer_ptr (buffer_ptr), m_verbose (verbose),
+  : m_tree (type),
+    m_printed_text (),
+    m_token_list (&token_list),
+    m_verbose (verbose),
     m_quote (quote)
   {
     gcc_assert (type);
-    gcc_assert (buffer_ptr);
+  }
+
+  void set_text_for_token_list (const char *text, bool quote)
+  {
+    /* Replace the contents of m_token_list with a text token for TEXT,
+       possibly wrapped by BEGIN_QUOTE/END_QUOTE (if QUOTE is true).
+       This allows us to ignore any {BEGIN,END}_QUOTE tokens added
+       by %qH and %qI, and instead use the quoting from type_to_string,
+       and its logic for "aka".  */
+    while (m_token_list->m_first)
+      m_token_list->pop_front ();
+
+    if (quote)
+      m_token_list->push_back<pp_token_begin_quote> ();
+
+    // TEXT is gc-allocated, so we can borrow it
+    m_token_list->push_back_text (label_text::borrow (text));
+
+    if (quote)
+      m_token_list->push_back<pp_token_end_quote> ();
   }
 
   /* The tree is not GTY-marked: they are only non-NULL within a
      call to pp_format.  */
   tree m_tree;
-  const char **m_buffer_ptr;
+  label_text m_printed_text;
+  pp_token_list *m_token_list;
   bool m_verbose;
   bool m_quote;
 };
@@ -4402,26 +4431,7 @@ append_formatted_chunk (pretty_printer *pp, const char *content)
 {
   output_buffer *buffer = pp_buffer (pp);
   chunk_info *chunk_array = buffer->cur_chunk_array;
-  chunk_array->append_formatted_chunk (content);
-}
-
-/* Create a copy of CONTENT, with quotes added, and,
-   potentially, with colorization.
-   No escaped is performed on CONTENT.
-   The result is in a GC-allocated buffer. */
-
-static const char *
-add_quotes (const char *content, bool show_color)
-{
-  pretty_printer tmp_pp;
-  pp_show_color (&tmp_pp) = show_color;
-
-  /* We have to use "%<%s%>" rather than "%qs" here in order to avoid
-     quoting colorization bytes within the results and using either
-     pp_quote or pp_begin_quote doesn't work the same.  */
-  pp_printf (&tmp_pp, "%<%s%>", content);
-
-  return pp_ggc_formatted_text (&tmp_pp);
+  chunk_array->append_formatted_chunk (buffer->chunk_obstack, content);
 }
 
 #if __GNUC__ >= 10
@@ -4429,8 +4439,8 @@ add_quotes (const char *content, bool show_color)
 #endif
 
 /* If we had %H and %I, and hence deferred printing them,
-   print them now, storing the result into the chunk_info
-   for pp_format.  Quote them if 'q' was provided.
+   print them now, storing the result into custom_token_value
+   for the custom pp_token.  Quote them if 'q' was provided.
    Also print the difference in tree form, adding it as
    an additional chunk.  */
 
@@ -4448,13 +4458,13 @@ cxx_format_postprocessor::handle (pretty_printer *pp)
 	= show_highlight_colors ? highlight_colors::percent_i : nullptr;
       /* Avoid reentrancy issues by working with a copy of
 	 m_type_a and m_type_b, resetting them now.  */
-      deferred_printed_type type_a = m_type_a;
-      deferred_printed_type type_b = m_type_b;
+      deferred_printed_type type_a = std::move (m_type_a);
+      deferred_printed_type type_b = std::move (m_type_b);
       m_type_a = deferred_printed_type ();
       m_type_b = deferred_printed_type ();
 
-      gcc_assert (type_a.m_buffer_ptr);
-      gcc_assert (type_b.m_buffer_ptr);
+      gcc_assert (type_a.m_token_list);
+      gcc_assert (type_b.m_token_list);
 
       bool show_color = pp_show_color (pp);
 
@@ -4495,13 +4505,8 @@ cxx_format_postprocessor::handle (pretty_printer *pp)
 					percent_i);
 	}
 
-      if (type_a.m_quote)
-	type_a_text = add_quotes (type_a_text, show_color);
-      *type_a.m_buffer_ptr = type_a_text;
-
-       if (type_b.m_quote)
-	type_b_text = add_quotes (type_b_text, show_color);
-      *type_b.m_buffer_ptr = type_b_text;
+      type_a.set_text_for_token_list (type_a_text, type_a.m_quote);
+      type_b.set_text_for_token_list (type_b_text, type_b.m_quote);
    }
 }
 
@@ -4526,9 +4531,12 @@ cxx_format_postprocessor::handle (pretty_printer *pp)
    pretty_printer's m_format_postprocessor hook.
 
    This is called in phase 2 of pp_format, when it is accumulating
-   a series of formatted chunks.  We stash the location of the chunk
-   we're meant to have written to, so that we can write to it in the
-   m_format_postprocessor hook.
+   a series of pp_token lists.  Since we have to interact with the
+   fiddly quoting logic for "aka", we store the pp_token_list *
+   and in the m_format_postprocessor hook we generate text for the type
+   (possibly with quotes and colors), then replace all tokens in that token list
+   (such as [BEGIN_QUOTE, END_QUOTE]) with a text token containing the
+   freshly generated text.
 
    We also need to stash whether a 'q' prefix was provided (the QUOTE
    param)  so that we can add the quotes when writing out the delayed
@@ -4536,12 +4544,13 @@ cxx_format_postprocessor::handle (pretty_printer *pp)
 
 static void
 defer_phase_2_of_type_diff (deferred_printed_type *deferred,
-			    tree type, const char **buffer_ptr,
+			    tree type,
+			    pp_token_list &formatted_token_list,
 			    bool verbose, bool quote)
 {
   gcc_assert (deferred->m_tree == NULL_TREE);
-  gcc_assert (deferred->m_buffer_ptr == NULL);
-  *deferred = deferred_printed_type (type, buffer_ptr, verbose, quote);
+  *deferred = deferred_printed_type (type, formatted_token_list,
+				     verbose, quote);
 }
 
 /* Implementation of pp_markup::element_quoted_type::print_type
@@ -4578,7 +4587,7 @@ pp_markup::element_quoted_type::print_type (pp_markup::context &ctxt)
 static bool
 cp_printer (pretty_printer *pp, text_info *text, const char *spec,
 	    int precision, bool wide, bool set_locus, bool verbose,
-	    bool *quoted, const char **buffer_ptr)
+	    bool *quoted, pp_token_list &formatted_token_list)
 {
   gcc_assert (pp_format_postprocessor (pp));
   cxx_format_postprocessor *postprocessor
@@ -4618,11 +4627,11 @@ cp_printer (pretty_printer *pp, text_info *text, const char *spec,
     case 'F': result = fndecl_to_string (next_tree, verbose);	break;
     case 'H':
       defer_phase_2_of_type_diff (&postprocessor->m_type_a, next_tree,
-				  buffer_ptr, verbose, *quoted);
+				  formatted_token_list, verbose, *quoted);
       return true;
     case 'I':
       defer_phase_2_of_type_diff (&postprocessor->m_type_b, next_tree,
-				  buffer_ptr, verbose, *quoted);
+				  formatted_token_list, verbose, *quoted);
       return true;
     case 'L': result = language_to_string (next_lang);		break;
     case 'O': result = op_to_string (false, next_tcode);	break;
diff --git a/gcc/diagnostic.cc b/gcc/diagnostic.cc
index 381a050ab4c9..a80e16b542df 100644
--- a/gcc/diagnostic.cc
+++ b/gcc/diagnostic.cc
@@ -1407,7 +1407,7 @@ diagnostic_context::report_diagnostic (diagnostic_info *diagnostic)
     m_output_format->on_begin_group ();
   m_diagnostic_groups.m_emission_count++;
 
-  pp_format (this->printer, &diagnostic->message, m_urlifier);
+  pp_format (this->printer, &diagnostic->message);
   /* Call vfunc in the output format.  This is responsible for
      phase 3 of formatting, and for printing the result.  */
   m_output_format->on_report_diagnostic (*diagnostic, orig_diag_kind);
diff --git a/gcc/dump-context.h b/gcc/dump-context.h
index 5992956380b1..e90c4ee1d6ae 100644
--- a/gcc/dump-context.h
+++ b/gcc/dump-context.h
@@ -154,48 +154,52 @@ class dump_context
 };
 
 /* A subclass of pretty_printer for implementing dump_context::dump_printf_va.
-   In particular, the formatted chunks are captured as optinfo_item instances,
-   thus retaining metadata about the entities being dumped (e.g. source
-   locations), rather than just as plain text.  */
+   In particular, the formatted chunks are captured as optinfo_item instances
+   as pp_token_custom_data, thus retaining metadata about the entities being
+   dumped (e.g. source locations), rather than just as plain text.
+   These custom items are retained through to the end of stage 3 of formatted
+   printing; the printer uses a custom token_printer subclass to emit them to
+   the active optinfo (if any).  */
 
 class dump_pretty_printer : public pretty_printer
 {
 public:
   dump_pretty_printer (dump_context *context, dump_flags_t dump_kind);
 
-  void emit_items (optinfo *dest);
+  void set_optinfo (optinfo *info) { m_token_printer.m_optinfo = info; }
 
 private:
-  /* Information on an optinfo_item that was generated during phase 2 of
-     formatting.  */
-  class stashed_item
+  struct custom_token_printer : public token_printer
   {
-  public:
-    stashed_item (const char **buffer_ptr_, optinfo_item *item_)
-      : buffer_ptr (buffer_ptr_), item (item_) {}
-    const char **buffer_ptr;
-    optinfo_item *item;
+    custom_token_printer (dump_pretty_printer &dump_pp)
+    : m_dump_pp (dump_pp),
+      m_optinfo (nullptr)
+    {}
+    void print_tokens (pretty_printer *pp,
+		       const pp_token_list &tokens) final override;
+    void emit_any_pending_textual_chunks ();
+
+    dump_pretty_printer &m_dump_pp;
+    optinfo *m_optinfo;
   };
 
   static bool format_decoder_cb (pretty_printer *pp, text_info *text,
 				 const char *spec, int /*precision*/,
 				 bool /*wide*/, bool /*set_locus*/,
 				 bool /*verbose*/, bool */*quoted*/,
-				 const char **buffer_ptr);
+				 pp_token_list &formatted_tok_list);
 
   bool decode_format (text_info *text, const char *spec,
-		      const char **buffer_ptr);
+		      pp_token_list &formatted_tok_list);
 
-  void stash_item (const char **buffer_ptr,
+  void stash_item (pp_token_list &formatted_tok_list,
 		   std::unique_ptr<optinfo_item> item);
 
-  void emit_any_pending_textual_chunks (optinfo *dest);
-
   void emit_item (std::unique_ptr<optinfo_item> item, optinfo *dest);
 
   dump_context *m_context;
   dump_flags_t m_dump_kind;
-  auto_vec<stashed_item> m_stashed_items;
+  custom_token_printer m_token_printer;
 };
 
 /* An RAII-style class for use in debug dumpers for temporarily using a
diff --git a/gcc/dumpfile.cc b/gcc/dumpfile.cc
index eb245059210a..da3671829a21 100644
--- a/gcc/dumpfile.cc
+++ b/gcc/dumpfile.cc
@@ -791,90 +791,39 @@ make_item_for_dump_symtab_node (symtab_node *node)
   return item;
 }
 
-/* dump_pretty_printer's ctor.  */
-
-dump_pretty_printer::dump_pretty_printer (dump_context *context,
-					  dump_flags_t dump_kind)
-: pretty_printer (), m_context (context), m_dump_kind (dump_kind),
-  m_stashed_items ()
+struct wrapped_optinfo_item : public pp_token_custom_data::value
 {
-  pp_format_decoder (this) = format_decoder_cb;
-}
-
-/* Phase 3 of formatting; compare with pp_output_formatted_text.
-
-   Emit optinfo_item instances for the various formatted chunks from phases
-   1 and 2 (i.e. pp_format).
-
-   Some chunks may already have had their items built (during decode_format).
-   These chunks have been stashed into m_stashed_items; we emit them here.
-
-   For all other purely textual chunks, they are printed into
-   buffer->formatted_obstack, and then emitted as a textual optinfo_item.
-   This consolidates multiple adjacent text chunks into a single text
-   optinfo_item.  */
-
-void
-dump_pretty_printer::emit_items (optinfo *dest)
-{
-  output_buffer *buffer = pp_buffer (this);
-  chunk_info *chunk_array = buffer->cur_chunk_array;
-  const char * const *args = chunk_array->get_args ();
-
-  gcc_assert (buffer->obstack == &buffer->formatted_obstack);
-  gcc_assert (buffer->line_length == 0);
-
-  unsigned stashed_item_idx = 0;
-  for (unsigned chunk = 0; args[chunk]; chunk++)
-    {
-      if (stashed_item_idx < m_stashed_items.length ()
-	  && args[chunk] == *m_stashed_items[stashed_item_idx].buffer_ptr)
-	{
-	  emit_any_pending_textual_chunks (dest);
-	  /* This chunk has a stashed item: use it.  */
-	  std::unique_ptr <optinfo_item> item
-	    (m_stashed_items[stashed_item_idx++].item);
-	  emit_item (std::move (item), dest);
-	}
-      else
-	/* This chunk is purely textual.  Print it (to
-	   buffer->formatted_obstack), so that we can consolidate adjacent
-	   chunks into one textual optinfo_item.  */
-	pp_string (this, args[chunk]);
-    }
+  wrapped_optinfo_item (std::unique_ptr<optinfo_item> item)
+  : m_optinfo_item (std::move (item))
+  {
+    gcc_assert (m_optinfo_item.get ());
+  }
 
-  emit_any_pending_textual_chunks (dest);
+  void dump (FILE *out) const final override
+  {
+    fprintf (out, "OPTINFO(\"%s\")", m_optinfo_item->get_text ());
+  }
 
-  /* Ensure that we consumed all of stashed_items.  */
-  gcc_assert (stashed_item_idx == m_stashed_items.length ());
+  bool as_standard_tokens (pp_token_list &) final override
+  {
+    /* Keep as a custom token.  */
+    return false;
+  }
 
-  chunk_array->pop_from_output_buffer (*buffer);
-}
+  std::unique_ptr<optinfo_item> m_optinfo_item;
+};
 
-/* Subroutine of dump_pretty_printer::emit_items
-   for consolidating multiple adjacent pure-text chunks into single
-   optinfo_items (in phase 3).  */
+/* dump_pretty_printer's ctor.  */
 
-void
-dump_pretty_printer::emit_any_pending_textual_chunks (optinfo *dest)
+dump_pretty_printer::dump_pretty_printer (dump_context *context,
+					  dump_flags_t dump_kind)
+: pretty_printer (),
+  m_context (context),
+  m_dump_kind (dump_kind),
+  m_token_printer (*this)
 {
-  output_buffer *const buffer = pp_buffer (this);
-  gcc_assert (buffer->obstack == &buffer->formatted_obstack);
-
-  /* Don't emit an item if the pending text is empty.  */
-  if (output_buffer_last_position_in_text (buffer) == NULL)
-    return;
-
-  char *formatted_text = xstrdup (pp_formatted_text (this));
-  std::unique_ptr<optinfo_item> item
-    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
-				 formatted_text);
-  emit_item (std::move (item), dest);
-
-  /* Clear the pending text by unwinding formatted_text back to the start
-     of the buffer (without deallocating).  */
-  obstack_free (&buffer->formatted_obstack,
-		buffer->formatted_obstack.object_base);
+  pp_format_decoder (this) = format_decoder_cb;
+  set_token_printer (&m_token_printer);
 }
 
 /* Emit ITEM and take ownership of it.  If DEST is non-NULL, add ITEM
@@ -889,17 +838,18 @@ dump_pretty_printer::emit_item (std::unique_ptr<optinfo_item> item,
     dest->add_item (std::move (item));
 }
 
-/* Record that ITEM (generated in phase 2 of formatting) is to be used for
-   the chunk at BUFFER_PTR in phase 3 (by emit_items).  */
+/* Append a custom pp_token for ITEM (generated in phase 2 of formatting)
+   into FORMATTTED_TOK_LIST, so that it can be emitted in phase 2.  */
 
 void
-dump_pretty_printer::stash_item (const char **buffer_ptr,
+dump_pretty_printer::stash_item (pp_token_list &formatted_tok_list,
 				 std::unique_ptr<optinfo_item> item)
 {
-  gcc_assert (buffer_ptr);
   gcc_assert (item.get ());
 
-  m_stashed_items.safe_push (stashed_item (buffer_ptr, item.release ()));
+  auto custom_data
+    = ::make_unique<wrapped_optinfo_item> (std::move (item));
+  formatted_tok_list.push_back<pp_token_custom_data> (std::move (custom_data));
 }
 
 /* pp_format_decoder callback for dump_pretty_printer, and thus for
@@ -912,10 +862,10 @@ dump_pretty_printer::format_decoder_cb (pretty_printer *pp, text_info *text,
 					const char *spec, int /*precision*/,
 					bool /*wide*/, bool /*set_locus*/,
 					bool /*verbose*/, bool */*quoted*/,
-					const char **buffer_ptr)
+					pp_token_list &formatted_tok_list)
 {
   dump_pretty_printer *opp = static_cast <dump_pretty_printer *> (pp);
-  return opp->decode_format (text, spec, buffer_ptr);
+  return opp->decode_format (text, spec, formatted_tok_list);
 }
 
 /* Format decoder for dump_pretty_printer, and thus for dump_printf and
@@ -942,7 +892,7 @@ dump_pretty_printer::format_decoder_cb (pretty_printer *pp, text_info *text,
 
 bool
 dump_pretty_printer::decode_format (text_info *text, const char *spec,
-				       const char **buffer_ptr)
+				    pp_token_list &formatted_tok_list)
 {
   /* Various format codes that imply making an optinfo_item and stashed it
      for later use (to capture metadata, rather than plain text).  */
@@ -954,7 +904,7 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 
 	/* Make an item for the node, and stash it.  */
 	auto item = make_item_for_dump_symtab_node (node);
-	stash_item (buffer_ptr, std::move (item));
+	stash_item (formatted_tok_list, std::move (item));
 	return true;
       }
 
@@ -964,7 +914,7 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 
 	/* Make an item for the stmt, and stash it.  */
 	auto item = make_item_for_dump_gimple_expr (stmt, 0, TDF_SLIM);
-	stash_item (buffer_ptr, std::move (item));
+	stash_item (formatted_tok_list, std::move (item));
 	return true;
       }
 
@@ -974,7 +924,7 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 
 	/* Make an item for the stmt, and stash it.  */
 	auto item = make_item_for_dump_gimple_stmt (stmt, 0, TDF_SLIM);
-	stash_item (buffer_ptr, std::move (item));
+	stash_item (formatted_tok_list, std::move (item));
 	return true;
       }
 
@@ -984,7 +934,7 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
 
 	/* Make an item for the tree, and stash it.  */
 	auto item = make_item_for_dump_generic_expr (t, TDF_SLIM);
-	stash_item (buffer_ptr, std::move (item));
+	stash_item (formatted_tok_list, std::move (item));
 	return true;
       }
 
@@ -993,6 +943,87 @@ dump_pretty_printer::decode_format (text_info *text, const char *spec,
     }
 }
 
+void
+dump_pretty_printer::custom_token_printer::
+print_tokens (pretty_printer *pp,
+	      const pp_token_list &tokens)
+{
+  /* Accumulate text whilst emitting items.  */
+  for (auto iter = tokens.m_first; iter; iter = iter->m_next)
+    switch (iter->m_kind)
+      {
+      default:
+	gcc_unreachable ();
+
+      case pp_token::kind::text:
+	{
+	  pp_token_text *sub = as_a <pp_token_text *> (iter);
+	  gcc_assert (sub->m_value.get ());
+	  pp_string (pp, sub->m_value.get ());
+	}
+	break;
+
+      case pp_token::kind::begin_color:
+      case pp_token::kind::end_color:
+	/* No-op for dumpfiles.  */
+	break;
+
+      case pp_token::kind::begin_quote:
+	pp_begin_quote (pp, pp_show_color (pp));
+	break;
+      case pp_token::kind::end_quote:
+	pp_end_quote (pp, pp_show_color (pp));
+	break;
+
+      case pp_token::kind::begin_url:
+      case pp_token::kind::end_url:
+	/* No-op for dumpfiles.  */
+	break;
+
+      case pp_token::kind::custom_data:
+	{
+	  emit_any_pending_textual_chunks ();
+	  pp_token_custom_data *sub = as_a <pp_token_custom_data *> (iter);
+	  gcc_assert (sub->m_value.get ());
+	  wrapped_optinfo_item *custom_data
+	    = static_cast<wrapped_optinfo_item *> (sub->m_value.get ());
+	  m_dump_pp.emit_item (std::move (custom_data->m_optinfo_item),
+			       m_optinfo);
+	}
+	break;
+      }
+
+  emit_any_pending_textual_chunks ();
+}
+
+/* Subroutine of dump_pretty_printer::custom_token_printer::print_tokens
+   for consolidating multiple adjacent pure-text chunks into single
+   optinfo_items (in phase 3).  */
+
+void
+dump_pretty_printer::custom_token_printer::
+emit_any_pending_textual_chunks ()
+{
+  dump_pretty_printer *pp = &m_dump_pp;
+  output_buffer *const buffer = pp_buffer (pp);
+  gcc_assert (buffer->obstack == &buffer->formatted_obstack);
+
+  /* Don't emit an item if the pending text is empty.  */
+  if (output_buffer_last_position_in_text (buffer) == nullptr)
+    return;
+
+  char *formatted_text = xstrdup (pp_formatted_text (pp));
+  std::unique_ptr<optinfo_item> item
+    = make_unique<optinfo_item> (OPTINFO_ITEM_KIND_TEXT, UNKNOWN_LOCATION,
+				 formatted_text);
+  pp->emit_item (std::move (item), m_optinfo);
+
+  /* Clear the pending text by unwinding formatted_text back to the start
+     of the buffer (without deallocating).  */
+  obstack_free (&buffer->formatted_obstack,
+		buffer->formatted_obstack.object_base);
+}
+
 /* Output a formatted message using FORMAT on appropriate dump streams.  */
 
 void
@@ -1007,14 +1038,16 @@ dump_context::dump_printf_va (const dump_metadata_t &metadata,
   /* Phases 1 and 2, using pp_format.  */
   pp_format (&pp, &text);
 
-  /* Phase 3.  */
+  /* Phase 3: update the custom token_printer with any active optinfo.  */
   if (optinfo_enabled_p ())
     {
       optinfo &info = ensure_pending_optinfo (metadata);
-      pp.emit_items (&info);
+      pp.set_optinfo (&info);
     }
   else
-    pp.emit_items (NULL);
+    pp.set_optinfo (nullptr);
+
+  pp_output_formatted_text (&pp, nullptr);
 }
 
 /* Similar to dump_printf, except source location is also printed, and
diff --git a/gcc/fortran/error.cc b/gcc/fortran/error.cc
index e89667613b18..a5884620e301 100644
--- a/gcc/fortran/error.cc
+++ b/gcc/fortran/error.cc
@@ -1125,7 +1125,7 @@ gfc_notify_std (int std, const char *gmsgid, ...)
 static bool
 gfc_format_decoder (pretty_printer *pp, text_info *text, const char *spec,
 		    int precision, bool wide, bool set_locus, bool hash,
-		    bool *quoted, const char **buffer_ptr)
+		    bool *quoted, pp_token_list &formatted_token_list)
 {
   switch (*spec)
     {
@@ -1170,7 +1170,8 @@ gfc_format_decoder (pretty_printer *pp, text_info *text, const char *spec,
 	 etc. diagnostics can use the FE printer while the FE is still
 	 active.  */
       return default_tree_printer (pp, text, spec, precision, wide,
-				   set_locus, hash, quoted, buffer_ptr);
+				   set_locus, hash, quoted,
+				   formatted_token_list);
     }
 }
 
diff --git a/gcc/opt-problem.cc b/gcc/opt-problem.cc
index d76ddaf57adf..fc29333c331a 100644
--- a/gcc/opt-problem.cc
+++ b/gcc/opt-problem.cc
@@ -71,7 +71,8 @@ opt_problem::opt_problem (const dump_location_t &loc,
 
     /* Phase 3: dump the items to the "immediate" dump destinations,
        and storing them into m_optinfo for later retrieval.  */
-    pp.emit_items (&m_optinfo);
+    pp.set_optinfo (&m_optinfo);
+    pp_output_formatted_text (&pp, nullptr);
   }
 }
 
diff --git a/gcc/pretty-print-format-impl.h b/gcc/pretty-print-format-impl.h
index e05ad388963d..cffdd461a33d 100644
--- a/gcc/pretty-print-format-impl.h
+++ b/gcc/pretty-print-format-impl.h
@@ -23,6 +23,308 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "pretty-print.h"
 
+/* A struct representing a pending item to be printed within
+   pp_format.
+
+   These can represent:
+   - a run of text within one of the output_buffers's obstacks
+   - begin/end named color
+   - open/close quote
+   - begin/end URL
+   - custom data (for the formatter, for the pretty_printer,
+     or the output format)
+
+   These are built into pp_token_list instances.
+
+   Doing so allows for interaction between:
+
+   - pretty_printer formatting codes (such as C++'s %H and %I,
+   which can't be printed until we've seen both)
+
+   - output formats, such as text vs SARIF (so each can handle URLs
+   and event IDs it its own way)
+
+   - optimization records, where we want to stash data into the
+   formatted messages
+
+   - urlifiers: these can be run in phase 3 of formatting
+
+   without needing lots of fragile logic on char pointers.
+
+   To avoid needing lots of heap allocation/deallocation, pp_token
+   instances are allocated in the pretty_printer's chunk_obstack:
+   they must not outlive phase 3 of formatting of the given
+   chunk_info level.  */
+
+struct pp_token
+{
+public:
+  enum class kind
+  {
+    text,
+
+    begin_color,
+    end_color,
+
+    begin_quote,
+    end_quote,
+
+    begin_url,
+    end_url,
+
+    custom_data,
+
+    NUM_KINDS
+  };
+
+  pp_token (enum kind k);
+
+  pp_token (const pp_token &) = delete;
+  pp_token (pp_token &&) = delete;
+
+  virtual ~pp_token () = default;
+
+  pp_token &operator= (const pp_token &) = delete;
+  pp_token &operator= (pp_token &&) = delete;
+
+  void dump (FILE *out) const;
+  void DEBUG_FUNCTION dump () const { dump (stderr); }
+
+  static void *operator new (size_t sz, obstack &s);
+  static void operator delete (void *);
+
+  enum kind m_kind;
+
+  // Intrusive doubly-linked list
+  pp_token *m_prev;
+  pp_token *m_next;
+};
+
+/* Subclasses of pp_token for the various kinds of token.  */
+
+struct pp_token_text : public pp_token
+{
+  pp_token_text (label_text &&value)
+  : pp_token (kind::text),
+    m_value (std::move (value))
+  {
+    gcc_assert (m_value.get ());
+  }
+
+  label_text m_value;
+};
+
+template <>
+template <>
+inline bool
+is_a_helper <pp_token_text *>::test (pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::text;
+}
+
+template <>
+template <>
+inline bool
+is_a_helper <const pp_token_text *>::test (const pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::text;
+}
+
+struct pp_token_begin_color : public pp_token
+{
+  pp_token_begin_color (label_text &&value)
+  : pp_token (kind::begin_color),
+    m_value (std::move (value))
+  {
+    gcc_assert (m_value.get ());
+  }
+
+  label_text m_value;
+};
+
+template <>
+template <>
+inline bool
+is_a_helper <pp_token_begin_color *>::test (pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::begin_color;
+}
+
+template <>
+template <>
+inline bool
+is_a_helper <const pp_token_begin_color *>::test (const pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::begin_color;
+}
+
+struct pp_token_end_color : public pp_token
+{
+  pp_token_end_color ()
+  : pp_token (kind::end_color)
+  {
+  }
+};
+
+struct pp_token_begin_quote : public pp_token
+{
+  pp_token_begin_quote ()
+  : pp_token (kind::begin_quote)
+  {
+  }
+};
+
+struct pp_token_end_quote : public pp_token
+{
+  pp_token_end_quote ()
+  : pp_token (kind::end_quote)
+  {
+  }
+};
+
+struct pp_token_begin_url : public pp_token
+{
+  pp_token_begin_url (label_text &&value)
+  : pp_token (kind::begin_url),
+    m_value (std::move (value))
+  {
+    gcc_assert (m_value.get ());
+  }
+
+  label_text m_value;
+};
+
+template <>
+template <>
+inline bool
+is_a_helper <pp_token_begin_url*>::test (pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::begin_url;
+}
+
+template <>
+template <>
+inline bool
+is_a_helper <const pp_token_begin_url*>::test (const pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::begin_url;
+}
+
+struct pp_token_end_url : public pp_token
+{
+  pp_token_end_url ()
+    : pp_token (kind::end_url)
+  {
+  }
+};
+
+struct pp_token_custom_data : public pp_token
+{
+  class value
+  {
+  public:
+    virtual ~value () {}
+    virtual void dump (FILE *out) const = 0;
+
+    /* Hook for lowering a custom_data token to standard tokens.
+       Return true and write to OUT if possible.
+       Return false for custom_data that is to be handled by
+       the token_printer.  */
+    virtual bool as_standard_tokens (pp_token_list &out) = 0;
+  };
+
+  pp_token_custom_data (std::unique_ptr<value> val)
+  : pp_token (kind::custom_data),
+    m_value (std::move (val))
+  {
+    gcc_assert (m_value.get ());
+  }
+
+  std::unique_ptr<value> m_value;
+};
+
+template <>
+template <>
+inline bool
+is_a_helper <pp_token_custom_data *>::test (pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::custom_data;
+}
+
+template <>
+template <>
+inline bool
+is_a_helper <const pp_token_custom_data *>::test (const pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::custom_data;
+}
+
+/* A list of pp_token, with ownership of the tokens, using
+   a particular obstack to allocate its tokens.  These are
+   also allocated on the obstack during formatting (or, occasionally,
+   the stack).  */
+
+class pp_token_list
+{
+public:
+  // Allocate a new pp_token_list within S.
+  static pp_token_list *make (obstack &s)
+  {
+    return new (s) pp_token_list (s);
+  }
+  static void *operator new (size_t sz, obstack &s);
+  static void operator delete (void *);
+
+  pp_token_list (obstack &s);
+  pp_token_list (const pp_token_list &) = delete;
+  pp_token_list (pp_token_list &&);
+
+  ~pp_token_list ();
+
+  pp_token &operator= (const pp_token_list &) = delete;
+  pp_token &operator= (pp_token_list &&) = delete;
+
+/* Make a pp_token of the given subclass, using the relevant obstack to provide
+   the memory.  The pp_token must therefore not outlive the current chunk_info
+   level during formatting.  */
+  template<typename Subclass, typename... Args>
+  std::unique_ptr<pp_token>
+  make_token (Args&&... args)
+  {
+    return std::unique_ptr<pp_token>
+      (new (m_obstack) Subclass (std::forward<Args> (args)...));
+  }
+
+  template<typename Subclass, typename... Args>
+  void push_back (Args&&... args)
+  {
+    auto tok = make_token<Subclass> (std::forward<Args> (args)...);
+    push_back (std::move (tok));
+  }
+  void push_back_text (label_text &&text);
+  void push_back (std::unique_ptr<pp_token> tok);
+  void push_back_list (pp_token_list &&list);
+
+  std::unique_ptr<pp_token> pop_front ();
+
+  std::unique_ptr<pp_token> remove_token (pp_token *tok);
+
+  void insert_after (std::unique_ptr<pp_token> new_tok,
+		     pp_token *relative_tok);
+
+  void replace_custom_tokens ();
+  void merge_consecutive_text_tokens ();
+  void apply_urlifier (const urlifier &urlifier);
+
+  void dump (FILE *out) const;
+  void DEBUG_FUNCTION dump () const { dump (stderr); }
+
+  obstack &m_obstack;
+
+  pp_token *m_first;
+  pp_token *m_end;
+};
+
 /* The chunk_info data structure forms a stack of the results from the
    first phase of formatting (pp_format) which have not yet been
    output (pp_output_formatted_text).  A stack is necessary because
@@ -34,13 +336,15 @@ class chunk_info
   friend class pp_markup::context;
 
 public:
-  const char * const *get_args () const { return m_args; }
-  quoting_info *get_quoting_info () const { return m_quotes; }
+  pp_token_list * const * get_token_lists () const { return m_args; }
 
-  void append_formatted_chunk (const char *content);
+  void append_formatted_chunk (obstack &s, const char *content);
 
   void pop_from_output_buffer (output_buffer &buf);
 
+  void dump (FILE *out) const;
+  void DEBUG_FUNCTION dump () const { dump (stderr); }
+
 private:
   void on_begin_quote (const output_buffer &buf,
 		       unsigned chunk_idx,
@@ -54,17 +358,100 @@ private:
   /* Pointer to previous chunk on the stack.  */
   chunk_info *m_prev;
 
-  /* Array of chunks to output.  Each chunk is a NUL-terminated string.
+  /* Array of chunks to output.  Each chunk is a doubly-linked list of
+     pp_token.
+
+     The chunks can be printed via chunk_info::dump ().
+
      In the first phase of formatting, even-numbered chunks are
      to be output verbatim, odd-numbered chunks are format specifiers.
+     For example, given:
+       pp_format (pp,
+		  "foo: %i, bar: %s, opt: %qs",
+		  42, "baz", "-foption");
+
+     after phase 1 we might have:
+       (gdb) call buffer->cur_chunk_array->dump()
+       0: [TEXT("foo: ")]
+       1: [TEXT("i")]
+       2: [TEXT(", bar: ")]
+       3: [TEXT("s")]
+       4: [TEXT(", opt: ")]
+       5: [TEXT("qs")]
+
      The second phase replaces all odd-numbered chunks with formatted
-     text, and the third phase simply emits all the chunks in sequence
-     with appropriate line-wrapping.  */
-  const char *m_args[PP_NL_ARGMAX * 2];
+     token lists.  In the above example, after phase 2 we might have:
+       (gdb) call pp->m_buffer->cur_chunk_array->dump()
+       0: [TEXT("foo: ")]
+       1: [TEXT("42")]
+       2: [TEXT(", bar: ")]
+       3: [TEXT("baz")]
+       4: [TEXT(", opt: ")]
+       5: [BEGIN_QUOTE, TEXT("-foption"), END_QUOTE]
+     For example the %qs has become the three tokens:
+       [BEGIN_QUOTE, TEXT("-foption"), END_QUOTE]
+
+     The third phase (in pp_output_formatted_text):
+
+     (1) merges the tokens from all the chunks into one list,
+     giving e.g.
+      (gdb) call tokens.dump()
+      [TEXT("foo: "), TEXT("42"), TEXT(", bar: "), TEXT("baz"),
+       TEXT(", opt: "), BEGIN_QUOTE, TEXT("-foption"), END_QUOTE]
+
+     (2) lowers some custom tokens into non-custom tokens
+
+     (3) merges consecutive text tokens, giving e.g.:
+      (gdb) call tokens.dump()
+      [TEXT("foo: 42, bar: baz, option: "),
+       BEGIN_QUOTE, TEXT("-foption"), END_QUOTE]
+
+     (4) if provided with a urlifier, tries to apply it to quoted text,
+     giving e.g:
+      (gdb) call tokens.dump()
+      [TEXT("foo: 42, bar: baz, option: "), BEGIN_QUOTE,
+       BEGIN_URL("http://example.com"), TEXT("-foption"), END_URL, END_QUOTE]
+
+     (5) emits all tokens in sequence with appropriate line-wrapping.  This
+     can be overridded via the pretty_printer's token_printer, allowing for
+     output formats to e.g. override how URLs are handled, or to handle
+     custom_data that wasn't lowered in (2) above, e.g. for handling JSON
+     output of optimization records.  */
+  pp_token_list *m_args[PP_NL_ARGMAX * 2];
+
+  /* The pp_tokens, pp_token_lists, and the accumulated text buffers are
+     allocated within the output_buffer's chunk_obstack.  In the above
+     example, the in-memory layout of the chunk_obstack might look like
+     this after phase 1:
+
+      + pp_token_list for chunk 0 (m_first: *)   <--- START of chunk_info level
+      |                                     |
+      + "foo: \0"  <-------------\          |
+      |                          |          |
+      + pp_token_text (borrowed: *) <-------/
+      |
+      + pp_token_list for chunk 1
+      |
+      + "i\0" <------------------\
+      |                          |
+      + pp_token_text (borrowed: *)
+      |
+      +  ...etc for chunks 2 to 4...
+      |
+      + pp_token_list for chunk 5
+      |
+      + "qs\0" <-----------------\
+      |                          |
+      + pp_token_text (borrowed: *)
+      |
+      |
+      V
+     obstack grows this way
 
-  /* If non-null, information on quoted text runs within the chunks
-     for use by a urlifier.  */
-  quoting_info *m_quotes;
+     At each stage, allocation of additional text buffers, tokens, and lists
+     grow forwards in the obstack (though the internal pointers in linked
+     lists might point backwards to earlier objects within the same
+     chunk_info level).  */
 };
 
 #endif /* GCC_PRETTY_PRINT_FORMAT_IMPL_H */
diff --git a/gcc/pretty-print-markup.h b/gcc/pretty-print-markup.h
index b35632a79da9..ce2c5e9dbbe9 100644
--- a/gcc/pretty-print-markup.h
+++ b/gcc/pretty-print-markup.h
@@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "diagnostic-color.h"
 
+class pp_token_list;
+
 namespace pp_markup {
 
 class context
@@ -31,12 +33,12 @@ public:
 	   output_buffer &buf,
 	   unsigned chunk_idx,
 	   bool &quoted,
-	   const urlifier *urlifier)
+	   pp_token_list *formatted_token_list)
   : m_pp (pp),
     m_buf (buf),
     m_chunk_idx (chunk_idx),
     m_quoted (quoted),
-    m_urlifier (urlifier)
+    m_formatted_token_list (formatted_token_list)
   {
   }
 
@@ -46,11 +48,13 @@ public:
   void begin_highlight_color (const char *color_name);
   void end_highlight_color ();
 
+  void push_back_any_text ();
+
   pretty_printer &m_pp;
   output_buffer &m_buf;
   unsigned m_chunk_idx;
   bool &m_quoted;
-  const urlifier *m_urlifier;
+  pp_token_list *m_formatted_token_list;
 };
 
 /* Abstract base class for use in pp_format for handling "%e".
diff --git a/gcc/pretty-print-urlifier.h b/gcc/pretty-print-urlifier.h
index 3e63e62c41e1..3feb80921bc9 100644
--- a/gcc/pretty-print-urlifier.h
+++ b/gcc/pretty-print-urlifier.h
@@ -20,7 +20,7 @@ along with GCC; see the file COPYING3.  If not see
 #ifndef GCC_PRETTY_PRINT_URLIFIER_H
 #define GCC_PRETTY_PRINT_URLIFIER_H
 
-/* Abstract base class for optional use in pp_format for adding URLs
+/* Abstract base class for optional use in pretty-printing for adding URLs
    to quoted text strings.  */
 
 class urlifier
diff --git a/gcc/pretty-print.cc b/gcc/pretty-print.cc
index 810c629ef116..d2c0a197680c 100644
--- a/gcc/pretty-print.cc
+++ b/gcc/pretty-print.cc
@@ -19,6 +19,7 @@ along with GCC; see the file COPYING3.  If not see
 <http://www.gnu.org/licenses/>.  */
 
 #include "config.h"
+#define INCLUDE_MEMORY
 #define INCLUDE_VECTOR
 #include "system.h"
 #include "coretypes.h"
@@ -30,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "diagnostic-color.h"
 #include "diagnostic-event-id.h"
 #include "diagnostic-highlight-colors.h"
+#include "make-unique.h"
 #include "selftest.h"
 
 #if HAVE_ICONV
@@ -710,6 +712,10 @@ static int
 decode_utf8_char (const unsigned char *, size_t len, unsigned int *);
 static void pp_quoted_string (pretty_printer *, const char *, size_t = -1);
 
+static void
+default_token_printer (pretty_printer *pp,
+		       const pp_token_list &tokens);
+
 /* Overwrite the given location/range within this text_info's rich_location.
    For use e.g. when implementing "+" in client format decoders.  */
 
@@ -1063,196 +1069,408 @@ pp_indent (pretty_printer *pp)
 
 static const char *get_end_url_string (pretty_printer *);
 
-/* Append STR to OSTACK, without a null-terminator.  */
+/* struct pp_token.  */
 
-static void
-obstack_append_string (obstack *ostack, const char *str)
+pp_token::pp_token (enum kind k)
+: m_kind (k),
+  m_prev (nullptr),
+  m_next (nullptr)
 {
-  obstack_grow (ostack, str, strlen (str));
 }
 
-/* Append STR to OSTACK, without a null-terminator.  */
-
-static void
-obstack_append_string (obstack *ostack, const char *str, size_t len)
-{
-  obstack_grow (ostack, str, len);
-}
-
-/* Given quoted text within the buffer OBSTACK
-   at the half-open interval [QUOTED_TEXT_START_IDX, QUOTED_TEXT_END_IDX),
-   potentially use URLIFIER (if non-null) to see if there's a URL for the
-   quoted text.
-
-   If so, replace the quoted part of the text in the buffer with a URLified
-   version of the text, using PP's settings.
-
-   For example, given this is the buffer:
-     "this is a test `hello worldTRAILING-CONTENT"
-     .................^~~~~~~~~~~
-   with the quoted text starting at the 'h' of "hello world", the buffer
-   becomes:
-     "this is a test `BEGIN_URL(URL)hello worldEND(URL)TRAILING-CONTENT"
-     .................^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-     .................-----------replacement-----------
-
-   Return the new offset into the buffer of the quoted text endpoint i.e.
-   the offset of "TRAILING-CONTENT" in the above.  */
-
-static size_t
-urlify_quoted_string (pretty_printer *pp,
-		      obstack *obstack,
-		      const urlifier *urlifier,
-		      size_t quoted_text_start_idx,
-		      size_t quoted_text_end_idx)
-{
-  if (!pp->supports_urls_p ())
-    return quoted_text_end_idx;
-  if (!urlifier)
-    return quoted_text_end_idx;
-
-  const size_t quoted_len = quoted_text_end_idx - quoted_text_start_idx;
-  if (quoted_len == 0)
-    /* Empty quoted string; do nothing.  */
-    return quoted_text_end_idx;
-  const char *start = (obstack->object_base + quoted_text_start_idx);
-  char *url = urlifier->get_url_for_quoted_text (start, quoted_len);
-  if (!url)
-    /* No URL for this quoted text; do nothing.  */
-    return quoted_text_end_idx;
-
-  /* Stash a copy of the remainder of the chunk.  */
-  char *text = xstrndup (start,
-			 obstack_object_size (obstack) - quoted_text_start_idx);
-
-  /* Replace quoted text...  */
-  obstack->next_free = obstack->object_base + quoted_text_start_idx;
-
-  /*  ...with URLified version of the text.  */
-  /* Begin URL.  */
-  switch (pp->get_url_format ())
+void
+pp_token::dump (FILE *out) const
+{
+  switch (m_kind)
     {
     default:
-    case URL_FORMAT_NONE:
       gcc_unreachable ();
-    case URL_FORMAT_ST:
-      obstack_append_string (obstack, "\33]8;;");
-      obstack_append_string (obstack, url);
-      obstack_append_string (obstack, "\33\\");
+    case kind::text:
+      {
+	const pp_token_text *sub = as_a <const pp_token_text *> (this);
+	gcc_assert (sub->m_value.get ());
+	fprintf (out, "TEXT(\"%s\")", sub->m_value.get ());
+      }
       break;
-    case URL_FORMAT_BEL:
-      obstack_append_string (obstack, "\33]8;;");
-      obstack_append_string (obstack, url);
-      obstack_append_string (obstack, "\a");
+    case kind::begin_color:
+      {
+	const pp_token_begin_color *sub
+	  = as_a <const pp_token_begin_color *> (this);
+	gcc_assert (sub->m_value.get ());
+	fprintf (out, "BEGIN_COLOR(\"%s\")", sub->m_value.get ());
+	break;
+      }
+    case kind::end_color:
+      fprintf (out, "END_COLOR");
+      break;
+    case kind::begin_quote:
+      fprintf (out, "BEGIN_QUOTE");
+      break;
+    case kind::end_quote:
+      fprintf (out, "END_QUOTE");
+      break;
+    case kind::begin_url:
+      {
+	const pp_token_begin_url *sub
+	  = as_a <const pp_token_begin_url *> (this);
+	gcc_assert (sub->m_value.get ());
+	fprintf (out, "BEGIN_URL(\"%s\")", sub->m_value.get ());
+      }
+      break;
+    case kind::end_url:
+      fprintf (out, "END_URL");
+      break;
+    case kind::custom_data:
+      {
+	const pp_token_custom_data *sub
+	  = as_a <const pp_token_custom_data *> (this);
+	gcc_assert (sub->m_value.get ());
+	fprintf (out, "CUSTOM(");
+	sub->m_value->dump (out);
+	fprintf (out, ")");
+      }
       break;
     }
-  /* Add back the quoted part of the text.  */
-  obstack_append_string (obstack, text, quoted_len);
-  /* End URL.  */
-  obstack_append_string (obstack,
-			 get_end_url_string (pp));
+}
 
-  size_t new_end_idx = obstack_object_size (obstack);
+/* Allocate SZ bytes within S, which must not be half-way through
+   building another object.  */
 
-  /* Add back the remainder of the text after the quoted part.  */
-  obstack_append_string (obstack, text + quoted_len);
-  free (text);
-  free (url);
-  return new_end_idx;
+static void *
+allocate_object (size_t sz, obstack &s)
+{
+  /* We must not be half-way through an object.  */
+  gcc_assert (obstack_base (&s) == obstack_next_free (&s));
+
+  obstack_grow (&s, obstack_base (&s), sz);
+  void *buf = obstack_finish (&s);
+  return buf;
 }
 
-/* A class for tracking quoted text within a buffer for
-   use by a urlifier.  */
+/* Make room for a pp_token instance within obstack S.  */
 
-class quoting_info
+void *
+pp_token::operator new (size_t sz, obstack &s)
 {
-public:
-  /* Called when quoted text is begun in phase 1 or 2.  */
-  void on_begin_quote (const output_buffer &buf,
-		       unsigned chunk_idx)
-  {
-    /* Stash location of start of quoted string.  */
-    size_t byte_offset = obstack_object_size (&buf.chunk_obstack);
-    m_loc_last_open_quote = location (chunk_idx, byte_offset);
-  }
+  return allocate_object (sz, s);
+}
 
-  /* Called when quoted text is ended in phase 1 or 2.  */
-  void on_end_quote (pretty_printer *pp,
-		     output_buffer &buf,
-		     unsigned chunk_idx,
-		     const urlifier &urlifier)
-  {
-    /* If possible, do urlification now.  */
-    if (chunk_idx == m_loc_last_open_quote.m_chunk_idx)
-      {
-	urlify_quoted_string (pp,
-			      &buf.chunk_obstack,
-			      &urlifier,
-			      m_loc_last_open_quote.m_byte_offset,
-			      obstack_object_size (&buf.chunk_obstack));
-	m_loc_last_open_quote = location ();
-	return;
-      }
-    /* Otherwise the quoted text straddles multiple chunks.
-       Stash the location of end of quoted string for use in phase 3.  */
-    size_t byte_offset = obstack_object_size (&buf.chunk_obstack);
-    m_phase_3_quotes.push_back (run (m_loc_last_open_quote,
-				     location (chunk_idx, byte_offset)));
-    m_loc_last_open_quote = location ();
-  }
+void
+pp_token::operator delete (void *)
+{
+  /* No-op: pp_tokens are allocated within obstacks, so
+     the memory will be reclaimed when the obstack is freed.  */
+}
 
-  bool has_phase_3_quotes_p () const
-  {
-    return m_phase_3_quotes.size () > 0;
-  }
-  void handle_phase_3 (pretty_printer *pp,
-		       const urlifier &urlifier);
+/* class pp_token_list.  */
 
-private:
-  struct location
-  {
-    location ()
-    : m_chunk_idx (UINT_MAX),
-      m_byte_offset (SIZE_MAX)
+/* Make room for a pp_token_list instance within obstack S.  */
+
+void *
+pp_token_list::operator new (size_t sz, obstack &s)
+{
+  return allocate_object (sz, s);
+}
+
+void
+pp_token_list::operator delete (void *)
+{
+  /* No-op: pp_token_list allocated within obstacks don't
+     need their own reclaim the memory will be reclaimed when
+     the obstack is freed.  */
+}
+
+pp_token_list::pp_token_list (obstack &s)
+: m_obstack (s),
+  m_first (nullptr),
+  m_end (nullptr)
+{
+}
+
+pp_token_list::pp_token_list (pp_token_list &&other)
+: m_obstack (other.m_obstack),
+  m_first (other.m_first),
+  m_end (other.m_end)
+{
+  other.m_first = nullptr;
+  other.m_end = nullptr;
+}
+
+pp_token_list::~pp_token_list ()
+{
+  for (auto iter = m_first; iter; )
     {
+      pp_token *next = iter->m_next;
+      delete iter;
+      iter = next;
     }
+}
+
+void
+pp_token_list::push_back_text (label_text &&text)
+{
+  if (text.get ()[0] == '\0')
+    return; // pushing empty string is a no-op
+  push_back<pp_token_text> (std::move (text));
+}
 
-    location (unsigned chunk_idx,
-	      size_t byte_offset)
-    : m_chunk_idx (chunk_idx),
-      m_byte_offset (byte_offset)
+void
+pp_token_list::push_back (std::unique_ptr<pp_token> tok)
+{
+  if (!m_first)
     {
+      gcc_assert (m_end == nullptr);
+      m_first = tok.get ();
+      m_end = tok.get ();
     }
+  else
+    {
+      gcc_assert (m_end != nullptr);
+      m_end->m_next = tok.get ();
+      tok->m_prev = m_end;
+      m_end = tok.get ();
+    }
+  tok.release ();
+}
 
-    unsigned m_chunk_idx;
-    size_t m_byte_offset;
-  };
+void
+pp_token_list::push_back_list (pp_token_list &&list)
+{
+  while (auto tok = list.pop_front ())
+    push_back (std::move (tok));
+}
 
-  struct run
-  {
-    run (location start, location end)
-    : m_start (start), m_end (end)
+std::unique_ptr<pp_token>
+pp_token_list::pop_front ()
+{
+  pp_token *result = m_first;
+  if (result == nullptr)
+    return nullptr;
+
+  gcc_assert (result->m_prev == nullptr);
+  m_first = result->m_next;
+  if (result->m_next)
     {
+      gcc_assert (result != m_end);
+      m_first->m_prev = nullptr;
     }
+  else
+    {
+      gcc_assert (result == m_end);
+      m_end = nullptr;
+    }
+  result->m_next = nullptr;
+  return std::unique_ptr<pp_token> (result);
+}
 
-    location m_start;
-    location m_end;
-  };
+std::unique_ptr<pp_token>
+pp_token_list::remove_token (pp_token *tok)
+{
+  gcc_assert (tok);
+  if (tok->m_prev)
+    {
+      gcc_assert (tok != m_first);
+      tok->m_prev->m_next = tok->m_next;
+    }
+  else
+    {
+      gcc_assert (tok == m_first);
+      m_first = tok->m_next;
+    }
+  if (tok->m_next)
+    {
+      gcc_assert (tok != m_end);
+      tok->m_next->m_prev = tok->m_prev;
+    }
+  else
+    {
+      gcc_assert (tok == m_end);
+      m_end = tok->m_prev;
+    }
+  tok->m_prev = nullptr;
+  tok->m_next = nullptr;
+  gcc_assert (m_first != tok);
+  gcc_assert (m_end != tok);
+  return std::unique_ptr<pp_token> (tok);
+}
+
+/* Insert NEW_TOK after RELATIVE_TOK.  */
+
+void
+pp_token_list::insert_after (std::unique_ptr<pp_token> new_tok_up,
+			     pp_token *relative_tok)
+{
+  pp_token *new_tok = new_tok_up.release ();
+
+  gcc_assert (new_tok);
+  gcc_assert (new_tok->m_prev == nullptr);
+  gcc_assert (new_tok->m_next == nullptr);
+  gcc_assert (relative_tok);
+
+  if (relative_tok->m_next)
+    {
+      gcc_assert (relative_tok != m_end);
+      relative_tok->m_next->m_prev = new_tok;
+    }
+  else
+    {
+      gcc_assert (relative_tok == m_end);
+      m_end = new_tok;
+    }
+  new_tok->m_prev = relative_tok;
+  new_tok->m_next = relative_tok->m_next;
+  relative_tok->m_next = new_tok;
+}
+
+void
+pp_token_list::replace_custom_tokens ()
+{
+  pp_token *iter = m_first;
+  while (iter)
+    {
+      pp_token *next  = iter->m_next;
+      if (iter->m_kind == pp_token::kind::custom_data)
+	{
+	  pp_token_list tok_list (m_obstack);
+	  pp_token_custom_data *sub = as_a <pp_token_custom_data *> (iter);
+	  if (sub->m_value->as_standard_tokens (tok_list))
+	    {
+	      while (auto tok = tok_list.pop_front ())
+		{
+		  /* The resulting token list must not contain any
+		     custom data.  */
+		  gcc_assert (tok->m_kind != pp_token::kind::custom_data);
+		  insert_after (std::move (tok), iter);
+		}
+	      remove_token (iter);
+	    }
+	}
+      iter = next;
+    }
+}
+
+/* Merge any runs of consecutive text tokens within this list
+   into individual text tokens.  */
+
+void
+pp_token_list::merge_consecutive_text_tokens ()
+{
+  pp_token *start_of_run = m_first;
+  while (start_of_run)
+    {
+      if (start_of_run->m_kind != pp_token::kind::text)
+	{
+	  start_of_run = start_of_run->m_next;
+	  continue;
+	}
+      pp_token *end_of_run = start_of_run;
+      while (end_of_run->m_next
+	     && end_of_run->m_next->m_kind == pp_token::kind::text)
+	end_of_run = end_of_run->m_next;
+      if (end_of_run != start_of_run)
+	{
+	  /* start_of_run through end_of_run are a run of consecutive
+	     text tokens.  */
+
+	  /* Calculate size of buffer for merged text.  */
+	  size_t sz = 0;
+	  for (auto iter = start_of_run; iter != end_of_run->m_next;
+	       iter = iter->m_next)
+	    {
+	      pp_token_text *iter_text = static_cast<pp_token_text *> (iter);
+	      sz += strlen (iter_text->m_value.get ());
+	    }
+
+	  /* Allocate and populate buffer for merged text
+	     (within m_obstack).  */
+	  char * const buf = (char *)allocate_object (sz + 1, m_obstack);
+	  char *p = buf;
+	  for (auto iter = start_of_run; iter != end_of_run->m_next;
+	       iter = iter->m_next)
+	    {
+	      pp_token_text *iter_text = static_cast<pp_token_text *> (iter);
+	      size_t iter_sz = strlen (iter_text->m_value.get ());
+	      memcpy (p, iter_text->m_value.get (), iter_sz);
+	      p += iter_sz;
+	    }
+	  *p = '\0';
+
+	  /* Replace start_of_run's buffer pointer with the new buffer.  */
+	  static_cast<pp_token_text *> (start_of_run)->m_value
+	    = label_text::borrow (buf);
+
+	  /* Remove all the other text tokens in the run.  */
+	  pp_token * const next = end_of_run->m_next;
+	  while (start_of_run->m_next != next)
+	    remove_token (start_of_run->m_next);
+	  start_of_run = next;
+	}
+      else
+	start_of_run = end_of_run->m_next;
+    }
+}
+
+/* Apply URLIFIER to this token list.
+   Find BEGIN_QUOTE, TEXT, END_QUOTE triples, and if URLIFIER has a url
+   for the value of TEXT, then wrap TEXT in a {BEGIN,END}_URL pair.  */
+
+void
+pp_token_list::apply_urlifier (const urlifier &urlifier)
+{
+  for (pp_token *iter = m_first; iter; )
+    {
+      if (iter->m_kind == pp_token::kind::begin_quote
+	  && iter->m_next
+	  && iter->m_next->m_kind == pp_token::kind::text
+	  && iter->m_next->m_next
+	  && iter->m_next->m_next->m_kind == pp_token::kind::end_quote)
+	{
+	  pp_token *begin_quote = iter;
+	  pp_token_text *text = as_a <pp_token_text *> (begin_quote->m_next);
+	  pp_token *end_quote = text->m_next;
+	  if (char *url = urlifier.get_url_for_quoted_text
+			    (text->m_value.get (),
+			     strlen (text->m_value.get ())))
+	    {
+	      auto begin_url
+		= make_token<pp_token_begin_url> (label_text::take (url));
+	      auto end_url = make_token<pp_token_end_url> ();
+	      insert_after (std::move (begin_url), begin_quote);
+	      insert_after (std::move (end_url), text);
+	    }
+	  iter = end_quote->m_next;
+	}
+      else
+	iter = iter->m_next;
+    }
+}
+
+void
+pp_token_list::dump (FILE *out) const
+{
+  fprintf (out, "[");
+  for (auto iter = m_first; iter; iter = iter->m_next)
+    {
+      iter->dump (out);
+      if (iter->m_next)
+	fprintf (out, ", ");
+    }
+  fprintf (out, "]\n");
+}
 
-  location m_loc_last_open_quote;
-  std::vector<run> m_phase_3_quotes;
-};
 
 /* Adds a chunk to the end of formatted output, so that it
    will be printed by pp_output_formatted_text.  */
 
 void
-chunk_info::append_formatted_chunk (const char *content)
+chunk_info::append_formatted_chunk (obstack &s, const char *content)
 {
   unsigned int chunk_idx;
   for (chunk_idx = 0; m_args[chunk_idx]; chunk_idx++)
     ;
-  m_args[chunk_idx++] = content;
+  pp_token_list *tokens = pp_token_list::make (s);
+  tokens->push_back_text (label_text::borrow (content));
+  m_args[chunk_idx++] = tokens;
   m_args[chunk_idx] = nullptr;
 }
 
@@ -1262,34 +1480,33 @@ chunk_info::append_formatted_chunk (const char *content)
 void
 chunk_info::pop_from_output_buffer (output_buffer &buf)
 {
-  delete m_quotes;
   buf.cur_chunk_array = m_prev;
   obstack_free (&buf.chunk_obstack, this);
 }
 
 void
-chunk_info::on_begin_quote (const output_buffer &buf,
-			    unsigned chunk_idx,
-			    const urlifier *urlifier)
+chunk_info::dump (FILE *out) const
 {
-  if (!urlifier)
-    return;
-  if (!m_quotes)
-    m_quotes = new quoting_info ();
-  m_quotes->on_begin_quote (buf, chunk_idx);
+  for (size_t idx = 0; m_args[idx]; ++idx)
+    {
+      fprintf (out, "%i: ", (int)idx);
+      m_args[idx]->dump (out);
+    }
 }
 
-void
-chunk_info::on_end_quote (pretty_printer *pp,
-			  output_buffer &buf,
-			  unsigned chunk_idx,
-			  const urlifier *urlifier)
+/* Finish any text accumulating within CUR_OBSTACK,
+   terminating it.
+   Push a text pp_token to the end of TOK_LIST containing
+   a borrowed copy of the text in CUR_OBSTACK.  */
+
+static void
+push_back_any_text (pp_token_list *tok_list,
+		    obstack *cur_obstack)
 {
-  if (!urlifier)
-    return;
-  if (!m_quotes)
-    m_quotes = new quoting_info ();
-  m_quotes->on_end_quote (pp, buf, chunk_idx, *urlifier);
+  obstack_1grow (cur_obstack, '\0');
+  tok_list->push_back_text
+    (label_text::borrow (XOBFINISH (cur_obstack,
+				    const char *)));
 }
 
 /* The following format specifiers are recognized as being client independent:
@@ -1339,36 +1556,22 @@ chunk_info::on_end_quote (pretty_printer *pp,
 /* Implementation of pp_format.
    Formatting phases 1 and 2: render TEXT->format_spec plus
    text->m_args_ptr into a series of chunks in pp_buffer (PP)->args[].
-   Phase 3 is in pp_output_formatted_text.
-
-   If URLIFIER is non-NULL, then use it to add URLs for quoted
-   strings, so that e.g.
-     "before %<quoted%> after"
-   with a URLIFIER that has a URL for "quoted" might be emitted as:
-     "before `BEGIN_URL(http://example.com)quotedEND_URL' after"
-   This is handled here for message fragments that are:
-   - quoted entirely in phase 1 (e.g. "%<this is quoted%>"), or
-   - quoted entirely in phase 2 (e.g. "%qs"),
-   Quoted fragments that use a mixture of both phases
-   (e.g. "%<this is a mixture: %s %>")
-   are stashed into the output_buffer's m_quotes for use in phase 3.  */
+   Phase 3 is in pp_output_formatted_text.  */
 
 void
-pretty_printer::format (text_info *text,
-			const urlifier *urlifier)
+pretty_printer::format (text_info *text)
 {
   output_buffer * const buffer = m_buffer;
 
   unsigned int chunk = 0, argno;
-  const char **formatters[PP_NL_ARGMAX];
+  pp_token_list **formatters[PP_NL_ARGMAX];
 
   /* Allocate a new chunk structure.  */
   chunk_info *new_chunk_array = XOBNEW (&buffer->chunk_obstack, chunk_info);
 
   new_chunk_array->m_prev = buffer->cur_chunk_array;
-  new_chunk_array->m_quotes = nullptr;
   buffer->cur_chunk_array = new_chunk_array;
-  const char **args = new_chunk_array->m_args;
+  pp_token_list **args = new_chunk_array->m_args;
 
   /* Formatting phase 1: split up TEXT->format_spec into chunks in
      pp_buffer (PP)->args[].  Even-numbered chunks are to be output
@@ -1380,6 +1583,8 @@ pretty_printer::format (text_info *text,
 
   unsigned int curarg = 0;
   bool any_unnumbered = false, any_numbered = false;
+  pp_token_list *cur_token_list;
+  args[chunk++] = cur_token_list = pp_token_list::make (buffer->chunk_obstack);
   for (const char *p = text->m_format_spec; *p; )
     {
       while (*p != '\0' && *p != '%')
@@ -1403,44 +1608,39 @@ pretty_printer::format (text_info *text,
 
 	case '<':
 	  {
-	    obstack_grow (&buffer->chunk_obstack,
-			  open_quote, strlen (open_quote));
-	    const char *colorstr = colorize_start (m_show_color, "quote");
-	    obstack_grow (&buffer->chunk_obstack, colorstr, strlen (colorstr));
+	    push_back_any_text (cur_token_list, &buffer->chunk_obstack);
+	    cur_token_list->push_back<pp_token_begin_quote> ();
 	    p++;
-
-	    buffer->cur_chunk_array->on_begin_quote (*buffer, chunk, urlifier);
 	    continue;
 	  }
 
 	case '>':
 	  {
-	    buffer->cur_chunk_array->on_end_quote (this, *buffer, chunk, urlifier);
-
-	    const char *colorstr = colorize_stop (m_show_color);
-	    obstack_grow (&buffer->chunk_obstack, colorstr, strlen (colorstr));
+	    push_back_any_text (cur_token_list, &buffer->chunk_obstack);
+	    cur_token_list->push_back<pp_token_end_quote> ();
+	    p++;
+	    continue;
 	  }
-	  /* FALLTHRU */
 	case '\'':
-	  obstack_grow (&buffer->chunk_obstack,
-			close_quote, strlen (close_quote));
-	  p++;
+	  {
+	    push_back_any_text (cur_token_list, &buffer->chunk_obstack);
+	    cur_token_list->push_back<pp_token_end_quote> ();
+	    p++;
+	  }
 	  continue;
 
 	case '}':
 	  {
-	    const char *endurlstr = get_end_url_string (this);
-	    obstack_grow (&buffer->chunk_obstack, endurlstr,
-			  strlen (endurlstr));
+	    push_back_any_text (cur_token_list, &buffer->chunk_obstack);
+	    cur_token_list->push_back<pp_token_end_url> ();
+	    p++;
 	  }
-	  p++;
 	  continue;
 
 	case 'R':
 	  {
-	    const char *colorstr = colorize_stop (m_show_color);
-	    obstack_grow (&buffer->chunk_obstack, colorstr,
-			  strlen (colorstr));
+	    push_back_any_text (cur_token_list, &buffer->chunk_obstack);
+	    cur_token_list->push_back<pp_token_end_color> ();
 	    p++;
 	    continue;
 	  }
@@ -1455,11 +1655,14 @@ pretty_printer::format (text_info *text,
 
 	default:
 	  /* Handled in phase 2.  Terminate the plain chunk here.  */
-	  obstack_1grow (&buffer->chunk_obstack, '\0');
-	  args[chunk++] = XOBFINISH (&buffer->chunk_obstack, const char *);
+	  push_back_any_text (cur_token_list, &buffer->chunk_obstack);
 	  break;
 	}
 
+      /* Start a new token list for the formatting args.  */
+      args[chunk] = cur_token_list
+	= pp_token_list::make (buffer->chunk_obstack);
+
       if (ISDIGIT (*p))
 	{
 	  char *end;
@@ -1479,7 +1682,7 @@ pretty_printer::format (text_info *text,
 	}
       gcc_assert (argno < PP_NL_ARGMAX);
       gcc_assert (!formatters[argno]);
-      formatters[argno] = &args[chunk];
+      formatters[argno] = &args[chunk++];
       do
 	{
 	  obstack_1grow (&buffer->chunk_obstack, *p);
@@ -1531,17 +1734,24 @@ pretty_printer::format (text_info *text,
 	    }
 	}
       if (*p == '\0')
-	break;
+	{
+	  push_back_any_text (cur_token_list, &buffer->chunk_obstack);
+	  break;
+	}
 
       obstack_1grow (&buffer->chunk_obstack, '\0');
+      push_back_any_text (cur_token_list, &buffer->chunk_obstack);
+
+      /* Start a new token list for the next (non-formatted) text.  */
       gcc_assert (chunk < PP_NL_ARGMAX * 2);
-      args[chunk++] = XOBFINISH (&buffer->chunk_obstack, const char *);
+      args[chunk++] = cur_token_list
+	= pp_token_list::make (buffer->chunk_obstack);
     }
 
   obstack_1grow (&buffer->chunk_obstack, '\0');
+  push_back_any_text (cur_token_list, &buffer->chunk_obstack);
   gcc_assert (chunk < PP_NL_ARGMAX * 2);
-  args[chunk++] = XOBFINISH (&buffer->chunk_obstack, const char *);
-  args[chunk] = 0;
+  args[chunk] = nullptr;
 
   /* Set output to the argument obstack, and switch line-wrapping and
      prefixing off.  */
@@ -1549,6 +1759,15 @@ pretty_printer::format (text_info *text,
   const int old_line_length = buffer->line_length;
   const pp_wrapping_mode_t old_wrapping_mode = pp_set_verbatim_wrapping (this);
 
+  /* Note that you can debug the state of the chunk arrays here using
+       (gdb) call buffer->cur_chunk_array->dump()
+     which, given e.g. "foo: %s bar: %s" might print:
+       0: [TEXT("foo: ")]
+       1: [TEXT("s")]
+       2: [TEXT(" bar: ")]
+       3: [TEXT("s")]
+  */
+
   /* Second phase.  Replace each formatter with the formatted text it
      corresponds to.  */
 
@@ -1562,10 +1781,20 @@ pretty_printer::format (text_info *text,
 
       const char *p;
 
+      /* We expect a single text token containing the formatter.  */
+      pp_token_list *tok_list = *(formatters[argno]);
+      gcc_assert (tok_list);
+      gcc_assert (tok_list->m_first == tok_list->m_end);
+      gcc_assert (tok_list->m_first->m_kind == pp_token::kind::text);
+
+      /* Accumulate the value of the formatted text into here.  */
+      pp_token_list *formatted_tok_list
+	= pp_token_list::make (buffer->chunk_obstack);
+
       /* We do not attempt to enforce any ordering on the modifier
 	 characters.  */
 
-      for (p = *formatters[argno];; p++)
+      for (p = as_a <pp_token_text *> (tok_list->m_first)->m_value.get ();; p++)
 	{
 	  switch (*p)
 	    {
@@ -1612,16 +1841,18 @@ pretty_printer::format (text_info *text,
 
       if (quote)
 	{
-	  pp_begin_quote (this, m_show_color);
-	  buffer->cur_chunk_array->on_begin_quote (*buffer, chunk, urlifier);
+	  push_back_any_text (formatted_tok_list, &buffer->chunk_obstack);
+	  formatted_tok_list->push_back<pp_token_begin_quote> ();
 	}
 
       switch (*p)
 	{
 	case 'r':
-	  pp_string (this, colorize_start (m_show_color,
-					 va_arg (*text->m_args_ptr,
-						 const char *)));
+	  {
+	    const char *color = va_arg (*text->m_args_ptr, const char *);
+	    formatted_tok_list->push_back<pp_token_begin_color>
+	      (label_text::borrow (color));
+	  }
 	  break;
 
 	case 'c':
@@ -1763,7 +1994,11 @@ pretty_printer::format (text_info *text,
 	  break;
 
 	case '{':
-	  begin_url (va_arg (*text->m_args_ptr, const char *));
+	  {
+	    const char *url = va_arg (*text->m_args_ptr, const char *);
+	    formatted_tok_list->push_back<pp_token_begin_url>
+	      (label_text::borrow (url));
+	  }
 	  break;
 
 	case 'e':
@@ -1772,7 +2007,7 @@ pretty_printer::format (text_info *text,
 	      = va_arg (*text->m_args_ptr, pp_element *);
 	    pp_markup::context ctxt (*this, *buffer, chunk,
 				     quote, /* by reference */
-				     urlifier);
+				     formatted_tok_list);
 	    element->add_to_phase_2 (ctxt);
 	  }
 	  break;
@@ -1787,22 +2022,23 @@ pretty_printer::format (text_info *text,
 	       (e.g. when printing "'TYPEDEF' aka 'TYPE'" in the C family
 	       of frontends).  */
 	    gcc_assert (pp_format_decoder (this));
+	    gcc_assert (formatted_tok_list);
 	    ok = m_format_decoder (this, text, p,
 				   precision, wide, plus, hash, &quote,
-				   formatters[argno]);
+				   *formatted_tok_list);
 	    gcc_assert (ok);
 	  }
 	}
 
       if (quote)
 	{
-	  buffer->cur_chunk_array->on_end_quote (this, *buffer,
-						 chunk, urlifier);
-	  pp_end_quote (this, m_show_color);
+	  push_back_any_text (formatted_tok_list, &buffer->chunk_obstack);
+	  formatted_tok_list->push_back<pp_token_end_quote> ();
 	}
 
-      obstack_1grow (&buffer->chunk_obstack, '\0');
-      *formatters[argno] = XOBFINISH (&buffer->chunk_obstack, const char *);
+      push_back_any_text (formatted_tok_list, &buffer->chunk_obstack);
+      delete *formatters[argno];
+      *formatters[argno] = formatted_tok_list;
     }
 
   if (CHECKING_P)
@@ -1833,6 +2069,8 @@ struct auto_obstack
     obstack_free (&m_obstack, NULL);
   }
 
+  operator obstack & () { return m_obstack; }
+
   void grow (const void *src, size_t length)
   {
     obstack_grow (&m_obstack, src, length);
@@ -1851,130 +2089,105 @@ struct auto_obstack
   obstack m_obstack;
 };
 
-/* Subroutine of pp_output_formatted_text for the awkward case where
-   quoted text straddles multiple chunks.
-
-   Flush PP's buffer's chunks to PP's output buffer, whilst inserting
-   URLs for any quoted text that should be URLified.
-
-   For example, given:
-   |  pp_format (pp,
-   |            "unrecognized option %qs; did you mean %<-%s%>",
-   |            "foo", "foption");
-   we would have these chunks:
-   |  chunk 0: "unrecognized option "
-   |  chunk 1: "`foo'" (already checked for urlification)
-   |  chunk 2: "; did you mean `-"
-   |                           ^*
-   |  chunk 3: "foption"
-   |            *******
-   |  chunk 4: "'"
-   |            ^
-   and this quoting_info would have recorded the open quote near the end
-   of chunk 2 and close quote at the start of chunk 4; this function would
-   check the combination of the end of chunk 2 and all of chunk 3 ("-foption")
-   for urlification.  */
+/* Format of a message pointed to by TEXT.
+   If URLIFIER is non-null then use it on any quoted text that was not
+   handled in phases 1 or 2 to potentially add URLs.  */
 
 void
-quoting_info::handle_phase_3 (pretty_printer *pp,
-			      const urlifier &urlifier)
+pp_output_formatted_text (pretty_printer *pp,
+			  const urlifier *urlifier)
 {
-  unsigned int chunk;
   output_buffer * const buffer = pp_buffer (pp);
+  gcc_assert (buffer->obstack == &buffer->formatted_obstack);
+
   chunk_info *chunk_array = buffer->cur_chunk_array;
-  const char * const *args = chunk_array->get_args ();
-  quoting_info *quoting = chunk_array->get_quoting_info ();
-
-  /* We need to construct the string into an intermediate buffer
-     for this case, since using pp_string can introduce prefixes
-     and line-wrapping, and omit whitespace at the start of lines.  */
-  auto_obstack combined_buf;
-
-  /* Iterate simultaneously through both
-     - the chunks and
-     - the runs of quoted characters
-     Accumulate text from the chunks into combined_buf, and handle
-     runs of quoted characters when handling the chunks they
-     correspond to.  */
-  size_t start_of_run_byte_offset = 0;
-  std::vector<quoting_info::run>::const_iterator iter_run
-    = quoting->m_phase_3_quotes.begin ();
-  std::vector<quoting_info::run>::const_iterator end_runs
-    = quoting->m_phase_3_quotes.end ();
-  for (chunk = 0; args[chunk]; chunk++)
-    {
-      size_t start_of_chunk_idx = combined_buf.object_size ();
+  pp_token_list * const *token_lists = chunk_array->get_token_lists ();
 
-      combined_buf.grow (args[chunk], strlen (args[chunk]));
+  {
+    /* Consolidate into one token list.  */
+    pp_token_list tokens (buffer->chunk_obstack);
+    for (unsigned chunk = 0; token_lists[chunk]; chunk++)
+      {
+	tokens.push_back_list (std::move (*token_lists[chunk]));
+	delete token_lists[chunk];
+      }
 
-      if (iter_run != end_runs
-	  && chunk == iter_run->m_end.m_chunk_idx)
-	{
-	  /* A run is ending; consider for it urlification.  */
-	  const size_t end_of_run_byte_offset
-	    = start_of_chunk_idx + iter_run->m_end.m_byte_offset;
-	  const size_t end_offset
-	    = urlify_quoted_string (pp,
-				    &combined_buf.m_obstack,
-				    &urlifier,
-				    start_of_run_byte_offset,
-				    end_of_run_byte_offset);
-
-	  /* If URLification occurred it will have grown the buffer.
-	     We need to update start_of_chunk_idx so that offsets
-	     relative to it are still correct, for the case where
-	     we have a chunk that both ends a quoted run and starts
-	     another quoted run.  */
-	  gcc_assert (end_offset >= end_of_run_byte_offset);
-	  start_of_chunk_idx += end_offset - end_of_run_byte_offset;
-
-	  iter_run++;
-	}
-      if (iter_run != end_runs
-	  && chunk == iter_run->m_start.m_chunk_idx)
-	{
-	  /* Note where the run starts w.r.t. the composed buffer.  */
-	  start_of_run_byte_offset
-	    = start_of_chunk_idx + iter_run->m_start.m_byte_offset;
-	}
-    }
+    tokens.replace_custom_tokens ();
+
+    tokens.merge_consecutive_text_tokens ();
+
+    if (urlifier)
+      tokens.apply_urlifier (*urlifier);
+
+    /* This is a third phase, first 2 phases done in pp_format_args.
+       Now we actually print it.  */
+    if (pp->m_token_printer)
+      pp->m_token_printer->print_tokens (pp, tokens);
+    else
+      default_token_printer (pp, tokens);
 
-  /* Now print to PP.  */
-  const char *start
-    = static_cast <const char *> (combined_buf.object_base ());
-  pp_maybe_wrap_text (pp, start, start + combined_buf.object_size ());
+  /* Close the scope here to ensure that "tokens" above is fully cleared up
+     before popping the current chunk_info, since that latter will pop
+     the chunk_obstack, and "tokens" may be using blocks within
+     the current chunk_info's chunk_obstack level.  */
+  }
+
+  chunk_array->pop_from_output_buffer (*buffer);
 }
 
-/* Format of a message pointed to by TEXT.
-   If URLIFIER is non-null then use it on any quoted text that was not
-   handled in phases 1 or 2 to potentially add URLs.  */
+/* Default implementation of token printing.  */
 
-void
-pp_output_formatted_text (pretty_printer *pp,
-			  const urlifier *urlifier)
+static void
+default_token_printer (pretty_printer *pp,
+		       const pp_token_list &tokens)
 {
-  unsigned int chunk;
-  output_buffer * const buffer = pp_buffer (pp);
-  chunk_info *chunk_array = buffer->cur_chunk_array;
-  const char * const *args = chunk_array->get_args ();
-  quoting_info *quoting = chunk_array->get_quoting_info ();
+  /* Convert to text, possibly with colorization, URLs, etc.  */
+  for (auto iter = tokens.m_first; iter; iter = iter->m_next)
+    switch (iter->m_kind)
+      {
+      default:
+	gcc_unreachable ();
 
-  gcc_assert (buffer->obstack == &buffer->formatted_obstack);
+      case pp_token::kind::text:
+	{
+	  pp_token_text *sub = as_a <pp_token_text *> (iter);
+	  pp_string (pp, sub->m_value.get ());
+	}
+	break;
+
+      case pp_token::kind::begin_color:
+	{
+	  pp_token_begin_color *sub = as_a <pp_token_begin_color *> (iter);
+	  pp_string (pp, colorize_start (pp_show_color (pp),
+					 sub->m_value.get ()));
+	}
+	break;
+      case pp_token::kind::end_color:
+	pp_string (pp, colorize_stop (pp_show_color (pp)));
+	break;
 
-  /* This is a third phase, first 2 phases done in pp_format_args.
-     Now we actually print it.  */
+      case pp_token::kind::begin_quote:
+	pp_begin_quote (pp, pp_show_color (pp));
+	break;
+      case pp_token::kind::end_quote:
+	pp_end_quote (pp, pp_show_color (pp));
+	break;
 
-  /* If we have any deferred urlification, handle it now.  */
-  if (urlifier
-      && pp->supports_urls_p ()
-      && quoting
-      && quoting->has_phase_3_quotes_p ())
-    quoting->handle_phase_3 (pp, *urlifier);
-  else
-    for (chunk = 0; args[chunk]; chunk++)
-      pp_string (pp, args[chunk]);
+      case pp_token::kind::begin_url:
+	{
+	  pp_token_begin_url *sub = as_a <pp_token_begin_url *> (iter);
+	  pp_begin_url (pp, sub->m_value.get ());
+	}
+	break;
+      case pp_token::kind::end_url:
+	pp_end_url (pp);
+	break;
 
-  chunk_array->pop_from_output_buffer (*buffer);
+      case pp_token::kind::custom_data:
+	/* These should have been eliminated by replace_custom_tokens.  */
+	gcc_unreachable ();
+	break;
+      }
 }
 
 /* Helper subroutine of output_verbatim and verbatim. Do the appropriate
@@ -2113,6 +2326,7 @@ pretty_printer::pretty_printer (int maximum_length)
     m_wrapping (),
     m_format_decoder (nullptr),
     m_format_postprocessor (NULL),
+    m_token_printer (nullptr),
     m_emitted_prefix (false),
     m_need_newline (false),
     m_translate_identifiers (true),
@@ -2138,6 +2352,7 @@ pretty_printer::pretty_printer (const pretty_printer &other)
   m_wrapping (other.m_wrapping),
   m_format_decoder (other.m_format_decoder),
   m_format_postprocessor (NULL),
+  m_token_printer (other.m_token_printer),
   m_emitted_prefix (other.m_emitted_prefix),
   m_need_newline (other.m_need_newline),
   m_translate_identifiers (other.m_translate_identifiers),
@@ -2743,8 +2958,9 @@ void
 pp_markup::context::begin_quote ()
 {
   gcc_assert (!m_quoted);
-  pp_begin_quote (&m_pp, pp_show_color (&m_pp));
-  m_buf.cur_chunk_array->on_begin_quote (m_buf, m_chunk_idx, m_urlifier);
+  gcc_assert (m_formatted_token_list);
+  push_back_any_text ();
+  m_formatted_token_list->push_back<pp_token_begin_quote> ();
   m_quoted = true;
 }
 
@@ -2755,8 +2971,9 @@ pp_markup::context::end_quote ()
      printing a type emitting "TYPEDEF' {aka `TYPE'}".  */
   if (!m_quoted)
     return;
-  m_buf.cur_chunk_array->on_end_quote (&m_pp, m_buf, m_chunk_idx, m_urlifier);
-  pp_end_quote (&m_pp, pp_show_color (&m_pp));
+  gcc_assert (m_formatted_token_list);
+  push_back_any_text ();
+  m_formatted_token_list->push_back<pp_token_end_quote> ();
   m_quoted = false;
 }
 
@@ -2765,7 +2982,10 @@ pp_markup::context::begin_highlight_color (const char *color_name)
 {
   if (!pp_show_highlight_colors (&m_pp))
     return;
-  pp_string (&m_pp, colorize_start (pp_show_color (&m_pp), color_name));
+
+  push_back_any_text ();
+  m_formatted_token_list->push_back <pp_token_begin_color>
+    (label_text::borrow (color_name));
 }
 
 void
@@ -2773,10 +2993,20 @@ pp_markup::context::end_highlight_color ()
 {
   if (!pp_show_highlight_colors (&m_pp))
     return;
-  const char *colorstr = colorize_stop (pp_show_color (&m_pp));
-  obstack_grow (&m_buf.chunk_obstack, colorstr, strlen (colorstr));
+
+  push_back_any_text ();
+  m_formatted_token_list->push_back<pp_token_end_color> ();
 }
 
+void
+pp_markup::context::push_back_any_text ()
+{
+  obstack *cur_obstack = m_buf.obstack;
+  obstack_1grow (cur_obstack, '\0');
+  m_formatted_token_list->push_back_text
+    (label_text::borrow (XOBFINISH (cur_obstack,
+				    const char *)));
+}
 
 /* Color names for expressing "expected" vs "actual" values.  */
 const char *const highlight_colors::expected = "highlight-a";
@@ -3039,6 +3269,245 @@ test_pp_format ()
 		    1776, "second");
 }
 
+static void
+test_merge_consecutive_text_tokens ()
+{
+  auto_obstack s;
+  pp_token_list list (s);
+  list.push_back_text (label_text::borrow ("hello"));
+  list.push_back_text (label_text::borrow (" "));
+  list.push_back_text (label_text::take (xstrdup ("world")));
+  list.push_back_text (label_text::borrow ("!"));
+
+  list.merge_consecutive_text_tokens ();
+  // We expect a single text token, with concatenated text
+  ASSERT_EQ (list.m_first, list.m_end);
+  pp_token *tok = list.m_first;
+  ASSERT_NE (tok, nullptr);
+  ASSERT_EQ (tok->m_kind, pp_token::kind::text);
+  ASSERT_STREQ (as_a <pp_token_text *> (tok)->m_value.get (), "hello world!");
+}
+
+/* Verify that we can create custom tokens that can be lowered
+   in phase 3.  */
+
+static void
+test_custom_tokens_1 ()
+{
+  struct custom_token_adder : public pp_element
+  {
+  public:
+    struct value : public pp_token_custom_data::value
+    {
+      value (custom_token_adder &adder)
+      : m_adder (adder)
+      {
+	m_adder.m_num_living_values++;
+      }
+      value (const value &other)
+      : m_adder (other.m_adder)
+      {
+	m_adder.m_num_living_values++;
+      }
+      value (value &&other)
+      : m_adder (other.m_adder)
+      {
+	m_adder.m_num_living_values++;
+      }
+      value &operator= (const value &other) = delete;
+      value &operator= (value &&other) = delete;
+      ~value ()
+      {
+	m_adder.m_num_living_values--;
+      }
+
+      void dump (FILE *out) const final override
+      {
+	fprintf (out, "\"%s\"", m_adder.m_name);
+      }
+
+      bool as_standard_tokens (pp_token_list &out) final override
+      {
+	ASSERT_TRUE (m_adder.m_num_living_values > 0);
+	out.push_back<pp_token_text> (label_text::borrow (m_adder.m_name));
+	return true;
+      }
+
+      custom_token_adder &m_adder;
+    };
+
+    custom_token_adder (const char *name)
+    : m_name (name),
+      m_num_living_values (0)
+    {
+    }
+
+    void add_to_phase_2 (pp_markup::context &ctxt) final override
+    {
+      auto val_ptr = make_unique<value> (*this);
+      ctxt.m_formatted_token_list->push_back<pp_token_custom_data>
+	(std::move (val_ptr));
+    }
+
+    const char *m_name;
+    int m_num_living_values;
+  };
+
+  custom_token_adder e1 ("foo");
+  custom_token_adder e2 ("bar");
+  ASSERT_EQ (e1.m_num_living_values, 0);
+  ASSERT_EQ (e2.m_num_living_values, 0);
+
+  pretty_printer pp;
+  pp_printf (&pp, "before %e middle %e after", &e1, &e2);
+
+  /* Verify that instances were cleaned up.  */
+  ASSERT_EQ (e1.m_num_living_values, 0);
+  ASSERT_EQ (e2.m_num_living_values, 0);
+
+  ASSERT_STREQ (pp_formatted_text (&pp),
+		"before foo middle bar after");
+}
+
+/* Verify that we can create custom tokens that aren't lowered
+   in phase 3, but instead are handled by a custom token_printer.
+   Use this to verify the inputs seen by such token_printers.  */
+
+static void
+test_custom_tokens_2 ()
+{
+  struct custom_token_adder : public pp_element
+  {
+    struct value : public pp_token_custom_data::value
+    {
+    public:
+      value (custom_token_adder &adder)
+      : m_adder (adder)
+      {
+	m_adder.m_num_living_values++;
+      }
+      value (const value &other)
+      : m_adder (other.m_adder)
+      {
+	m_adder.m_num_living_values++;
+      }
+      value (value &&other)
+      : m_adder (other.m_adder)
+      {
+	m_adder.m_num_living_values++;
+      }
+      value &operator= (const value &other) = delete;
+      value &operator= (value &&other) = delete;
+      ~value ()
+      {
+	m_adder.m_num_living_values--;
+      }
+
+      void dump (FILE *out) const final override
+      {
+	fprintf (out, "\"%s\"", m_adder.m_name);
+      }
+
+      bool as_standard_tokens (pp_token_list &) final override
+      {
+	return false;
+      }
+
+      custom_token_adder &m_adder;
+    };
+
+    custom_token_adder (const char *name)
+    : m_name (name),
+      m_num_living_values (0)
+    {
+    }
+
+    void add_to_phase_2 (pp_markup::context &ctxt) final override
+    {
+      auto val_ptr = make_unique<value> (*this);
+      ctxt.m_formatted_token_list->push_back<pp_token_custom_data>
+	(std::move (val_ptr));
+    }
+
+    const char *m_name;
+    int m_num_living_values;
+  };
+
+  class custom_token_printer : public token_printer
+  {
+    void print_tokens (pretty_printer *pp,
+		       const pp_token_list &tokens) final override
+    {
+      /* Verify that TOKENS has:
+	 [TEXT("before "), CUSTOM("foo"), TEXT(" middle "), CUSTOM("bar"),
+	  TEXT(" after")]  */
+      pp_token *tok_0 = tokens.m_first;
+      ASSERT_NE (tok_0, nullptr);
+      ASSERT_EQ (tok_0->m_kind, pp_token::kind::text);
+      ASSERT_STREQ (as_a<pp_token_text *> (tok_0)->m_value.get (),
+		    "before ");
+
+      pp_token *tok_1 = tok_0->m_next;
+      ASSERT_NE (tok_1, nullptr);
+      ASSERT_EQ (tok_1->m_prev, tok_0);
+      ASSERT_EQ (tok_1->m_kind, pp_token::kind::custom_data);
+
+      custom_token_adder::value *v1
+	= static_cast <custom_token_adder::value *>
+	(as_a<pp_token_custom_data *> (tok_1)->m_value.get ());
+      ASSERT_STREQ (v1->m_adder.m_name, "foo");
+      ASSERT_TRUE (v1->m_adder.m_num_living_values > 0);
+
+      pp_token *tok_2 = tok_1->m_next;
+      ASSERT_NE (tok_2, nullptr);
+      ASSERT_EQ (tok_2->m_prev, tok_1);
+      ASSERT_EQ (tok_2->m_kind, pp_token::kind::text);
+      ASSERT_STREQ (as_a<pp_token_text *> (tok_2)->m_value.get (),
+		    " middle ");
+
+      pp_token *tok_3 = tok_2->m_next;
+      ASSERT_NE (tok_3, nullptr);
+      ASSERT_EQ (tok_3->m_prev, tok_2);
+      ASSERT_EQ (tok_3->m_kind, pp_token::kind::custom_data);
+      custom_token_adder::value *v3
+	= static_cast <custom_token_adder::value *>
+	(as_a<pp_token_custom_data *> (tok_3)->m_value.get ());
+      ASSERT_STREQ (v3->m_adder.m_name, "bar");
+      ASSERT_TRUE (v3->m_adder.m_num_living_values > 0);
+
+      pp_token *tok_4 = tok_3->m_next;
+      ASSERT_NE (tok_4, nullptr);
+      ASSERT_EQ (tok_4->m_prev, tok_3);
+      ASSERT_EQ (tok_4->m_kind, pp_token::kind::text);
+      ASSERT_STREQ (as_a<pp_token_text *> (tok_4)->m_value.get (),
+		    " after");
+      ASSERT_EQ (tok_4->m_next, nullptr);
+
+      /* Normally we'd loop over the tokens, printing them to PP
+	 and handling the custom tokens.
+	 Instead, print a message to PP to verify that we were called.  */
+      pp_string (pp, "print_tokens was called");
+    }
+  };
+
+  custom_token_adder e1 ("foo");
+  custom_token_adder e2 ("bar");
+  ASSERT_EQ (e1.m_num_living_values, 0);
+  ASSERT_EQ (e2.m_num_living_values, 0);
+
+  custom_token_printer tp;
+  pretty_printer pp;
+  pp.set_token_printer (&tp);
+  pp_printf (&pp, "before %e middle %e after", &e1, &e2);
+
+  /* Verify that instances were cleaned up.  */
+  ASSERT_EQ (e1.m_num_living_values, 0);
+  ASSERT_EQ (e2.m_num_living_values, 0);
+
+  ASSERT_STREQ (pp_formatted_text (&pp),
+		"print_tokens was called");
+}
+
 /* A subclass of pretty_printer for use by test_prefixes_and_wrapping.  */
 
 class test_pretty_printer : public pretty_printer
@@ -3248,7 +3717,7 @@ pp_printf_with_urlifier (pretty_printer *pp,
 
   va_start (ap, msg);
   text_info text (msg, &ap, errno);
-  pp_format (pp, &text, urlifier);
+  pp_format (pp, &text);
   pp_output_formatted_text (pp, urlifier);
   va_end (ap);
 }
@@ -3404,6 +3873,18 @@ test_urlification ()
       ("foo `\33]8;;http://example.com\33\\-foption\33]8;;\33\\' bar",
        pp_formatted_text (&pp));
   }
+
+  /* Test the example from pretty-print-format-impl.h.  */
+  {
+    pretty_printer pp;
+    pp.set_url_format (URL_FORMAT_ST);
+    pp_printf_with_urlifier (&pp, &urlifier,
+	       "foo: %i, bar: %s, option: %qs",
+	       42, "baz", "-foption");
+    ASSERT_STREQ (pp_formatted_text (&pp),
+		  "foo: 42, bar: baz, option:"
+		  " `]8;;http://example.com\\-foption]8;;\\'");
+  }
 }
 
 /* Test multibyte awareness.  */
@@ -3453,6 +3934,9 @@ pretty_print_cc_tests ()
 {
   test_basic_printing ();
   test_pp_format ();
+  test_merge_consecutive_text_tokens ();
+  test_custom_tokens_1 ();
+  test_custom_tokens_2 ();
   test_prefixes_and_wrapping ();
   test_urls ();
   test_urls_from_braces ();
diff --git a/gcc/pretty-print.h b/gcc/pretty-print.h
index ea81706b5d8a..e0505b2683c2 100644
--- a/gcc/pretty-print.h
+++ b/gcc/pretty-print.h
@@ -70,8 +70,8 @@ enum diagnostic_prefixing_rule_t
 };
 
 class chunk_info;
-class quoting_info;
 class output_buffer;
+class pp_token_list;
 class urlifier;
 
 namespace pp_markup {
@@ -177,7 +177,7 @@ struct pp_wrapping_mode_t
    A client-supplied formatter returns true if everything goes well,
    otherwise it returns false.  */
 typedef bool (*printer_fn) (pretty_printer *, text_info *, const char *,
-			    int, bool, bool, bool, bool *, const char **);
+			    int, bool, bool, bool, bool *, pp_token_list &);
 
 /* Base class for an optional client-supplied object for doing additional
    processing between stages 2 and 3 of formatted printing.  */
@@ -189,6 +189,18 @@ class format_postprocessor
   virtual void handle (pretty_printer *) = 0;
 };
 
+/* Abstract base class for writing formatted tokens to the pretty_printer's
+   text buffer, allowing for output formats and dumpfiles to override
+   how different kinds of tokens are handled.  */
+
+class token_printer
+{
+public:
+  virtual ~token_printer () {}
+  virtual void print_tokens (pretty_printer *pp,
+			     const pp_token_list &tokens) = 0;
+};
+
 inline bool & pp_needs_newline (pretty_printer *pp);
 
 /* True if PRETTY-PRINTER is in line-wrapping mode.  */
@@ -236,6 +248,9 @@ public:
   friend format_postprocessor *& pp_format_postprocessor (pretty_printer *pp);
   friend bool & pp_show_highlight_colors (pretty_printer *pp);
 
+  friend void pp_output_formatted_text (pretty_printer *,
+					const urlifier *);
+
   /* Default construct a pretty printer with specified
      maximum line length cut off limit.  */
   explicit pretty_printer (int = 0);
@@ -250,12 +265,16 @@ public:
     m_buffer->stream = outfile;
   }
 
+  void set_token_printer (token_printer* tp)
+  {
+    m_token_printer = tp; // borrowed
+  }
+
   void set_prefix (char *prefix);
 
   void emit_prefix ();
 
-  void format (text_info *text,
-	       const urlifier *urlifier);
+  void format (text_info *text);
 
   void maybe_space ();
 
@@ -314,8 +333,9 @@ private:
      If the BUFFER needs additional characters from the format string, it
      should advance the TEXT->format_spec as it goes.  When FORMAT_DECODER
      returns, TEXT->format_spec should point to the last character processed.
-     The QUOTE and BUFFER_PTR are passed in, to allow for deferring-handling
-     of format codes (e.g. %H and %I in the C++ frontend).  */
+     The QUOTE and FORMATTED_TOKEN_LIST are passed in, to allow for
+     deferring-handling of format codes (e.g. %H and %I in
+     the C++ frontend).  */
   printer_fn m_format_decoder;
 
   /* If non-NULL, this is called by pp_format once after all format codes
@@ -324,6 +344,12 @@ private:
      format codes (which interract with each other).  */
   format_postprocessor *m_format_postprocessor;
 
+  /* This is used by pp_output_formatted_text after it has converted all
+     formatted chunks into a single list of tokens.
+     Can be nullptr.
+     Borrowed from the output format or from dump_pretty_printer.  */
+  token_printer *m_token_printer;
+
   /* Nonzero if current PREFIX was emitted at least once.  */
   bool m_emitted_prefix;
 
@@ -543,10 +569,9 @@ extern void pp_verbatim (pretty_printer *, const char *, ...)
      ATTRIBUTE_GCC_PPDIAG(2,3);
 extern void pp_flush (pretty_printer *);
 extern void pp_really_flush (pretty_printer *);
-inline void pp_format (pretty_printer *pp, text_info *text,
-		       const urlifier *urlifier = nullptr)
+inline void pp_format (pretty_printer *pp, text_info *text)
 {
-  pp->format (text, urlifier);
+  pp->format (text);
 }
 extern void pp_output_formatted_text (pretty_printer *,
 				      const urlifier * = nullptr);
diff --git a/gcc/tree-diagnostic.cc b/gcc/tree-diagnostic.cc
index fc78231dfa44..466725fdd637 100644
--- a/gcc/tree-diagnostic.cc
+++ b/gcc/tree-diagnostic.cc
@@ -55,7 +55,7 @@ default_tree_diagnostic_starter (diagnostic_context *context,
 bool
 default_tree_printer (pretty_printer *pp, text_info *text, const char *spec,
 		      int precision, bool wide, bool set_locus, bool hash,
-		      bool *, const char **)
+		      bool *, pp_token_list &)
 {
   tree t;
 
diff --git a/gcc/tree-diagnostic.h b/gcc/tree-diagnostic.h
index 6ebac381ace8..98ca654c946e 100644
--- a/gcc/tree-diagnostic.h
+++ b/gcc/tree-diagnostic.h
@@ -53,6 +53,6 @@ void diagnostic_report_current_function (diagnostic_context *,
 
 void tree_diagnostics_defaults (diagnostic_context *context);
 bool default_tree_printer (pretty_printer *, text_info *, const char *,
-			   int, bool, bool, bool, bool *, const char **);
+			   int, bool, bool, bool, bool *, pp_token_list &);
 
 #endif /* ! GCC_TREE_DIAGNOSTIC_H */

From patchwork Thu Aug 29 22:58:13 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
X-Patchwork-Submitter: David Malcolm <dmalcolm@redhat.com>
X-Patchwork-Id: 1978641
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=fail reason="signature verification failed" (1024-bit key;
 unprotected) header.d=redhat.com header.i=@redhat.com header.a=rsa-sha256
 header.s=mimecast20190719 header.b=c5n0TglP;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4WvxYY11dlz1yfn
	for <incoming@patchwork.ozlabs.org>; Fri, 30 Aug 2024 08:59:29 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id EB6023861823
	for <incoming@patchwork.ozlabs.org>; Thu, 29 Aug 2024 22:59:26 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [170.10.133.124])
 by sourceware.org (Postfix) with ESMTP id 47ECF3861827
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:32 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 47ECF3861827
Authentication-Results: sourceware.org;
 dmarc=pass (p=none dis=none) header.from=redhat.com
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=redhat.com
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 47ECF3861827
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=170.10.133.124
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724972320; cv=none;
 b=YPJGiV3uNu/YH/vrwsJJmp4oZPcCcEXgtUfx9o0x6gfZX4s+/OdSLx+tmd6IUDb7J+I+FqhV7babZ8Mh/d/zb7HNWn++tkBrIpvpfSNrDA7ZUQg5bkf/zef5CYbssBxCJYUtOo+cWXMHNV3Reswo784/hywI/ZUSrSjnZykpH50=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1724972320; c=relaxed/simple;
 bh=F4B2qq3eAfhJiqZdFLayl1Se9sjv+lhvELtoSPkoZC8=;
 h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version;
 b=HdJu7PQHZZwJ7T/WYQ+0CRPRfY/6uTpynZl07SS/HfPJL8aThEk55CXkBirF7PKE/xl92G1xWWISofy0cRjBpOJfjQyIFIlqZ3zlNlm+EcTKRMjSqEoiCKZo6Z+IfCnrJQMM9sWyLDGrXVlaDd/n1I2K760GvABqLVUXmgK/mgc=
ARC-Authentication-Results: i=1; server2.sourceware.org
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1724972312;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=C59gyEuB0uGvON1McHqbvopBbWbI4Ejkahj3wgM19hw=;
 b=c5n0TglPhKs2xYzk8iHIEIh+oEzz0fWhZsZeP3+E/5GZK4tolLd0rTlWqxcomltsr9Oebv
 3w42ttXpWt5vVeL72qDDtcDz3vh6K3HuCnKsFMswzoUp11FzXusC4DqQ/6w7OUSBsmYmUh
 ZwuoYdG+nt1I4th09CVOhMWXdnkLPr8=
Received: from mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com
 (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by
 relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3,
 cipher=TLS_AES_256_GCM_SHA384) id us-mta-202-t3aII5s8Mqe00JzLHbw5pg-1; Thu,
 29 Aug 2024 18:58:28 -0400
X-MC-Unique: t3aII5s8Mqe00JzLHbw5pg-1
Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com
 (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17])
 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
 key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest
 SHA256)
 (No client certificate requested)
 by mx-prod-mc-01.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS
 id 224A61955BF1
 for <gcc-patches@gcc.gnu.org>; Thu, 29 Aug 2024 22:58:26 +0000 (UTC)
Received: from t14s.localdomain.com (unknown [10.22.16.43])
 by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP
 id 4106319560A3; Thu, 29 Aug 2024 22:58:24 +0000 (UTC)
From: David Malcolm <dmalcolm@redhat.com>
To: gcc-patches@gcc.gnu.org
Cc: David Malcolm <dmalcolm@redhat.com>
Subject: [pushed 4/4] =?utf-8?q?SARIF_output=3A_implement_embedded_URLs_in_m?=
	=?utf-8?q?essages_=28=C2=A73=2E11=2E6=3B__PR_other/116419=29?=
Date: Thu, 29 Aug 2024 18:58:13 -0400
Message-Id: <20240829225813.2567570-4-dmalcolm@redhat.com>
In-Reply-To: <20240829225813.2567570-1-dmalcolm@redhat.com>
References: <20240829225813.2567570-1-dmalcolm@redhat.com>
MIME-Version: 1.0
X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
X-Spam-Status: No, score=-11.6 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH,
 DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0,
 KAM_SHORT,
 RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE,
 SPF_NONE, TXREP, T_FILL_THIS_FORM_SHORT,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

GCC diagnostic messages can contain URLs, such as to our documentation
when we suggest an option name to correct a misspelling.

SARIF message strings can contain embedded URLs in the plain text
messages (see SARIF v2.1.0 §3.11.6), but previously we were
simply dropping any URLs from the diagnostic messages.

This patch adds support for encoding URLs into messages in our SARIF
output, using the pp_token machinery added in the previous patch.

As well as supporting URLs, the patch also adjusts how we report
event IDs in SARIF message, so that rather than e.g.
  "text": "second 'free' here; first 'free' was at (1)"
we now report:
  "text": "second 'free' here; first 'free' was at [(1)](sarif:/runs/0/results/0/codeFlows/0/threadFlows/0/locations/0)"

i.e. the text "(1)" now has a embedded link referring within the sarif
log to the threadFlowLocation object for the other event, via JSON
pointer (see §3.10.3 "URIs that use the sarif scheme").  Doing so
requires the arious objects to know their index within their containing
array, requiring some reworking of how they are constructed.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to trunk as r15-3312-gaff7f677120ec3.

gcc/ChangeLog:
	PR other/116419
	* diagnostic-event-id.h (diagnostic_event_id_t::zero_based): New.
	* diagnostic-format-sarif.cc: Include "pretty-print-format-impl.h"
	and "pretty-print-urlifier.h".
	(sarif_result::sarif_result): Add param "idx_within_parent".
	(sarif_result::get_index_within_parent): New accessor.
	(sarif_result::m_idx_within_parent): New field.
	(sarif_code_flow::sarif_code_flow): New ctor.
	(sarif_code_flow::get_parent): New accessor.
	(sarif_code_flow::get_index_within_parent): New accessor.
	(sarif_code_flow::m_parent): New field.
	(sarif_code_flow::m_thread_id_map): New field.
	(sarif_code_flow::m_thread_flows_arr): New field.
	(sarif_code_flow::m_all_tfl_objs): New field.
	(sarif_thread_flow::sarif_thread_flow): Add "parent" and
	"idx_within_parent" params.
	(sarif_thread_flow::get_parent): New accessor.
	(sarif_thread_flow::get_index_within_parent): New accessor.
	(sarif_thread_flow::m_parent): New field.
	(sarif_thread_flow::m_idx_within_parent): New field.
	(sarif_thread_flow_location::sarif_thread_flow_location): New
	ctor.
	(sarif_thread_flow_location::get_parent): New accessor.
	(sarif_thread_flow_location::get_index_within_parent): New
	accessor.
	(sarif_thread_flow_location::m_parent): New field.
	(sarif_thread_flow_location::m_idx_within_parent): New field.
	(sarif_builder::get_code_flow_for_event_ids): New accessor.
	(class sarif_builder::sarif_token_printer): New.
	(sarif_builder::m_token_printer): New member.
	(sarif_builder::m_next_result_idx): New field.
	(sarif_builder::m_current_code_flow): New field.
	(sarif_code_flow::get_or_append_thread_flow): New.
	(sarif_code_flow::get_thread_flow): New.
	(sarif_code_flow::add_location): New.
	(sarif_code_flow::get_thread_flow_loc_obj): New.
	(sarif_thread_flow::add_location): Create the new
	sarif_thread_flow_location internally, rather than passing
	it in as a parm so that we can keep track of its index in
	the array.  Return a reference to it.
	(sarif_builder::sarif_builder): Initialize m_token_printer,
	m_next_result_idx, and m_current_code_flow.
	(sarif_builder::on_report_diagnostic): Pass index to
	make_result_object.
	(sarif_builder::make_result_object): Add "idx_within_parent" param
	and pass to sarif_result ctor.  Pass code flow index to call to
	make_code_flow_object.
	(make_sarif_url_for_event): New.
	(sarif_builder::make_code_flow_object): Add "idx_within_parent"
	param and pass it to sarif_code_flow ctor.  Reimplement walking
	of events so that we first create threadFlow objects for each
	thread, then populate them with threadFlowLocation objects, so
	that the IDs work.  Set m_current_code_flow whilst creating the
	latter, so that we can create correct URIs for "%@".
	(sarif_builder::make_thread_flow_location_object): Replace with...
	(sarif_builder::populate_thread_flow_location_object): ...this.
	(sarif_output_format::get_builder): New accessor.
	(sarif_begin_embedded_link): New.
	(sarif_end_embedded_link): New.
	(sarif_builder::sarif_token_printer::print_tokens): New.
	(diagnostic_output_format_init_sarif): Add "fmt" param; use it to
	set the token printer and output format for the context.
	(diagnostic_output_format_init_sarif_stderr): Move responsibility
	for setting the context's output format to within
	diagnostic_output_format_init_sarif.
	(diagnostic_output_format_init_sarif_file): Likewise.
	(diagnostic_output_format_init_sarif_stream): Likewise.
	(test_sarif_diagnostic_context::test_sarif_diagnostic_context):
	Likewise.
	(selftest::test_make_location_object): Provide an idx for the
	result.
	(selftest::get_result_from_log): New.
	(selftest::get_message_from_log): New.
	(selftest::test_message_with_embedded_link): New test.
	(selftest::diagnostic_format_sarif_cc_tests): Call it.
	* pretty-print-format-impl.h: Include "diagnostic-event-id.h".
	(pp_token::kind): Add "event_id".
	(struct pp_token_event_id): New.
	(is_a_helper <pp_token_event_id *>::test): New.
	(is_a_helper <const pp_token_event_id *>::test): New.
	* pretty-print.cc (pp_token::dump): Handle kind::event_id.
	(pretty_printer::format): Update handling of "%@" in phase 2
	so that we add a pp_token_event_id, rather that the text "(N)".
	(default_token_printer): Handle pp_token::kind::event_id by
	printing the text "(N)".

gcc/testsuite/ChangeLog:
	PR other/116419
	* gcc.dg/sarif-output/bad-pragma.c: New test.
	* gcc.dg/sarif-output/test-bad-pragma.py: New test.
	* gcc.dg/sarif-output/test-include-chain-2.py
	(test_location_relationships): Update expected text of event to
	include an intra-sarif URI to the other event.

Signed-off-by: David Malcolm <dmalcolm@redhat.com>
---
 gcc/diagnostic-event-id.h                     |   6 +
 gcc/diagnostic-format-sarif.cc                | 605 +++++++++++++++---
 gcc/pretty-print-format-impl.h                |  32 +
 gcc/pretty-print.cc                           |  29 +-
 .../gcc.dg/sarif-output/bad-pragma.c          |  16 +
 .../gcc.dg/sarif-output/test-bad-pragma.py    |  38 ++
 .../sarif-output/test-include-chain-2.py      |   6 +-
 7 files changed, 636 insertions(+), 96 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/sarif-output/bad-pragma.c
 create mode 100644 gcc/testsuite/gcc.dg/sarif-output/test-bad-pragma.py

diff --git a/gcc/diagnostic-event-id.h b/gcc/diagnostic-event-id.h
index 78c2ccbbc99d..8237ba34df33 100644
--- a/gcc/diagnostic-event-id.h
+++ b/gcc/diagnostic-event-id.h
@@ -41,6 +41,12 @@ class diagnostic_event_id_t
 
   bool known_p () const { return m_index != UNKNOWN_EVENT_IDX; }
 
+  int zero_based () const
+  {
+    gcc_assert (known_p ());
+    return m_index;
+  }
+
   int one_based () const
   {
     gcc_assert (known_p ());
diff --git a/gcc/diagnostic-format-sarif.cc b/gcc/diagnostic-format-sarif.cc
index 59d9cd721839..9d9e7ae60734 100644
--- a/gcc/diagnostic-format-sarif.cc
+++ b/gcc/diagnostic-format-sarif.cc
@@ -45,6 +45,8 @@ along with GCC; see the file COPYING3.  If not see
 #include "selftest-diagnostic-show-locus.h"
 #include "selftest-json.h"
 #include "text-range-label.h"
+#include "pretty-print-format-impl.h"
+#include "pretty-print-urlifier.h"
 
 /* Forward decls.  */
 class sarif_builder;
@@ -385,7 +387,12 @@ private:
 class sarif_result : public sarif_location_manager
 {
 public:
-  sarif_result () : m_related_locations_arr (nullptr) {}
+  sarif_result (unsigned idx_within_parent)
+    : m_related_locations_arr (nullptr),
+      m_idx_within_parent (idx_within_parent)
+  {}
+
+  unsigned get_index_within_parent () const { return m_idx_within_parent; }
 
   void
   on_nested_diagnostic (diagnostic_context &context,
@@ -402,6 +409,7 @@ public:
 
 private:
   json::array *m_related_locations_arr; // borrowed
+  const unsigned m_idx_within_parent;
 };
 
 /* Subclass of sarif_object for SARIF "location" objects
@@ -460,7 +468,39 @@ private:
 /* Subclass of sarif_object for SARIF "codeFlow" objects
    (SARIF v2.1.0 section 3.36).  */
 
-class sarif_code_flow : public sarif_object {};
+class sarif_code_flow : public sarif_object
+{
+public:
+  sarif_code_flow (sarif_result &parent,
+		   unsigned idx_within_parent);
+
+  sarif_result &get_parent () const { return m_parent; }
+  unsigned get_index_within_parent () const { return m_idx_within_parent; }
+
+  sarif_thread_flow &
+  get_or_append_thread_flow (const diagnostic_thread &thread,
+			     diagnostic_thread_id_t thread_id);
+
+  sarif_thread_flow &
+  get_thread_flow (diagnostic_thread_id_t thread_id);
+
+  void add_location (sarif_thread_flow_location &);
+
+  sarif_thread_flow_location &
+  get_thread_flow_loc_obj (diagnostic_event_id_t event_id) const;
+
+private:
+  sarif_result &m_parent;
+  const unsigned m_idx_within_parent;
+
+  hash_map<int_hash<diagnostic_thread_id_t, -1, -2>,
+	   sarif_thread_flow *> m_thread_id_map; // borrowed ptr
+  json::array *m_thread_flows_arr; // borrowed
+
+  /* Vec of borrowed ptr, allowing for going easily from
+     an event_id to the corresponding threadFlowLocation object.  */
+  std::vector<sarif_thread_flow_location *> m_all_tfl_objs;
+};
 
 /* Subclass of sarif_object for SARIF "threadFlow" objects
    (SARIF v2.1.0 section 3.37).  */
@@ -468,20 +508,41 @@ class sarif_code_flow : public sarif_object {};
 class sarif_thread_flow : public sarif_object
 {
 public:
-  sarif_thread_flow (const diagnostic_thread &thread);
+  sarif_thread_flow (sarif_code_flow &parent,
+		     const diagnostic_thread &thread,
+		     unsigned idx_within_parent);
 
-  void
-  add_location
-    (std::unique_ptr<sarif_thread_flow_location> thread_flow_loc_obj);
+  sarif_code_flow &get_parent () const { return m_parent; }
+  unsigned get_index_within_parent () const { return m_idx_within_parent; }
+
+  sarif_thread_flow_location &add_location ();
 
 private:
+  sarif_code_flow &m_parent;
   json::array *m_locations_arr; // borrowed
+  const unsigned m_idx_within_parent;
 };
 
 /* Subclass of sarif_object for SARIF "threadFlowLocation" objects
    (SARIF v2.1.0 section 3.38).  */
 
-class sarif_thread_flow_location : public sarif_object {};
+class sarif_thread_flow_location : public sarif_object
+{
+public:
+  sarif_thread_flow_location (sarif_thread_flow &parent,
+			      unsigned idx_within_parent)
+  : m_parent (parent),
+    m_idx_within_parent (idx_within_parent)
+  {
+  }
+
+  sarif_thread_flow &get_parent () const { return m_parent; }
+  unsigned get_index_within_parent () const { return m_idx_within_parent; }
+
+private:
+  sarif_thread_flow &m_parent;
+  const unsigned m_idx_within_parent;
+};
 
 /* Subclass of sarif_object for SARIF "reportingDescriptor" objects
    (SARIF v2.1.0 section 3.49).  */
@@ -632,11 +693,33 @@ public:
   std::unique_ptr<sarif_artifact_location>
   make_artifact_location_object (const char *filename);
 
+  const sarif_code_flow *
+  get_code_flow_for_event_ids () const
+  {
+    return m_current_code_flow;
+  }
+
+  token_printer &get_token_printer () { return m_token_printer; }
+
 private:
+  class sarif_token_printer : public token_printer
+  {
+  public:
+    sarif_token_printer (sarif_builder &builder)
+      : m_builder (builder)
+    {
+    }
+    void print_tokens (pretty_printer *pp,
+		       const pp_token_list &tokens) final override;
+  private:
+    sarif_builder &m_builder;
+  };
+
   std::unique_ptr<sarif_result>
   make_result_object (diagnostic_context &context,
 		      const diagnostic_info &diagnostic,
-		      diagnostic_t orig_diag_kind);
+		      diagnostic_t orig_diag_kind,
+		      unsigned idx_within_parent);
   void
   add_any_include_chain (sarif_location_manager &loc_mgr,
 			 sarif_location &location_obj,
@@ -650,11 +733,13 @@ private:
 			enum diagnostic_artifact_role role);
   std::unique_ptr<sarif_code_flow>
   make_code_flow_object (sarif_result &result,
+			 unsigned idx_within_parent,
 			 const diagnostic_path &path);
-  std::unique_ptr<sarif_thread_flow_location>
-  make_thread_flow_location_object (sarif_result &result,
-				    const diagnostic_event &event,
-				    int path_event_idx);
+  void
+  populate_thread_flow_location_object (sarif_result &result,
+					sarif_thread_flow_location &thread_flow_loc_obj,
+					const diagnostic_event &event,
+					int event_execution_idx);
   std::unique_ptr<json::array>
   maybe_make_kinds_array (diagnostic_event::meaning m) const;
   std::unique_ptr<sarif_physical_location>
@@ -725,6 +810,7 @@ private:
 
   diagnostic_context &m_context;
   const line_maps *m_line_maps;
+  sarif_token_printer m_token_printer;
 
   /* The JSON object for the invocation object.  */
   std::unique_ptr<sarif_invocation> m_invocation_obj;
@@ -751,6 +837,9 @@ private:
   int m_tabstop;
 
   bool m_formatted;
+
+  unsigned m_next_result_idx;
+  sarif_code_flow *m_current_code_flow;
 };
 
 /* class sarif_object : public json::object.  */
@@ -1294,9 +1383,67 @@ lazily_add_kind (enum location_relationship_kind kind)
   kinds_arr->append_string (kind_str);
 }
 
+/* class sarif_code_flow : public sarif_object.  */
+
+sarif_code_flow::sarif_code_flow (sarif_result &parent,
+				  unsigned idx_within_parent)
+: m_parent (parent),
+  m_idx_within_parent (idx_within_parent)
+{
+  /* "threadFlows" property (SARIF v2.1.0 section 3.36.3).  */
+  auto thread_flows_arr = ::make_unique<json::array> ();
+  m_thread_flows_arr = thread_flows_arr.get (); // borrowed
+  set<json::array> ("threadFlows", std::move (thread_flows_arr));
+}
+
+sarif_thread_flow &
+sarif_code_flow::get_or_append_thread_flow (const diagnostic_thread &thread,
+					    diagnostic_thread_id_t thread_id)
+{
+  sarif_thread_flow **slot = m_thread_id_map.get (thread_id);
+  if (slot)
+    return **slot;
+
+  unsigned next_thread_flow_idx = m_thread_flows_arr->size ();
+  auto thread_flow_obj
+    = ::make_unique<sarif_thread_flow> (*this, thread, next_thread_flow_idx);
+  m_thread_id_map.put (thread_id, thread_flow_obj.get ()); // borrowed
+  sarif_thread_flow *result = thread_flow_obj.get ();
+  m_thread_flows_arr->append<sarif_thread_flow> (std::move (thread_flow_obj));
+  return *result;
+}
+
+sarif_thread_flow &
+sarif_code_flow::get_thread_flow (diagnostic_thread_id_t thread_id)
+{
+  sarif_thread_flow **slot = m_thread_id_map.get (thread_id);
+  gcc_assert (slot); // it must already have one
+  return **slot;
+}
+
+void
+sarif_code_flow::add_location (sarif_thread_flow_location &tfl_obj)
+{
+  m_all_tfl_objs.push_back (&tfl_obj);
+}
+
+sarif_thread_flow_location &
+sarif_code_flow::get_thread_flow_loc_obj (diagnostic_event_id_t event_id) const
+{
+  gcc_assert (event_id.known_p ());
+  gcc_assert ((size_t)event_id.zero_based () < m_all_tfl_objs.size ());
+  sarif_thread_flow_location *tfl_obj = m_all_tfl_objs[event_id.zero_based ()];
+  gcc_assert (tfl_obj);
+  return *tfl_obj;
+}
+
 /* class sarif_thread_flow : public sarif_object.  */
 
-sarif_thread_flow::sarif_thread_flow (const diagnostic_thread &thread)
+sarif_thread_flow::sarif_thread_flow (sarif_code_flow &parent,
+				      const diagnostic_thread &thread,
+				      unsigned idx_within_parent)
+: m_parent (parent),
+  m_idx_within_parent (idx_within_parent)
 {
   /* "id" property (SARIF v2.1.0 section 3.37.2).  */
   label_text name (thread.get_name (false));
@@ -1310,11 +1457,18 @@ sarif_thread_flow::sarif_thread_flow (const diagnostic_thread &thread)
   set ("locations", m_locations_arr);
 }
 
-void
-sarif_thread_flow::
-add_location (std::unique_ptr<sarif_thread_flow_location> thread_flow_loc_obj)
+/* Add a sarif_thread_flow_location to this threadFlow object, but
+   don't populate it yet.  */
+
+sarif_thread_flow_location &
+sarif_thread_flow::add_location ()
 {
-  m_locations_arr->append (std::move (thread_flow_loc_obj));
+  const unsigned thread_flow_location_idx = m_locations_arr->size ();
+  sarif_thread_flow_location *thread_flow_loc_obj
+    = new sarif_thread_flow_location (*this, thread_flow_location_idx);
+  m_locations_arr->append (thread_flow_loc_obj);
+  m_parent.add_location (*thread_flow_loc_obj);
+  return *thread_flow_loc_obj;
 }
 
 /* class sarif_builder.  */
@@ -1327,6 +1481,7 @@ sarif_builder::sarif_builder (diagnostic_context &context,
 			      bool formatted)
 : m_context (context),
   m_line_maps (line_maps),
+  m_token_printer (*this),
   m_invocation_obj
     (::make_unique<sarif_invocation> (*this,
 				      context.get_original_argv ())),
@@ -1336,7 +1491,9 @@ sarif_builder::sarif_builder (diagnostic_context &context,
   m_rule_id_set (),
   m_rules_arr (new json::array ()),
   m_tabstop (context.m_tabstop),
-  m_formatted (formatted)
+  m_formatted (formatted),
+  m_next_result_idx (0),
+  m_current_code_flow (nullptr)
 {
   gcc_assert (m_line_maps);
 
@@ -1376,7 +1533,8 @@ sarif_builder::on_report_diagnostic (diagnostic_context &context,
     {
       /* Top-level diagnostic.  */
       m_cur_group_result
-	= make_result_object (context, diagnostic, orig_diag_kind);
+	= make_result_object (context, diagnostic, orig_diag_kind,
+			      m_next_result_idx++);
     }
 }
 
@@ -1476,9 +1634,10 @@ make_rule_id_for_diagnostic_kind (diagnostic_t diag_kind)
 std::unique_ptr<sarif_result>
 sarif_builder::make_result_object (diagnostic_context &context,
 				   const diagnostic_info &diagnostic,
-				   diagnostic_t orig_diag_kind)
+				   diagnostic_t orig_diag_kind,
+				   unsigned idx_within_parent)
 {
-  auto result_obj = ::make_unique<sarif_result> ();
+  auto result_obj = ::make_unique<sarif_result> (idx_within_parent);
 
   /* "ruleId" property (SARIF v2.1.0 section 3.27.5).  */
   /* Ideally we'd have an option_name for these.  */
@@ -1552,8 +1711,10 @@ sarif_builder::make_result_object (diagnostic_context &context,
   if (const diagnostic_path *path = diagnostic.richloc->get_path ())
     {
       auto code_flows_arr = ::make_unique<json::array> ();
+      const unsigned code_flow_index = 0;
       code_flows_arr->append<sarif_code_flow>
 	(make_code_flow_object (*result_obj.get (),
+				code_flow_index,
 				*path));
       result_obj->set<json::array> ("codeFlows", std::move (code_flows_arr));
     }
@@ -2272,83 +2433,115 @@ make_sarif_logical_location_object (const logical_location &logical_loc)
   return logical_loc_obj;
 }
 
+label_text
+make_sarif_url_for_event (const sarif_code_flow *code_flow,
+			  diagnostic_event_id_t event_id)
+{
+  gcc_assert (event_id.known_p ());
+
+  if (!code_flow)
+    return label_text ();
+
+  const sarif_thread_flow_location &tfl_obj
+    = code_flow->get_thread_flow_loc_obj (event_id);
+  const int location_idx = tfl_obj.get_index_within_parent ();
+
+  const sarif_thread_flow &thread_flow_obj = tfl_obj.get_parent ();
+  const int thread_flow_idx = thread_flow_obj.get_index_within_parent ();
+
+  const sarif_code_flow &code_flow_obj = thread_flow_obj.get_parent ();
+  const int code_flow_idx = code_flow_obj.get_index_within_parent ();
+
+  const sarif_result &result_obj = code_flow_obj.get_parent ();
+  const int result_idx = result_obj.get_index_within_parent ();
+
+  /* We only support a single run object in the log.  */
+  const int run_idx = 0;
+
+  char *buf = xasprintf
+    ("sarif:/runs/%i/results/%i/codeFlows/%i/threadFlows/%i/locations/%i",
+     run_idx, result_idx, code_flow_idx, thread_flow_idx, location_idx);
+  return label_text::take (buf);
+}
+
 /* Make a "codeFlow" object (SARIF v2.1.0 section 3.36) for PATH.  */
 
 std::unique_ptr<sarif_code_flow>
 sarif_builder::make_code_flow_object (sarif_result &result,
+				      unsigned idx_within_parent,
 				      const diagnostic_path &path)
 {
-  auto code_flow_obj = ::make_unique <sarif_code_flow> ();
+  auto code_flow_obj
+    = ::make_unique <sarif_code_flow> (result, idx_within_parent);
 
-  /* "threadFlows" property (SARIF v2.1.0 section 3.36.3).  */
-  auto thread_flows_arr = ::make_unique<json::array> ();
-
-  /* Walk the events, consolidating into per-thread threadFlow objects,
-     using the index with PATH as the overall executionOrder.  */
-  hash_map<int_hash<diagnostic_thread_id_t, -1, -2>,
-	   sarif_thread_flow *> thread_id_map; // borrowed
+  /* First pass:
+     Create threadFlows and threadFlowLocation objects within them,
+     effectively recording a mapping from event_id to threadFlowLocation
+     so that we can later go from an event_id to a URI within the
+     SARIF file.  */
   for (unsigned i = 0; i < path.num_events (); i++)
     {
       const diagnostic_event &event = path.get_event (i);
       const diagnostic_thread_id_t thread_id = event.get_thread_id ();
-      sarif_thread_flow *thread_flow_obj;
 
-      if (sarif_thread_flow **slot = thread_id_map.get (thread_id))
-	thread_flow_obj = *slot;
-      else
-	{
-	  const diagnostic_thread &thread = path.get_thread (thread_id);
-	  thread_flow_obj = new sarif_thread_flow (thread);
-	  thread_id_map.put (thread_id, thread_flow_obj); // borrowed
-	  thread_flows_arr->append (thread_flow_obj);
-	}
+      sarif_thread_flow &thread_flow_obj
+	= code_flow_obj->get_or_append_thread_flow (path.get_thread (thread_id),
+						    thread_id);
+      thread_flow_obj.add_location ();
+    }
 
-      /* Add event to thread's threadFlow object.  */
-      std::unique_ptr<sarif_thread_flow_location> thread_flow_loc_obj
-	= make_thread_flow_location_object (result, event, i);
-      thread_flow_obj->add_location (std::move (thread_flow_loc_obj));
+  /* Second pass: walk the events, populating the tfl objs.  */
+  m_current_code_flow = code_flow_obj.get ();
+  for (unsigned i = 0; i < path.num_events (); i++)
+    {
+      const diagnostic_event &event = path.get_event (i);
+      sarif_thread_flow_location &thread_flow_loc_obj
+	= code_flow_obj->get_thread_flow_loc_obj (i);
+      populate_thread_flow_location_object (result,
+					    thread_flow_loc_obj,
+					    event,
+					    i);
     }
-  code_flow_obj->set<json::array> ("threadFlows", std::move (thread_flows_arr));
+  m_current_code_flow = nullptr;
 
   return code_flow_obj;
 }
 
-/* Make a "threadFlowLocation" object (SARIF v2.1.0 section 3.38) for EVENT.  */
+/* Populate TFL_OBJ, a "threadFlowLocation" object (SARIF v2.1.0 section 3.38)
+   based on EVENT.  */
 
-std::unique_ptr<sarif_thread_flow_location>
-sarif_builder::make_thread_flow_location_object (sarif_result &result,
-						 const diagnostic_event &ev,
-						 int path_event_idx)
+void
+sarif_builder::
+populate_thread_flow_location_object (sarif_result &result,
+				      sarif_thread_flow_location &tfl_obj,
+				      const diagnostic_event &ev,
+				      int event_execution_idx)
 {
-  auto thread_flow_loc_obj = ::make_unique<sarif_thread_flow_location> ();
-
   /* Give diagnostic_event subclasses a chance to add custom properties
      via a property bag.  */
-  ev.maybe_add_sarif_properties (*thread_flow_loc_obj);
+  ev.maybe_add_sarif_properties (tfl_obj);
 
   /* "location" property (SARIF v2.1.0 section 3.38.3).  */
-  thread_flow_loc_obj->set<sarif_location>
+  tfl_obj.set<sarif_location>
     ("location",
      make_location_object (result, ev, diagnostic_artifact_role::traced_file));
 
   /* "kinds" property (SARIF v2.1.0 section 3.38.8).  */
   diagnostic_event::meaning m = ev.get_meaning ();
   if (auto kinds_arr = maybe_make_kinds_array (m))
-    thread_flow_loc_obj->set<json::array> ("kinds", std::move (kinds_arr));
+    tfl_obj.set<json::array> ("kinds", std::move (kinds_arr));
 
   /* "nestingLevel" property (SARIF v2.1.0 section 3.38.10).  */
-  thread_flow_loc_obj->set_integer ("nestingLevel", ev.get_stack_depth ());
+  tfl_obj.set_integer ("nestingLevel", ev.get_stack_depth ());
 
   /* "executionOrder" property (SARIF v2.1.0 3.38.11).
      Offset by 1 to match the human-readable values emitted by %@.  */
-  thread_flow_loc_obj->set_integer ("executionOrder", path_event_idx + 1);
+  tfl_obj.set_integer ("executionOrder", event_execution_idx + 1);
 
   /* It might be nice to eventually implement the following for -fanalyzer:
      - the "stack" property (SARIF v2.1.0 section 3.38.5)
      - the "state" property (SARIF v2.1.0 section 3.38.9)
      - the "importance" property (SARIF v2.1.0 section 3.38.13).  */
-
-  return thread_flow_loc_obj;
 }
 
 /* If M has any known meaning, make a json array suitable for the "kinds"
@@ -2933,6 +3126,8 @@ public:
     m_builder.emit_diagram (m_context, diagram);
   }
 
+  sarif_builder &get_builder () { return m_builder; }
+
 protected:
   sarif_output_format (diagnostic_context &context,
 		       const line_maps *line_maps,
@@ -3008,11 +3203,128 @@ private:
   char *m_base_file_name;
 };
 
+/* Print the start of an embedded link to PP, as per 3.11.6.  */
+
+static void
+sarif_begin_embedded_link (pretty_printer *pp)
+{
+  pp_character (pp, '[');
+}
+
+/* Print the end of an embedded link to PP, as per 3.11.6.  */
+
+static void
+sarif_end_embedded_link (pretty_printer *pp,
+			 const char *url)
+{
+  pp_string (pp, "](");
+  /* TODO: does the URI need escaping?
+     See https://github.com/oasis-tcs/sarif-spec/issues/657 */
+  pp_string (pp, url);
+  pp_character (pp, ')');
+}
+
+/* class sarif_token_printer : public token_printer.  */
+
+/* Implementation of pretty_printer::token_printer for SARIF output.
+   Emit URLs as per 3.11.6 ("Messages with embedded links").  */
+
+void
+sarif_builder::sarif_token_printer::print_tokens (pretty_printer *pp,
+						  const pp_token_list &tokens)
+{
+  /* Convert to text, possibly with colorization, URLs, etc.  */
+  label_text current_url;
+  for (auto iter = tokens.m_first; iter; iter = iter->m_next)
+    switch (iter->m_kind)
+      {
+      default:
+	gcc_unreachable ();
+
+      case pp_token::kind::text:
+	{
+	  const pp_token_text *sub = as_a <const pp_token_text *> (iter);
+	  const char * const str = sub->m_value.get ();
+	  if (current_url.get ())
+	    {
+	      /* Write iter->m_value, but escaping any
+		 escaped link characters as per 3.11.6.  */
+	      for (const char *ptr = str; *ptr; ptr++)
+		{
+		  const char ch = *ptr;
+		  switch (ch)
+		    {
+		    default:
+		      pp_character (pp, ch);
+		      break;
+		    case '\\':
+		    case '[':
+		    case ']':
+		      pp_character (pp, '\\');
+		      pp_character (pp, ch);
+		      break;
+		    }
+		}
+	    }
+	  else
+	    /* TODO: is other escaping needed? (e.g. of '[')
+	       See https://github.com/oasis-tcs/sarif-spec/issues/658 */
+	    pp_string (pp, str);
+	}
+	break;
+
+      case pp_token::kind::begin_color:
+      case pp_token::kind::end_color:
+	/* These are no-ops.  */
+	break;
+
+      case pp_token::kind::begin_quote:
+	pp_begin_quote (pp, pp_show_color (pp));
+	break;
+      case pp_token::kind::end_quote:
+	pp_end_quote (pp, pp_show_color (pp));
+	break;
+
+      /* Emit URLs as per 3.11.6 ("Messages with embedded links").  */
+      case pp_token::kind::begin_url:
+	{
+	  pp_token_begin_url *sub = as_a <pp_token_begin_url *> (iter);
+	  sarif_begin_embedded_link (pp);
+	  current_url = std::move (sub->m_value);
+	}
+	break;
+      case pp_token::kind::end_url:
+	gcc_assert (current_url.get ());
+	sarif_end_embedded_link (pp, current_url.get ());
+	current_url = label_text::borrow (nullptr);
+	break;
+
+      case pp_token::kind::event_id:
+	{
+	  pp_token_event_id *sub = as_a <pp_token_event_id *> (iter);
+	  gcc_assert (sub->m_event_id.known_p ());
+	  const sarif_code_flow *code_flow
+	    = m_builder.get_code_flow_for_event_ids ();
+	  label_text url = make_sarif_url_for_event (code_flow,
+						     sub->m_event_id);
+	  if (url.get ())
+	    sarif_begin_embedded_link (pp);
+	  pp_character (pp, '(');
+	  pp_decimal_int (pp, sub->m_event_id.one_based ());
+	  pp_character (pp, ')');
+	  if (url.get ())
+	    sarif_end_embedded_link (pp, url.get ());
+	}
+	break;
+      }
+}
+
 /* Populate CONTEXT in preparation for SARIF output (either to stderr, or
    to a file).  */
 
 static void
-diagnostic_output_format_init_sarif (diagnostic_context &context)
+diagnostic_output_format_init_sarif (diagnostic_context &context,
+				     std::unique_ptr<sarif_output_format> fmt)
 {
   /* Suppress normal textual path output.  */
   context.set_path_format (DPF_NONE);
@@ -3023,6 +3335,10 @@ diagnostic_output_format_init_sarif (diagnostic_context &context)
   /* Don't colorize the text.  */
   pp_show_color (context.printer) = false;
   context.set_show_highlight_colors (false);
+
+  context.printer->set_token_printer
+    (&fmt->get_builder ().get_token_printer ());
+  context.set_output_format (fmt.release ());
 }
 
 /* Populate CONTEXT in preparation for SARIF output to stderr.  */
@@ -3034,13 +3350,13 @@ diagnostic_output_format_init_sarif_stderr (diagnostic_context &context,
 					    bool formatted)
 {
   gcc_assert (line_maps);
-  diagnostic_output_format_init_sarif (context);
-  context.set_output_format
-    (new sarif_stream_output_format (context,
-				     line_maps,
-				     main_input_filename_,
-				     formatted,
-				     stderr));
+  diagnostic_output_format_init_sarif
+    (context,
+     ::make_unique<sarif_stream_output_format> (context,
+						line_maps,
+						main_input_filename_,
+						formatted,
+						stderr));
 }
 
 /* Populate CONTEXT in preparation for SARIF output to a file named
@@ -3054,13 +3370,13 @@ diagnostic_output_format_init_sarif_file (diagnostic_context &context,
 					  const char *base_file_name)
 {
   gcc_assert (line_maps);
-  diagnostic_output_format_init_sarif (context);
-  context.set_output_format
-    (new sarif_file_output_format (context,
-				   line_maps,
-				   main_input_filename_,
-				   formatted,
-				   base_file_name));
+  diagnostic_output_format_init_sarif
+    (context,
+     ::make_unique<sarif_file_output_format> (context,
+					      line_maps,
+					      main_input_filename_,
+					      formatted,
+					      base_file_name));
 }
 
 /* Populate CONTEXT in preparation for SARIF output to STREAM.  */
@@ -3073,13 +3389,13 @@ diagnostic_output_format_init_sarif_stream (diagnostic_context &context,
 					    FILE *stream)
 {
   gcc_assert (line_maps);
-  diagnostic_output_format_init_sarif (context);
-  context.set_output_format
-    (new sarif_stream_output_format (context,
-				     line_maps,
-				     main_input_filename_,
-				     formatted,
-				     stream));
+  diagnostic_output_format_init_sarif
+    (context,
+     ::make_unique<sarif_stream_output_format> (context,
+						line_maps,
+						main_input_filename_,
+						formatted,
+						stream));
 }
 
 #if CHECKING_P
@@ -3095,13 +3411,12 @@ class test_sarif_diagnostic_context : public test_diagnostic_context
 public:
   test_sarif_diagnostic_context (const char *main_input_filename)
   {
-    diagnostic_output_format_init_sarif (*this);
-
-    m_format = new buffered_output_format (*this,
-					   line_table,
-					   main_input_filename,
-					   true);
-    set_output_format (m_format); // give ownership;
+    auto format = ::make_unique<buffered_output_format> (*this,
+							 line_table,
+							 main_input_filename,
+							 true);
+    m_format = format.get (); // borrowed
+    diagnostic_output_format_init_sarif (*this, std::move (format));
   }
 
   std::unique_ptr<sarif_log> flush_to_object ()
@@ -3175,7 +3490,7 @@ test_make_location_object (const line_table_case &case_)
   richloc.add_range (field, SHOW_RANGE_WITHOUT_CARET, &label2);
   richloc.set_escape_on_output (true);
 
-  sarif_result result;
+  sarif_result result (0);
 
   std::unique_ptr<sarif_location> location_obj
     = builder.make_location_object
@@ -3471,6 +3786,119 @@ test_simple_log_2 (const line_table_case &case_)
   }
 }
 
+/* Assuming that a single diagnostic has been emitted within
+   LOG, get a json::object for the result object.  */
+
+static const json::object *
+get_result_from_log (const sarif_log *log)
+{
+  auto runs = EXPECT_JSON_OBJECT_WITH_ARRAY_PROPERTY (log, "runs"); // 3.13.4
+  ASSERT_EQ (runs->size (), 1);
+
+  // 3.14 "run" object:
+  auto run = (*runs)[0];
+
+  // 3.14.23:
+  auto results = EXPECT_JSON_OBJECT_WITH_ARRAY_PROPERTY (run, "results");
+  ASSERT_EQ (results->size (), 1);
+
+  // 3.27 "result" object:
+  auto result = (*results)[0];
+  return expect_json_object (SELFTEST_LOCATION, result);
+}
+
+/* Assuming that a single diagnostic has been emitted to
+   DC, get a json::object for the messsage object within
+   the result.  */
+
+static const json::object *
+get_message_from_log (const sarif_log *log)
+{
+  auto result_obj = get_result_from_log (log);
+
+  // 3.27.11:
+  auto message_obj
+    = EXPECT_JSON_OBJECT_WITH_OBJECT_PROPERTY (result_obj, "message");
+  return message_obj;
+}
+
+/* Tests of messages with embedded links; see SARIF v2.1.0 3.11.6.  */
+
+static void
+test_message_with_embedded_link ()
+{
+  auto_fix_quotes fix_quotes;
+  {
+    test_sarif_diagnostic_context dc ("test.c");
+    rich_location richloc (line_table, UNKNOWN_LOCATION);
+    dc.report (DK_ERROR, richloc, nullptr, 0,
+	       "before %{text%} after",
+	       "http://example.com");
+    std::unique_ptr<sarif_log> log = dc.flush_to_object ();
+
+    auto message_obj = get_message_from_log (log.get ());
+    ASSERT_JSON_STRING_PROPERTY_EQ
+      (message_obj, "text",
+       "before [text](http://example.com) after");
+  }
+
+  /* Escaping in message text.
+     This is "EXAMPLE 1" from 3.11.6.  */
+  {
+    test_sarif_diagnostic_context dc ("test.c");
+    rich_location richloc (line_table, UNKNOWN_LOCATION);
+
+    /* Disable "unquoted sequence of 2 consecutive punctuation
+       characters `]\' in format" warning.  */
+#if __GNUC__ >= 10
+#  pragma GCC diagnostic push
+#  pragma GCC diagnostic ignored "-Wformat-diag"
+#endif
+    dc.report (DK_ERROR, richloc, nullptr, 0,
+	       "Prohibited term used in %{para[0]\\spans[2]%}.",
+	       "1");
+#if __GNUC__ >= 10
+#  pragma GCC diagnostic pop
+#endif
+
+    std::unique_ptr<sarif_log> log = dc.flush_to_object ();
+
+    auto message_obj = get_message_from_log (log.get ());
+    ASSERT_JSON_STRING_PROPERTY_EQ
+      (message_obj, "text",
+       "Prohibited term used in [para\\[0\\]\\\\spans\\[2\\]](1).");
+    /* This isn't exactly what EXAMPLE 1 of the spec has; reported as
+       https://github.com/oasis-tcs/sarif-spec/issues/656  */
+  }
+
+  /* Urlifier.  */
+  {
+    class test_urlifier : public urlifier
+    {
+    public:
+      char *
+      get_url_for_quoted_text (const char *p, size_t sz) const final override
+      {
+	if (!strncmp (p, "-foption", sz))
+	  return xstrdup ("http://example.com");
+	return nullptr;
+      }
+    };
+
+    test_sarif_diagnostic_context dc ("test.c");
+    dc.set_urlifier (new test_urlifier ());
+    rich_location richloc (line_table, UNKNOWN_LOCATION);
+    dc.report (DK_ERROR, richloc, nullptr, 0,
+	       "foo %<-foption%> %<unrecognized%> bar");
+    std::unique_ptr<sarif_log> log = dc.flush_to_object ();
+
+    auto message_obj = get_message_from_log (log.get ());
+    ASSERT_JSON_STRING_PROPERTY_EQ
+      (message_obj, "text",
+       "foo `[-foption](http://example.com)' `unrecognized' bar");
+  }
+}
+
 /* Run all of the selftests within this file.  */
 
 void
@@ -3479,6 +3907,7 @@ diagnostic_format_sarif_cc_tests ()
   for_each_line_table_case (test_make_location_object);
   test_simple_log ();
   for_each_line_table_case (test_simple_log_2);
+  test_message_with_embedded_link ();
 }
 
 } // namespace selftest
diff --git a/gcc/pretty-print-format-impl.h b/gcc/pretty-print-format-impl.h
index cffdd461a33d..f1996284f4a1 100644
--- a/gcc/pretty-print-format-impl.h
+++ b/gcc/pretty-print-format-impl.h
@@ -22,6 +22,7 @@ along with GCC; see the file COPYING3.  If not see
 #define GCC_PRETTY_PRINT_FORMAT_IMPL_H
 
 #include "pretty-print.h"
+#include "diagnostic-event-id.h"
 
 /* A struct representing a pending item to be printed within
    pp_format.
@@ -31,6 +32,7 @@ along with GCC; see the file COPYING3.  If not see
    - begin/end named color
    - open/close quote
    - begin/end URL
+   - event IDs
    - custom data (for the formatter, for the pretty_printer,
      or the output format)
 
@@ -72,6 +74,8 @@ public:
     begin_url,
     end_url,
 
+    event_id,
+
     custom_data,
 
     NUM_KINDS
@@ -218,6 +222,34 @@ struct pp_token_end_url : public pp_token
   }
 };
 
+struct pp_token_event_id : public pp_token
+{
+  pp_token_event_id (diagnostic_event_id_t event_id)
+  : pp_token (kind::event_id),
+    m_event_id (event_id)
+  {
+    gcc_assert (event_id.known_p ());
+  }
+
+  diagnostic_event_id_t m_event_id;
+};
+
+template <>
+template <>
+inline bool
+is_a_helper <pp_token_event_id *>::test (pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::event_id;
+}
+
+template <>
+template <>
+inline bool
+is_a_helper <const pp_token_event_id *>::test (const pp_token *tok)
+{
+  return tok->m_kind == pp_token::kind::event_id;
+}
+
 struct pp_token_custom_data : public pp_token
 {
   class value
diff --git a/gcc/pretty-print.cc b/gcc/pretty-print.cc
index d2c0a197680c..fe6b6090f323 100644
--- a/gcc/pretty-print.cc
+++ b/gcc/pretty-print.cc
@@ -1120,6 +1120,16 @@ pp_token::dump (FILE *out) const
     case kind::end_url:
       fprintf (out, "END_URL");
       break;
+
+    case kind::event_id:
+      {
+	const pp_token_event_id *sub
+	  = as_a <const pp_token_event_id *> (this);
+	gcc_assert (sub->m_event_id.known_p ());
+	fprintf (out, "EVENT((%i))", sub->m_event_id.one_based ());
+      }
+      break;
+
     case kind::custom_data:
       {
 	const pp_token_custom_data *sub
@@ -1984,12 +1994,7 @@ pretty_printer::format (text_info *text)
 	    diagnostic_event_id_ptr event_id
 	      = va_arg (*text->m_args_ptr, diagnostic_event_id_ptr);
 	    gcc_assert (event_id->known_p ());
-
-	    pp_string (this, colorize_start (m_show_color, "path"));
-	    pp_character (this, '(');
-	    pp_decimal_int (this, event_id->one_based ());
-	    pp_character (this, ')');
-	    pp_string (this, colorize_stop (m_show_color));
+	    formatted_tok_list->push_back<pp_token_event_id> (*event_id);
 	  }
 	  break;
 
@@ -2183,6 +2188,18 @@ default_token_printer (pretty_printer *pp,
 	pp_end_url (pp);
 	break;
 
+      case pp_token::kind::event_id:
+	{
+	  pp_token_event_id *sub = as_a <pp_token_event_id *> (iter);
+	  gcc_assert (sub->m_event_id.known_p ());
+	  pp_string (pp, colorize_start (pp_show_color (pp), "path"));
+	  pp_character (pp, '(');
+	  pp_decimal_int (pp, sub->m_event_id.one_based ());
+	  pp_character (pp, ')');
+	  pp_string (pp, colorize_stop (pp_show_color (pp)));
+	}
+	break;
+
       case pp_token::kind::custom_data:
 	/* These should have been eliminated by replace_custom_tokens.  */
 	gcc_unreachable ();
diff --git a/gcc/testsuite/gcc.dg/sarif-output/bad-pragma.c b/gcc/testsuite/gcc.dg/sarif-output/bad-pragma.c
new file mode 100644
index 000000000000..db274de4b97a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/sarif-output/bad-pragma.c
@@ -0,0 +1,16 @@
+/* Verify that SARIF output can capture URLs in diagnostics
+   related to a bad pragma.  */
+
+/* { dg-do compile } */
+/* { dg-options "-fdiagnostics-format=sarif-file -Wpragmas" } */
+
+#pragma GCC diagnostic ignored "-Wmisleading-indenttion"
+
+int nonempty;
+
+/* Verify that some JSON was written to a file with the expected name:
+   { dg-final { verify-sarif-file } } */
+
+/* Use a Python script to verify various properties about the generated
+   .sarif file:
+   { dg-final { run-sarif-pytest bad-pragma.c "test-bad-pragma.py" } } */
diff --git a/gcc/testsuite/gcc.dg/sarif-output/test-bad-pragma.py b/gcc/testsuite/gcc.dg/sarif-output/test-bad-pragma.py
new file mode 100644
index 000000000000..140bb3381984
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/sarif-output/test-bad-pragma.py
@@ -0,0 +1,38 @@
+from sarif import *
+
+import pytest
+
+@pytest.fixture(scope='function', autouse=True)
+def sarif():
+    return sarif_from_env()
+
+def test_messages_have_embedded_urls(sarif):
+    runs = sarif['runs']
+    run = runs[0]
+    results = run['results']
+
+    # We expect a single warning with a secondary location.
+    #
+    # The textual form of the diagnostic would look like this:
+    #  . PATH/bad-pragma.c:7:32: warning: unknown option after '#pragma GCC diagnostic' kind [-Wpragmas]
+    #  .     7 | #pragma GCC diagnostic ignored "-Wmisleading-indenttion"
+    #  .       |                                ^~~~~~~~~~~~~~~~~~~~~~~~~
+    #  . PATH/bad-pragma.c:7:32: note: did you mean '-Wmisleading-indentation'?
+    assert len(results) == 1
+    
+    result = results[0]
+    assert result['ruleId'] == '-Wpragmas'
+    assert result['level'] == 'warning'
+    assert result['message']['text'] \
+        == "unknown option after '[#pragma GCC diagnostic](https://gcc.gnu.org/onlinedocs/gcc/Diagnostic-Pragmas.html)' kind"
+    # Note how we expect an embedded link in the above for the docs for #pragma GCC diagnostic
+    
+    # We expect one related location, for the note.
+    relatedLocations = result['relatedLocations']
+    assert len(relatedLocations) == 1
+
+    rel_loc = relatedLocations[0]
+    assert rel_loc['message']['text'] \
+        == "did you mean '[-Wmisleading-indentation](https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#index-Wmisleading-indentation)'?"
+    # Again, we expect an embedded link in the above, this time to the
+    # docs for the suggested option
diff --git a/gcc/testsuite/gcc.dg/sarif-output/test-include-chain-2.py b/gcc/testsuite/gcc.dg/sarif-output/test-include-chain-2.py
index 761fe1b59a9c..843f89a94853 100644
--- a/gcc/testsuite/gcc.dg/sarif-output/test-include-chain-2.py
+++ b/gcc/testsuite/gcc.dg/sarif-output/test-include-chain-2.py
@@ -96,9 +96,11 @@ def test_location_relationships(sarif):
         == "  __builtin_free (ptr); // 1st\n"
     assert threadFlow['locations'][0]['kinds'] == ['release', 'memory']
     assert threadFlow['locations'][0]['executionOrder'] == 1
-    
+
+    # We should have an embedded link in this event's message to the
+    # other event's location within the SARIF file:
     assert threadFlow['locations'][1]['location']['message']['text'] \
-        == "second 'free' here; first 'free' was at (1)"
+        == "second 'free' here; first 'free' was at [(1)](sarif:/runs/0/results/0/codeFlows/0/threadFlows/0/locations/0)"
     assert threadFlow['locations'][1]['location']['physicalLocation']['contextRegion']['snippet']['text'] \
         == "  __builtin_free (ptr); // 2nd\n"
     assert threadFlow['locations'][1]['kinds'] == ['danger']