From patchwork Thu Aug  8 07:25:02 2024
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: =?utf-8?q?J=C3=B8rgen_Kvalsvik?= <j@lambda.is>
X-Patchwork-Id: 1970421
Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org>
X-Original-To: incoming@patchwork.ozlabs.org
Delivered-To: patchwork-incoming@legolas.ozlabs.org
Authentication-Results: legolas.ozlabs.org;
	dkim=pass (2048-bit key;
 secure) header.d=kolabnow.com header.i=@kolabnow.com header.a=rsa-sha256
 header.s=dkim20240523 header.b=ELQvKSa2;
	dkim-atps=neutral
Authentication-Results: legolas.ozlabs.org;
 spf=pass (sender SPF authorized) smtp.mailfrom=gcc.gnu.org
 (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org;
 envelope-from=gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org;
 receiver=patchwork.ozlabs.org)
Received: from server2.sourceware.org (server2.sourceware.org
 [IPv6:2620:52:3:1:0:246e:9693:128c])
	(using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)
	 key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384)
	(No client certificate requested)
	by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wfdrj733cz1ybS
	for <incoming@patchwork.ozlabs.org>; Thu,  8 Aug 2024 17:26:29 +1000 (AEST)
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id BA5043857C4F
	for <incoming@patchwork.ozlabs.org>; Thu,  8 Aug 2024 07:26:27 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mx.kolabnow.com (mx.kolabnow.com [212.103.80.154])
 by sourceware.org (Postfix) with ESMTPS id 1ED0C3858C3A
 for <gcc-patches@gcc.gnu.org>; Thu,  8 Aug 2024 07:25:24 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1ED0C3858C3A
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=lambda.is
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=lambda.is
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1ED0C3858C3A
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=212.103.80.154
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1723101931; cv=none;
 b=IdvTOMTpdZofnRHhRXY9DKUS/3uyko0bIzQgHWq7jWLvjIPu1MKSd0FgBpV1ogK0Dr90aoEReXfUG3/jp0eGf8qiqj5+0b7rCblQNzEaB5q8jizWxM+E07Fr0lDise69eFbP6wlYZXHFKWwc9yiVHA5/4nO3xccosdYdT5Y4JbU=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1723101931; c=relaxed/simple;
 bh=HxQni76gloHQaieVqIqACOviVKAs0GoG5QFur1wcH4M=;
 h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version;
 b=txBfdzCzf0qoKivK37QwmMvFK15mSIKr0YkHMqmYIlYYrQKwfer7BiCp/3yKm512aCspbqH+tF6RhOnk48LQbxSlRlIaCRKIZx5qadF4Ma1+PpfUaUMLZd74NOBDDgXz0yhX6RVXjhGsdTifYlgGcoaRYzAH5A6kaAs18+fvQrE=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from localhost (unknown [127.0.0.1])
 by mx.kolabnow.com (Postfix) with ESMTP id 456563385684;
 Thu,  8 Aug 2024 09:25:23 +0200 (CEST)
Authentication-Results: ext-mx-out013.mykolab.com (amavis);
 dkim=pass (2048-bit key) reason="pass (just generated, assumed good)"
 header.d=kolabnow.com
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kolabnow.com; h=
 content-transfer-encoding:mime-version:references:in-reply-to
 :message-id:date:date:subject:subject:from:from:received
 :received:received; s=dkim20240523; t=1723101914; x=1724916315;
 bh=Vle455E0E7pCRHA+NBvlHy5ENnU/kYCLNrlugiJwy3E=; b=ELQvKSa2y5Bs
 i2NBIXWZM8jxFWGF5zyVxvlSwHuZeuqfRVejncD0mZkeDQznGwMZUYHvCxGuoYQS
 jfQG1/kvUkIQPlFcHjhDnCiXQfmlKOc/GMvwN8SSHezFYfvt2QbY/WfxmSbRfZ0h
 ZeRMWiPSLdqnYv0NUhBovgBicgZ9RNUOj2gKjzDXbTUI0xGIoflE+Hbk9qFG2sit
 ZDzlMWszrlcCJg3aH6t8gCTkpki891wfnkA/HlhxLeYtDS1dGR6gImbB8zsDKrM6
 exVyj3uYHjXQAdwksSGLhZyqcizKohJH3MhgJpT8iM3t1WwifT/Ux80wn661bCnN
 G+tdtdnbtw==
X-Virus-Scanned: amavis at mykolab.com
X-Spam-Score: -0.999
X-Spam-Level: 
X-Spam-Status: No, score=-12.5 required=5.0 tests=BAYES_00, DKIM_SIGNED,
 DKIM_VALID, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS,
 TXREP autolearn=ham autolearn_force=no version=3.4.6
Received: from mx.kolabnow.com ([127.0.0.1])
 by localhost (ext-mx-out013.mykolab.com [127.0.0.1]) (amavis, port 10024)
 with ESMTP id QMKfLX80PJhT; Thu,  8 Aug 2024 09:25:14 +0200 (CEST)
Received: from int-mx011.mykolab.com (unknown [10.9.13.11])
 by mx.kolabnow.com (Postfix) with ESMTPS id BA36C3385685;
 Thu,  8 Aug 2024 09:25:14 +0200 (CEST)
Received: from ext-subm010.mykolab.com (unknown [10.9.6.10])
 by int-mx011.mykolab.com (Postfix) with ESMTPS id 8AA3B311BB29;
 Thu,  8 Aug 2024 09:25:14 +0200 (CEST)
From: =?utf-8?q?J=C3=B8rgen_Kvalsvik?= <j@lambda.is>
To: gcc-patches@gcc.gnu.org
Cc: hubicka@ucw.cz, =?utf-8?q?J=C3=B8rgen_Kvalsvik?= <j@lambda.is>
Subject: [PATCH 3/3] Add prime path coverage to gcc/gcov
Date: Thu,  8 Aug 2024 09:25:02 +0200
Message-Id: <20240808072502.3251665-3-j@lambda.is>
In-Reply-To: <20240808072502.3251665-1-j@lambda.is>
References: <20240808072502.3251665-1-j@lambda.is>
MIME-Version: 1.0
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

This patch adds prime path coverage to gcc/gcov. First, a quick
introduction to path coverage, before I explain a bit on the pieces of
the patch.

PRIME PATHS

Path coverage is recording the paths taken through the program. Here is a
simple example:

if (cond1)  BB 1
  then1 ()  BB 2
else
  else1 ()  BB 3

if (cond2)  BB 4
  then2 ()  BB 5
else
  else2 ()  BB 6

_           BB 7

To cover all paths you must run {then1 then2}, {then1 else2}, {else1 then1},
{else1 else2}. This is in contrast with line/statement coverage where it is
sufficient to execute then2, and it does not matter if it was reached through
then1 or else1.

1 2 4 5 7
1 2 4 6 7
1 3 4 5 7
1 3 4 6 7

This gets more complicated with loops, because 0, 1, 2, ..., N iterations are
all different paths. There are different ways of addressing this, a promising
one being prime paths. A prime path is a simple path (a path with no repeated
vertices except for the first/last in a cycle) that does not appear as a subpath
of any other simple path. Prime paths seem to strike a decent balance between
number of tests, path growth, and loop coverage, requiring loops to be
both taken and skipped. Of course, the number of paths still grows
very fast with program complexity - for example, this program has 14
prime paths:

  while (a)
    {
      if (b)
        return;
      while (c--)
        a++;
    }
---

ALGORITHM

Since the numbers of paths grows so fast, we need a good
algorithm. The naive approach of generating all paths and discarding
redundancies (see reference_prime_paths in the diff) simply doesn't
complete for even pretty simple functions with a few ten thousand
paths (granted, the implementation is also poor, but only serves as a
reference). Fazli & Afsharchi in their paper "Time and Space-Efficient
Compositional Method for Prime and Test Paths Generation" describe a
neat algorithm which drastically improves on for most programs, and
brings complexity down to something managable. This patch implements
that algorithm with a few minor tweaks.

The algorithm first finds the strongly connected components (SCC) of the graph
and creates a new graph where the vertices are the SCCs of the CFG. Within
these vertices different paths are found - regular prime paths, paths that
start in the SCCs entries, and paths that end in the SCCs exits. These per-SCC
paths are combined with paths through the CFG which greatly reduces of paths
needed to be evaluated just to be thrown away.

Using this algorithm we can find the prime paths for somewhat
complicated functions in a reasonable time. Please note that some
programs don't benefit from this at all. We need to find the prime
paths within a SCC, so if a single SCC is very large the function
degenerates to the naive implementation. This can probably be much
improved on, but is an exercise for later.

--

OVERALL ARCHITECTURE

Like the other coverages in gcc, this operates on the CFG in the profiling
phase, just after branch and condition coverage, in phases:

1. All prime paths are generated, counted, and enumerated from the CFG
2. The paths are evaluted and counter instructions and accumulators are
   emitted
3. gcov reads the CFG and computes the prime paths (same as step 1)
4. gcov prints a report

Simply writing out all the paths in the .gcno file is not really viable,
the files would be too big. Additionally, there are limits to the
practicality of measuring (and reporting) on millions of paths, so for
most programs where coverage is feasible, computing paths should be
plenty fast. As a result, path coverage really only adds 1 bit to the
counter, rounded up to nearest 64 ("bucket"), so 64 paths takes up 8
bytes, 65 paths take up 16 bytes.

Recording paths is really just massaging large bitsets. Per function,
ceil(paths/64 or 32) buckets (gcov_type) are allocated. Paths are
sorted, so the first path maps to the lowest bit, the second path to the
second lowest bit, and so on. On taking an edge and entering a basic
block, a few bitmasks are applied to unset the bits corresponding to the
paths outside the block and set the bits of the paths that start in that
block. Finally, the right buckets are masked and written to the global
accumulators for the paths that end in the block. Full coverage is
achieved when all bits are set.

gcc does not really inform gcov of abnormal paths, so paths with
abnormal paths are ignored. This probably possible, but requires some
changes to the graph gcc writes to the .gcno file.

--

IMPLEMENTATION

In order to remove non-prime paths (subpaths) I use a non-clever
suffix tree, by inserting all subpaths into a trie. Fazli & Afsharchi
do not discuss how duplicates or subpaths are removed, and using the
trie turned out to work really well. The same prime_paths function is
used both in gcc and in gcov.

As for speed, I would say that it is acceptable. Path coverage is a
problem that is exponential its very nature, so if you enable this
feature you can reasonably expect it taking a while.  To combat the
effects of path explosion there is a limit at which point gcc will
give up and refuse to instrument a function, set with
-fpath-coverage-limit.  Since estimating the number of prime paths is
pretty much is counting them, gcc maintains a pessimistic running
count which slightly overestimates the number of paths found so
far. When that count exceeds the limit, the function aborts and gcc
prints a warning.  This is a blunt instrument meant to not get stuck
on the occasional large function, not fine-grained control over when
to skip instrumentation.

My main benchmark has been tree-2.1.3/tree.c which generates approx 2M
paths across the 20 functions or so in it. Most functions have less
than 1500 paths, and 2 around a million each. Finding the paths takes
3.5-4s, but the instrumentation phase takes approx. 2.5 minutes and
generates a 32M binary. Not bad for a 1429 line source file.

There are some selftests which deconstruct the algorithm, so it can be
easily referenced with the Fazli & Afsharchi. I hope that including
them both help to catch regression, clarify the assumptions, and help
understanding the algorithm by breaking up the phases.

DEMO

This is the denser line-aware (grep-friendlier) output. Every missing
path is summarized as the lines you need to run in what order, annotated
with the true/false/throw decision.

$ gcc -fpath-coverage --coverage bs.c -c -o bs
$ gcov -e --prime-paths-lines bs.o
bs.gcda:cannot open data file, assuming not executed
        -:    0:Source:bs.c
        -:    0:Graph:bs.gcno
        -:    0:Data:-
        -:    0:Runs:0
paths covered 0 of 17
path  0 not covered: lines 6 6(true) 11(true) 12
path  1 not covered: lines 6 6(true) 11(false) 13(true) 14
path  2 not covered: lines 6 6(true) 11(false) 13(false) 16
path  3 not covered: lines 6 6(false) 18
path  4 not covered: lines 11(true) 12 6(true) 11
path  5 not covered: lines 11(true) 12 6(false) 18
path  6 not covered: lines 11(false) 13(true) 14 6(true) 11
path  7 not covered: lines 11(false) 13(true) 14 6(false) 18
path  8 not covered: lines 12 6(true) 11(true) 12
path  9 not covered: lines 12 6(true) 11(false) 13(true) 14
path 10 not covered: lines 12 6(true) 11(false) 13(false) 16
path 11 not covered: lines 13(true) 14 6(true) 11(true) 12
path 12 not covered: lines 13(true) 14 6(true) 11(false) 13
path 13 not covered: lines 14 6(true) 11(false) 13(true) 14
path 14 not covered: lines 14 6(true) 11(false) 13(false) 16
path 15 not covered: lines 6(true) 11(true) 12 6
path 16 not covered: lines 6(true) 11(false) 13(true) 14 6
    #####:    1:int binary_search(int a[], int len, int from, int to, int key)
        -:    2:{
    #####:    3:    int low = from;
    #####:    4:    int high = to - 1;
        -:    5:
    #####:    6:    while (low <= high)
        -:    7:    {
    #####:    8:        int mid = (low + high) >> 1;
    #####:    9:        long midVal = a[mid];
        -:   10:
    #####:   11:        if (midVal < key)
    #####:   12:            low = mid + 1;
    #####:   13:        else if (midVal > key)
    #####:   14:            high = mid - 1;
        -:   15:        else
    #####:   16:            return mid; // key found
        -:   17:    }
    #####:   18:    return -1;
        -:   19:}

Then there's the human-oriented source mode.  Because it is so verbose
I have limited the demo to 2 paths. In this mode gcov will print the
sequence of *lines* through the program and in what order to cover the
path, including what basic block the line is a part of. Like its
denser sibling, this also prints the true/false/throw decision, if
there is one.

$ gcov -t --prime-paths-source bs.o
bs.gcda:cannot open data file, assuming not executed
        -:    0:Source:bs.c
        -:    0:Graph:bs.gcno
        -:    0:Data:-
        -:    0:Runs:0
paths covered 0 of 17
path 0:
BB  2:           1:int binary_search(int a[], int len, int from, int to, int key)
BB  2:           3:    int low = from;
BB  2:           4:    int high = to - 1;
BB  2:           6:    while (low <= high)
BB  8: (true)    6:    while (low <= high)
BB  3:           8:     int mid = (low + high) >> 1;
BB  3:           9:     long midVal = a[mid];
BB  3: (true)   11:     if (midVal < key)
BB  4:          12:         low = mid + 1;

path 1:
BB  2:           1:int binary_search(int a[], int len, int from, int to, int key)
BB  2:           3:    int low = from;
BB  2:           4:    int high = to - 1;
BB  2:           6:    while (low <= high)
BB  8: (true)    6:    while (low <= high)
BB  3:           8:     int mid = (low + high) >> 1;
BB  3:           9:     long midVal = a[mid];
BB  3: (false)  11:     if (midVal < key)
BB  5: (true)   13:     else if (midVal > key)
BB  6:          14:         high = mid - 1;

The listing is also aware of inlining:

hello.c:

    #include <stdio.h>
    #include "hello.h"

    int notmain(const char *entity)
    {
      return hello (entity);
    }

    #include <stdio.h>

    inline __attribute__((always_inline))
    int hello (const char *s)
    {
      if (s)
        printf ("hello, %s!\n", s);
      else
        printf ("hello, world!\n");
      return 0;
    }

$ gcov -t --prime-paths-source hello

paths covered 0 of 2
path 0:
BB  2: (true)    4:int notmain(const char *entity)
 == inlined from hello.h ==
BB  2: (true)    6:  if (s)
BB  3:           7:    printf ("hello, %s!\n", s);
BB  5:          10:  return 0;
-------------------------
BB  7:           6:  return hello (entity);
BB  8:           6:  return hello (entity);

path 1:
BB  2: (false)   4:int notmain(const char *entity)
 == inlined from hello.h ==
BB  2: (false)   6:  if (s)
BB  4:           9:    printf ("hello, world!\n");
BB  5:          10:  return 0;
-------------------------
BB  7:           6:  return hello (entity);
BB  8:           6:  return hello (entity);

And finally, JSON (abbreviated). It is quite sparse and very nested, but
is mostly a JSON version of the source listing. It has to be this nested
in order to consistently capture multiple locations. It is always
includes the file name per location for consistency, even though this is
very much redundant in almost all cases. This format is in no way set in
stone, and without targeting it with other tooling I am not sure if it
does the job well.

  "gcc_version": "15.0.0 20240704 (experimental)",
  "current_working_directory": "dir",
  "data_file": "hello.o",
  "files": [
    {
      "file": "hello.c",
      "functions": [
        {
          "name": "notmain",
          "demangled_name": "notmain",
          "start_line": 4,
          "start_column": 5,
          "end_line": 7,
          "end_column": 1,
          "blocks": 7,
          "blocks_executed": 0,
          "execution_count": 0,
          "total_prime_paths": 2,
          "covered_prime_paths": 0,
          "prime_path_coverage": [
            {
              "id": 0,
              "sequence": [
                {
                  "block_id": 2,
                  "locations": [
                    {
                      "file": "hello.c",
                      "line_numbers": [
                        4
                      ]
                    },
                    {
                      "file": "hello.h",
                      "line_numbers": [
                        6
                      ]
                    }
                  ],
                  "edge_kind": "fallthru"
                },
                ...

gcc/ChangeLog:

	* Makefile.in (OBJS): Add prime-paths.o, path-coverage.o.
	(GTFILES): Add prime-paths.cc, path-coverage.cc
	(GCOV_OBJS): Add graphds.o, prime-paths.o, bitmap.o, sbitmap.o
	* builtins.cc (expand_builtin_fork_or_exec): Check
	path_coverage_flag.
	* collect2.cc (main): Add -fno-path-coverage to OBSTACK.
	* common.opt: Add new options -fpath-coverage,
	-fpath-coverage-limit, -Wcoverage-too-many-paths
	* doc/gcov.texi: Add --prime-paths, --prime-paths-lines,
	--prime-paths-source documentation.
	* doc/invoke.texi: Add -fpath-coverage, -fpath-coverage-limit,
	-Wcoverage-too-many-paths documentation.
	* gcc.cc: Link gcov on -fpath-coverage.
	* gcov-counter.def (GCOV_COUNTER_PATHS): New.
	* gcov-io.h (GCOV_TAG_PATHS): New.
	(GCOV_TAG_PATHS_LENGTH): New.
	(GCOV_TAG_PATHS_NUM): New.
	* gcov.cc (class path_info): New.
	(struct coverage_info): Add paths, paths_covered.
	(find_prime_paths): New.
	(add_path_counts): New.
	(find_arc): New.
	(print_usage): Add -e, --prime-paths, --prime-paths-lines,
	--prime-paths-source.
	(process_args): Likewise.
	(json_set_prime_path_coverage): New.
	(output_json_intermediate_file): Call
	json_set_prime_path_coverage.
	(process_all_functions): Call find_prime_paths.
	(generate_results): Call add_path_counts.
	(read_graph_file): Read path counters.
	(read_count_file): Likewise.
	(function_summary): Print path counts.
	(file_summary): Likewise.
	(print_source_line): New.
	(print_prime_path_lines): New.
	(print_inlined_separator): New.
	(print_prime_path_source): New.
	(output_path_coverage): New.
	(output_lines): Print path coverage.
	* ipa-inline.cc (can_early_inline_edge_p): Check
	path_coverage_flag.
	* passes.cc (finish_optimization_passes): Likewise.
	* profile.cc (branch_prob): Likewise.
	* selftest-run-tests.cc (selftest::run_tests): Run path coverage
	tests.
	* selftest.h (path_coverage_cc_tests): New declaration.
	* tree-profile.cc (tree_profiling): Check path_coverage_flag.
	(pass_ipa_tree_profile::gate): Likewise.
	* path-coverage.cc: New file.
	* prime-paths.cc: New file.

gcc/testsuite/ChangeLog:

	* lib/gcov.exp: Add prime paths test function.
	* g++.dg/gcov/gcov-22.C: New test.
	* gcc.misc-tests/gcov-29.c: New test.
	* gcc.misc-tests/gcov-30.c: New test.
---
 gcc/Makefile.in                        |    6 +-
 gcc/builtins.cc                        |    2 +-
 gcc/collect2.cc                        |    5 +-
 gcc/common.opt                         |   14 +
 gcc/doc/gcov.texi                      |  155 ++
 gcc/doc/invoke.texi                    |   35 +
 gcc/gcc.cc                             |    4 +-
 gcc/gcov-counter.def                   |    3 +
 gcc/gcov-io.h                          |    3 +
 gcc/gcov.cc                            |  419 ++++-
 gcc/ipa-inline.cc                      |    2 +-
 gcc/passes.cc                          |    4 +-
 gcc/path-coverage.cc                   |  778 +++++++++
 gcc/prime-paths.cc                     | 2006 ++++++++++++++++++++++++
 gcc/profile.cc                         |    6 +-
 gcc/selftest-run-tests.cc              |    1 +
 gcc/selftest.h                         |    1 +
 gcc/testsuite/g++.dg/gcov/gcov-22.C    |  170 ++
 gcc/testsuite/gcc.misc-tests/gcov-29.c |  869 ++++++++++
 gcc/testsuite/gcc.misc-tests/gcov-30.c |  869 ++++++++++
 gcc/testsuite/lib/gcov.exp             |   92 +-
 gcc/tree-profile.cc                    |   11 +-
 22 files changed, 5438 insertions(+), 17 deletions(-)
 create mode 100644 gcc/path-coverage.cc
 create mode 100644 gcc/prime-paths.cc
 create mode 100644 gcc/testsuite/g++.dg/gcov/gcov-22.C
 create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-29.c
 create mode 100644 gcc/testsuite/gcc.misc-tests/gcov-30.c

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index f4bb4a88cf3..5797b2d097f 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1623,6 +1623,8 @@ OBJS = \
 	opts-global.o \
 	ordered-hash-map-tests.o \
 	passes.o \
+	prime-paths.o \
+	path-coverage.o \
 	plugin.o \
 	pointer-query.o \
 	postreload-gcse.o \
@@ -2879,6 +2881,8 @@ GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/tree-scalar-evolution.cc \
   $(srcdir)/tree-ssa-operands.h \
   $(srcdir)/tree-profile.cc $(srcdir)/tree-nested.cc \
+  $(srcdir)/path-coverage.cc \
+  $(srcdir)/prime-paths.cc \
   $(srcdir)/omp-offload.h \
   $(srcdir)/omp-general.cc \
   $(srcdir)/omp-low.cc \
@@ -3255,7 +3259,7 @@ s-version: build/genversion$(build_exeext)
 # gcov.o needs $(ZLIBINC) added to the include flags.
 CFLAGS-gcov.o += $(ZLIBINC)
 
-GCOV_OBJS = gcov.o json.o
+GCOV_OBJS = gcov.o json.o graphds.o prime-paths.o bitmap.o sbitmap.o
 gcov$(exeext): $(GCOV_OBJS) $(LIBDEPS)
 	+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) $(GCOV_OBJS) \
 		hash-table.o ggc-none.o $(LIBS) $(ZLIB) -o $@
diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index 0b902896ddd..cccaad469da 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -6323,7 +6323,7 @@ expand_builtin_fork_or_exec (tree fn, tree exp, rtx target, int ignore)
   tree call;
 
   /* If we are not profiling, just call the function.  */
-  if (!profile_arc_flag && !condition_coverage_flag)
+  if (!profile_arc_flag && !condition_coverage_flag && !path_coverage_flag)
     return NULL_RTX;
 
   /* Otherwise call the wrapper.  This should be equivalent for the rest of
diff --git a/gcc/collect2.cc b/gcc/collect2.cc
index 902014a9cc1..8be949d117a 100644
--- a/gcc/collect2.cc
+++ b/gcc/collect2.cc
@@ -1035,9 +1035,9 @@ main (int argc, char **argv)
       lto_mode = LTO_MODE_LTO;
   }
 
-  /* -fno-profile-arcs -fno-condition-coverage -fno-test-coverage
+  /* -fno-profile-arcs -fno-condition-coverage -fno-path-coverage -fno-test-coverage
      -fno-branch-probabilities -fno-exceptions -w -fno-whole-program */
-  num_c_args += 7;
+  num_c_args += 8;
 
   c_argv = XCNEWVEC (char *, num_c_args);
   c_ptr = CONST_CAST2 (const char **, char **, c_argv);
@@ -1234,6 +1234,7 @@ main (int argc, char **argv)
   obstack_free (&temporary_obstack, temporary_firstobj);
   *c_ptr++ = "-fno-profile-arcs";
   *c_ptr++ = "-fno-condition-coverage";
+  *c_ptr++ = "-fno-path-coverage";
   *c_ptr++ = "-fno-test-coverage";
   *c_ptr++ = "-fno-branch-probabilities";
   *c_ptr++ = "-fno-exceptions";
diff --git a/gcc/common.opt b/gcc/common.opt
index c4bda3b7da8..713f89aa5c7 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -881,6 +881,16 @@ Common Var(warn_too_many_conditions) Init(1) Warning
 Warn when a conditional has too many terms and condition coverage profiling
 gives up instrumenting the expression.
 
+fpath-coverage-limit=
+Common RejectNegative Joined UInteger Var(path_coverage_limit) Init(250000)
+-fpath-coverage-limit=<number>	Don't instrument functions path count exceeding <number>.
+
+Wcoverage-too-many-paths
+Common Var(warn_too_many_paths) Init(1) Warning
+Warn when a function exceeds the number of paths (controlled by
+-fpath-coverage-limit) and path coverage give up instrumenting the
+function.
+
 Wmissing-profile
 Common Var(warn_missing_profile) Init(1) Warning
 Warn in case profiles in -fprofile-use do not exist.
@@ -2403,6 +2413,10 @@ foptimize-sibling-calls
 Common Var(flag_optimize_sibling_calls) Optimization
 Optimize sibling and tail recursive calls.
 
+fpath-coverage
+Common Var(path_coverage_flag)
+Insert path profiling code.
+
 fpartial-inlining
 Common Var(flag_partial_inlining) Optimization
 Perform partial inlining.
diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index a9221738cce..9aa02dffff7 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -125,6 +125,9 @@ gcov [@option{-v}|@option{--version}] [@option{-h}|@option{--help}]
      [@option{-b}|@option{--branch-probabilities}]
      [@option{-c}|@option{--branch-counts}]
      [@option{-g}|@option{--conditions}]
+     [@option{-e}|@option{--prime-paths}]
+     [@option{--prime-paths-lines}]
+     [@option{--prime-paths-source}]
      [@option{-d}|@option{--display-progress}]
      [@option{-f}|@option{--function-summaries}]
      [@option{--include} @var{regex}]
@@ -181,6 +184,59 @@ your program at least once had an independent effect on the outcome of the
 boolean expression (modified condition/decision coverage).  This requires you
 to compile the source with @option{-fcondition-coverage}.
 
+@item -e
+@itemx --prime-paths
+Write path coverage to the output file, and write path summary info to
+the standard output.  This option allows you to see how many prime paths
+were taken at least once.  For the regular output this option only
+includes the number of paths covered.  For more fine grained information
+on uncovered paths you can use @option{--prime-paths-lines} or
+@option{--prime-paths-source}.  With @option{--json-format} all path details
+are included in the output.  This requires you to compile the source
+with @option{-fpath-coverage}.
+
+@itemx --prime-paths-lines
+Write path coverage to the output file, and write path summary info to
+the standard output.  This option allows you to see how many prime paths
+were taken at least once, and dense report on the uncovered paths an how
+to cover them.  This mode is useful for automated reporting and progress
+tracking.
+
+@smallexample
+paths covered 12 of 15
+path  2 not covered: lines 8 8(false) 11(true) 11 13(true) 13(true) 14 17
+path  3 not covered: lines 8 8(false) 11(true) 11 13(true) 13(false) 16 17
+path  4 not covered: lines 8 8(false) 11(true) 11 13(false) 16 17
+@end smallexample
+
+This means to cover path 2 you must run lines 8, 11, 13, 14, and 17,
+evaluting the decision at 8 false and the decisions at 11 and 13 to
+@code{false}.
+
+@itemx --prime-paths-source
+Write path coverage to the output file, and write path summary info to
+the standard output.  This option allows you to see how many prime paths
+were taken at least once, and detailed report on the uncovered paths an
+how to cover them.  This mode is useful for understanding paths and
+interactions between sections of your program.
+
+@smallexample
+path 10:
+BB  3:           8:  for (i = 0; i < 10; i++)
+BB  3:           9:    total += i;
+BB  4: (false)   8:  for (i = 0; i < 10; i++)
+BB  5: (true)   11:  int v = total > 100 ? 1 : 2;
+BB  6:          11:  int v = total > 100 ? 1 : 2;
+BB  8: (false)  13:  if (total != 45 && v == 1)
+BB 11:          16:    printf ("Success\n");
+BB 12:          17:  return 0;
+@end smallexample
+
+The first (BB) column is the sequence of basic blocks (see @option{-w}).
+The middle column (true/false) is the decision for that line.  The third
+column is the line number.  The fourth column is the line itself.  These
+lines must be run in this order to cover path 10.
+
 @item -d
 @itemx --display-progress
 Display the progress on the standard output.
@@ -989,6 +1045,105 @@ condition  1 not covered (true)
         -:   12:@}
 @end smallexample
 
+When you compile with @option{--coverage -fpath-coverage} and use the
+option @option{-e} your output looks like this:
+
+@smallexample
+$ gcov -t -e tmp
+        -:    0:Source:tmp.cpp
+        -:    0:Graph:tmp.gcno
+        -:    0:Data:tmp.gcda
+        -:    0:Runs:1
+        -:    1:#include <stdio.h>
+        -:    2:
+paths covered 4 of 15
+        1:    3:int main ()
+        -:    4:@{
+        -:    5:  int i, total;
+        1:    6:  total = 0;
+        -:    7:
+       11:    8:  for (i = 0; i < 10; i++)
+       10:    9:    total += i;
+        -:   10:
+       1*:   11:  int v = total > 100 ? 1 : 2;
+        -:   12:
+       1*:   13:  if (total != 45 && v == 1)
+    #####:   14:    printf ("Failure\n");
+        -:   15:  else
+        1:   16:    printf ("Success\n");
+        1:   17:  return 0;
+        -:   18:@}
+@end smallexample
+
+This output is useful to figure out roughly where coverage is missing
+and testing how different inputs change the coverage.  The
+@option{--prime-paths-source} is a useful tool for understanding paths.
+
+@smallexample
+$ gcov -t --prime-paths-source tmp
+        -:    0:Source:tmp.cpp
+        -:    0:Graph:tmp.gcno
+        -:    0:Data:tmp.gcda
+        -:    0:Runs:1
+        -:    1:#include <stdio.h>
+        -:    2:
+paths covered 4 of 15
+path 1:
+BB  2:           3:int main ()
+BB  2:           6:  total = 0;
+BB  2:           8:  for (i = 0; i < 10; i++)
+BB  4: (false)   8:  for (i = 0; i < 10; i++)
+BB  5: (true)   11:  int v = total > 100 ? 1 : 2;
+BB  6:          11:  int v = total > 100 ? 1 : 2;
+BB  8: (true)   13:  if (total != 45 && v == 1)
+BB  9: (true)   13:  if (total != 45 && v == 1)
+BB 10:          14:    printf ("Failure\n");
+BB 12:          17:  return 0;
+@end smallexample
+
+In this mode, gcov will print details on the missing paths.  The first
+column lists the sequence of basic blocks (BB).  The second column is
+the decision to take at that line if there is one.  The final columns
+are the line number and the line itself.  This is useful for
+understanding the paths, in particular those that are hard to cover or
+even unreachable.  Lines may be repeated, for example the @code{for}
+loop, if the same line is a part of multiple basic blocks.  This mode is
+intended for humans and good at understanding what code is exercised
+under testing or for given inputs.  This output is quite verbose, and
+for focusing on specific functions it can be combined with the filters
+@option{--include} and @option{--exclude}.
+
+A denser output is available with @option{--prime-paths-lines}, which
+looks like this:
+
+@smallexample
+        -:    0:Source:tmp.cpp
+        -:    0:Graph:tmp.gcno
+        -:    0:Data:tmp.gcda
+        -:    0:Runs:1
+        -:    1:#include <stdio.h>
+        -:    2:
+paths covered 4 of 15
+path  1 not covered: lines 8 8(false) 11(true) 11 13(true) 13(true) 14 17
+path  2 not covered: lines 8 8(false) 11(true) 11 13(true) 13(false) 16 17
+path  3 not covered: lines 8 8(false) 11(true) 11 13(false) 16 17
+path  4 not covered: lines 8 8(false) 11(false) 11 13(true) 13(true) 14 17
+path  5 not covered: lines 8 8(false) 11(false) 11 13(true) 13(false) 16 17
+path  6 not covered: lines 8 8(false) 11(false) 11 13(false) 16 17
+path  8 not covered: lines 9 8(false) 11(true) 11 13(true) 13(true) 14 17
+path  9 not covered: lines 9 8(false) 11(true) 11 13(true) 13(false) 16 17
+path 10 not covered: lines 9 8(false) 11(true) 11 13(false) 16 17
+path 11 not covered: lines 9 8(false) 11(false) 11 13(true) 13(true) 14 17
+path 12 not covered: lines 9 8(false) 11(false) 11 13(true) 13(false) 16 17
+        1:    3:int main ()
+        -:    4:@{
+@end smallexample
+
+In this mode, every missing path is expanded using the lines and
+decisions like @option{--prime-paths-source} but printed on a single
+line.  This mode provides a good overview over the paths and for
+tracking how different tests and inputs exercises the code.
+
 The execution counts are cumulative.  If the example program were
 executed again without removing the @file{.gcda} file, the count for the
 number of times each line in the source was executed would be added to
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4850c7379bf..676e2da3b60 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -644,6 +644,7 @@ Objective-C and Objective-C++ Dialects}.
 @xref{Instrumentation Options,,Program Instrumentation Options}.
 @gccoptlist{-p  -pg  -fprofile-arcs  --coverage  -ftest-coverage
 -fcondition-coverage
+-fpath-coverage
 -fprofile-abs-path
 -fprofile-dir=@var{path}  -fprofile-generate  -fprofile-generate=@var{path}
 -fprofile-info-section  -fprofile-info-section=@var{name}
@@ -6663,6 +6664,13 @@ terms and GCC gives up coverage.  Coverage is given up when there are more
 terms in the conditional than there are bits in a @code{gcov_type_unsigned}.
 This warning is enabled by default.
 
+@opindex Wno-coverage-too-many-paths
+@opindex Wcoverage-too-many-paths
+@item -Wno-coverage-too-many-paths
+Warn if @option{-fpath-coverage} is used and a function has too many
+paths and GCC gives up coverage.  Giving up is controlled by
+@option{-fpath-coverage-limit}.  This warning is enabled by default.
+
 @opindex Wno-coverage-invalid-line-number
 @opindex Wcoverage-invalid-line-number
 @item -Wno-coverage-invalid-line-number
@@ -17256,6 +17264,26 @@ can be used to verify that all terms in a Boolean function are tested and have
 an independent effect on the outcome of a decision.  The result can be read
 with @code{gcov --conditions}.
 
+@item -fpath-coverage
+@opindex fpath-coverage
+Add code so that the paths taken are tracked.  During execution the
+program records the prime paths taken.  The number of paths grows very
+fast with complexity, and to avoid exploding compile times GCC will give
+up instrumentation if the approximate number of paths exceeds the limit
+controlled by @option{-fpath-coverage-limit}.  The result can be read
+with @code{gcov --prime-paths --prime-paths-lines --prime-paths-source}.
+
+@item -fpath-coverage-limit=@var{limit}
+@opindex fpath-coverage-limit
+The threshold at which point @option{-fpath-coverage} gives up on
+instrumenting a function.  This limit is approximate as GCC uses a
+pessimistic heuristic which slightly overcounts the running number of
+paths, and gives up if the threshold is reached before finding all the
+paths.  This option is not for fine grained control over which functions
+to instrument - rather it is intended to limit the effect of path
+explosion and keep compile times reasonable.  The default is
+@var{250000}.
+
 @xref{Cross-profiling}.
 
 @cindex @command{gcov}
@@ -17322,6 +17350,13 @@ With @option{-fcondition-coverage}, for each conditional in your program GCC
 creates a bitset and records the exercised boolean values that have an
 independent effect on the outcome of that expression.
 
+With @option{-fpath-coverage}, GCC finds and enumerates the prime paths
+each function, unless the number of paths would exceed the limit
+controlled by @option{-fpath-coverage-limit}, and records the taken
+prime paths.  A prime path is the longest sequence of unique blocks,
+except possibly the first and last, which is not a subpath of any other
+path.
+
 @need 2000
 @opindex ftest-coverage
 @item -ftest-coverage
diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index d80b604a48d..4c361276aef 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -1165,7 +1165,7 @@ proper position among the other output files.  */
 	%:include(libgomp.spec)%(link_gomp)}\
     %{fgnu-tm:%:include(libitm.spec)%(link_itm)}\
     " STACK_SPLIT_SPEC "\
-    %{fprofile-arcs|fcondition-coverage|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
+    %{fprofile-arcs|fcondition-coverage|fpath-coverage|fprofile-generate*|coverage:-lgcov} " SANITIZER_SPEC " \
     %{!nostdlib:%{!r:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}}\
     %{!nostdlib:%{!r:%{!nostartfiles:%E}}} %{T*}  \n%(post_link) }}}}}}"
 #endif
@@ -1288,7 +1288,7 @@ static const char *cc1_options =
  %{!fsyntax-only:%{S:%W{o*}%{!o*:-o %w%b.s}}}\
  %{fsyntax-only:-o %j} %{-param*}\
  %{coverage:-fprofile-arcs -ftest-coverage}\
- %{fprofile-arcs|fcondition-coverage|fprofile-generate*|coverage:\
+ %{fprofile-arcs|fcondition-coverage|fpath-coverage|fprofile-generate*|coverage:\
    %{!fprofile-update=single:\
      %{pthread:-fprofile-update=prefer-atomic}}}";
 
diff --git a/gcc/gcov-counter.def b/gcc/gcov-counter.def
index 45d4b3eb0c8..65404135c81 100644
--- a/gcc/gcov-counter.def
+++ b/gcc/gcov-counter.def
@@ -52,3 +52,6 @@ DEF_GCOV_COUNTER(GCOV_TIME_PROFILER, "time_profiler", _time_profile)
 
 /* Conditions.  The counter is interpreted as a bit-set.  */
 DEF_GCOV_COUNTER(GCOV_COUNTER_CONDS, "conditions", _ior)
+
+/* Paths.  The counter is interpreted as a bit-set.  */
+DEF_GCOV_COUNTER(GCOV_COUNTER_PATHS, "ior", _ior)
diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h
index 5dc467c92b1..623ca56899a 100644
--- a/gcc/gcov-io.h
+++ b/gcc/gcov-io.h
@@ -264,6 +264,9 @@ typedef uint64_t gcov_type_unsigned;
 #define GCOV_TAG_CONDS		   ((gcov_unsigned_t)0x01470000)
 #define GCOV_TAG_CONDS_LENGTH(NUM) ((NUM) * 2 * GCOV_WORD_SIZE)
 #define GCOV_TAG_CONDS_NUM(LENGTH) (((LENGTH) / GCOV_WORD_SIZE) / 2)
+#define GCOV_TAG_PATHS		   ((gcov_unsigned_t)0x01490000)
+#define GCOV_TAG_PATHS_LENGTH(NUM) ((NUM) * GCOV_WORD_SIZE)
+#define GCOV_TAG_PATHS_NUM(LENGTH) (((LENGTH) / GCOV_WORD_SIZE))
 #define GCOV_TAG_LINES		 ((gcov_unsigned_t)0x01450000)
 #define GCOV_TAG_COUNTER_BASE 	 ((gcov_unsigned_t)0x01a10000)
 #define GCOV_TAG_COUNTER_LENGTH(NUM) ((NUM) * 2 * GCOV_WORD_SIZE)
diff --git a/gcc/gcov.cc b/gcc/gcov.cc
index 74ebcf10e4b..9c2520efd36 100644
--- a/gcc/gcov.cc
+++ b/gcc/gcov.cc
@@ -48,6 +48,7 @@ along with Gcov; see the file COPYING3.  If not see
 #include "json.h"
 #include "hwint.h"
 #include "xregex.h"
+#include "graphds.h"
 
 #include <zlib.h>
 #include <getopt.h>
@@ -84,6 +85,7 @@ class function_info;
 class block_info;
 class source_info;
 class condition_info;
+class path_info;
 
 /* Describes an arc between two basic blocks.  */
 
@@ -129,6 +131,45 @@ struct arc_info
   struct arc_info *pred_next;
 };
 
+/* Describes (prime) path coverage.  */
+class path_info
+{
+public:
+  path_info () : paths (), covered () {}
+
+  /* The prime paths of a function.  The paths will be
+     lexicographically ordered and identified by their index.  */
+  vector<vector<unsigned>> paths;
+
+  /* The covered paths.  This is really a large bitset partitioned
+     into buckets of gcov_type_unsigned, and bit n is set if the nth
+     path is covered.  */
+  vector<gcov_type_unsigned> covered;
+
+  /* The size (in bits) of each bucket.  */
+  static const size_t
+  bucketsize = sizeof (gcov_type_unsigned) * BITS_PER_UNIT;
+
+  /* Count the covered paths.  */
+  unsigned covered_paths () const
+  {
+    unsigned cnt = 0;
+    for (gcov_type_unsigned v : covered)
+      cnt += popcount_hwi (v);
+    return cnt;
+  }
+
+  /* Check if the nth path is covered.  */
+  bool covered_p (size_t n) const
+  {
+    if (covered.empty ())
+      return false;
+    const size_t bucket = n / bucketsize;
+    const uint64_t bit = n % bucketsize;
+    return covered[bucket] & (gcov_type_unsigned (1) << bit);
+  }
+};
+
 /* Describes which locations (lines and files) are associated with
    a basic block.  */
 
@@ -317,6 +358,9 @@ public:
 
   vector<condition_info*> conditions;
 
+  /* Path coverage information.  */
+  path_info paths;
+
   /* Raw arc coverage counts.  */
   vector<gcov_type> counts;
 
@@ -400,6 +444,9 @@ struct coverage_info
   int calls_executed;
 
   char *name;
+
+  unsigned paths;
+  unsigned paths_covered;
 };
 
 /* Describes a file mentioned in the block graph.  Contains an array
@@ -607,6 +654,15 @@ static bool flag_conditions = 0;
 /* Show unconditional branches too.  */
 static int flag_unconditional = 0;
 
+/* Output path coverage.  */
+static bool flag_prime_paths = false;
+
+/* Output path coverage - lines mode.  */
+static bool flag_prime_paths_lines = false;
+
+/* Output path coverage - source mode.  */
+static bool flag_prime_paths_source = false;
+
 /* Output a gcov file if this is true.  This is on by default, and can
    be turned off by the -n option.  */
 
@@ -750,9 +806,11 @@ static unsigned find_source (const char *);
 static void read_graph_file (void);
 static int read_count_file (void);
 static void solve_flow_graph (function_info *);
+static void find_prime_paths (function_info *fn);
 static void find_exception_blocks (function_info *);
 static void add_branch_counts (coverage_info *, const arc_info *);
 static void add_condition_counts (coverage_info *, const block_info *);
+static void add_path_counts (coverage_info &, const function_info &);
 static void add_line_counts (coverage_info *, function_info *);
 static void executed_summary (unsigned, unsigned);
 static void function_summary (const coverage_info *);
@@ -801,6 +859,17 @@ bool function_info::group_line_p (unsigned n, unsigned src_idx)
   return is_group && src == src_idx && start_line <= n && n <= end_line;
 }
 
+/* Find the arc that connects BLOCK to the block with id DEST.  This
+   edge must exist.  */
+static const arc_info&
+find_arc (const block_info &block, unsigned dest)
+{
+  for (const arc_info *arc = block.succ; arc; arc = arc->succ_next)
+    if (arc->dst->id == dest)
+      return *arc;
+  gcc_assert (false);
+}
+
 /* Cycle detection!
    There are a bajillion algorithms that do this.  Boost's function is named
    hawick_cycles, so I used the algorithm by K. A. Hawick and H. A. James in
@@ -1030,6 +1099,11 @@ print_usage (int error_p)
                                     rather than percentages\n");
   fnotice (file, "  -g, --conditions                Include modified condition/decision\n\
                                     coverage (masking MC/DC) in output\n");
+  fnotice (file, "  -e, --prime-paths               Show prime path coverage\n");
+  fnotice (file, "      --prime-paths-lines         Include uncovered paths in output\n\
+                                    line trace mode - does not affect json\n");
+  fnotice (file, "      --prime-paths-source        Include uncovered paths in output\n\
+                                    source trace mode - does not affect json\n");
   fnotice (file, "  -d, --display-progress          Display progress information\n");
   fnotice (file, "  -D, --debug			    Display debugging dumps\n");
   fnotice (file, "  -f, --function-summaries        Output summaries for each function\n");
@@ -1088,6 +1162,9 @@ static const struct option options[] =
   { "branch-probabilities", no_argument,       NULL, 'b' },
   { "branch-counts",        no_argument,       NULL, 'c' },
   { "conditions",	    no_argument,       NULL, 'g' },
+  { "prime-paths",	    no_argument,       NULL, 'e' },
+  { "prime-paths-lines",    no_argument,       NULL, 900 },
+  { "prime-paths-source",   no_argument,       NULL, 901 },
   { "json-format",	    no_argument,       NULL, 'j' },
   { "include",              required_argument, NULL, 'I' },
   { "exclude",              required_argument, NULL, 'E' },
@@ -1119,7 +1196,7 @@ process_args (int argc, char **argv)
 {
   int opt;
 
-  const char *opts = "abcdDfghHijklmMno:pqrs:tuvwx";
+  const char *opts = "abcdDefghHijklmMno:pqrs:tuvwx";
   while ((opt = getopt_long (argc, argv, opts, options, NULL)) != -1)
     {
       switch (opt)
@@ -1133,6 +1210,17 @@ process_args (int argc, char **argv)
 	case 'c':
 	  flag_counts = 1;
 	  break;
+	case 'e':
+	  flag_prime_paths = true;
+	  break;
+	case 900:
+	  flag_prime_paths = true;
+	  flag_prime_paths_lines = true;
+	  break;
+	case 901:
+	  flag_prime_paths = true;
+	  flag_prime_paths_source = true;
+	  break;
 	case 'f':
 	  flag_function_summary = 1;
 	  break;
@@ -1385,6 +1473,68 @@ get_gcov_intermediate_filename (const char *input_file_name)
   return str.c_str ();
 }
 
+/* Add prime path coverage from INFO to FUNCTION.  */
+static void
+json_set_prime_path_coverage (json::object &function, function_info &info)
+{
+  json::array *jpaths = new json::array ();
+  function.set_integer ("total_prime_paths", info.paths.paths.size ());
+  function.set_integer ("covered_prime_paths", info.paths.covered_paths ());
+  function.set ("prime_path_coverage", jpaths);
+
+  size_t pathno = 0;
+  for (const vector<unsigned> &path : info.paths.paths)
+    {
+      if (info.paths.covered_p (pathno++))
+	continue;
+
+      gcc_assert (!path.empty ());
+
+      json::object *jpath = new json::object ();
+      jpaths->append (jpath);
+      jpath->set_integer ("id", pathno - 1);
+
+      json::array *jlist = new json::array ();
+      jpath->set ("sequence", jlist);
+
+      for (size_t i = 0; i != path.size (); ++i)
+	{
+	  const unsigned bb = path[i];
+	  const block_info &block = info.blocks[bb];
+	  const char *edge_kind = "";
+	  if (i + 1 != path.size ())
+	    {
+	      const arc_info &arc = find_arc (block, path[i+1]);
+	      if (arc.false_value)
+		edge_kind = "true";
+	      else if (arc.false_value)
+		edge_kind = "false";
+	      else if (arc.fall_through)
+		edge_kind = "fallthru";
+	      else if (arc.is_throw)
+		edge_kind = "throw";
+	    }
+
+	  json::object *jblock = new json::object ();
+	  json::array *jlocs = new json::array ();
+	  jblock->set_integer ("block_id", block.id);
+	  jblock->set ("locations", jlocs);
+	  jblock->set_string ("edge_kind", edge_kind);
+	  jlist->append (jblock);
+	  for (const block_location_info &loc : block.locations)
+	    {
+	      json::object *jloc = new json::object ();
+	      json::array *jline_numbers = new json::array ();
+	      jlocs->append (jloc);
+	      jloc->set_string ("file", sources[loc.source_file_idx].name);
+	      jloc->set ("line_numbers", jline_numbers);
+	      for (unsigned line : loc.lines)
+		jline_numbers->append (new json::integer_number (line));
+	    }
+	}
+    }
+}
+
 /* Output the result in JSON intermediate format.
    Source info SRC is dumped into JSON_FILES which is JSON array.  */
 
@@ -1415,6 +1565,7 @@ output_json_intermediate_file (json::array *json_files, source_info *src)
       function->set_integer ("blocks_executed", (*it)->blocks_executed);
       function->set_integer ("execution_count", (*it)->blocks[0].count);
 
+      json_set_prime_path_coverage (*function, **it);
       functions->append (function);
     }
 
@@ -1629,6 +1780,9 @@ process_all_functions (void)
 	  solve_flow_graph (fn);
 	  if (fn->has_catch)
 	    find_exception_blocks (fn);
+
+	  /* For path coverage.  */
+	  find_prime_paths (fn);
 	}
       else
 	{
@@ -1698,6 +1852,8 @@ generate_results (const char *file_name)
       for (const block_info& block : fn->blocks)
 	add_condition_counts (&coverage, &block);
 
+      add_path_counts (coverage, *fn);
+
       function_summary (&coverage);
       fnotice (stdout, "\n");
     }
@@ -1742,6 +1898,9 @@ generate_results (const char *file_name)
 	    continue;
 	}
 
+      for (function_info *fn : src->functions)
+	add_path_counts (src->coverage, *fn);
+
       accumulate_line_counts (src);
       if (flag_debug)
 	src->debug ();
@@ -2201,6 +2360,13 @@ read_graph_file (void)
 	      info->n_terms = gcov_read_unsigned ();
 	      fn->conditions[i] = info;
 	    }
+    }
+      else if (fn && tag == GCOV_TAG_PATHS)
+	{
+	  const unsigned npaths = gcov_read_unsigned ();
+	  const size_t nbits = path_info::bucketsize;
+	  const size_t nbuckets = (npaths + (nbits - 1)) / nbits;
+	  fn->paths.covered.assign (nbuckets, 0);
 	}
       else if (fn && tag == GCOV_TAG_LINES)
 	{
@@ -2357,6 +2523,17 @@ read_count_file (void)
 	    for (ix = 0; ix != fn->counts.size (); ix++)
 	      fn->counts[ix] += gcov_read_counter ();
 	}
+      else if (tag == GCOV_TAG_FOR_COUNTER (GCOV_COUNTER_PATHS) && fn)
+	{
+	  vector<gcov_type_unsigned> &covered = fn->paths.covered;
+	  length = abs (read_length);
+	  if (length != GCOV_TAG_COUNTER_LENGTH (covered.size ()))
+	    goto mismatch;
+
+	  if (read_length > 0)
+	    for (ix = 0; ix != covered.size (); ix++)
+	      covered[ix] = gcov_read_counter ();
+	}
       if (read_length < 0)
 	read_length = 0;
       gcov_sync (base, read_length);
@@ -2640,6 +2817,67 @@ solve_flow_graph (function_info *fn)
       }
 }
 
+/* Find the prime paths of the function from the CFG and add to FN
+   using the same function as gcc.  It relies on gcc recording the CFG
+   faithfully.  Storing the paths explicitly takes up way too much
+   space to be practical, but this means we need to recompute the
+   (exact) same paths in gcov.  This should give paths in
+   lexicographical order so that the nth path in gcc is the nth path
+   in gcov.  ENTRY_BLOCK and EXIT_BLOCK are both removed from all
+   paths.  */
+static void
+find_prime_paths (function_info *fn)
+{
+  if (!flag_prime_paths)
+    return;
+
+  /* If paths.covered being empty then this function was not
+     instrumented, probably because it exceeded #-of-paths limit.  In
+     this case we don't want to find the prime paths as it will take
+     too long, and covered paths are not measured.  */
+  if (fn->paths.covered.empty ())
+    return;
+
+  struct graph *cfg = new_graph (fn->blocks.size ());
+  for (block_info &block : fn->blocks)
+    {
+      cfg->vertices[block.id].data = &block;
+      for (arc_info *arc = block.succ; arc; arc = arc->succ_next)
+	if (!arc->fake)
+	  add_edge (cfg, arc->src->id, arc->dst->id)->data = arc;
+    }
+
+  vec<vec<int>> prime_paths (struct graph*, size_t);
+  /* TODO: Pass extra information in the PATH_TAG section.  In case
+     that is empty this might still need to be tunable should the
+     coverage be requested without instrumentation.  */
+  vec<vec<int>> paths = prime_paths (cfg, (size_t)-1);
+  fn->paths.paths.reserve (paths.length ());
+  for (vec<int> &path : paths)
+    {
+      const int *begin = path.begin ();
+      const int *end = path.end ();
+      if (begin != end && path.last () == EXIT_BLOCK)
+	--end;
+      if (begin != end && *begin == ENTRY_BLOCK)
+	++begin;
+
+      if (begin == end)
+	continue;
+
+      /* If this is an isolated vertex because abnormal edges and fake
+	 edges are removed, don't include it.  */
+      if (end - begin == 1 && !cfg->vertices[*begin].succ
+	  && !cfg->vertices[*begin].pred)
+	continue;
+
+      fn->paths.paths.emplace_back (begin, end);
+    }
+
+  release_vec_vec (paths);
+  free_graph (cfg);
+}
+
 /* Mark all the blocks only reachable via an incoming catch.  */
 
 static void
@@ -2700,6 +2938,16 @@ add_condition_counts (coverage_info *coverage, const block_info *block)
   coverage->conditions_covered += block->conditions.popcount ();
 }
 
+/* Increment path totals, number of paths and number of covered paths,
+   in COVERAGE according to FN.  */
+
+static void
+add_path_counts (coverage_info &coverage, const function_info &fn)
+{
+  coverage.paths += fn.paths.paths.size ();
+  coverage.paths_covered += fn.paths.covered_paths ();
+}
+
 /* Format COUNT, if flag_human_readable_numbers is set, return it human
    readable format.  */
 
@@ -2802,6 +3050,16 @@ function_summary (const coverage_info *coverage)
       else
 	fnotice (stdout, "No conditions\n");
     }
+
+  if (flag_prime_paths)
+    {
+      if (coverage->paths)
+	fnotice (stdout, "Prime paths covered:%s of %d\n",
+		 format_gcov (coverage->paths_covered, coverage->paths, 2),
+			      coverage->paths);
+      else
+	fnotice (stdout, "No path information\n");
+    }
 }
 
 /* Output summary info for a file.  */
@@ -2846,6 +3104,16 @@ file_summary (const coverage_info *coverage)
       else
 	fnotice (stdout, "No conditions\n");
     }
+
+  if (flag_prime_paths)
+    {
+      if (coverage->paths)
+	fnotice (stdout, "Prime paths covered:%s of %d\n",
+		 format_gcov (coverage->paths_covered, coverage->paths, 2),
+			      coverage->paths);
+      else
+	fnotice (stdout, "No path information\n");
+    }
 }
 
 /* Canonicalize the filename NAME by canonicalizing directory
@@ -3264,6 +3532,153 @@ output_branch_count (FILE *gcov_file, int ix, const arc_info *arc)
   return 1;
 }
 
+static void
+print_source_line (FILE *f, const vector<const char *> &source_lines,
+		   unsigned line);
+
+
+/* Print a dense coverage report for PATH of FN to GCOV_FILE.  PATH should be
+   number PATHNO in the sorted set of paths.  This function prints a dense form
+   where only the line numbers, and optionally the source file the line comes
+   from, in the order they need to be executed to achieve coverage.  This
+   produces very long lines for large functions, but is a useful and greppable
+   output.
+
+   Returns 0 if the path is covered, and 1 if not covered.  */
+static unsigned
+print_prime_path_lines (FILE *gcov_file, const function_info &fn,
+			const vector<unsigned> &path, size_t pathno)
+{
+  if (fn.paths.covered_p (pathno))
+    return 0;
+
+  fprintf (gcov_file, "path %2zu not covered: lines", pathno);
+  for (size_t k = 0; k != path.size (); ++k)
+    {
+      const block_info &block = fn.blocks[path[k]];
+      const char *edge_kind = "";
+      if (k + 1 != path.size ())
+	{
+	  gcc_checking_assert (block.id == path[k]);
+	  const arc_info &arc = find_arc (block, path[k+1]);
+	  if (arc.true_value)
+	    edge_kind = "(true)";
+	  else if (arc.false_value)
+	    edge_kind = "(false)";
+	  else if (arc.is_throw)
+	    edge_kind = "(throw)";
+	}
+
+      for (const block_location_info &loc : block.locations)
+	{
+	  if (loc.source_file_idx == fn.src)
+	    fprintf (gcov_file, " %u%s", loc.lines.back (), edge_kind);
+	  else
+	    fprintf (gcov_file, " %s:%u%s", sources[loc.source_file_idx].name,
+		     loc.lines.back (), edge_kind);
+	}
+    }
+
+  fprintf (gcov_file, "\n");
+  return 1;
+}
+
+static unsigned
+print_inlined_separator (FILE *gcov_file, unsigned current_index, const
+			 block_location_info &loc, const function_info &fn)
+{
+  if (loc.source_file_idx != current_index && loc.source_file_idx == fn.src)
+    fprintf (gcov_file, "------------------\n");
+  if (loc.source_file_idx != current_index && loc.source_file_idx != fn.src)
+    fprintf (gcov_file, "== inlined from %s ==\n",
+	     sources[loc.source_file_idx].name);
+  return loc.source_file_idx;
+}
+
+/* Print a coverage report for PATH of FN to GCOV_FILE.  PATH should be number
+   PATHNO in the sorted set of paths.  This function prints the lines that need
+   to be executed (and in what order) to cover it.
+
+   Returns 0 if the path is covered, and 1 if not covered.  */
+static unsigned
+print_prime_path_source (FILE *gcov_file, const function_info &fn,
+			 const vector<unsigned> &path, size_t pathno)
+{
+  if (fn.paths.covered_p (pathno))
+    return 0;
+  fprintf (gcov_file, "path %zu:\n", pathno);
+  unsigned current = fn.src;
+  for (size_t k = 0; k != path.size (); ++k)
+    {
+      const unsigned bb = path[k];
+      const block_info &block = fn.blocks[bb];
+      gcc_checking_assert (block.id == bb);
+
+      const char *edge_kind = "";
+      if (k + 1 != path.size ())
+	{
+	  const arc_info &arc = find_arc (block, path[k+1]);
+	  if (arc.true_value)
+	    edge_kind = "(true)";
+	  else if (arc.false_value)
+	    edge_kind = "(false)";
+	  else if (arc.is_throw)
+	    edge_kind = "(throw)";
+	}
+
+      for (const block_location_info &loc : block.locations)
+	{
+	  const source_info &src = sources[loc.source_file_idx];
+	  const vector<const char *> &lines = slurp (src, gcov_file, "");
+	  current = print_inlined_separator (gcov_file, current, loc, fn);
+	  for (unsigned i = 0; i != loc.lines.size () - 1; ++i)
+	    {
+	      const unsigned line = loc.lines[i];
+	      fprintf (gcov_file, "BB %2d: %-7s %3d", bb, "", line);
+	      print_source_line (gcov_file, lines, line);
+	    }
+
+	  const unsigned line = loc.lines.back ();
+	  fprintf (gcov_file, "BB %2d: %-7s %3d", bb, edge_kind, line);
+	  print_source_line (gcov_file, lines, line);
+	}
+    }
+
+  fputc ('\n', gcov_file);
+  return 1;
+}
+
+/* Print path coverage counts for FN to GCOV_FILE.  LINES is the vector of
+   source lines for FN.  Note that unlike statements, branch counts, and
+   conditions, this is not anchored to source lines but the function root.  */
+static int
+output_path_coverage (FILE *gcov_file, const function_info *fn)
+{
+  if (!flag_prime_paths)
+    return 0;
+
+  if (fn->paths.paths.empty ())
+    fnotice (gcov_file, "path coverage omitted\n");
+  else
+    fnotice (gcov_file, "paths covered %u of %zu\n",
+	     fn->paths.covered_paths (), fn->paths.paths.size ());
+
+  if (flag_prime_paths_lines)
+    {
+      size_t pathno = 0;
+      for (const vector<unsigned> &path : fn->paths.paths)
+	print_prime_path_lines (gcov_file, *fn, path, pathno++);
+    }
+
+  if (flag_prime_paths_source)
+    {
+      size_t pathno = 0;
+      for (const vector<unsigned> &path : fn->paths.paths)
+	print_prime_path_source (gcov_file, *fn, path, pathno++);
+    }
+  return 1;
+}
+
 static const char *
 read_line (FILE *file)
 {
@@ -3598,6 +4013,7 @@ output_lines (FILE *gcov_file, const source_info *src)
 	    {
 	      function_info *fn = (*fns)[0];
 	      output_function_details (gcov_file, fn);
+	      output_path_coverage (gcov_file, fn);
 
 	      /* If functions are filtered, only the matching functions will be in
 		 fns and there is no need for extra checking.  */
@@ -3643,6 +4059,7 @@ output_lines (FILE *gcov_file, const source_info *src)
 	      fprintf (gcov_file, "%s:\n", fn_name.c_str ());
 
 	      output_function_details (gcov_file, fn);
+	      output_path_coverage (gcov_file, fn);
 
 	      /* Print all lines covered by the function.  */
 	      for (unsigned i = 0; i < lines.size (); i++)
diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc
index 9fc41b7696d..f04db904ce5 100644
--- a/gcc/ipa-inline.cc
+++ b/gcc/ipa-inline.cc
@@ -699,7 +699,7 @@ can_early_inline_edge_p (struct cgraph_edge *e)
     }
   gcc_assert (gimple_in_ssa_p (DECL_STRUCT_FUNCTION (e->caller->decl))
 	      && gimple_in_ssa_p (DECL_STRUCT_FUNCTION (callee->decl)));
-  if ((profile_arc_flag || condition_coverage_flag)
+  if ((profile_arc_flag || condition_coverage_flag || path_coverage_flag)
       && ((lookup_attribute ("no_profile_instrument_function",
 			    DECL_ATTRIBUTES (caller->decl)) == NULL_TREE)
 	  != (lookup_attribute ("no_profile_instrument_function",
diff --git a/gcc/passes.cc b/gcc/passes.cc
index d73f8ba97b6..010a98d9bca 100644
--- a/gcc/passes.cc
+++ b/gcc/passes.cc
@@ -352,8 +352,8 @@ finish_optimization_passes (void)
   gcc::dump_manager *dumps = m_ctxt->get_dumps ();
 
   timevar_push (TV_DUMP);
-  if (profile_arc_flag || condition_coverage_flag || flag_test_coverage
-      || flag_branch_probabilities)
+  if (profile_arc_flag || condition_coverage_flag || path_coverage_flag
+      || flag_test_coverage || flag_branch_probabilities)
     {
       dumps->dump_start (pass_profile_1->static_pass_number, NULL);
       end_branch_prob ();
diff --git a/gcc/path-coverage.cc b/gcc/path-coverage.cc
new file mode 100644
index 00000000000..f1bfe883b20
--- /dev/null
+++ b/gcc/path-coverage.cc
@@ -0,0 +1,778 @@
+/* Calculate prime path coverage.
+   Copyright (C) 2024-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "diagnostic-core.h"
+#include "memmodel.h"
+#include "target.h"
+#include "function.h"
+#include "basic-block.h"
+#include "tree.h"
+#include "tree-cfg.h"
+#include "tree-nested.h"
+#include "cfg.h"
+#include "gimple.h"
+#include "gimple-iterator.h"
+#include "gimplify.h"
+#include "coverage.h"
+#include "ssa.h"
+#include "vec.h"
+#include "sbitmap.h"
+#include "graphds.h"
+#include "hash-map.h"
+
+bool mark_dfs_back_edges (struct function *);
+vec<vec<int>> prime_paths (struct graph*, size_t);
+
+namespace
+{
+
+/* Check if all the successors of BB are abnormal, e.g. setjmp.  */
+bool
+all_abnormal_succs_p (basic_block bb)
+{
+  for (edge e : bb->succs)
+    if (!(e->flags & EDGE_ABNORMAL))
+      return false;
+  return true;
+}
+
+/* Build a struct graph equivalent to the CFG for the function FN.  The caller
+   must free the returned graph with free_graph.  The data field of every
+   vertex and edge point to the basic blocks and edges in the CFG.
+
+   The CFG recording and gcov is not aware of abnormal edges, so they are
+   ignored here, too.  This means some paths are lost, e.g. those that involve
+   setjmp/longjmp.  They are still paths but would need more support from
+   profile.cc and gcov.cc to work.  */
+struct graph*
+cfg_as_graph (struct function* fn)
+{
+  struct graph *g = new_graph (n_basic_blocks_for_fn (fn));
+  basic_block entry = ENTRY_BLOCK_PTR_FOR_FN (fn);
+  basic_block exit = EXIT_BLOCK_PTR_FOR_FN (fn);
+
+  g->vertices[entry->index].data = entry;
+  g->vertices[exit->index].data = exit;
+  basic_block top = single_succ (entry);
+  add_edge (g, entry->index, top->index)->data = single_succ_edge (entry);
+
+  const unsigned ignore = EDGE_FAKE | EDGE_ABNORMAL | EDGE_ABNORMAL_CALL;
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fn)
+    {
+      g->vertices[bb->index].data = bb;
+      for (edge e : bb->succs)
+	if (!(e->flags & ignore))
+	  add_edge (g, e->src->index, e->dest->index)->data = e;
+    }
+  return g;
+}
+
+/* Find the prime paths for the function FN with the ENTRY and EXIT blocks
+   removed.  This can lead to empty paths when there is a fake edge to the exit
+   block, for example for abort functions or infinite loops.  Empty paths are
+   removed because the length of the returned vec is important.  */
+vec<vec<int>>
+find_prime_paths (struct function *fn)
+{
+  struct graph *cfg = cfg_as_graph (fn);
+  vec<vec<int>> paths = prime_paths (cfg, path_coverage_limit);
+
+  bool any_empty = false;
+  for (vec<int> &path : paths)
+    {
+      /* setjmp calls will be isolated basic blocks when ABNORMAL_EDGEs are
+	 removed.  If a path is made up of just such a vertex it is pointless
+	 and can be removed.  */
+      if (path.length () == 1
+	  && all_abnormal_succs_p (BASIC_BLOCK_FOR_FN (fn, path[0])))
+	path.pop ();
+      if (!path.is_empty () && path[0] == ENTRY_BLOCK)
+	path.ordered_remove (0);
+      if (!path.is_empty () && path.last () == EXIT_BLOCK)
+	path.pop ();
+
+      if (path.is_empty ())
+	{
+	  any_empty = true;
+	  path.release ();
+	}
+    }
+
+  unsigned ix1, ix2;
+  vec<int> *ptr;
+  if (any_empty)
+    VEC_ORDERED_REMOVE_IF (paths, ix1, ix2, ptr, ptr->is_empty ());
+
+  return paths;
+}
+
+/* Return the edge between SRC and DST.  */
+edge
+edge_between (struct function *fn, int src, int dst)
+{
+  basic_block bbsrc = BASIC_BLOCK_FOR_FN (fn, src);
+  basic_block bbdst = BASIC_BLOCK_FOR_FN (fn, dst);
+  for (edge e : bbsrc->succs)
+    if (e->dest == bbdst)
+      return e;
+  gcc_checking_assert (false);
+  return NULL;
+}
+
+/* Visitor for topsort.  */
+void
+topsort1 (basic_block b, vec<basic_block>& L, sbitmap visited)
+{
+  if (bitmap_bit_p (visited, b->index))
+    return;
+
+  for (edge e : b->succs)
+    if (!(e->flags & EDGE_DFS_BACK))
+      topsort1 (e->dest, L, visited);
+
+  bitmap_set_bit (visited, b->index);
+  L.quick_push (b);
+}
+
+/* Get the basic blocks of FN in topological order so that all predecessors
+   come before the vertex, while ignoring back edges.  This is a depth-first
+   approach similar to the algorithm described in Cormen et al (2001).  */
+vec<basic_block>
+topsort (struct function *fn)
+{
+  vec<basic_block> L {};
+  L.reserve (n_basic_blocks_for_fn (fn));
+
+  auto_sbitmap visited (n_basic_blocks_for_fn (fn));
+  bitmap_clear (visited);
+  bitmap_set_bit (visited, EXIT_BLOCK);
+
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, fn)
+    topsort1 (bb, L, visited);
+
+  L.reverse ();
+  return L;
+}
+
+/* Hasher for maps where the key is a pair of edge/basic_block and bucket id
+   (size_t).  */
+template <typename T>
+using bucket_hash = pair_hash <nofree_ptr_hash <T>,
+			       int_hash <size_t, -1>>;
+
+/* Hasher for {edge, bucket-id} keys.  */
+using edge_hash = bucket_hash <edge_def>;
+/* Hasher for {basic_block, bucket-id} keys.  */
+using block_hash = bucket_hash <basic_block_def>;
+
+/* Find the union of possible preserved by bitwise-ANDs for the bucket BUCKET
+   of BB.  If this returns zero no paths go through BB at BUCKET.  */
+uint64_t
+union_incoming_bit_and (hash_map <edge_hash, uint64_t> &ands,
+			const basic_block bb, size_t bucket)
+{
+  uint64_t inc = 0;
+  for (edge e : bb->preds)
+    {
+      const uint64_t *val = ands.get ({e, bucket});
+      inc |= (val ? *val : 0);
+    }
+  return inc;
+}
+
+
+/* Check if the incoming edges have the power to modify the paths in flight for
+   BUCKET.  If all the incoming edges would apply the same bitmask the
+   BB+BUCKET will not actually update the path set, and we don't need to emit
+   an AND_EXPR and have a later fork distinguish the paths.  */
+bool
+can_change_path_p (hash_map <edge_hash, uint64_t> &ands,
+		   const basic_block bb, size_t bucket, uint64_t all_and)
+{
+  for (edge e : bb->preds)
+    {
+      const uint64_t *e_and = ands.get ({e, bucket});
+      if (!e_and || *e_and != all_and)
+	return true;
+    }
+  return false;
+}
+
+/* Check if all bits in BITSET are 1 for the target size TARGET_SIZE.  For a
+   32-bit target only the bits in the lower half should be set, and this should
+   return true when all bits in the lower half are set, even if the bitset type
+   have room for more bits.  */
+bool
+all_bits_set_p (uint64_t bitset, size_t target_size)
+{
+  return (size_t)popcount_hwi (bitset) == target_size;
+}
+
+/* Create a constant or SSA name of GCOV_TYPE_NODE type and zero-assign to it
+   safely on the edge E.  If the edge is abnormal it is assumed the phi is
+   abnormal and we need an SSA name, otherwise fall back to a constant.  The
+   returned value is safe to use with add_phi_arg ().
+
+   If the edge is abnormal we cannot insert on it directly, and instead
+   carefully add the assignment on the source block.  If that source block ends
+   with control flow (like those produced by _setjmp) we must insert before to
+   not break that invariant, otherwise insert after so that things like the
+   setjmp receiver is the first element of the basic block.  Doing the assign
+   is necessary as phis cannot resolve to constants.  */
+tree
+safe_insert_zero_phi_arg (edge e, tree gcov_type_node)
+{
+  tree cst = build_zero_cst (gcov_type_node);
+  if (!(e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL)))
+    return cst;
+
+  tree zero = make_ssa_name (gcov_type_node);
+  gassign *put = gimple_build_assign (zero, cst);
+  gimple_stmt_iterator gsi = gsi_start_nondebug_after_labels_bb (e->src);
+  if (gimple_call_internal_p (*gsi, IFN_ABNORMAL_DISPATCHER))
+    gsi_insert_before (&gsi, put, GSI_NEW_STMT);
+  else
+    gsi_insert_after (&gsi, put, GSI_NEW_STMT);
+
+  return zero;
+}
+
+/* Check if SSA is a constant value (created by build_int_cst) and can be
+   folded.  */
+bool
+constant_p (tree ssa)
+{
+  return tree_fits_uhwi_p (ssa);
+}
+
+/* A fixup task.  When resolving the exit SSA for a back edge arg to a phi
+   node, the exit SSA has not been created yet.  Record what needs to be done
+   when it is created, and tie the phi to the right SSA name once it is
+   guaranteed it is created.  If MASK is nullptr the predecessor's SSA should
+   be used as-is, and does not need an AND.  This should only be created with
+   the helpers fixup_noop and fixup_and.  */
+struct fixup
+{
+  gphi *phi;
+  edge e;
+  tree lhs;
+  tree mask;
+  size_t bucket;
+};
+
+/* Create a fixup with a no-op for the PHI in BUCKET. Use this when the edge E
+   does not need an AND applied and should rather propagate the predecessor's
+   SSA name.  */
+fixup
+fixup_noop (gphi *phi, edge e, size_t bucket)
+{
+  fixup todo;
+  todo.phi = phi;
+  todo.e = e;
+  todo.bucket = bucket;
+  todo.lhs = nullptr;
+  todo.mask = nullptr;
+  return todo;
+}
+
+/* Create a fixup for PHI through BUCKET with the exit SSA name E->src ANDed
+   with MASK (E->src & MASK).  GCOV_TYPE_NODE should be a tree of the gcov type
+   node for creating SSA names.  */
+fixup
+fixup_and (gphi *phi, edge e, size_t bucket, uint64_t mask,
+	   tree gcov_type_node)
+{
+  fixup todo;
+  todo.phi = phi;
+  todo.e = e;
+  todo.bucket = bucket;
+  todo.lhs = make_ssa_name (gcov_type_node);
+  todo.mask = build_int_cst (gcov_type_node, mask);
+  return todo;
+}
+
+/* Insert LOCAL |= MASK on BB safely, even when BB is a returns_twice block.
+
+   This is a helper to safely emit code for updating the path bitmask in the
+   presence of abnormal edges and returns_twice blocks, since they have special
+   restrictions on edge splits and first/last instructions on the block.  */
+tree
+safe_insert_ior (basic_block bb, tree local, tree mask, gphi *phi,
+		 tree gcov_type_node)
+{
+  gimple_stmt_iterator gsi = gsi_after_labels (bb);
+  gimple *stmt = gsi_stmt (gsi);
+
+  /* This is a returns_twice block (e.g. setjmp) which does not really expect
+     anything before or after, so we cannot insert the IOR on the block itself.
+     We move the IORs to the predecessors instead and update the phi.  The
+     abnormal edge cannot be split, so in that case we carefully put the IOR
+     after the ABNORMAL_DISPATCHER.  */
+  if (stmt && is_gimple_call (stmt)
+      && (gimple_call_flags (stmt) & ECF_RETURNS_TWICE))
+    {
+      for (edge e : bb->preds)
+	{
+	  gcc_checking_assert (phi);
+	  tree next = make_ssa_name (gcov_type_node);
+	  tree prev = gimple_phi_arg_def_from_edge (phi, e);
+	  gassign *put = gimple_build_assign (next, BIT_IOR_EXPR, prev, mask);
+	  gimple_stmt_iterator gsi = gsi_last_bb (e->src);
+	  add_phi_arg (phi, next, e, UNKNOWN_LOCATION);
+
+	  if (e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL))
+	    gsi_insert_before (&gsi, put, GSI_SAME_STMT);
+	  else
+	    gsi_insert_on_edge (e, put);
+	}
+    }
+  else
+    {
+      tree next = make_ssa_name (gcov_type_node);
+      gassign *put = gimple_build_assign (next, BIT_IOR_EXPR, local, mask);
+      gsi_insert_before (&gsi, put, GSI_SAME_STMT);
+      local = next;
+    }
+  return local;
+}
+
+/* Insert instructions update the global counter at BUCKET with the contents of
+   (LOCAL & MASK).  LOCAL is the SSA counter for this bucket, MASK the bits
+   that should be updated (that is, the paths that end in this basic block).
+   If ATOMIC_IOR is non-null the it uses atomics.  GCOV_TYPE_NODE should be a
+   tree of the gcov type node for creating SSA names.
+
+   global[BUCKET] |= (LOCAL & MASK)
+
+   If MASK is null, no mask is applied and it becomes:
+
+   global[BUCKET] |= LOCAL
+
+   This function should only be necessary for the successor of an
+   ABNORMAL_DISPATCHER e.g. the destination of a longjmp.  Paths starting at a
+   longjmp do not have anything to flush, so those edges are simply ignored.
+   Since this is a returns_twice block we cannot put anything before (or really
+   after), so we instrument on the edge itself, rather than messing with
+   splitting and changing the graph now.
+
+   This is similar to gsi_safe_insert_before, but this function does not
+   immediately commit edge inserts and does not split blocks.  */
+void
+flush_on_edges (basic_block bb, size_t bucket, tree local, tree mask,
+		tree atomic_ior, tree gcov_type_node)
+{
+  gimple *def = SSA_NAME_DEF_STMT (local);
+  gphi *phi = nullptr;
+  if (is_a <gphi *> (def))
+    phi = as_a <gphi *> (def);
+
+  tree relaxed = nullptr;
+  if (atomic_ior)
+    relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED);
+
+  for (edge e : bb->preds)
+    {
+      if (e->flags & (EDGE_ABNORMAL | EDGE_ABNORMAL_CALL))
+	continue;
+
+      tree global = tree_coverage_counter_ref (GCOV_COUNTER_PATHS, bucket);
+      if (phi)
+	local = gimple_phi_arg_def_from_edge (phi, e);
+
+      tree global_copy = make_ssa_name (gcov_type_node);
+      gassign *ga1 = gimple_build_assign (global_copy, global);
+      gsi_insert_on_edge (e, ga1);
+
+      tree masked;
+      if (mask)
+	{
+	  masked = make_ssa_name (gcov_type_node);
+	  gassign *ga2 = gimple_build_assign (masked, BIT_AND_EXPR, local, mask);
+	  gsi_insert_on_edge (e, ga2);
+	}
+      else
+	masked = local;
+
+      if (atomic_ior)
+	{
+	  global = unshare_expr (global);
+	  gcall *call = gimple_build_call (atomic_ior, 3, build_addr (global),
+					   masked, relaxed);
+	  gsi_insert_on_edge (e, call);
+	}
+      else
+	{
+	  tree tmp = make_ssa_name (gcov_type_node);
+	  gassign *ga3 = gimple_build_assign (tmp, BIT_IOR_EXPR, global_copy, masked);
+	  gassign *ga4 = gimple_build_assign (unshare_expr (global), tmp);
+	  gsi_insert_on_edge (e, ga3);
+	  gsi_insert_on_edge (e, ga4);
+	}
+    }
+}
+
+/* Insert instructions update the global counter at BUCKET with the contents of
+   (LOCAL & MASK).  LOCAL is the SSA counter for this bucket, MASK the bits
+   that should be updated (that is, the paths that end in this basic block).
+   If ATOMIC_IOR is non-null the it uses atomics.  GCOV_TYPE_NODE should be a
+   tree of the gcov type node for creating SSA names.
+
+   global[BUCKET] |= (LOCAL & MASK)
+
+   If MASK is null, no mask is applied and it becomes:
+
+   global[BUCKET] |= LOCAL
+
+   This function should be used on every block except returns_twice blocks (see
+   flush_on_edge) as all incoming edges can use the same flushing code.  */
+void
+flush_on_gsi (gimple_stmt_iterator *gsi, size_t bucket, tree local, tree mask,
+	      tree atomic_ior, tree gcov_type_node)
+{
+  tree relaxed = nullptr;
+  if (atomic_ior)
+    relaxed = build_int_cst (integer_type_node, MEMMODEL_RELAXED);
+  tree global = tree_coverage_counter_ref (GCOV_COUNTER_PATHS, bucket);
+
+  tree global_copy = make_ssa_name (gcov_type_node);
+  gassign *ga1 = gimple_build_assign (global_copy, global);
+  gsi_insert_before (gsi, ga1, GSI_SAME_STMT);
+
+  tree masked;
+  if (mask)
+    {
+      masked = make_ssa_name (gcov_type_node);
+      gassign *ga2 = gimple_build_assign (masked, BIT_AND_EXPR, local, mask);
+      gsi_insert_before (gsi, ga2, GSI_SAME_STMT);
+    }
+  else
+    masked = local;
+
+  if (atomic_ior)
+    {
+      global = unshare_expr (global);
+      gcall *call = gimple_build_call (atomic_ior, 3, build_addr (global),
+				       masked, relaxed);
+      gsi_insert_before (gsi, call, GSI_SAME_STMT);
+    }
+  else
+    {
+      tree tmp = make_ssa_name (gcov_type_node);
+      gassign *ga3 = gimple_build_assign (tmp, BIT_IOR_EXPR, global_copy,
+					  masked);
+      gassign *ga4 = gimple_build_assign (unshare_expr (global), tmp);
+      gsi_insert_before (gsi, ga3, GSI_SAME_STMT);
+      gsi_insert_before (gsi, ga4, GSI_SAME_STMT);
+    }
+}
+
+} // namespace
+
+
+/* Instrument FN for prime path coverage, enabled by -fpath-coverage.  The
+   number of paths grows very fast with the complexity (control flow) which
+   explodes compile times and object file size.  Giving up is controlled by the
+   -fpath-coverage-limit flag.  The paths are sorted lexicographically by basic
+   block id, and each path is identified by its index in the sorted set of
+   paths, which in turn corresponds to a bit in a large bitset associated with
+   FN.  The monitoring code is a few bitwise operations added to edges and
+   basic blocks to maintain a set of live paths (note that many paths may
+   overlap and that many paths may be covered by the same input).  When
+   reaching the final vertex of a path the global counters are updated.
+
+   This is made more complicated by the gcov type having very limited size.  To
+   support functions with more than 32/64 paths the bitmap is implemented on
+   top of a sequence of gcov integers, "buckets", where path N is recorded as
+   bit N%64 in bucket N/64.  For large functions, an individual basic block
+   will only be part of a small subset of paths, and by extension buckets and
+   local counters.  Only the necessary counters are read and written.  */
+void
+find_paths (struct function *fn)
+{
+  mark_dfs_back_edges (fn);
+  vec<vec<int>> paths = find_prime_paths (fn);
+
+  if (paths.is_empty ())
+    {
+      warning_at (fn->function_start_locus, OPT_Wcoverage_too_many_paths,
+		  "paths exceeding limit, giving up path coverage");
+      release_vec_vec (paths);
+      return;
+    }
+
+  tree gcov_type_node = get_gcov_type ();
+  const size_t bucketsize = TYPE_PRECISION (gcov_type_node);
+  const size_t nbuckets = (paths.length () + (bucketsize - 1)) / bucketsize;
+  gcc_checking_assert (sizeof (uint64_t) * BITS_PER_UNIT >= bucketsize);
+
+  if (!coverage_counter_alloc (GCOV_COUNTER_PATHS, nbuckets))
+    {
+      release_vec_vec (paths);
+      return;
+    }
+
+  gcov_position_t offset {};
+  offset = gcov_write_tag (GCOV_TAG_PATHS);
+  gcov_write_unsigned (paths.length ());
+  gcov_write_length (offset);
+
+  hash_map <edge_hash, uint64_t> ands;
+  hash_map <block_hash, uint64_t> iors;
+  hash_map <block_hash, uint64_t> flushes;
+
+  /* Now that we have all paths we must figure out what bitmasks must be
+     applied on the edges.  We also record in which basic block the path starts
+     (e.g.  accumulator resets) and ends (accumulator flushes).  */
+  for (size_t pathno = 0; pathno != paths.length (); ++pathno)
+    {
+      const vec<int> &path = paths[pathno];
+      const size_t bucket = pathno / bucketsize;
+      const uint64_t bit = uint64_t (1) << (pathno % bucketsize);
+
+      basic_block first = BASIC_BLOCK_FOR_FN (fn, path[0]);
+      basic_block last = BASIC_BLOCK_FOR_FN (fn, path[path.length () - 1]);
+
+      for (unsigned i = 1; i != path.length (); ++i)
+	{
+	  edge e = edge_between (fn, path[i-1], path[i]);
+	  ands.get_or_insert ({e, bucket}) |= bit;
+	}
+
+      iors.get_or_insert ({first, bucket}) |= bit;
+      flushes.get_or_insert ({last, bucket}) |= bit;
+    }
+
+  /* The basic blocks (except entry, exit) for this function, in topological
+     order.  Processing in this order means that the predecessor(s) exit SSAs
+     will have been created by the time a block is processed, unless it is a
+     loop/back edge.  This simplifies processing a bit.  */
+  vec<basic_block> blocks = topsort (fn);
+
+  /* The exit SSA names for the BLOCK, the SSA name the BLOCK successors should
+     use as inputs.  */
+  hash_map<block_hash, tree> SSAex;
+  /* The entry SSA name for the BLOCK.  This name forms the basis for the
+     flushing to the global accumulators.  In the presence of phi nodes this is
+     the resolved phi, otherwise it is some predecessor's exit SSA name.  */
+  hash_map<block_hash, tree> SSAen;
+
+  auto_vec<fixup, 4> todos;
+
+  /* Generate the operations for each basic block.  */
+  for (basic_block bb : blocks)
+    {
+      for (size_t bucket = 0; bucket != nbuckets; ++bucket)
+	{
+	  tree ssa = nullptr;
+	  gphi *phi = nullptr;
+	  uint64_t all_and = union_incoming_bit_and (ands, bb, bucket);
+
+	  if (all_and && single_pred_p (bb))
+	    {
+	      /* There is a path on this edge through the bucket, but since
+		 there is only one predecessor there it has no decisive power.
+		 Just push the predecessor's exit and have later ANDs sort it
+		 out.  */
+	      tree *prev = SSAex.get ({single_pred (bb), bucket});
+	      gcc_checking_assert (prev);
+	      ssa = *prev;
+	    }
+	  else if (all_and)
+	    {
+	      /* There are multiple predecessors, so we need a phi.  */
+	      ssa = make_ssa_name (gcov_type_node);
+	      phi = create_phi_node (ssa, bb);
+	    }
+
+	  if (ssa)
+	    SSAen.put ({bb, bucket}, ssa);
+
+	  if (single_pred_p (bb) && single_succ_p (bb))
+	    {
+	      /* Straight line -- the AND mask will already have been applied
+		 to the first ancestor of this chain, so we don't need to apply
+		 it here.  */
+	    }
+	  else if (!can_change_path_p (ands, bb, bucket, all_and))
+	    {
+	      /* All incoming edges would apply the same mask, so applying the
+		 AND here would not actually distinguish paths.  Such an AND
+		 will be applied later anyway so we don't need to apply it
+		 here.  This is a huge improvement for large programs.  */
+	    }
+	  else for (edge e : bb->preds)
+	    {
+	      const uint64_t *bitmask = ands.get ({e, bucket});
+	      /* There is no phi, and there are no paths through this bucket.
+		 Set the SSA name to nullptr so we don't contaminate it by
+		 pushing unrelated predecessors.  */
+	      if (!bitmask && !phi)
+		ssa = nullptr;
+	      else if (!bitmask && phi)
+		{
+		  tree zero = safe_insert_zero_phi_arg (e, gcov_type_node);
+		  add_phi_arg (phi, zero, e, UNKNOWN_LOCATION);
+		}
+	      else if (all_bits_set_p (*bitmask, bucketsize) && !phi)
+		{
+		  /* This reduces to a no-op (x & ~0) and there is no phi
+		     selection, so just push the SSA.  */
+		  gcc_checking_assert (ssa);
+		}
+	      else if (all_bits_set_p (*bitmask, bucketsize) && phi)
+		{
+		  /* This reduces to a no-op (x & ~0).  Reusing the SSA and not
+		     emitting an unecessary AND is a big improvement for large
+		     programs.  */
+		  tree *prev = SSAex.get ({e->src, bucket});
+		  if (prev)
+		    add_phi_arg (phi, *prev, e, UNKNOWN_LOCATION);
+		  else
+		    todos.safe_push (fixup_noop (phi, e, bucket));
+		}
+	      else if (SSAex.get ({e->src, bucket}))
+		{
+		  /* We need to apply a mask, folding if possible.  If there is
+		     a phi it is already the latest-touched ssa.  */
+		  tree prev = *SSAex.get ({e->src, bucket});
+		  gcc_assert (prev);
+		  if (constant_p (prev))
+		    {
+		      const uint64_t x = tree_to_uhwi (prev);
+		      tree cst = build_int_cst (gcov_type_node, x & *bitmask);
+		      if (phi)
+			add_phi_arg (phi, cst, e, UNKNOWN_LOCATION);
+		      else
+			ssa = cst;
+		    }
+		  else
+		    {
+		      tree lhs = make_ssa_name (gcov_type_node);
+		      tree mask = build_int_cst (gcov_type_node, *bitmask);
+		      gassign *put = gimple_build_assign (lhs, BIT_AND_EXPR, prev, mask);
+		      gsi_insert_on_edge (e, put);
+		      if (phi)
+			add_phi_arg (phi, lhs, e, UNKNOWN_LOCATION);
+		      else
+			ssa = lhs;
+		    }
+		}
+	      else
+		{
+		  /* There is a phi and this edge is a back edge,
+		     which means the predecessor (and descendant) exit
+		     SSA has not been created yet.  */
+		  gcc_assert (phi);
+		  gcc_assert (e->flags & EDGE_DFS_BACK);
+		  fixup todo = fixup_and (phi, e, bucket, *bitmask,
+					  gcov_type_node);
+		  todos.safe_push (todo);
+		}
+	    }
+
+	  /* Bitwise IOR.  Unlike the AND this assignment can always be created
+	     right away as this should be applied to the result of the phi,
+	     AND, or single predecessor's exit SSA, and all of those have
+	     already been created.  */
+	  const uint64_t *ior = iors.get ({bb, bucket});
+	  if (ior && !ssa)
+	    {
+	      /* In case there was no predecessor, the IOR/initial state can
+		 just be a constant.  In this case, the IOR also becomes the
+		 block's entry node which means it will be considered for
+		 flushing in single-vertex paths.  */
+	      ssa = build_int_cst (gcov_type_node, *ior);
+	      SSAen.put ({bb, bucket}, ssa);
+	    }
+	  else if (ior && all_bits_set_p (*ior, bucketsize))
+	    ssa = build_all_ones_cst (gcov_type_node);
+	  else if (ior)
+	    {
+	      gcc_assert (ssa);
+	      tree mask = build_int_cst (gcov_type_node, *ior);
+	      ssa = safe_insert_ior (bb, ssa, mask, phi, gcov_type_node);
+	    }
+
+	  if (ssa)
+	    SSAex.put ({bb, bucket}, ssa);
+	}
+    }
+
+  /* Apply fixups -- now that all exit SSA names are created we can properly
+     set the phi argument if there is a phi node, and emit the (x & mask)
+     instruction if necessary.  */
+  for (fixup &todo : todos)
+    {
+      tree *exit = SSAex.get ({todo.e->src, todo.bucket});
+      gcc_assert (exit && *exit);
+      gcc_checking_assert (todo.phi);
+      if (todo.mask)
+	{
+	  gassign *put = gimple_build_assign (todo.lhs, BIT_AND_EXPR, *exit,
+					      todo.mask);
+	  gsi_insert_on_edge (todo.e, put);
+	}
+
+      add_phi_arg (todo.phi, todo.lhs ? todo.lhs : *exit, todo.e,
+		   UNKNOWN_LOCATION);
+    }
+
+  tree atomic_ior = nullptr;
+  if (flag_profile_update == PROFILE_UPDATE_ATOMIC)
+    atomic_ior = builtin_decl_explicit
+      (TYPE_PRECISION (gcov_type_node) > 32
+       ? BUILT_IN_ATOMIC_FETCH_OR_8
+       : BUILT_IN_ATOMIC_FETCH_OR_4);
+
+  /* Finally, add instructions to update the global counters.  */
+  for (basic_block bb : blocks)
+    {
+      gimple_stmt_iterator gsi = gsi_after_labels (bb);
+      for (size_t bucket = 0; bucket != nbuckets; ++bucket)
+	{
+	  const uint64_t *bitmask = flushes.get ({bb, bucket});
+	  if (!bitmask || !*bitmask)
+	    continue;
+
+	  tree *counter = SSAen.get ({bb, bucket});
+	  gcc_checking_assert (counter);
+	  if (!*counter)
+	    continue;
+
+	  tree mask = nullptr;
+	  if (!all_bits_set_p (*bitmask, bucketsize))
+	    mask = build_int_cst (gcov_type_node, *bitmask);
+
+	  if (bb_has_abnormal_pred (bb))
+	    flush_on_edges (bb, bucket, *counter, mask, atomic_ior,
+			    gcov_type_node);
+	  else
+	    flush_on_gsi (&gsi, bucket, *counter, mask, atomic_ior,
+			  gcov_type_node);
+	}
+    }
+
+  blocks.release ();
+  release_vec_vec (paths);
+}
diff --git a/gcc/prime-paths.cc b/gcc/prime-paths.cc
new file mode 100644
index 00000000000..cbd3d87e34c
--- /dev/null
+++ b/gcc/prime-paths.cc
@@ -0,0 +1,2006 @@
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "obstack.h"
+#include "sbitmap.h"
+#include "vec.h"
+#include "graphds.h"
+#include "selftest.h"
+
+namespace
+{
+
+/* Counter for the number of candidate paths to generate before giving up.  It
+   is neater to use a global because it has to be checked deep in helper
+   functions, which may also suffer under path explosion.  It is a heuristic
+   guaranteed to overshoot the number of actual paths (which is difficult to
+   estimate), and is intended to give up on (absurdly) large functions with
+   millions of paths, not be a high fidelity rejection mechanism.  */
+size_t approx_limit;
+
+/* Record approximately APPROX new paths. Returns true if the limit is
+   exceeded and coverage should give up.  */
+bool
+limit_checked_add (size_t approx)
+{
+  approx_limit -= approx < approx_limit ? approx : approx_limit;
+  return approx_limit == 0;
+};
+
+/* A silly RAII wrapper for struct graph.  The prime_paths function has multiple
+   returns, and this helps reliably clean up.  */
+struct auto_graph
+{
+  auto_graph (struct graph *graph) : ptr (graph) {}
+  auto_graph (const auto_graph &) = delete;
+  ~auto_graph () { free_graph (ptr); }
+  operator struct graph* () { return ptr; }
+  struct graph* operator -> () { return ptr; }
+  graph *ptr;
+};
+
+/* A silly RAII wrapper for an sbitmap vector.  The prime_paths function has
+   multiple returns, and this helps reliably clean up.  */
+struct auto_sbitmap_vector
+{
+  auto_sbitmap_vector (sbitmap *s) : ptr (s) {}
+  auto_sbitmap_vector (const auto_sbitmap_vector &) = delete;
+  ~auto_sbitmap_vector () { sbitmap_vector_free (ptr); }
+  operator sbitmap* () { return ptr; }
+  sbitmap* ptr;
+};
+
+/* A silly RAII wrpaper for automatically releasing a vec<vec<int>>.  */
+struct auto_vec_vec : vec<vec<int>>
+{
+  ~auto_vec_vec () { release_vec_vec (*this); }
+};
+
+/* A silly RAII wrpaper for automatically releasing a vec<vec<vec<int>>>.  */
+struct auto_vec_vec_vec : vec<vec<vec<int>>>
+{
+  ~auto_vec_vec_vec ()
+  {
+    for (vec<vec<int>> &v : *this)
+      release_vec_vec (v);
+    release ();
+  }
+};
+
+/* A trivial key/value pair for a short linear map type.  */
+struct xpair
+{
+  int key;
+  unsigned val;
+};
+
+/* A node in a trie, optimized for mid-sized alphabets possibly larger than 256
+   but not much more.  Finding the prime paths ends up creating a large amount
+   of these nodes so space and access costs matters a lot.
+
+   The node does not explicitly store its own key (CFG vertex ID/basic block
+   index), nor does it store pointers to its successors. Rather, it stores the
+   key+offset pairs for its successors the root trie object, and in a sense
+   behaves like near pointers.  This makes the trie vertices small and
+   reloctable, and removes the need for pointer chasing when releasing the trie.
+
+   The union of near/far is essentially a short-vector optimization, switching
+   to a heap-allocated vector when necessary. This happens relatively rarely
+   (usually maxes out at 1-2%), and the vertices that have more than 2 sucessors
+   also tend to have more than 4. The root vertex tends to use the dynamic
+   vector because the subpaths are recorded as the successors of the root.
+
+   Conceptually, this is a small map from vertex-id -> index and the API is
+   modelled as such.  The insert and search functions are unrolled by hand when
+   using the small vector.  This has a noticable performance impact on insert in
+   particular, and is not too complex since we know we are limited to 2
+   elements.
+
+   Vertices are tagged with endofpath and inserted.  If endofpath is set, the
+   path from the root to this vertex is a complete path.  If inserted is set
+   then the vertex is a part of proper path (one given to insert) and not
+   generated as a suffix.  For example:
+
+   insert ([2 4 6])
+   insert ([9 7 2 4 6])
+   insert ([2 4 6 8])
+
+   The inserted flags for [2 4 6] are not cleared, because otherwise [2 4 6 8]
+   would be dropped when only following inserted vertices.  The endofpath flag
+   in [2 4 6] is cleared when the suffixes of [9 7 2 4 6] are inserted.
+
+   The node will be inserted into a vec, and should be trivial.  Instances
+   should be value-initialized to zero-intialized state.  */
+struct trie_node
+{
+  unsigned length () const
+  { return !heaped ? len : far.length (); }
+
+  const xpair *begin () const
+  { return !heaped ? near : far.begin (); }
+
+  const xpair *end () const
+  { return !heaped ? (near + len) : far.end(); }
+
+  /* Get the ith successor.  This is used for traversal and not lookup, and
+     should only be used by the iterator.  */
+  const xpair &at (unsigned i) const
+  { return !heaped ? near[i] : far[i]; }
+
+  const xpair *get (int key) const;
+  void put (int key, unsigned val);
+
+  unsigned near_lower_bound (int key) const;
+
+  /* Experimentally I found that using a union with 2 elements in the near array
+     to be faster than 4 or without the union (very slightly).  A lot of trie
+     vertices will be created, and vast majority of vertices will have 1 or 2
+     successors (straight line or if-then), and the cost of size and copying
+     adds up.  */
+  union
+  {
+    xpair near[2];
+    vec<xpair> far;
+  };
+  unsigned len : 8;
+  unsigned endofpath : 1;
+  unsigned inserted : 1;
+  unsigned heaped : 1;
+};
+
+/* Compare LHS.key < RHS.key, for use with vec.lower_bound.  */
+bool
+xpair_less (const xpair& lhs, const xpair& rhs)
+{
+  return lhs.key < rhs.key;
+}
+
+/* Compare LHS.key to RHS.key, for use with vec.bsearch.  */
+int
+xpair_cmp (const void *lhs, const void *rhs)
+{
+  return ((const xpair*)lhs)->key - ((const xpair*)rhs)->key;
+}
+
+/* Get a pointer to the element at KEY if it exists, otherwise NULL.  */
+const xpair*
+trie_node::get (int key) const
+{
+  if (!heaped)
+    {
+      if (len == 0) return NULL;
+      if (len >= 1 && key == near[0].key) return near + 0;
+      if (len >= 2 && key == near[1].key) return near + 1;
+      return NULL;
+    }
+  else
+    {
+      xpair kv;
+      kv.key = key;
+      return const_cast <vec<xpair>&> (far).bsearch (&kv, xpair_cmp);
+    }
+}
+
+/* Put ("emplace") VAL at KEY, extending the paths that pass through this
+   vertex.  This function assumes that KEY is not already a successor, and does
+   not perform this check.  get () should be called and checked for NULL putting
+   with this function.  Put maintains the order of the successors.  */
+void
+trie_node::put (int key, unsigned val)
+{
+  xpair kv;
+  kv.key = key;
+  kv.val = val;
+  if (!heaped)
+    {
+      const unsigned i = near_lower_bound (key);
+      if (len < 2)
+	{
+	  near[1] = near[0];
+	  near[i] = kv;
+	  len += 1;
+	}
+      else
+	{
+	  /* This insert is the 3rd element, which does not fit in the embedded
+	     storage, so we must create a vector and convert to a far node.  */
+	  vec<xpair> xs {};
+	  xs.reserve (13);
+	  xs.quick_grow (3);
+	  gcc_checking_assert (i <= 2);
+	  if (i == 0)
+	    {
+	      xs[0] = kv;
+	      xs[1] = near[0];
+	      xs[2] = near[1];
+	    }
+	  else if (i == 1)
+	    {
+	      xs[0] = near[0];
+	      xs[1] = kv;
+	      xs[2] = near[1];
+	    }
+	  else
+	    {
+	      xs[0] = near[0];
+	      xs[1] = near[1];
+	      xs[2] = kv;
+	    }
+
+	  far = xs;
+	  heaped = 1;
+	}
+    }
+  else
+    {
+      const unsigned i = far.lower_bound (kv, xpair_less);
+      far.safe_insert (i, kv);
+    }
+}
+
+/* Get the index to the last element that compares less than KEY, similar to
+   vec.lower_bound.  This assumes the near vector is active, and is for internal
+   use.  */
+unsigned
+trie_node::near_lower_bound (int key) const
+{
+  gcc_checking_assert (!heaped);
+  if (len == 0) return 0;
+  if (len >= 1 && key < near[0].key) return 0;
+  if (len >= 2 && key < near[1].key) return 1;
+  return len;
+}
+
+/* The trie is a major workhorse for this algorithm.  It has two key properties
+   - set-like subpath elimination and sorted output.
+
+   Many evaluated paths will be non-prime, that is, a sequence of vertices that
+   is also fully embedded in a longer sequence of vertices.  For example the
+   path [3 4 5 8] is a subpath of both [2 3 4 5 8] and [3 4 5 8 10].  The
+   insert_with_suffix function maintains this property so that inserting a
+   subpath into the trie is effectively a no-op, and inserting a superpath will
+   effectively remove (unmark) the subpath.  Sometimes it can be guaranteed that
+   no redundant (subpaths) will be generated, in which case the insert function
+   can be used.  The insert function is really only set insert, only becoming a
+   no-op for identical paths, which will be a lot faster.
+
+   Paths can be extracted with an iterator, which will output paths in
+   lexicographically sorted order.  This is an important property because the
+   index of a path in the sorted set will be used by the coverage to record when
+   a path is taken and completed.  The iterator has different behavior than the
+   standard C++ iterators, and to avoid mixups the interface is deliberately
+   different.  The iterator has a (large) stack which is not cheap to copy, and
+   if the stack is shallow copied it would mean iterator copies have non-local
+   effects.  */
+struct trie
+{
+  struct iter;
+  trie ();
+  trie (const trie &o);
+  trie (trie &&o);
+  ~trie ();
+
+  bool insert (const vec<int>&);
+  bool insert (const array_slice<const int>);
+  bool hard_insert (const array_slice<const int>);
+  bool insert_with_suffix (const array_slice<const int>);
+  bool insert_suffix (const array_slice<const int>);
+
+  void merge (const trie&);
+
+  iter paths (vec<int>&) const;
+  iter paths (vec<int>&, int from) const;
+
+  vec<vec<int>> paths () const;
+
+  size_t size () const { return len; }
+
+  vec<trie_node> vertices;
+  size_t len;
+
+  /* An iterator for the paths of the trie.  The iterator yields all paths in
+     lexicographical order.  The iterator will be invalidated on any insertion
+     into the trie.  The iterator should not be constructed directly, but
+     through the paths functions on the trie.  It is essentially an explicit
+     stack depth-first traversal.
+
+     The iter fills a user-provided buffer which should only be read as a when
+     the iter is active.  Whenever next returns true the buffer is filled with
+     the current path.  Uses will generally look like this:
+
+     vec<int> path {};
+     auto iter = trie.paths (path);
+     while (iter.next ())
+       use_path (path);
+*/
+  struct iter
+  {
+    iter (vec<int>&, const vec<trie_node>&);
+    iter (int first, vec<int>& path, const vec<trie_node> &vertices);
+    ~iter ()
+    { stack.release (); }
+
+    bool next ();
+    bool next (int);
+    bool next (bool);
+
+
+    /* This is the analog of the stack frame when implementing a recursive
+       depth-first path traversal and collecting paths to the leafs:
+
+       for (auto successor : vertex[ix])
+	 {
+	   path.push (successor.value);
+	   collect (successor.ix, successor.begin, successor.end, path)
+	   path.pop ();
+	 }
+
+       Using size_t + 2x unsigned helped make the frame more compact and faster
+       than pointers.  */
+    struct frame
+    {
+      /* The index of this frame's vertex, so that vertices[ix].  */
+      size_t ix;
+      /* The index of the current active successor of vertices[ix].  */
+      unsigned itr;
+      /* The end of vertices[ix] successors.  When itr == end, vertex[ix] is
+	 exhausted.  */
+      unsigned end;
+    };
+
+    /* User provided buffer to fill with the paths.  */
+    vec<int> &path;
+    /* Direct reference to the trie vertices vector.  */
+    const vec<trie_node> &vertices;
+    /* The call stack.  */
+    vec<frame> stack;
+    /* Yield flag.  If this is true then next () is permitted to and return a
+       new value.  If this is false, a value has already been yielded and next
+       must first reset the state before building the next value.  */
+    bool yield = true;
+
+    iter (const iter& o) : path (o.path), vertices (o.vertices),
+      stack (o.stack.copy ()), yield (o.yield)
+    {
+    }
+
+    /* Delete the copy assignment as the iter stores references and would cause
+       bad bugs.  It is not necessary for the iterator to work well.  To support
+       these the references would need to be (explicit) pointers.  */
+    iter& operator = (const iter& o) = delete;
+  };
+};
+
+/* Construct an iterator filling BUFFER.  */
+trie::iter
+trie::paths (vec<int> &buffer) const
+{
+  buffer.truncate (0);
+  return iter (buffer, vertices);
+}
+
+/* Construct an iterator filling BUFFER for paths starting at FROM.  */
+trie::iter
+trie::paths (vec<int>& buffer, int from) const
+{
+  buffer.truncate (0);
+  return iter (from, buffer, vertices);
+}
+
+/* Default construct a new trie.  */
+trie::trie () : vertices (vec<trie_node> {}), len (0)
+{
+  vertices.safe_push (trie_node {});
+  vertices[0].inserted = true;
+}
+
+/* Copy construct a new trie.  */
+trie::trie (const trie &o) : vertices (o.vertices.copy ()), len (o.len)
+{
+}
+
+/* Move construct a new trie.  */
+trie::trie (trie &&o) : vertices (o.vertices), len (o.len)
+{
+  o.vertices = {};
+  o.len = 0;
+}
+
+/* Destroy a trie and release all the heaped resources.  */
+trie::~trie ()
+{
+  for (trie_node &v : vertices)
+    if (v.heaped)
+      v.far.release ();
+  vertices.release ();
+}
+
+/* Insert PATH into the trie.  */
+bool
+trie::insert (const vec<int>& path)
+{
+  return insert (array_slice <const int> (path));
+}
+
+/* Insert the PATH into the trie.  Duplicate entries will not be entered twice.
+   If PATH is a subpath of another path this will not be detected or if there is
+   a path previously inserted that is a subpath of PATH then this redundancy
+   will not be eliminated.  For that behavior, call insert_with_suffix.  */
+bool
+trie::insert (array_slice<const int> path)
+{
+  size_t index = 0;
+  size_t partition = 0;
+  for (const int v : path)
+    {
+      trie_node &current = vertices[index];
+      current.inserted = true;
+      partition++;
+
+      const auto *xp = current.get (v);
+      if (xp)
+	{
+	  index = xp->val;
+	}
+      else
+	{
+	  /* A new vertex on this path has been created, which means the rest of
+	     the path will also have to be created.  Drain the path and create
+	     the remaining vertices in a single operation.  */
+	  unsigned ix = vertices.length ();
+	  current.put (v, ix);
+	  current.endofpath = false;
+
+	  array_slice<const int> tail (path.begin () + partition,
+				       path.size () - partition);
+	  vertices.safe_grow_cleared (1 + ix + tail.size ());
+
+	  for (const int v : tail)
+	    {
+	      trie_node &last = vertices[ix];
+	      ix += 1;
+	      last.put (v, ix);
+	      last.inserted = true;
+	    }
+
+	  vertices.last ().endofpath = true;
+	  vertices.last ().inserted = true;
+	  len += 1;
+	  return true;
+	}
+    }
+
+  return false;
+}
+
+/* hard_insert is like insert, except it does not overwrite any endofpath flags,
+   and records the endofpath flag even when a superpath of PATH has been
+   inserted previously.  This effectively disables subpath elimination.  */
+bool
+trie::hard_insert (array_slice<const int> path)
+{
+  size_t index = 0;
+  size_t partition = 0;
+  for (const int v : path)
+    {
+      trie_node &current = vertices[index];
+      current.inserted = true;
+      partition++;
+
+      const auto *xp = current.get (v);
+      if (xp)
+	{
+	  index = xp->val;
+	}
+      else
+	{
+	  unsigned ix = vertices.length ();
+	  current.put (v, ix);
+
+	  array_slice<const int> tail (path.begin () + partition,
+				       path.size () - partition);
+	  vertices.safe_grow_cleared (1 + ix + tail.size ());
+
+	  for (const int v : tail)
+	    {
+	      trie_node &last = vertices[ix];
+	      ix += 1;
+	      last.put (v, ix);
+	      last.inserted = true;
+	    }
+
+	  vertices.last ().endofpath = true;
+	  vertices.last ().inserted = true;
+	  len += 1;
+	  return true;
+	}
+    }
+
+  vertices[index].endofpath = true;
+  return false;
+}
+
+/* Insert a suffixes of PATH.  This is identical to insert except that it is
+   assumed that PATH is a subpath, and that the inserted path should clear the
+   inserted and endofpath flags.  This function should only be called by
+   insert_with_suffix.  */
+bool
+trie::insert_suffix (array_slice<const int> path)
+{
+  size_t index = 0;
+  size_t partition = 0;
+  for (const int v : path)
+    {
+      trie_node &current = vertices[index];
+      current.endofpath = false;
+      partition++;
+
+      const auto *xp = current.get (v);
+      if (xp)
+	{
+	  index = xp->val;
+	}
+      else
+	{
+	  /* A new vertex on this path has been created, which means the rest of
+	     the path will also have to be created.  Drain the path and create
+	     the remaining vertices in a single operation.  */
+	  unsigned ix = vertices.length ();
+	  current.put (v, ix);
+
+	  array_slice<const int> tail (path.begin () + partition,
+				       path.size () - partition);
+	  vertices.safe_grow_cleared (1 + ix + tail.size ());
+
+	  for (const int v : tail)
+	    {
+	      vertices[ix].put (v, ix + 1);
+	      ix += 1;
+	    }
+
+	  return true;
+	}
+    }
+
+  vertices[index].endofpath = false;
+  return false;
+}
+
+/* Insert the paths from OTHER into this trie.  */
+void
+trie::merge (const trie& other)
+{
+  auto_vec<int, 32> p {};
+  iter itr = other.paths (p);
+  while (itr.next ())
+    insert_with_suffix (p);
+}
+
+/* Insert PATH and all its subpaths into the trie.  This function implements the
+   redundancy property of the trie - if an inserted path is either a subpath or
+   superpath of some other path then only the longest will keep its inserted
+   flag.  */
+bool
+trie::insert_with_suffix (array_slice<const int> path)
+{
+  bool inserted = insert (path);
+  path = array_slice<const int> (path.begin () + 1, path.size () - 1);
+  while (inserted && !path.empty ())
+    {
+      inserted = insert_suffix (path);
+      path = array_slice<const int> (path.begin () + 1, path.size () - 1);
+    }
+  return inserted;
+}
+
+/* Convert the paths of a trie to a vec-of-vec.  */
+vec<vec<int>>
+trie::paths () const
+{
+  vec<int> path {};
+  vec<vec<int>> all {};
+  auto iter = paths (path);
+  while (iter.next ())
+    all.safe_push (path.copy ());
+  return all;
+}
+
+/* Create an iterator over VERTICES with the caller-provided buffer PATH.  */
+trie::iter::iter (vec<int> &path, const vec<trie_node> &vertices) : path (path),
+  vertices (vertices), stack (vec<frame> {})
+{
+  gcc_checking_assert (!vertices.is_empty ());
+  stack.reserve (13);
+  frame f;
+  f.ix = 0;
+  f.itr = 0;
+  f.end = vertices[0].length ();
+  stack.quick_push (f);
+}
+
+/* Create an iterator over VERTICES with the caller-provided buffer PATH for all
+   paths and subpaths (suffixes) starting in FROM.  Note that FROM will not be
+   in the output buffer PATH, mainly because non-rooted paths are used when
+   splicing with paths that end in FROM.  */
+trie::iter::iter (int from, vec<int> &path, const vec<trie_node> &vertices) :
+  path (path), vertices (vertices), stack (vec<frame> {})
+{
+  gcc_checking_assert (!vertices.is_empty ());
+  stack.reserve (13);
+
+  auto *xp = vertices[0].get (from);
+  if (!xp)
+    {
+      /* No paths start with FROM, so construct an iterator where next () always
+	 returns false.  */
+      frame f;
+      f.ix = 0;
+      f.itr = 0;
+      f.end = 0;
+      stack.safe_push (f);
+      return;
+    }
+
+  frame f;
+  f.ix = xp->val;
+  f.itr = 0;
+  f.end = vertices[f.ix].length ();
+  stack.safe_push (f);
+}
+
+/* Find the next complete prime path in the trie and write it to the caller's
+   buffer.  Returns true if a path is written and false if the iterator is
+   exhausted, in which case no path is written and the contents of the buffer is
+   garbage.  */
+bool
+trie::iter::next ()
+{
+  while (true)
+    {
+      frame &top = stack.last ();
+      const trie_node &vertex = vertices[top.ix];
+
+      if (vertex.endofpath && yield
+	  && (top.itr != top.end || vertex.length () == 0))
+	{
+	  yield = false;
+	  return true;
+	}
+
+      yield = true;
+
+      if (top.itr != top.end)
+	{
+	  const xpair succ = vertex.at (top.itr);
+	  const trie_node &next = vertices[succ.val];
+	  top.itr++;
+
+	  if (!next.inserted)
+	    continue;
+
+	  frame f {};
+	  f.ix = succ.val;
+	  f.itr = 0;
+	  f.end = next.length ();
+	  path.safe_push (succ.key);
+	  stack.safe_push (f);
+	}
+      else
+	{
+	  stack.pop ();
+	  if (stack.is_empty ())
+	    return false;
+	  path.pop ();
+	}
+    }
+}
+
+/* Find the next path in the trie that would continue (but does not include)
+   LIMIT.  If the trie contains the paths [2 4 6 8 9] [2 4 6 8 10] and [2 4 5
+   8], iter.next (8) would yield [2 4 6] and [2 4 5].  Returns true if a path is
+   written and false if the iterator is exhausted, in which case no path is
+   written and the contents of the buffer is garbage.  */
+bool
+trie::iter::next (int limit)
+{
+  while (true)
+    {
+      frame &top = stack.last ();
+      const trie_node &vertex = vertices[top.ix];
+
+      if (yield && top.itr != top.end)
+	{
+	  const xpair succ = vertex.at (top.itr);
+	  const trie_node &next = vertices[succ.val];
+	  const int key = succ.key;
+	  const int val = succ.val;
+	  top.itr++;
+
+	  if (!next.inserted)
+	    continue;
+
+	  if (key == limit)
+	    {
+	      if (path.is_empty ())
+		continue;
+	      yield = false;
+	      return true;
+	    }
+
+	  frame f {};
+	  f.ix = val;
+	  f.itr = 0;
+	  f.end = next.length ();
+	  path.safe_push (key);
+	  stack.safe_push (f);
+	}
+      else
+	{
+	  yield = true;
+	  stack.pop ();
+	  if (stack.is_empty ())
+	    return false;
+	  path.pop ();
+	}
+    }
+}
+
+/* Find the next path in among all paths including subpaths/suffixes.  This is
+   mainly useful in combination with trie.paths (from) for finding the paths
+   that go through some vertex.  */
+bool
+trie::iter::next (bool)
+{
+  while (true)
+    {
+      frame &top = stack.last ();
+      const trie_node &vertex = vertices[top.ix];
+
+      if (yield && vertex.length () == 0)
+	{
+	  yield = false;
+	  return true;
+	}
+
+      yield = true;
+
+      if (top.itr != top.end)
+	{
+	  const xpair succ = vertex.at (top.itr);
+	  const trie_node &next = vertices[succ.val];
+	  top.itr++;
+
+	  frame f {};
+	  f.ix = succ.val;
+	  f.itr = 0;
+	  f.end = next.length ();
+	  path.safe_push (succ.key);
+	  stack.safe_push (f);
+	}
+      else
+	{
+	  stack.pop ();
+	  if (stack.is_empty ())
+	    return false;
+	  path.pop ();
+	}
+    }
+}
+
+/* Return the index of NEEDLE in HAYSTACK, or (size_t)-1 if not found.  */
+template <typename T>
+size_t
+index_of (T needle, const vec <T> &haystack)
+{
+  size_t len = haystack.length ();
+  for (size_t i = 0; i != len; ++i)
+    if (haystack[i] == needle)
+      return i;
+  return (size_t)-1;
+}
+
+/* Check if there is an edge in GRAPH from SRC to DST.  */
+bool
+edge_p (const struct graph *graph, int src, int dest)
+{
+  for (struct graph_edge *e = graph->vertices[src].succ; e; e = e->succ_next)
+    if (e->dest == dest)
+      return true;
+  return false;
+}
+
+/* Check if PATH is a cycle starting (and ending) with V.  */
+bool
+cycle_p (const vec<int>& path, int v)
+{
+  return path[0] == v && path[path.length ()-1] == v;
+}
+
+/* Find the SCC entry-exit paths, the simple paths from ENTRY to EXIT, and add
+   them to OUT.  PRIME_PATHS is the prime paths of the SCC.  Paths are hard
+   inserted into OUT, which disables subpath eliminiation and essentially makes
+   OUT a compact set.  This is important to not eliminate paths from ENTRY to
+   EXIT which are traversed by other ENTRY/EXIT pairs.  Duplicated entries are
+   removed.  */
+void
+scc_entry_exit_paths (const vec<vec<int>> &internal_pp, int entry, int exit,
+		      trie &out)
+{
+  if (entry == exit)
+    {
+      out.hard_insert (array_slice <const int> (&entry, 1));
+      return;
+    }
+
+  for (const vec<int> &path : internal_pp)
+    {
+      const size_t Ven = index_of (entry, path);
+      const size_t Vex = index_of (exit, path);
+
+      if (Ven == (size_t)-1 || Vex == (size_t)-1 || Vex <= Ven)
+	continue;
+
+      const size_t len = (Vex + 1) - Ven;
+      array_slice <const int> p (path.begin () + Ven, len);
+      out.hard_insert (p);
+    }
+}
+
+/* Find the SCC exit paths, the simple paths that starts in a non-entry vertex
+   in the SCC and ends in EXIT and are not a cycles.  INTERNAL_PP are the
+   internal prime paths for the SCC with EXIT as an exit vertex.
+
+   Fazli claims the path must not be a subpath of another exit path in the SCC,
+   but this is only half true: see gcov-29.c/pathcov005a.  Subpaths must survive
+   if they end in a different exit vertex than the superpath, so the hard_insert
+   is important.  */
+void
+scc_exit_paths (const vec<vec<int>> &prime_paths, int exit, trie &out)
+{
+  trie trie;
+  for (const vec<int> &path : prime_paths)
+    {
+      const size_t Vex = index_of (exit, path);
+      if (Vex == (size_t)-1 || cycle_p (path, exit))
+	continue;
+      array_slice <const int> p (path.begin (), Vex + 1);
+      trie.insert_with_suffix (p);
+    }
+
+  auto_vec<int> path {};
+  auto iter = trie.paths (path);
+  while (iter.next ())
+    out.hard_insert (path);
+}
+
+/* Find the SCC entry paths, the simple paths that start in the entry vertex
+   ENTRY and are not cycles.  INTERNAL_PP are the internal prime paths for the
+   SCC with ENTRY as an entry vertex.  The paths are inserted into OUT.  */
+void
+scc_entry_paths (const vec<vec<int>> &internal_pp, int entry, trie &trie)
+{
+  for (const vec<int> &path : internal_pp)
+    {
+      const size_t Ven = index_of (entry, path);
+      if (Ven == (size_t)-1 || cycle_p (path, entry))
+	continue;
+      array_slice <const int> p (path.begin () + Ven, path.length () - Ven);
+      trie.insert (p);
+    }
+}
+
+/* Worker for cfg_complete_prime_paths.  ITR is the current id for the current
+   path.  END is end of the path so that when ITR == END, the walk is completed.
+   EDGES is the matrix of edges where EDGES[src][dst] is set if there is an edge
+   from src to dest.  PATH is the vertices that make up this walk so far.  TRIE
+   is the output trie where paths are inserted.  SCC_ENEX_PATHS are the
+   entry-exit paths found by the scc_entry_exit_paths function.  */
+void
+cfg_complete_prime_paths1 (const int *itr, const int *end,
+			   const sbitmap *edges,
+			   const vec<vec<vec<int>>> &scc_enex_paths,
+			   vec<int> &path, trie &trie)
+{
+  if (itr == end)
+    {
+      trie.insert_with_suffix (path);
+      return;
+    }
+
+  const unsigned pathlen = path.length ();
+  const sbitmap succs = edges[path.last ()];
+  for (const vec<int> &enex : scc_enex_paths[*itr])
+    {
+      if (!bitmap_bit_p (succs, enex[0]))
+	continue;
+
+      path.safe_splice (enex);
+      cfg_complete_prime_paths1 (itr + 1, end, edges, scc_enex_paths,
+				 path, trie);
+      path.truncate (pathlen);
+    }
+}
+
+/* Find the complete prime paths of the CFG, the prime paths that start in the
+   entry vertex and end in the exit vertex.  */
+trie
+cfg_complete_prime_paths (const sbitmap *edges,
+			  const vec<trie> &scc_entry_exit_paths,
+			  const trie &ccfg_prime_paths)
+{
+  trie trie;
+  auto_vec<int, 16> path {};
+  auto_vec<int, 16> cfgpp {};
+  auto_vec_vec_vec scc_enex {};
+  scc_enex.reserve (scc_entry_exit_paths.length ());
+
+  for (size_t i = 0; i != scc_entry_exit_paths.length (); ++i)
+    {
+      scc_enex.quick_push (vec<vec<int>> {});
+      auto iter = scc_entry_exit_paths[i].paths (path);
+      while (iter.next ())
+	scc_enex[i].safe_push (path.copy ());
+    }
+
+  auto iter = ccfg_prime_paths.paths (cfgpp);
+  while (trie.size () < approx_limit && iter.next ())
+    for (const vec<int> &enex : scc_enex[cfgpp[0]])
+      {
+	path.truncate (0);
+	path.safe_splice (enex);
+	cfg_complete_prime_paths1 (cfgpp.begin () + 1, cfgpp.end (), edges,
+				   scc_enex, path, trie);
+      }
+
+  return trie;
+}
+
+/* Find the SCC exit prime paths, the prime paths that start from a strongly
+   connected component and end in the end vertex. SCCS is a vector where SCCS[i]
+   = SCC (vertex_i) so that if vertex[2].component == 5 then SCCS[2] == 5.
+   SCC_EXIT_PATHS is the output of scc_exit_paths ().  COMPLETE_PRIME_PATHS is
+   the output of cfg_complete_prime_paths ().
+
+   This function can suffer under path explosion and will terminate early if
+   the number of inserts in COMPLETE_PRIME_PATHS exceeds approx_limit.  */
+trie
+scc_exit_prime_paths (const struct graph *cfg, const trie &scc_exit_paths,
+		      const trie &complete_prime_paths)
+{
+  trie trie;
+  auto_vec<int, 8> path {};
+  auto_vec<int, 8> r {};
+  auto_vec<int, 8> q {};
+
+  auto exiter = scc_exit_paths.paths (q);
+  while (exiter.next ())
+    {
+      const int Vex = q.last ();
+      auto iter = complete_prime_paths.paths (r, Vex);
+      while (iter.next (true))
+	{
+	  /* There could be multiple Vex in the SCC. Even if scc_exit_paths did
+	     not kill the subpaths, this trie probably would. It can be assumed
+	     that all vertices in q are in the same SCC.
+
+	     This is an important note, as the Fazli and Afsharchi paper does
+	     not properly capture this subtlety.  */
+	  const int p0 = Vex;
+	  const int p1 = r[0];
+
+	  if (cfg->vertices[p0].component == cfg->vertices[p1].component)
+	    continue;
+
+	  path.truncate (0);
+	  path.reserve (q.length () + r.length ());
+	  path.splice (q);
+	  path.splice (r);
+	  /* This can probably insert without subpath elimination because:
+	     1. Conflicts are *really* rare (see patmatch in tree.c), but they
+		do happen.
+	     2. The output of this function is "filtered" through another trie
+		anyway so the redundant paths generated here will be eliminated
+		in the consumers at a very low extra cost.  */
+	  trie.insert (path);
+	  if (trie.size () > approx_limit)
+	    return trie;
+	}
+    }
+
+  return trie;
+}
+
+/* Check if PATH in CFG enters the VERTEX's SCC through VERTEX.  */
+bool
+enters_through_p (const struct graph *cfg, const vec<int> &path, int vertex)
+{
+  gcc_checking_assert (!path.is_empty ());
+  const int last = path.address()[path.length ()-1];
+  if (cfg->vertices[last].component == cfg->vertices[vertex].component)
+    return false;
+  return edge_p (cfg, last, vertex);
+};
+
+/* Worker for scc_entry_prime_paths.  CFG is the CFG for the function,
+   SCC_ENTRY_PATHS the accumulated scc_entry_paths for all the SCCs, PRIME_PATHS
+   is either the result of cfg_complete_prime_paths or exit_prime_paths, and OUT
+   the output trie.
+
+   This function can suffer under path explosion and will terminate early if
+   the number of inserts in OUT exceeds approx_limit.  */
+void
+scc_entry_prime_paths1 (const struct graph *cfg, const trie &scc_entry_paths,
+			const trie &prime_paths, trie &out)
+{
+  auto_vec<int, 8> p {};
+  auto_vec<int, 8> q {};
+  auto_vec<int, 8> path {};
+  auto itr = scc_entry_paths.paths (q);
+  while (itr.next ())
+    {
+      const int Ven = q[0];
+      /* TODO: This might benefit from a reversed trie lookup.  */
+      auto xitr = prime_paths.paths (p);
+      while (xitr.next (Ven))
+	{
+	  if (!enters_through_p (cfg, p, Ven))
+	    continue;
+
+	  path.truncate (0);
+	  path.reserve (p.length () + q.length ());
+	  path.splice (p);
+	  path.splice (q);
+	  out.insert_with_suffix (path);
+	  if (out.size () > approx_limit)
+	    return;
+	}
+    }
+}
+
+/* Find the entry prime paths - the prime paths that start in the root and end
+   in a strongly connected component.  CFG is the CFG for this function,
+   SCC_ENTRY_PATHS the accumulated scc_entry_paths for all the SCCs,
+   COMPLETE_PRIME_PATHS the result of cfg_complete_prime_paths, and
+   EXIT_PRIME_PATHS result of exit_prime_paths.
+
+   This function can suffer under path explosion and will terminate early if
+   the return value grows beyond approx_limit.  */
+trie
+scc_entry_prime_paths (const struct graph *cfg,
+		       const trie &scc_entry_paths,
+		       const trie &complete_prime_paths,
+		       const trie &exit_prime_paths)
+{
+  trie trie;
+  scc_entry_prime_paths1 (cfg, scc_entry_paths, complete_prime_paths, trie);
+  scc_entry_prime_paths1 (cfg, scc_entry_paths, exit_prime_paths, trie);
+  return trie;
+}
+
+/* Build a new control flow graph from the strongly connected components, so
+   that every node in the CCFG is a strongly connected component in the original
+   CFG.  NSSC is the number of vertices in the new graph, and the return value
+   of graphds_ssc.  */
+struct graph*
+build_ccfg (struct graph *cfg, int nscc)
+{
+  struct graph *ccfg = new_graph (nscc);
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    {
+      struct vertex v = cfg->vertices[i];
+      for (struct graph_edge *e = v.succ; e; e = e->succ_next)
+	{
+	  int src = v.component;
+	  int dest = cfg->vertices[e->dest].component;
+	  if (src != dest && !edge_p (ccfg, src, dest))
+	    add_edge (ccfg, src, dest);
+	}
+    }
+
+  return ccfg;
+}
+
+/* Create a new graph from CFG where the edges between strongly connected
+   components removed.  */
+struct graph*
+disconnect_sccs (struct graph *cfg)
+{
+  struct graph *ccfg = new_graph (cfg->n_vertices);
+  const struct vertex *vertices = cfg->vertices;
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    {
+      ccfg->vertices[i].data = &cfg->vertices[i];
+      for (struct graph_edge *e = vertices[i].succ; e; e = e->succ_next)
+	if (vertices[e->src].component == vertices[e->dest].component)
+	  add_edge (ccfg, e->src, e->dest)->data = e;
+    }
+  return ccfg;
+}
+
+/* Check if vertex I in CFG is the entry vertex of a strongly connected
+   component.  A vertex is an entry vertex if 1) there are no predecessors
+   (i.e. the root vertex is always an entry vertex) or 2) a predecessor belongs
+   to a different SCC.  */
+bool
+scc_entry_vertex_p (struct graph *cfg, size_t i)
+{
+  if (!cfg->vertices[i].pred)
+    return true;
+  const int scc = cfg->vertices[i].component;
+  for (struct graph_edge *e = cfg->vertices[i].pred; e; e = e->pred_next)
+    if (cfg->vertices[e->src].component != scc)
+      return true;
+  return false;
+}
+
+/* Check if vertex I in CFG is an exit vertex of a strongly connected component.
+   A vertex is an exit vertex if 1) there are no successors (i.e. the sink is
+   always an exit vertex) or 2) if a successor belongs to a different SCC.  */
+bool
+scc_exit_vertex_p (struct graph *cfg, size_t i)
+{
+  if (!cfg->vertices[i].succ)
+    return true;
+  const int scc = cfg->vertices[i].component;
+  for (struct graph_edge *e = cfg->vertices[i].succ; e; e = e->succ_next)
+    if (cfg->vertices[e->dest].component != scc)
+      return true;
+  return false;
+}
+
+/* Worker for simple_paths.  Find all the simple paths in CFG starting at NODE
+   and insert into OUT.  This is a DFS where the search stops when entering a
+   vertex already in SEEN.  PATH is the sequence of ids for the vertices taken
+   from the from the root to NODE.  When the number of inserts reaches LIMIT
+   the function aborts and returns so the caller can report that it is giving
+   up because the function is too complex.
+
+   This function can suffer under path explosion and will terminate early if
+   the number of inserts in OUT exceeds approx_limit.  */
+void
+simple_paths1 (const struct graph *cfg, int node, sbitmap seen, vec<int> &path,
+	       trie &out)
+{
+  if (out.size () > approx_limit)
+    return;
+
+  if (!bitmap_set_bit (seen, node))
+    {
+      if (path[0] == node)
+	path.quick_push (node);
+      out.insert (path);
+      if (path[0] == node)
+	path.pop ();
+      return;
+    }
+  path.quick_push (node);
+
+  struct graph_edge *succs = cfg->vertices[node].succ;
+  if (!succs)
+    {
+      out.insert (path);
+      bitmap_clear_bit (seen, node);
+      path.pop ();
+      return;
+    }
+
+  for (struct graph_edge *e = succs; e; e = e->succ_next)
+    simple_paths1 (cfg, e->dest, seen, path, out);
+
+  bitmap_clear_bit (seen, node);
+  path.pop ();
+}
+
+/* Find all the simple paths in CFG starting at ROOT and insert into OUT.  A
+   simple path is a sequence of vertices without any duplicated vertices (i.e.
+   no loops).  SEEN should be an sbitmap of CFG->n_vertices size.  PATH and
+   SEEN will be cleared entry and is for buffer reuse between calls.  When the
+   number of inserts reaches LIMIT the function aborts and returns so the
+   caller can report that it is giving up because the function is too complex.
+   Note that there might be fewer prime paths than inserts, but if the number
+   of inserts alone is larger than LIMIT the function is very complex and would
+   take too long to compile in later stages.
+
+   This function can suffer under path explosion and will terminate early if
+   the number of inserts in OUT exceeds approx_limit.  Since OUT is often
+   shared between calls it is ok to use in a loop, and only check the size of
+   OUT after the loop terminates.  */
+void
+simple_paths (struct graph *cfg, int root, sbitmap seen, vec<int> &path,
+	      trie &out)
+{
+  bitmap_clear (seen);
+  path.reserve (cfg->n_vertices);
+  path.truncate (0);
+  simple_paths1 (cfg, root, seen, path, out);
+}
+
+/* Merge the tries T1, T2, T3, and set of paths VECS into the larges trie.
+   Returns a reference to the trie merged into.  Merging tries and resolving
+   redundant paths is the slowest step (at least in the sense it works on the
+   largest input), and merging into a partial result reduces the work
+   accordingly.  For large problems this is a massive improvement, which in the
+   worst cases (where all tries but one are empty or almost empty) speed up
+   30-40%.  */
+trie&
+merge (trie &t1, trie &t2, trie &t3, vec<vec<vec<int>>> &vecs)
+{
+  trie *dst = nullptr;
+  const size_t s1 = t1.size ();
+  const size_t s2 = t2.size ();
+  const size_t s3 = t3.size ();
+
+  if (s1 >= s2 && s1 >= s3)
+    {
+      dst = &t1;
+      t1.merge (t2);
+      t1.merge (t3);
+    }
+  else if (s2 >= s1 && s2 >= s3)
+    {
+      dst = &t2;
+      t2.merge (t1);
+      t2.merge (t3);
+    }
+  else
+    {
+      dst = &t3;
+      t3.merge (t1);
+      t3.merge (t2);
+    }
+
+  gcc_checking_assert (dst);
+  for (const vec<vec<int>> &v2 : vecs)
+    for (const vec<int> &v1 : v2)
+      dst->insert_with_suffix (v1);
+  return *dst;
+}
+
+/* Store the edges of CFG in a matrix of bitmaps so that bit_p (edges[src],
+   dest) is true if there is an edge from src to dest.  This is faster and more
+   convenient than walking the linked list of successors in hot loops.  The
+   vector will have N bitmaps of N bits where N is the number of vertices in
+   CFG.  */
+sbitmap*
+edge_matrix (const struct graph *cfg)
+{
+  sbitmap *edges = sbitmap_vector_alloc (cfg->n_vertices, cfg->n_vertices);
+  bitmap_vector_clear (edges, cfg->n_vertices);
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    for (graph_edge *e = cfg->vertices[i].succ; e; e = e->succ_next)
+      bitmap_set_bit (edges[e->src], e->dest);
+  return edges;
+}
+
+} // namespace
+
+/* Find the prime paths for CFG.  The search gives up after approximate
+   PATHLIMIT probable paths have been generated to address path explosions.
+   The PATHLIMIT flag is typically controlled by -fpath-coverage-limit.  This
+   function is a part of -fpath-coverage and will also be called from gcov.
+   The paths are returned in lexicographical order based on node (basic block)
+   ID.  If the path limit was exceeded, an empty vector is returned.
+
+   A simple path is a path where all vertices are unique, except possibly the
+   first and last.  A prime path is a maximal-length simple path which is not a
+   part of any other simple path.  Prime paths strike a good balance between
+   coverage thoroughness, loops (requiring them to be taken and skipped), and
+   number of paths.
+
+   The algorithm is based on Fazli & Afsarchi's "A Time and Space-Efficient
+   Compositional Method for Prime and Test Paths Generation" (2019), combined
+   with a suffix trie for removing duplicate or redundant paths.  An auxillary
+   graph of the strongly connected components (SCCs) is built.  Then, the prime
+   paths of the SCCs composes the prime paths of each SCC with the prime paths
+   of this auxillary graph.  This can drastically cut the number of redundant
+   paths generated compared to a naive algorithm.
+
+   This does not work for all graphs.  Some structures, e.g. when most of the
+   graph is inside a single SCC, cause the algorithm to degenerate to a naive
+   one.  The same happens for functions with many SCCs that are either
+   singletons or very small.  Those cases will be slower with respect to the
+   number of paths, but still fast enough if the path limit is kept reasonably
+   low (a few hundred thousand).  */
+vec<vec<int>>
+prime_paths (struct graph *cfg, size_t pathlimit)
+{
+  const int nscc = graphds_scc (cfg, NULL);
+  auto_graph disconnected (disconnect_sccs (cfg));
+  auto_graph ccfg (build_ccfg (cfg, nscc));
+  auto_sbitmap_vector edges (edge_matrix (cfg));
+
+  auto_sbitmap seen (cfg->n_vertices);
+  auto_vec<int, 8> pathbuf {};
+
+  approx_limit = pathlimit;
+
+  /* Store an SCC-ID -> vertices mapping to quickly find the vertices that
+     make up a strongly connected component.  */
+  auto_vec_vec sccs {};
+  sccs.safe_grow_cleared (ccfg->n_vertices);
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    sccs[cfg->vertices[i].component].safe_push (i);
+
+  auto_vec_vec_vec scc_internal_pp {};
+  scc_internal_pp.safe_grow_cleared (nscc);
+  for (int i = 0; i != nscc; ++i)
+    {
+      trie internal_pp;
+      for (int V : sccs[i])
+	simple_paths (disconnected, V, seen, pathbuf, internal_pp);
+      scc_internal_pp[i] = internal_pp.paths ();
+      if (limit_checked_add (scc_internal_pp[i].length ()))
+	return {};
+    }
+
+  auto_vec<trie, 8> scc_enex_paths (nscc);
+  scc_enex_paths.safe_grow_cleared (nscc);
+  trie scc_en_paths;
+  trie scc_ex_paths;
+
+  for (int i = 0; i != ccfg->n_vertices; ++i)
+    {
+      for (int Ven : sccs[i])
+	{
+	  if (!scc_entry_vertex_p (cfg, Ven))
+	    continue;
+
+	  for (int Vex : sccs[i])
+	    {
+	      if (!scc_exit_vertex_p (cfg, Vex))
+		continue;
+	      scc_entry_exit_paths (scc_internal_pp[i], Ven, Vex,
+				    scc_enex_paths[i]);
+	    }
+	}
+    }
+
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    {
+      const int scc = cfg->vertices[i].component;
+      if (scc_entry_vertex_p (cfg, i))
+	scc_entry_paths (scc_internal_pp[scc], i, scc_en_paths);
+
+      if (scc_exit_vertex_p (cfg, i))
+	scc_exit_paths (scc_internal_pp[scc], i, scc_ex_paths);
+    }
+
+  /* In the presence of abnormal edges (like longjmp) it is possible to have
+     multiple "entry points" in function -- build ccfg prime paths starting at
+     any vertex without predecessor.  For most graphs this will only be the
+     ENTRY_BLOCK.  */
+  trie ccfg_prime_paths;
+  for (int i = 0; i != ccfg->n_vertices; ++i)
+    if (!ccfg->vertices[i].pred)
+      simple_paths (ccfg, i, seen, pathbuf, ccfg_prime_paths);
+
+  if (ccfg_prime_paths.size () > approx_limit)
+    return {};
+  trie complete_prime_paths = cfg_complete_prime_paths (edges, scc_enex_paths,
+							ccfg_prime_paths);
+  if (limit_checked_add (complete_prime_paths.size ()))
+    return {};
+  trie exit_prime_paths = scc_exit_prime_paths (cfg, scc_ex_paths,
+						complete_prime_paths);
+  if (limit_checked_add (exit_prime_paths.size ()))
+    return {};
+  trie entry_prime_paths = scc_entry_prime_paths (cfg, scc_en_paths,
+						  complete_prime_paths,
+						  exit_prime_paths);
+  if (limit_checked_add (entry_prime_paths.size ()))
+    return {};
+
+  trie &merged = merge (complete_prime_paths, entry_prime_paths,
+			exit_prime_paths, scc_internal_pp);
+  if (merged.size () > pathlimit)
+    return {};
+
+  return merged.paths ();
+}
+
+#if CHECKING_P
+
+namespace selftest
+{
+
+/* Check if the trie contains PATH.  */
+static bool
+contains (const trie &trie, array_slice<const int> path)
+{
+  size_t index = 0;
+  for (int id : path)
+    {
+      const trie_node &current = trie.vertices[index];
+      if (!current.inserted)
+	return false;
+      const auto *xp = current.get (id);
+      if (!xp)
+	return false;
+      index = xp->val;
+    }
+  return trie.vertices[index].inserted && trie.vertices[index].endofpath;
+}
+
+static bool
+equal_p (array_slice<const int> lhs, array_slice<const int> rhs)
+{
+  if (lhs.size () != rhs.size ())
+    return false;
+
+  size_t length = lhs.size();
+  for (size_t i = 0; i != length; ++i)
+    if (lhs[i] != rhs[i])
+      return false;
+  return true;
+}
+
+static bool
+any_equal_p (const array_slice<const int> &needle,
+	     const vec<vec<int>> &haystack)
+{
+  for (const vec<int> &x : haystack)
+    if (equal_p (needle, array_slice <const int> (x)))
+      return true;
+  return false;
+}
+
+static size_t
+count (const trie &trie)
+{
+  size_t n = 0;
+  auto_vec<int> path {};
+  auto iter = trie.paths (path);
+  while (iter.next ())
+    n += 1;
+  return n;
+}
+
+static vec<vec<int>>
+simple_paths (struct graph *cfg, trie &trie, int root = 0)
+{
+  auto_sbitmap seen (cfg->n_vertices);
+  auto_vec<int> path;
+  simple_paths (cfg, root, seen, path, trie);
+  return trie.paths ();
+}
+
+/* Create a CFG that roughly corresponds to this program:
+
+int binary_search(int a[], int len, int from, int to, int key)
+{
+    int low = from;
+    int high = to - 1;
+
+    while (low <= high)
+    {
+	int mid = (low + high) >> 1;
+	long midVal = a[mid];
+
+	if (midVal < key)
+	    low = mid + 1;
+	else if (midVal > key)
+	    high = mid - 1;
+	else
+	    return mid; // key found
+    }
+    return -1;
+}
+
+*/
+static struct graph*
+binary_search_cfg ()
+{
+    struct graph *g = new_graph (11);
+    add_edge (g, 0, 1);
+    add_edge (g, 1, 2);
+    add_edge (g, 2, 3);
+    add_edge (g, 2, 4);
+    add_edge (g, 3, 10);
+    add_edge (g, 4, 5);
+    add_edge (g, 4, 6);
+    add_edge (g, 5, 7);
+    add_edge (g, 6, 8);
+    add_edge (g, 6, 9);
+    add_edge (g, 7, 2);
+    add_edge (g, 8, 10);
+    add_edge (g, 9, 7);
+    graphds_scc (g, NULL);
+    return g;
+}
+
+/* Test a full run of the algorithm against a known graph (binary-search).  */
+static void
+test_prime_paths ()
+{
+  auto_graph g (binary_search_cfg ());
+  vec<vec<int>> paths = prime_paths (g, 100);
+  const int p01[] = { 0, 1, 2, 3, 10 };
+  const int p02[] = { 0, 1, 2, 4, 6, 8, 10 };
+  const int p03[] = { 5, 7, 2, 4, 6, 9 };
+  const int p04[] = { 4, 6, 9, 7, 2, 4 };
+  const int p05[] = { 2, 4, 6, 9, 7, 2 };
+  const int p06[] = { 6, 9, 7, 2, 4, 6 };
+  const int p07[] = { 9, 7, 2, 4, 6, 9 };
+  const int p08[] = { 7, 2, 4, 6, 9, 7 };
+  const int p09[] = { 6, 9, 7, 2, 4, 5 };
+  const int p10[] = { 4, 5, 7, 2, 4 };
+  const int p11[] = { 2, 4, 5, 7, 2 };
+  const int p12[] = { 5, 7, 2, 4, 5 };
+  const int p13[] = { 7, 2, 4, 5, 7 };
+  const int p14[] = { 4, 6, 9, 7, 2, 3, 10 };
+  const int p15[] = { 5, 7, 2, 4, 6, 8, 10 };
+  const int p16[] = { 9, 7, 2, 4, 6, 8, 10 };
+  const int p17[] = { 4, 5, 7, 2, 3, 10 };
+  const int p18[] = { 0, 1, 2, 4, 6, 9, 7 };
+  const int p19[] = { 0, 1, 2, 4, 5, 7 };
+
+  ASSERT_EQ (paths.length (), 19);
+  ASSERT_TRUE (any_equal_p (p01, paths));
+  ASSERT_TRUE (any_equal_p (p02, paths));
+  ASSERT_TRUE (any_equal_p (p03, paths));
+  ASSERT_TRUE (any_equal_p (p04, paths));
+  ASSERT_TRUE (any_equal_p (p05, paths));
+  ASSERT_TRUE (any_equal_p (p06, paths));
+  ASSERT_TRUE (any_equal_p (p07, paths));
+  ASSERT_TRUE (any_equal_p (p08, paths));
+  ASSERT_TRUE (any_equal_p (p09, paths));
+  ASSERT_TRUE (any_equal_p (p10, paths));
+  ASSERT_TRUE (any_equal_p (p11, paths));
+  ASSERT_TRUE (any_equal_p (p12, paths));
+  ASSERT_TRUE (any_equal_p (p13, paths));
+  ASSERT_TRUE (any_equal_p (p14, paths));
+  ASSERT_TRUE (any_equal_p (p15, paths));
+  ASSERT_TRUE (any_equal_p (p16, paths));
+  ASSERT_TRUE (any_equal_p (p17, paths));
+  ASSERT_TRUE (any_equal_p (p18, paths));
+  ASSERT_TRUE (any_equal_p (p19, paths));
+  release_vec_vec (paths);
+}
+
+/* The strongly connected component graph for binary_search looks like
+    this, using the vertex numbers from the original graph:
+
+    START
+      |
+      1
+      |
+      2 (SCC)
+     / \
+    3   8
+     \ /
+     END
+
+  The components gets renumbered by graphds_scc, so the ccfg looks like
+  this.  The actual numbers don't matter as long as the structure of the
+  graph is preserved, and this test is now sensitive to the numbering set
+  by graphds_scc.  It does not have to be - if that function should reverse
+  the numbering this test must be updated.
+
+      5
+      |
+      4
+      |
+      3 (SCC)
+     / \
+    2   1
+     \ /
+      0
+*/
+static void
+test_build_ccfg ()
+{
+  auto_graph cfg (binary_search_cfg ());
+  const int nscc = graphds_scc (cfg, NULL);
+  auto_graph ccfg (build_ccfg (cfg, nscc));
+  ASSERT_EQ (6, nscc);
+
+  ASSERT_TRUE (edge_p (ccfg, 5, 4));
+  ASSERT_TRUE (edge_p (ccfg, 4, 3));
+  ASSERT_TRUE (edge_p (ccfg, 3, 2));
+  ASSERT_TRUE (edge_p (ccfg, 3, 1));
+  ASSERT_TRUE (edge_p (ccfg, 2, 0));
+  ASSERT_TRUE (edge_p (ccfg, 1, 0));
+}
+
+/* This test checks some basic assumptions on finding the strongly connected
+   components and disconnecting the graph by removing all edges between SCCs.
+   Creating a single auxillary graph simplifies the bookkeeping.  */
+static void
+test_split_components ()
+{
+  auto_graph cfg (binary_search_cfg ());
+  int nscc = graphds_scc (cfg, NULL);
+  auto_graph ccfg (disconnect_sccs (cfg));
+
+  vec<vec<int>> entries {};
+  vec<vec<int>> exits {};
+  entries.safe_grow_cleared (nscc);
+  exits.safe_grow_cleared (nscc);
+
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    {
+      if (scc_entry_vertex_p (cfg, i))
+	entries[cfg->vertices[i].component].safe_push (i);
+      if (scc_exit_vertex_p (cfg, i))
+	exits[cfg->vertices[i].component].safe_push (i);
+    }
+
+  const int p10[] = { 10 };
+  const int p08[] = { 8 };
+  const int p03[] = { 3 };
+  const int p02[] = { 2 };
+  const int p01[] = { 1 };
+  const int p00[] = { 0 };
+  const int p26[] = { 2, 6 };
+
+  ASSERT_EQ (entries.length (), 6);
+  ASSERT_TRUE (any_equal_p (p10, entries));
+  ASSERT_TRUE (any_equal_p (p08, entries));
+  ASSERT_TRUE (any_equal_p (p03, entries));
+  ASSERT_TRUE (any_equal_p (p02, entries));
+  ASSERT_TRUE (any_equal_p (p01, entries));
+  ASSERT_TRUE (any_equal_p (p00, entries));
+
+  ASSERT_EQ (exits.length (), 6);
+  ASSERT_TRUE (any_equal_p (p10, exits));
+  ASSERT_TRUE (any_equal_p (p08, exits));
+  ASSERT_TRUE (any_equal_p (p03, exits));
+  ASSERT_TRUE (any_equal_p (p26, exits));
+  ASSERT_TRUE (any_equal_p (p01, exits));
+  ASSERT_TRUE (any_equal_p (p00, exits));
+
+  /* The result of disconnect_sccs () disconnects the graph into its strongly
+     connected components. The subgraphs are either singletons (a single
+     vertex with no edges) or graphs with cycles.  The SCC internal prime
+     paths can be found by running a DFS from every SCC vertex, terminating
+     on a duplicated vertex.  This may create some redundant paths still,
+     which must be filtered out.
+
+     Singletons can either be detected and skipped (requires counting the
+     components) or filtered after.  For this test case they are skipped
+     because other graph inconsistencies are easier to detect.  */
+
+  /* Count and check singleton components.  */
+  vec<int> scc_size {};
+  scc_size.safe_grow_cleared (nscc);
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    scc_size[cfg->vertices[i].component]++;
+  ASSERT_EQ (nscc, 6);
+  ASSERT_EQ (scc_size[0], 1);
+  ASSERT_EQ (scc_size[1], 1);
+  ASSERT_EQ (scc_size[2], 1);
+  ASSERT_EQ (scc_size[3], 6);
+  ASSERT_EQ (scc_size[4], 1);
+  ASSERT_EQ (scc_size[5], 1);
+
+  /* Manually unroll the loop finding the simple paths starting at the
+     vertices in the SCCs.  In this case there is only the one SCC.  */
+  trie ccfg_paths;
+  simple_paths (ccfg, ccfg_paths, 2);
+  simple_paths (ccfg, ccfg_paths, 4);
+  simple_paths (ccfg, ccfg_paths, 5);
+  simple_paths (ccfg, ccfg_paths, 6);
+  simple_paths (ccfg, ccfg_paths, 7);
+  simple_paths (ccfg, ccfg_paths, 9);
+  /* then in+out of trie */
+  vec<vec<int>> xscc_internal_pp = ccfg_paths.paths ();
+  trie scc_internal_pp;
+  for (auto &p : xscc_internal_pp)
+    scc_internal_pp.insert_with_suffix (p);
+
+  const int pp01[] = { 5, 7, 2, 4, 6, 9 };
+  const int pp02[] = { 4, 5, 7, 2, 4 };
+  const int pp03[] = { 4, 6, 9, 7, 2, 4 };
+  const int pp04[] = { 2, 4, 5, 7, 2 };
+  const int pp05[] = { 2, 4, 6, 9, 7, 2 };
+  const int pp06[] = { 5, 7, 2, 4, 5 };
+  const int pp07[] = { 6, 9, 7, 2, 4, 6 };
+  const int pp08[] = { 7, 2, 4, 5, 7 };
+  const int pp09[] = { 9, 7, 2, 4, 6, 9 };
+  const int pp10[] = { 7, 2, 4, 6, 9, 7 };
+  const int pp11[] = { 6, 9, 7, 2, 4, 5 };
+
+  ASSERT_EQ (count (scc_internal_pp), 11);
+  ASSERT_TRUE (contains (scc_internal_pp, pp01));
+  ASSERT_TRUE (contains (scc_internal_pp, pp02));
+  ASSERT_TRUE (contains (scc_internal_pp, pp03));
+  ASSERT_TRUE (contains (scc_internal_pp, pp04));
+  ASSERT_TRUE (contains (scc_internal_pp, pp05));
+  ASSERT_TRUE (contains (scc_internal_pp, pp06));
+  ASSERT_TRUE (contains (scc_internal_pp, pp07));
+  ASSERT_TRUE (contains (scc_internal_pp, pp08));
+  ASSERT_TRUE (contains (scc_internal_pp, pp09));
+  ASSERT_TRUE (contains (scc_internal_pp, pp10));
+  ASSERT_TRUE (contains (scc_internal_pp, pp11));
+}
+
+/* The remaining tests deconstructs the algorithm and runs only a single phase
+   with good inputs at that point.  This makes it easier to pinpoint where
+   things go wrong, and helps show in steps how the algorithm works and the
+   phases relate.
+
+   The phases and their inputs and outputs are bazed on Fazli & Afsarchi.  */
+
+static void
+test_scc_internal_prime_paths ()
+{
+  /* This graph has only the SCC subgraph.  The result of running prime-paths
+     on it should be the SCC internal prime paths of the full graph.  */
+  auto_graph scc (new_graph (11));
+  add_edge (scc, 0, 2);
+  add_edge (scc, 2, 4);
+  add_edge (scc, 4, 5);
+  add_edge (scc, 4, 6);
+  add_edge (scc, 5, 7);
+  add_edge (scc, 6, 9);
+  add_edge (scc, 9, 7);
+  add_edge (scc, 7, 2);
+
+  vec<vec<int>> paths = prime_paths (scc, 100);
+  const int p01[] = { 5, 7, 2, 4, 6, 9 };
+  const int p02[] = { 4, 6, 9, 7, 2, 4 };
+  const int p03[] = { 2, 4, 6, 9, 7, 2 };
+  const int p04[] = { 6, 9, 7, 2, 4, 6 };
+  const int p05[] = { 9, 7, 2, 4, 6, 9 };
+  const int p06[] = { 7, 2, 4, 6, 9, 7 };
+  const int p07[] = { 6, 9, 7, 2, 4, 5 };
+  const int p08[] = { 4, 5, 7, 2, 4 };
+  const int p09[] = { 2, 4, 5, 7, 2 };
+  const int p10[] = { 5, 7, 2, 4, 5 };
+  const int p11[] = { 7, 2, 4, 5, 7 };
+
+  ASSERT_TRUE (any_equal_p (p01, paths));
+  ASSERT_TRUE (any_equal_p (p02, paths));
+  ASSERT_TRUE (any_equal_p (p03, paths));
+  ASSERT_TRUE (any_equal_p (p04, paths));
+  ASSERT_TRUE (any_equal_p (p05, paths));
+  ASSERT_TRUE (any_equal_p (p06, paths));
+  ASSERT_TRUE (any_equal_p (p07, paths));
+  ASSERT_TRUE (any_equal_p (p08, paths));
+  ASSERT_TRUE (any_equal_p (p09, paths));
+  ASSERT_TRUE (any_equal_p (p10, paths));
+  ASSERT_TRUE (any_equal_p (p11, paths));
+  release_vec_vec (paths);
+}
+
+/* Test the entry/exit path helpers for the strongly connected component in
+   binary_search.  The SCC has one entry (2, the loop header) and two exits (2,
+   6, the loop exit and return).  */
+static void
+test_scc_entry_exit_paths ()
+{
+  auto_graph scc (new_graph (11));
+  add_edge (scc, 2, 4);
+  add_edge (scc, 4, 5);
+  add_edge (scc, 4, 6);
+  add_edge (scc, 5, 7);
+  add_edge (scc, 6, 9);
+  add_edge (scc, 9, 7);
+  add_edge (scc, 7, 2);
+
+  trie scc_internal_trie;
+  simple_paths (scc, scc_internal_trie, 2);
+  simple_paths (scc, scc_internal_trie, 4);
+  simple_paths (scc, scc_internal_trie, 5);
+  simple_paths (scc, scc_internal_trie, 6);
+  simple_paths (scc, scc_internal_trie, 7);
+  simple_paths (scc, scc_internal_trie, 9);
+  vec<vec<int>> scc_prime_paths = scc_internal_trie.paths ();
+
+  trie entry_exits {};
+  scc_entry_exit_paths (scc_prime_paths, 2, 2, entry_exits);
+  scc_entry_exit_paths (scc_prime_paths, 2, 6, entry_exits);
+
+  const int p01[] = { 2 };
+  const int p02[] = { 2, 4, 6 };
+
+  ASSERT_EQ (count (entry_exits), 2);
+  ASSERT_TRUE (contains (entry_exits, p01));
+  ASSERT_TRUE (contains (entry_exits, p02));
+
+  trie exits;
+  scc_exit_paths (scc_prime_paths, 2, exits);
+  scc_exit_paths (scc_prime_paths, 6, exits);
+
+  const int p03[] = { 4, 6, 9, 7, 2 };
+  const int p04[] = { 5, 7, 2, 4, 6 };
+  const int p05[] = { 9, 7, 2, 4, 6 };
+  const int p06[] = { 4, 5, 7, 2 };
+
+  ASSERT_EQ (count (exits), 4);
+  ASSERT_TRUE (contains (exits, p03));
+  ASSERT_TRUE (contains (exits, p04));
+  ASSERT_TRUE (contains (exits, p05));
+  ASSERT_TRUE (contains (exits, p06));
+
+  trie entries;
+  scc_entry_paths (scc_prime_paths, 2, entries);
+
+  const int p07[] = { 2, 4, 6, 9, 7 };
+  const int p08[] = { 2, 4, 5, 7 };
+  ASSERT_EQ (count (entries), 2);
+  ASSERT_TRUE (contains (entries, p07));
+  ASSERT_TRUE (contains (entries, p08));
+
+  release_vec_vec (scc_prime_paths);
+}
+
+static void
+test_complete_prime_paths ()
+{
+  const int ccfgpp0[] = { 0, 1, 2, 3, 5 };
+  const int ccfgpp1[] = { 0, 1, 2, 4, 5 };
+  trie ccfg_prime_paths {};
+  ccfg_prime_paths.insert (ccfgpp0);
+  ccfg_prime_paths.insert (ccfgpp1);
+
+  const int scc0[] = { 2 };
+  const int scc1[] = { 2, 4, 6 };
+
+  const int ccfg_single[] = { 0, 1, 3, 8, 10 };
+
+  auto_graph cfg (binary_search_cfg ());
+  auto_sbitmap_vector edges (sbitmap_vector_alloc (cfg->n_vertices,
+						   cfg->n_vertices));
+  bitmap_vector_clear (edges, cfg->n_vertices);
+  for (int i = 0; i != cfg->n_vertices; ++i)
+    for (graph_edge *e = cfg->vertices[i].succ; e; e = e->succ_next)
+      bitmap_set_bit (edges[e->src], e->dest);
+
+  vec<trie> ccfg_paths {};
+  ccfg_paths.safe_grow_cleared (6);
+  ccfg_paths[0].insert (array_slice <const int> (ccfg_single + 0, 1));
+  ccfg_paths[1].insert (array_slice <const int> (ccfg_single + 1, 1));
+  ccfg_paths[3].insert (array_slice <const int> (ccfg_single + 2, 1));
+  ccfg_paths[4].insert (array_slice <const int> (ccfg_single + 3, 1));
+  ccfg_paths[5].insert (array_slice <const int> (ccfg_single + 4, 1));
+
+  /* The paths for the SCC would need to be updated in ccfg pre pass.  This
+     feels like a clumsy interface and should maybe be disconnected from the
+     struct graph.  */
+  ccfg_paths[2].hard_insert (scc0);
+  ccfg_paths[2].hard_insert (scc1);
+
+  trie cpp = cfg_complete_prime_paths (edges, ccfg_paths, ccfg_prime_paths);
+
+  const int p01[] = { 0, 1, 2, 3, 10 };
+  const int p02[] = { 0, 1, 2, 4, 6, 8, 10 };
+
+  ASSERT_EQ (count (cpp), 2);
+  ASSERT_TRUE (contains (cpp, p01));
+  ASSERT_TRUE (contains (cpp, p02));
+}
+
+static vec<int>
+binary_search_scc_map ()
+{
+  vec<int> sccs {};
+  sccs.safe_grow (11);
+  sccs[0] = 0;
+  sccs[1] = 1;
+  sccs[2] = 2;
+  sccs[3] = 3;
+  sccs[4] = 2;
+  sccs[5] = 2;
+  sccs[6] = 2;
+  sccs[7] = 2;
+  sccs[8] = 4;
+  sccs[9] = 2;
+  sccs[10] = 5;
+  return sccs;
+}
+
+static void
+test_exit_prime_paths ()
+{
+  const int cpp01[] = { 0, 1, 2, 3, 10 };
+  const int cpp02[] = { 0, 1, 2, 4, 6, 8, 10 };
+  trie complete_prime_paths {};
+  complete_prime_paths.insert_with_suffix (cpp01);
+  complete_prime_paths.insert_with_suffix (cpp02);
+
+  const int ep01[] = { 4, 6, 9, 7, 2 };
+  const int ep02[] = { 5, 7, 2, 4, 6 };
+  const int ep03[] = { 9, 7, 2, 4, 6 };
+  const int ep04[] = { 4, 5, 7, 2 };
+  trie exit_paths;
+  exit_paths.insert (ep01);
+  exit_paths.insert (ep02);
+  exit_paths.insert (ep03);
+  exit_paths.insert (ep04);
+
+  auto_graph cfg (binary_search_cfg ());
+  trie epp = scc_exit_prime_paths (cfg, exit_paths, complete_prime_paths);
+
+  const int pp01[] = { 4, 6, 9, 7, 2, 3, 10 };
+  const int pp02[] = { 5, 7, 2, 4, 6, 8, 10 };
+  const int pp03[] = { 9, 7, 2, 4, 6, 8, 10 };
+  const int pp04[] = { 4, 5, 7, 2, 3, 10 };
+
+  ASSERT_EQ (count (epp), 4);
+  ASSERT_TRUE (contains (epp, pp01));
+  ASSERT_TRUE (contains (epp, pp02));
+  ASSERT_TRUE (contains (epp, pp03));
+  ASSERT_TRUE (contains (epp, pp04));
+}
+
+static void
+test_entry_prime_paths ()
+{
+  auto_graph cfg (binary_search_cfg ());
+
+  static int sccep01[] = { 2, 4, 6, 9, 7 };
+  static int sccep02[] = { 2, 4, 5, 7 };
+  trie scc_entry_paths;
+  scc_entry_paths.insert (sccep01);
+  scc_entry_paths.insert (sccep02);
+
+  const int cpp01[] = { 0, 1, 2, 3, 10 };
+  const int cpp02[] = { 0, 1, 2, 4, 6, 8, 10 };
+  trie complete_prime_paths {};
+  complete_prime_paths.insert (cpp01);
+  complete_prime_paths.insert (cpp02);
+
+  const int ep01[] = { 4, 6, 9, 7, 2, 3, 10 };
+  const int ep02[] = { 5, 7, 2, 4, 6, 8, 10 };
+  const int ep03[] = { 9, 7, 2, 4, 6, 8, 10 };
+  const int ep04[] = { 4, 5, 7, 2, 3, 10 };
+  trie exit_prime_paths {};
+  exit_prime_paths.insert (ep01);
+  exit_prime_paths.insert (ep02);
+  exit_prime_paths.insert (ep03);
+  exit_prime_paths.insert (ep04);
+
+  vec<int> sccs = binary_search_scc_map ();
+
+  trie epp = scc_entry_prime_paths (cfg, scc_entry_paths,
+				    complete_prime_paths,
+				    exit_prime_paths);
+
+  /* The 0 (start node) does not show up in the Fazli & Afschardi paper and
+     kinda, but has no real impact on the result.  The prime-paths functions
+     do not care about these vertices, but the path-coverage instrumentation
+     filters out the ENTRY/EXIT blocks from all the paths.  */
+  const int pp01[] = { 0, 1, 2, 4, 6, 9, 7 };
+  const int pp02[] = { 0, 1, 2, 4, 5, 7 };
+  ASSERT_EQ (count (epp), 2);
+  ASSERT_TRUE (contains (epp, pp01));
+  ASSERT_TRUE (contains (epp, pp02));
+}
+
+/* A straight-line graph with one vertex should yield a single path of length 1
+   with just the non-exit non-entry node in it.  */
+void
+test_singleton_path ()
+{
+  auto_graph cfg (new_graph (3));
+  add_edge (cfg, 0, 2);
+  add_edge (cfg, 2, 1);
+  vec<vec<int>> paths = prime_paths (cfg, 100);
+
+  ASSERT_EQ (paths.length (), 1);
+  ASSERT_EQ (paths[0].length (), 3);
+  ASSERT_EQ (paths[0][0], 0);
+  ASSERT_EQ (paths[0][1], 2);
+  ASSERT_EQ (paths[0][2], 1);
+  release_vec_vec (paths);
+}
+
+void
+path_coverage_cc_tests ()
+{
+  approx_limit = 250000;
+  test_prime_paths ();
+  test_build_ccfg ();
+  test_split_components ();
+  test_scc_internal_prime_paths ();
+  test_scc_entry_exit_paths ();
+  test_complete_prime_paths ();
+  test_exit_prime_paths ();
+  test_entry_prime_paths ();
+  test_singleton_path ();
+}
+
+} // namespace selftest
+
+#endif /* #if CHECKING_P */
diff --git a/gcc/profile.cc b/gcc/profile.cc
index 25d4f4a4b86..c3d06c4d387 100644
--- a/gcc/profile.cc
+++ b/gcc/profile.cc
@@ -1545,7 +1545,7 @@ branch_prob (bool thunk)
 
   remove_fake_edges ();
 
-  if (condition_coverage_flag || profile_arc_flag)
+  if (condition_coverage_flag || path_coverage_flag || profile_arc_flag)
       gimple_init_gcov_profiler ();
 
   if (condition_coverage_flag)
@@ -1597,6 +1597,10 @@ branch_prob (bool thunk)
 	instrument_values (values);
     }
 
+  void find_paths(struct function*);
+  if (path_coverage_flag)
+    find_paths(cfun);
+
   free_aux_for_edges ();
 
   values.release ();
diff --git a/gcc/selftest-run-tests.cc b/gcc/selftest-run-tests.cc
index e6779206c47..9361e43ccdf 100644
--- a/gcc/selftest-run-tests.cc
+++ b/gcc/selftest-run-tests.cc
@@ -105,6 +105,7 @@ selftest::run_tests ()
   diagnostic_path_cc_tests ();
   simple_diagnostic_path_cc_tests ();
   attribs_cc_tests ();
+  path_coverage_cc_tests ();
 
   /* This one relies on most of the above.  */
   function_tests_cc_tests ();
diff --git a/gcc/selftest.h b/gcc/selftest.h
index dcb1463ed90..4df9c0dd836 100644
--- a/gcc/selftest.h
+++ b/gcc/selftest.h
@@ -241,6 +241,7 @@ extern void json_cc_tests ();
 extern void optinfo_emit_json_cc_tests ();
 extern void opts_cc_tests ();
 extern void ordered_hash_map_tests_cc_tests ();
+extern void path_coverage_cc_tests ();
 extern void predict_cc_tests ();
 extern void pretty_print_cc_tests ();
 extern void range_tests ();
diff --git a/gcc/testsuite/g++.dg/gcov/gcov-22.C b/gcc/testsuite/g++.dg/gcov/gcov-22.C
new file mode 100644
index 00000000000..69e0728e3e4
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/gcov-22.C
@@ -0,0 +1,170 @@
+/* { dg-options "--coverage -fpath-coverage" } */
+/* { dg-do compile } */
+
+#include <stdexcept>
+
+/* This test is a collection of odd CFG shapes which has been shown to
+   trigger ICEs.  */
+
+/* A hard abort-like exit could lead to paths with only the exit node,
+   which would then be an empty path after entry/exit prunng.
+
+   BEGIN paths
+   summary: 0/1  */
+void xabort () { __builtin_exit (0); }
+/* END */
+
+/* Unconditional exceptions have the same effect as aborts
+   w.r.t. empty paths.  */
+int throws (int) { throw std::runtime_error("exception"); }
+
+int _setjmp (void **);
+/* Bad instrumentation would give
+   'error: returns_twice call is not first in basic block 3'.
+
+   BEGIN paths
+   summary: 0/1  */
+void set_jmp ()
+/* END */
+{
+  _setjmp (0);
+}
+
+/* From g++.dg/torture/pr83482.C  */
+void a();
+struct b {
+  virtual long c() { return 0L; }
+  void m_fn2() { c(); }
+} d;
+
+/* BEGIN paths
+   summary: 0/3  */
+void e() {
+/* END */
+  d.m_fn2();
+  try {
+    a();
+    _setjmp(0);
+  } catch (...) {
+  }
+}
+
+/* BEGIN paths
+   summary: 0/1  */
+void multiple_setjmp ()
+/* END */
+{
+  _setjmp (0);
+  _setjmp (0);
+}
+
+/* loop_setjmp and loop_setjmp_continue are based on patterns found in
+   expat-2.5.0 tests/minicheck.c srunner_run_all.  */
+void loop_setjmp (int n)
+{
+  for (int i = 0; i < n; ++i)
+    _setjmp (0);
+}
+
+void loop_setjmp_continue (int n)
+{
+  for (int i = 0; i < n; ++i)
+    {
+      if (i < n / 4)
+	if (_setjmp (0))
+	  continue;
+
+      if (_setjmp (0))
+	continue;
+
+      if (i < n / 2)
+	if (_setjmp (0))
+	  continue;
+    }
+}
+
+void loop_setjmp_infinite (int n)
+{
+  for (;;)
+    _setjmp (0);
+}
+
+/* BEGIN paths
+   summary: 0/2  */
+void multiple_conditional_setjmp (int a)
+/* END */
+{
+  if (a)
+    _setjmp (0);
+  else
+    _setjmp (0);
+
+  _setjmp (0);
+}
+
+/* Infinite loops can create CFGs that are a bit different, e.g. no edge for
+   skipping or brekaing out of the loop.  */
+int id (int x) { return x; }
+int infinite1 ()
+{
+  for (int c = 0; ; c++) id (c);
+}
+
+void infinite2 ()
+{
+  for (;;) {}
+}
+
+/* This function has multiple SCCs (loops), but one has too many internal paths
+   (2^32).  If the scc-internal-paths is not limited this function will not
+   compile in reasonable time.  */
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wcoverage-too-many-paths"
+void too_many_paths_single_scc (int x)
+{
+  int z = 0;
+  int c = 0;
+
+  for (int i = 0; i != x; ++i)
+    for (int k = i; k != x; ++k)
+      c++;
+
+  for (int i = 0; i != 100; ++i)
+    {
+      if (z + i + 0 < x) z++;
+      if (z + i + 1 < x) z++;
+      if (z + i + 2 < x) z++;
+      if (z + i + 3 < x) z++;
+      if (z + i + 4 < x) z++;
+      if (z + i + 5 < x) z++;
+      if (z + i + 6 < x) z++;
+      if (z + i + 7 < x) z++;
+      if (z + i + 8 < x) z++;
+      if (z + i + 9 < x) z++;
+      if (z + i + 10 < x) z++;
+      if (z + i + 11 < x) z++;
+      if (z + i + 12 < x) z++;
+      if (z + i + 13 < x) z++;
+      if (z + i + 14 < x) z++;
+      if (z + i + 15 < x) z++;
+      if (z + i + 16 < x) z++;
+      if (z + i + 17 < x) z++;
+      if (z + i + 18 < x) z++;
+      if (z + i + 19 < x) z++;
+      if (z + i + 20 < x) z++;
+      if (z + i + 21 < x) z++;
+      if (z + i + 22 < x) z++;
+      if (z + i + 23 < x) z++;
+      if (z + i + 24 < x) z++;
+      if (z + i + 25 < x) z++;
+      if (z + i + 26 < x) z++;
+      if (z + i + 27 < x) z++;
+      if (z + i + 28 < x) z++;
+      if (z + i + 29 < x) z++;
+      if (z + i + 30 < x) z++;
+      if (z + i + 31 < x) z++;
+    }
+}
+#pragma GCC diagnostic pop
+
+/* { dg-final { run-gcov prime-paths { --prime-paths gcov-22.C } } } */
diff --git a/gcc/testsuite/gcc.misc-tests/gcov-29.c b/gcc/testsuite/gcc.misc-tests/gcov-29.c
new file mode 100644
index 00000000000..320570ec300
--- /dev/null
+++ b/gcc/testsuite/gcc.misc-tests/gcov-29.c
@@ -0,0 +1,869 @@
+/* { dg-options "--coverage -fpath-coverage" } */
+/* { dg-do run { target native } } */
+
+void
+pathcov001a ()
+{
+  /* Empty functions should not be problematic.  */
+}
+
+/* Straight line, which should have only one path.  */
+/* BEGIN paths
+   summary: 1/1
+*/
+void
+pathcov001b ()
+/* END */
+{
+  int a = 0;
+  ++a;
+}
+
+/* Same as b, but not executed.  */
+/* BEGIN paths
+   summary: 0/1
+   expect: 33
+*/
+void
+pathcov001c ()
+/* END */
+{
+  int a = 0;
+  ++a;
+}
+
+/* 002 is a simple baseline test, with no complicated control flow and no
+   loops, run with different combinations of inputs that tests the paths in
+   isolation.  */
+/* BEGIN paths
+   summary: 0/2
+   expect: 48(true) 49 52
+   expect: 48(false) 51 52
+*/
+void
+pathcov002a (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* BEGIN paths
+    summary: 1/2
+    expect: 63(false) 66 67
+*/
+void
+pathcov002c (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* BEGIN paths
+   summary: 1/2
+   expect: 78(true) 79 82
+*/
+void
+pathcov002b (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* Identical to 002*, but run for both inputs.  This should achieve full
+   coverage.
+
+   BEGIN paths
+   summary: 2/2
+*/
+void
+pathcov002d (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* Test individual control flow structures in isolation.  */
+
+/* BEGIN paths
+   summary: 0/2
+   expect: 112(true) 113 114
+   expect: 112(false) 114
+*/
+void
+pathcov003a (int a)
+/* END */
+{
+  if (a)
+    a++;
+}
+
+/* BEGIN paths
+   summary: 0/2
+   expect: 125(true) 126 129
+   expect: 125(false) 128 129
+*/
+void
+pathcov003b (int a)
+/* END */
+{
+  if (a)
+    a++;
+  else
+    a--;
+}
+
+/* BEGIN paths
+   summary: 0/3
+   expect: 141(true) 142 147
+   expect: 141(false) 143(true) 144 147
+   expect: 141(false) 143(false) 146 147
+*/
+void
+pathcov003c (int a)
+/* END */
+{
+  if (a > 10)
+    a++;
+  else if (a > 20)
+    a--;
+  else
+    a += 2;
+}
+
+/* BEGIN paths
+   summary: 0/5
+   expect: 162 162(true) 163
+   expect: 162 162(false) 164
+   expect: 163 162(true) 163
+   expect: 163 162(false) 164
+   expect: 162(true) 163 162
+*/
+void
+pathcov003d (int a)
+/* END */
+{
+  int x = 0;
+  for (int i = 0; i < a; ++i)
+    ++x;
+}
+
+/* BEGIN paths
+   summary: 0/5
+   expect: 180 180(true) 181
+   expect: 180 180(false) 182
+   expect: 181 180(true) 181
+   expect: 181 180(false) 182
+   expect: 180(true) 181 180
+*/
+void
+pathcov003e (int a)
+/* END */
+{
+  int x = 0;
+  int i = 0;
+  while (i++ < a)
+    x++;
+}
+
+/* BEGIN paths
+   summary: 0/2
+   expect: 194 197(false) 198
+   expect: 197(true) 197
+*/
+void
+pathcov003f (int a)
+/* END */
+{
+  int x = 0;
+  int i = 0;
+  do {
+      x++;
+  } while (i++ < a);
+}
+
+/* BEGIN paths
+   summary: 0/5
+   expect: 213 216(true) 220
+   expect: 213 216(false) 222
+   expect: 216(true) 220 216
+   expect: 220 216(true) 220
+   expect: 220 216(false) 222
+*/
+void
+pathcov003g (int a)
+/* END */
+{
+  int i = 0;
+  int x = 0;
+
+top:
+  if (i < a)
+    {
+      x++;
+      i++;
+      goto top;
+    }
+}
+
+/* This example has a good mix of control flow structures which makes it nice
+   at identifying problems with the bookeeping/instrumentation.  */
+
+/* BEGIN paths
+   summary: 0/9
+   expect: 243(false) 247 247(true) 247(true) 249(true) 250 253
+   expect: 243(false) 247 247(true) 247(false) 253
+   expect: 243(false) 247 247(false) 253
+   expect: 243(true) 253
+   expect: 249(false) 247(true) 247(true) 249
+   expect: 249(false) 247(true) 247(false) 253
+   expect: 247(true) 247(true) 249(false) 247
+   expect: 247(true) 249(false) 247(true) 247
+   expect: 247(true) 249(false) 247(false) 253
+*/
+void
+pathcov004a (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (1, 0, 0, 0)
+   summary: 1/9 */
+void
+pathcov004b (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 0, 0, 0)
+   summary: 1/9 */
+void
+pathcov004c (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 1, 0, 0)
+   summary: 1/9 */
+void
+pathcov004d (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 1, 1, 0)
+   summary: 2/9 */
+void
+pathcov004e (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 2, 1, 0)
+   summary: 3/9 */
+void
+pathcov004f (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* Funny loop exits.  */
+
+/* BEGIN paths
+   summary: 0/14
+*/
+void
+pathcov005a (int a, int b, int c)
+/* END */
+{
+  while (a)
+    {
+      if (b)
+	return;
+      while (c--)
+	a++;
+    }
+}
+
+void
+pathcov005b (int a, int b, int c)
+/* END */
+{
+  if (a)
+    goto loop;
+
+  while (b > 0)
+    {
+      b--;
+loop:
+      while (c--)
+	a++;
+    }
+}
+
+void
+pathcov005c (int a, int b, int c)
+/* END */
+{
+  while (a-- > 0) c++;
+  while (b-- > 0) c++;
+}
+
+/* BEGIN paths
+   summary: 0/67
+
+   This is a sanity check and baseline and should not be executed.
+
+   With >64 cases we should have >64 paths which guarantees we cannot fit the
+   full bitset within a in a single gcov type.  We want to only update the
+   relevant counters because extra instructions are expensive in compile time
+   and binary bloat, and verify that only taken paths are recorded.  */
+void
+pathcov006a (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (0)
+   summary: 1/67
+*/
+void
+pathcov006b (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (64)
+   summary: 1/67 */
+void
+pathcov006c (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (2, 65)
+   summary: 2/67
+
+   In this case we should record a path in both halves of the accumulator
+   bitsets.  Note that the paths don't overlap with the single-half examples in
+   006b and 006c to reduce the chance of accidental passes.  */
+void
+pathcov006d (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (66)
+   summary: 1/67
+
+   This case should hit the empty default.  */
+void
+pathcov006e (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+    default:
+  }
+}
+
+void *alloc (int) { return 0; }
+void *getcwd (void *, int) { return 0; }
+void release (void *) {}
+
+/* Based on gnu_getcwd from tree.c.  This function crashed in early
+   development, and is mostly useful as a regression test.
+
+   BEGIN paths
+   summary: 0/8  */
+void *gnu_getcwd()
+/* END */
+{
+  int size = 100;
+  void *buffer = alloc (size);
+
+  while (1) {
+    void *value = getcwd (buffer, size);
+    if (value != 0) return buffer;
+    size *= 2;
+    release (buffer);
+    buffer = (void *) alloc (size);
+  }
+}
+
+/* BEGIN paths
+   summary: 0/5  */
+void pathcov007a (int a)
+/* END */
+{
+  goto mid;
+  while (1)
+    {
+      a--;
+    mid:
+      a--;
+      if (a < -5)
+	break;
+      a--;
+    }
+}
+
+int
+main ()
+{
+  pathcov001a ();
+  pathcov001b ();
+  /* not called: pathcov001c (); */
+
+  /* not called: pathcov002a (); */
+  pathcov002b (0);
+  pathcov002c (1);
+  pathcov002d (0);
+  pathcov002d (1);
+
+  pathcov004b (1, 0, 0, 0);
+  pathcov004c (0, 0, 0, 0);
+  pathcov004d (0, 1, 0, 0);
+  pathcov004e (0, 1, 1, 0);
+  pathcov004f (0, 2, 1, 0);
+
+  /* not called: pathcov006a (); */
+  pathcov006b (0);
+  pathcov006c (64);
+  pathcov006d (2);
+  pathcov006d (65);
+  pathcov006e (66);
+}
+
+/* { dg-final { run-gcov prime-paths { --prime-paths-lines gcov-29.c } } } */
diff --git a/gcc/testsuite/gcc.misc-tests/gcov-30.c b/gcc/testsuite/gcc.misc-tests/gcov-30.c
new file mode 100644
index 00000000000..dbc168186b7
--- /dev/null
+++ b/gcc/testsuite/gcc.misc-tests/gcov-30.c
@@ -0,0 +1,869 @@
+/* { dg-options "--coverage -fpath-coverage -fprofile-update=atomic" } */
+/* { dg-do run { target native } } */
+
+void
+pathcov001a ()
+{
+  /* Empty functions should not be problematic.  */
+}
+
+/* Straight line, which should have only one path.  */
+/* BEGIN paths
+   summary: 1/1
+*/
+void
+pathcov001b ()
+/* END */
+{
+  int a = 0;
+  ++a;
+}
+
+/* Same as b, but not executed.  */
+/* BEGIN paths
+   summary: 0/1
+   expect: 33
+*/
+void
+pathcov001c ()
+/* END */
+{
+  int a = 0;
+  ++a;
+}
+
+/* 002 is a simple baseline test, with no complicated control flow and no
+   loops, run with different combinations of inputs that tests the paths in
+   isolation.  */
+/* BEGIN paths
+   summary: 0/2
+   expect: 48(true) 49 52
+   expect: 48(false) 51 52
+*/
+void
+pathcov002a (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* BEGIN paths
+    summary: 1/2
+    expect: 63(false) 66 67
+*/
+void
+pathcov002c (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* BEGIN paths
+   summary: 1/2
+   expect: 78(true) 79 82
+*/
+void
+pathcov002b (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* Identical to 002*, but run for both inputs.  This should achieve full
+   coverage.
+
+   BEGIN paths
+   summary: 2/2
+*/
+void
+pathcov002d (int a)
+/* END */
+{
+  int v = 0;
+  if (a)
+    v++;
+  else
+    v--;
+}
+
+/* Test individual control flow structures in isolation.  */
+
+/* BEGIN paths
+   summary: 0/2
+   expect: 112(true) 113 114
+   expect: 112(false) 114
+*/
+void
+pathcov003a (int a)
+/* END */
+{
+  if (a)
+    a++;
+}
+
+/* BEGIN paths
+   summary: 0/2
+   expect: 125(true) 126 129
+   expect: 125(false) 128 129
+*/
+void
+pathcov003b (int a)
+/* END */
+{
+  if (a)
+    a++;
+  else
+    a--;
+}
+
+/* BEGIN paths
+   summary: 0/3
+   expect: 141(true) 142 147
+   expect: 141(false) 143(true) 144 147
+   expect: 141(false) 143(false) 146 147
+*/
+void
+pathcov003c (int a)
+/* END */
+{
+  if (a > 10)
+    a++;
+  else if (a > 20)
+    a--;
+  else
+    a += 2;
+}
+
+/* BEGIN paths
+   summary: 0/5
+   expect: 162 162(true) 163
+   expect: 162 162(false) 164
+   expect: 163 162(true) 163
+   expect: 163 162(false) 164
+   expect: 162(true) 163 162
+*/
+void
+pathcov003d (int a)
+/* END */
+{
+  int x = 0;
+  for (int i = 0; i < a; ++i)
+    ++x;
+}
+
+/* BEGIN paths
+   summary: 0/5
+   expect: 180 180(true) 181
+   expect: 180 180(false) 182
+   expect: 181 180(true) 181
+   expect: 181 180(false) 182
+   expect: 180(true) 181 180
+*/
+void
+pathcov003e (int a)
+/* END */
+{
+  int x = 0;
+  int i = 0;
+  while (i++ < a)
+    x++;
+}
+
+/* BEGIN paths
+   summary: 0/2
+   expect: 194 197(false) 198
+   expect: 197(true) 197
+*/
+void
+pathcov003f (int a)
+/* END */
+{
+  int x = 0;
+  int i = 0;
+  do {
+      x++;
+  } while (i++ < a);
+}
+
+/* BEGIN paths
+   summary: 0/5
+   expect: 213 216(true) 220
+   expect: 213 216(false) 222
+   expect: 216(true) 220 216
+   expect: 220 216(true) 220
+   expect: 220 216(false) 222
+*/
+void
+pathcov003g (int a)
+/* END */
+{
+  int i = 0;
+  int x = 0;
+
+top:
+  if (i < a)
+    {
+      x++;
+      i++;
+      goto top;
+    }
+}
+
+/* This example has a good mix of control flow structures which makes it nice
+   at identifying problems with the bookeeping/instrumentation.  */
+
+/* BEGIN paths
+   summary: 0/9
+   expect: 243(false) 247 247(true) 247(true) 249(true) 250 253
+   expect: 243(false) 247 247(true) 247(false) 253
+   expect: 243(false) 247 247(false) 253
+   expect: 243(true) 253
+   expect: 249(false) 247(true) 247(true) 249
+   expect: 249(false) 247(true) 247(false) 253
+   expect: 247(true) 247(true) 249(false) 247
+   expect: 247(true) 249(false) 247(true) 247
+   expect: 247(true) 249(false) 247(false) 253
+*/
+void
+pathcov004a (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (1, 0, 0, 0)
+   summary: 1/9 */
+void
+pathcov004b (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 0, 0, 0)
+   summary: 1/9 */
+void
+pathcov004c (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 1, 0, 0)
+   summary: 1/9 */
+void
+pathcov004d (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 1, 1, 0)
+   summary: 2/9 */
+void
+pathcov004e (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* BEGIN paths
+   args: (0, 2, 1, 0)
+   summary: 3/9 */
+void
+pathcov004f (int a, int b, int c, int d)
+/* END */
+{
+  if (a)
+    {}
+  else
+    {
+      while (b-- > 0 && c-- > 0)
+	{
+	  if (d)
+	    break;
+	}
+    }
+}
+
+/* Funny loop exits.  */
+
+/* BEGIN paths
+   summary: 0/14
+*/
+void
+pathcov005a (int a, int b, int c)
+/* END */
+{
+  while (a)
+    {
+      if (b)
+	return;
+      while (c--)
+	a++;
+    }
+}
+
+void
+pathcov005b (int a, int b, int c)
+/* END */
+{
+  if (a)
+    goto loop;
+
+  while (b > 0)
+    {
+      b--;
+loop:
+      while (c--)
+	a++;
+    }
+}
+
+void
+pathcov005c (int a, int b, int c)
+/* END */
+{
+  while (a-- > 0) c++;
+  while (b-- > 0) c++;
+}
+
+/* BEGIN paths
+   summary: 0/67
+
+   This is a sanity check and baseline and should not be executed.
+
+   With >64 cases we should have >64 paths which guarantees we cannot fit the
+   full bitset within a in a single gcov type.  We want to only update the
+   relevant counters because extra instructions are expensive in compile time
+   and binary bloat, and verify that only taken paths are recorded.  */
+void
+pathcov006a (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (0)
+   summary: 1/67
+*/
+void
+pathcov006b (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (64)
+   summary: 1/67 */
+void
+pathcov006c (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (2, 65)
+   summary: 2/67
+
+   In this case we should record a path in both halves of the accumulator
+   bitsets.  Note that the paths don't overlap with the single-half examples in
+   006b and 006c to reduce the chance of accidental passes.  */
+void
+pathcov006d (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+  }
+}
+
+/* BEGIN paths
+   args: (66)
+   summary: 1/67
+
+   This case should hit the empty default.  */
+void
+pathcov006e (int a)
+/* END */
+{
+  int x = 0;
+  switch (a)
+  {
+    case 0: x++; break;
+    case 1: x++; break;
+    case 2: x++; break;
+    case 3: x++; break;
+    case 4: x++; break;
+    case 5: x++; break;
+    case 6: x++; break;
+    case 7: x++; break;
+    case 8: x++; break;
+    case 9: x++; break;
+    case 10: x++; break;
+    case 11: x++; break;
+    case 12: x++; break;
+    case 13: x++; break;
+    case 14: x++; break;
+    case 15: x++; break;
+    case 16: x++; break;
+    case 17: x++; break;
+    case 18: x++; break;
+    case 19: x++; break;
+    case 20: x++; break;
+    case 21: x++; break;
+    case 22: x++; break;
+    case 23: x++; break;
+    case 24: x++; break;
+    case 25: x++; break;
+    case 26: x++; break;
+    case 27: x++; break;
+    case 28: x++; break;
+    case 29: x++; break;
+    case 30: x++; break;
+    case 31: x++; break;
+    case 32: x++; break;
+    case 33: x++; break;
+    case 34: x++; break;
+    case 35: x++; break;
+    case 36: x++; break;
+    case 37: x++; break;
+    case 38: x++; break;
+    case 39: x++; break;
+    case 40: x++; break;
+    case 41: x++; break;
+    case 42: x++; break;
+    case 43: x++; break;
+    case 44: x++; break;
+    case 45: x++; break;
+    case 46: x++; break;
+    case 47: x++; break;
+    case 48: x++; break;
+    case 49: x++; break;
+    case 50: x++; break;
+    case 51: x++; break;
+    case 52: x++; break;
+    case 53: x++; break;
+    case 54: x++; break;
+    case 55: x++; break;
+    case 56: x++; break;
+    case 57: x++; break;
+    case 58: x++; break;
+    case 59: x++; break;
+    case 60: x++; break;
+    case 61: x++; break;
+    case 62: x++; break;
+    case 63: x++; break;
+    case 64: x++; break;
+    case 65: x++; break;
+    default:
+  }
+}
+
+void *alloc (int) { return 0; }
+void *getcwd (void *, int) { return 0; }
+void release (void *) {}
+
+/* Based on gnu_getcwd from tree.c.  This function crashed in early
+   development, and is mostly useful as a regression test.
+
+   BEGIN paths
+   summary: 0/8  */
+void *gnu_getcwd()
+/* END */
+{
+  int size = 100;
+  void *buffer = alloc (size);
+
+  while (1) {
+    void *value = getcwd (buffer, size);
+    if (value != 0) return buffer;
+    size *= 2;
+    release (buffer);
+    buffer = (void *) alloc (size);
+  }
+}
+
+/* BEGIN paths
+   summary: 0/5  */
+void pathcov007a (int a)
+/* END */
+{
+  goto mid;
+  while (1)
+    {
+      a--;
+    mid:
+      a--;
+      if (a < -5)
+	break;
+      a--;
+    }
+}
+
+int
+main ()
+{
+  pathcov001a ();
+  pathcov001b ();
+  /* not called: pathcov001c (); */
+
+  /* not called: pathcov002a (); */
+  pathcov002b (0);
+  pathcov002c (1);
+  pathcov002d (0);
+  pathcov002d (1);
+
+  pathcov004b (1, 0, 0, 0);
+  pathcov004c (0, 0, 0, 0);
+  pathcov004d (0, 1, 0, 0);
+  pathcov004e (0, 1, 1, 0);
+  pathcov004f (0, 2, 1, 0);
+
+  /* not called: pathcov006a (); */
+  pathcov006b (0);
+  pathcov006c (64);
+  pathcov006d (2);
+  pathcov006d (65);
+  pathcov006e (66);
+}
+
+/* { dg-final { run-gcov prime-paths { --prime-paths-lines gcov-30.c } } } */
diff --git a/gcc/testsuite/lib/gcov.exp b/gcc/testsuite/lib/gcov.exp
index 91d9e85f8cf..7f09e2a954f 100644
--- a/gcc/testsuite/lib/gcov.exp
+++ b/gcc/testsuite/lib/gcov.exp
@@ -537,6 +537,86 @@ proc verify-filters { testname testcase file expected unexpected } {
     return [expr [llength $ex] + [llength $unex]]
 }
 
+proc verify-prime-paths { testname testcase file } {
+    set failed 0
+    set fd [open $file r]
+
+    set expected_n -1
+    set expected_m -1
+    set recording 0
+    set expected ""
+
+    while { [gets $fd line] >= 0 } {
+	regexp "^\[^:\]+: *(\[0-9\]+):" "$line" all lineno
+	set prefix "$testname line $lineno"
+
+	if {[regexp "BEGIN *paths" $line]} {
+	    set recording 1
+	    set expected ""
+	    set expected_n -1
+	    set expected_m -1
+	    set seen ""
+	    continue
+	}
+
+	if { $recording != 1 } {
+	    continue
+	}
+
+	if [regexp {summary: *(\d+)/(\d+)} $line _ n m] {
+	    set expected_n $n
+	    set expected_m $m
+	}
+
+	if [regexp "expect: *(.*)" $line all ln] {
+	    set cases ""
+	    set ln [regsub -all {\s+} $ln " "]
+	    foreach case [split  $ln " "] {
+		lappend cases $case
+	    }
+	    lappend expected $cases
+	}
+
+	if [regexp "END" $line] {
+	    if {$recording != 1} {
+		incr failed
+		fail "unexpected END at line $lineno, missing BEGIN"
+
+		# Abort the test if there is a mismatch, to avoid creating
+		# unecessary errors.  At this point the test itself is broken.
+		break
+	    }
+	    set recording 0
+
+	    if {[llength $expected] > 0} {
+		incr failed
+		fail "expected: '$expected'"
+	    }
+	}
+
+	if [regexp {paths covered (\d+) of (\d+)} $line _ n m] {
+	    if { $n ne $expected_n || $m ne $expected_m } {
+		incr failed
+		fail "$prefix: expected $expected_n/$expected_m covered paths, was $n/$m"
+	    }
+	}
+
+	if [regexp {path *\d+ not covered: lines (.*)} $line _ path] {
+	    set pathl ""
+	    foreach ln [split $path " "] {
+		if [regexp {\s*(.*)\s*} $ln _ key] {
+		    lappend pathl $key
+		}
+	    }
+	    set i [lsearch $expected $pathl]
+	    set expected [lreplace $expected $i $i]
+	}
+    }
+
+    close $fd
+    return $failed
+}
+
 proc gcov-pytest-format-line { args } {
     global subdir
 
@@ -610,6 +690,7 @@ proc run-gcov { args } {
     set gcov_verify_calls 0
     set gcov_verify_branches 0
     set gcov_verify_conditions 0
+    set gcov_verify_prime_paths 0
     set gcov_verify_lines 1
     set gcov_verify_intermediate 0
     set gcov_verify_filters 0
@@ -627,6 +708,8 @@ proc run-gcov { args } {
 	  set gcov_verify_filters 1
 	  set verify_filters_expected [lindex $a 1]
 	  set verify_filters_unexpected [lindex $a 2]
+	} elseif { $a == "prime-paths" } {
+	  set gcov_verify_prime_paths 1
 	} elseif { $a == "intermediate" } {
 	  set gcov_verify_intermediate 1
 	  set gcov_verify_calls 0
@@ -707,6 +790,11 @@ proc run-gcov { args } {
     } else {
 	set cdfailed 0
     }
+    if { $gcov_verify_prime_paths } {
+	set ppfailed [verify-prime-paths $testname $testcase $testcase.gcov]
+    } else {
+	set ppfailed 0
+    }
     if { $gcov_verify_calls } {
 	set cfailed [verify-calls $testname $testcase $testcase.gcov]
     } else {
@@ -726,12 +814,12 @@ proc run-gcov { args } {
 
     # Report whether the gcov test passed or failed.  If there were
     # multiple failures then the message is a summary.
-    set tfailed [expr $lfailed + $bfailed + $cdfailed + $cfailed + $ifailed + $ffailed]
+    set tfailed [expr $lfailed + $bfailed + $cdfailed + $ppfailed + $cfailed + $ifailed + $ffailed]
     if { $xfailed } {
 	setup_xfail "*-*-*"
     }
     if { $tfailed > 0 } {
-	fail "$testname gcov: $lfailed failures in line counts, $bfailed in branch percentages, $cdfailed in condition/decision, $cfailed in return percentages, $ifailed in intermediate format, $ffailed failed in filters"
+	fail "$testname gcov: $lfailed failures in line counts, $bfailed in branch percentages, $cdfailed in condition/decision, $ppfailed in prime-paths, $cfailed in return percentages, $ifailed in intermediate format, $ffailed failed in filters"
 	if { $xfailed } {
 	    clean-gcov $testcase
 	}
diff --git a/gcc/tree-profile.cc b/gcc/tree-profile.cc
index 153c9323040..80520eecd3e 100644
--- a/gcc/tree-profile.cc
+++ b/gcc/tree-profile.cc
@@ -1908,7 +1908,7 @@ tree_profiling (void)
 	  thunk = true;
 	  /* When generate profile, expand thunk to gimple so it can be
 	     instrumented same way as other functions.  */
-	  if (profile_arc_flag || condition_coverage_flag)
+	  if (profile_arc_flag || condition_coverage_flag || path_coverage_flag)
 	    expand_thunk (node, false, true);
 	  /* Read cgraph profile but keep function as thunk at profile-use
 	     time.  */
@@ -1953,7 +1953,8 @@ tree_profiling (void)
   release_profile_file_filtering ();
 
   /* Drop pure/const flags from instrumented functions.  */
-  if (profile_arc_flag || condition_coverage_flag || flag_test_coverage)
+  if (profile_arc_flag || condition_coverage_flag || path_coverage_flag
+      || flag_test_coverage)
     FOR_EACH_DEFINED_FUNCTION (node)
       {
 	if (!gimple_has_body_p (node->decl)
@@ -1985,7 +1986,8 @@ tree_profiling (void)
 
       push_cfun (DECL_STRUCT_FUNCTION (node->decl));
 
-      if (profile_arc_flag || condition_coverage_flag || flag_test_coverage)
+      if (profile_arc_flag || condition_coverage_flag || path_coverage_flag
+	  || flag_test_coverage)
 	FOR_EACH_BB_FN (bb, cfun)
 	  {
 	    gimple_stmt_iterator gsi;
@@ -2070,7 +2072,8 @@ pass_ipa_tree_profile::gate (function *)
      disabled.  */
   return (!in_lto_p && !flag_auto_profile
 	  && (flag_branch_probabilities || flag_test_coverage
-	      || profile_arc_flag || condition_coverage_flag)
+	      || profile_arc_flag || condition_coverage_flag
+	      || path_coverage_flag)
 	  && !seen_error ());
 }