diff mbox series

[4/7,v2] lto: Implement ltrans cache

Message ID wjmjyztoxlfj4boe3anvum7fg23ksjoozgtgthdusoujqisqtp@pken7ze2yb5x
State New
Headers show
Series None | expand

Commit Message

Michal Jireš June 20, 2024, 11:31 a.m. UTC
Outside of suggested changes, this version:
- uses #define INCLUDE_*
- is rebased onto current trunk - 'fprintf (mstream,' lines
- --verbose 'recompiling++' count is moved into correct if branch

Comments

Andi Kleen June 20, 2024, 5:45 p.m. UTC | #1
Michal Jires <mjires@suse.cz> writes:

No performance data?

> +
> +static const md5_checksum_t INVALID_CHECKSUM = {
> +  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> +};

There are much faster/optimized modern hashes for good collision detection over
MD5 especially when it's not needed to be cryptographically secure. Pick
something from smhasher.

Also perhaps the check sum should be cached in the file? I assume it's
cheap to compute while writing. It could be written at the tail of the
file. Then it can be read by seeking to the end and you save that
step.

The lockfiles scare me a bit. What happens when they get lost, e.g.
due to a compiler crash? You may need some recovery for that.
Perhaps it would be better to make the files self checking, so that
partial files can be detected when reading, and get rid of the locks.

-Andi
Jan Hubicka June 21, 2024, 11:12 a.m. UTC | #2
> Michal Jires <mjires@suse.cz> writes:
> 
> No performance data?

Michal has bachelor thesis on the topic	which has some statistics
https://dspace.cuni.cz/handle/20.500.11956/183051?locale-attribute=en
> 
> > +
> > +static const md5_checksum_t INVALID_CHECKSUM = {
> > +  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
> > +};
> 
> There are much faster/optimized modern hashes for good collision detection over
> MD5 especially when it's not needed to be cryptographically secure. Pick
> something from smhasher.
> 
> Also perhaps the check sum should be cached in the file? I assume it's
> cheap to compute while writing. It could be written at the tail of the
> file. Then it can be read by seeking to the end and you save that
> step.
> 
> The lockfiles scare me a bit. What happens when they get lost, e.g.
> due to a compiler crash? You may need some recovery for that.
> Perhaps it would be better to make the files self checking, so that
> partial files can be detected when reading, and get rid of the locks.

Those seem good ideas. One problem is that the cached files are object
files and thus we will likely need simple-object extension to embed the
hash.

Overall incremental LTO is quite complex problem hitting several areas
where current implementation of LTO is lacking. Most important problem
seems to be the divergence of partition files which happens due to
various global counters and have no good purpose. Fixing those is useful
per se, since it improves reproducibility of builds. (Some of this was
merged to last release, but not all).

Ohter important issue is the debug info which is slow and diverges
often.  We will also need to work on speeding up WPA.

Since it is relatively early stage1, I think it makes sense to merge the
code without adding extra complexity and optimizing it incrementally.
(Since it works relatively well already)

There are many things to do and I think it is better to do that in trunk
rahter than cumulating relatively complex changes on branch.
md5 is already supported by libiberty so it is kind of easy choice for
first cut implementation.

Honza
> 
> -Andi
Michal Jireš June 21, 2024, 4:59 p.m. UTC | #3
On 6/20/24 7:45 PM, Andi Kleen wrote:
> 
> There are much faster/optimized modern hashes for good collision detection over
> MD5 especially when it's not needed to be cryptographically secure. Pick
> something from smhasher.
> 
> Also perhaps the check sum should be cached in the file? I assume it's
> cheap to compute while writing. It could be written at the tail of the
> file. Then it can be read by seeking to the end and you save that
> step.

Good ideas, but with only minor benefits relative to time spent in
LTRANS phase. Current focus was to create simple mostly self-contained
implementation that reduces LTRANS recompilations. We can do these more
pervasive improvements incrementally.

Just to clarify, the hashes are computed only once, then stored in the
cache.

> The lockfiles scare me a bit. What happens when they get lost, e.g.
> due to a compiler crash? You may need some recovery for that.
> Perhaps it would be better to make the files self checking, so that
> partial files can be detected when reading, and get rid of the locks.

It uses process-associated locks via fcntl, so if the compiler crashes,
the locks will be released. If the compiler process crashes and leaves
partially written file, the lto-wrapper deletes it in tool_cleanup.
If a file is missing, the cache entry will be deleted.

Michal
Andi Kleen June 21, 2024, 7:07 p.m. UTC | #4
FWIW I suspect not handling lockfile errors could be a show stopper
even for an initial implementation.  It's not that uncommon that people
press Ctrl-C. flock on systems that have it would be a safer
alternative.

> There are many things to do and I think it is better to do that in trunk
> rahter than cumulating relatively complex changes on branch.
> md5 is already supported by libiberty so it is kind of easy choice for
> first cut implementation.

At least use sha1 then. This is also in libiberty and it has hardware
acceleration on modern x86.

-Andi
Andi Kleen June 21, 2024, 7:09 p.m. UTC | #5
On Fri, Jun 21, 2024 at 06:59:05PM +0200, Michal Jireš wrote:
> > The lockfiles scare me a bit. What happens when they get lost, e.g.
> > due to a compiler crash? You may need some recovery for that.
> > Perhaps it would be better to make the files self checking, so that
> > partial files can be detected when reading, and get rid of the locks.
> 
> It uses process-associated locks via fcntl, so if the compiler crashes,
> the locks will be released. If the compiler process crashes and leaves
> partially written file, the lto-wrapper deletes it in tool_cleanup.
> If a file is missing, the cache entry will be deleted.

Sounds good to me.

-Andi
Sam James June 21, 2024, 7:16 p.m. UTC | #6
Andi Kleen <ak@linux.intel.com> writes:

> FWIW I suspect not handling lockfile errors could be a show stopper
> even for an initial implementation.  It's not that uncommon that people
> press Ctrl-C. flock on systems that have it would be a safer
> alternative.
>
>> There are many things to do and I think it is better to do that in trunk
>> rahter than cumulating relatively complex changes on branch.
>> md5 is already supported by libiberty so it is kind of easy choice for
>> first cut implementation.
>
> At least use sha1 then. This is also in libiberty and it has hardware
> acceleration on modern x86.

I have to say I agree here -- although I'd still want to switch to a
proper non-cryptographic hash later.

Note that it wasn't in libiberty when this work was started, IIRC, but
it is now.

>
> -Andi
diff mbox series

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 90ec59dca75..ab7335ed882 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1855,7 +1855,7 @@  ALL_HOST_BACKEND_OBJS = $(GCC_OBJS) $(OBJS) $(OBJS-libcommon) \
   $(OBJS-libcommon-target) main.o c-family/cppspec.o \
   $(COLLECT2_OBJS) $(EXTRA_GCC_OBJS) $(GCOV_OBJS) $(GCOV_DUMP_OBJS) \
   $(GCOV_TOOL_OBJS) $(GENGTYPE_OBJS) gcc-ar.o gcc-nm.o gcc-ranlib.o \
-  lto-wrapper.o collect-utils.o lockfile.o
+  lto-wrapper.o collect-utils.o lockfile.o lto-ltrans-cache.o
 
 # for anything that is shared use the cc1plus profile data, as that
 # is likely the most exercised during the build
@@ -2384,7 +2384,8 @@  collect2$(exeext): $(COLLECT2_OBJS) $(LIBDEPS)
 CFLAGS-collect2.o += -DTARGET_MACHINE=\"$(target_noncanonical)\" \
 	@TARGET_SYSTEM_ROOT_DEFINE@
 
-LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o
+LTO_WRAPPER_OBJS = lto-wrapper.o collect-utils.o ggc-none.o lockfile.o \
+  lto-ltrans-cache.o
 
 lto-wrapper$(exeext): $(LTO_WRAPPER_OBJS) libcommon-target.a $(LIBDEPS)
 	+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o T$@ \
diff --git a/gcc/lto-ltrans-cache.cc b/gcc/lto-ltrans-cache.cc
new file mode 100644
index 00000000000..a6ff02afb58
--- /dev/null
+++ b/gcc/lto-ltrans-cache.cc
@@ -0,0 +1,409 @@ 
+/* File caching.
+   Copyright (C) 2023-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#define INCLUDE_ALGORITHM
+#define INCLUDE_STRING
+#define INCLUDE_ARRAY
+#define INCLUDE_MAP
+#define INCLUDE_VECTOR
+#include "config.h"
+#include "system.h"
+#include "md5.h"
+#include "lto-ltrans-cache.h"
+
+static const md5_checksum_t INVALID_CHECKSUM = {
+  0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+};
+
+/* Computes checksum for given file, returns INVALID_CHECKSUM if not
+   possible.
+ */
+static md5_checksum_t
+file_checksum (char const *filename)
+{
+  FILE *file = fopen (filename, "rb");
+
+  if (!file)
+    return INVALID_CHECKSUM;
+
+  md5_checksum_t result;
+
+  int ret = md5_stream (file, &result);
+
+  if (ret)
+    result = INVALID_CHECKSUM;
+
+  fclose (file);
+
+  return result;
+}
+
+/* Checks identity of two files byte by byte.  */
+static bool
+files_identical (char const *first_filename, char const *second_filename)
+{
+  FILE *f_first = fopen (first_filename, "rb");
+  if (!f_first)
+    return false;
+
+  FILE *f_second = fopen (second_filename, "rb");
+  if (!f_second)
+    {
+      fclose (f_first);
+      return false;
+    }
+
+  bool ret = true;
+
+  for (;;)
+    {
+      int c1, c2;
+      c1 = fgetc (f_first);
+      c2 = fgetc (f_second);
+
+      if (c1 != c2)
+	{
+	  ret = false;
+	  break;
+	}
+
+      if (c1 == EOF)
+	break;
+    }
+
+  fclose (f_first);
+  fclose (f_second);
+  return ret;
+}
+
+/* Contructor of cache item.  */
+ltrans_file_cache::item::item (std::string input, std::string output,
+			       md5_checksum_t input_checksum,
+			       uint32_t last_used):
+  input (std::move (input)), output (std::move (output)),
+  input_checksum (input_checksum), last_used (last_used)
+{
+  lock = lockfile (this->input + ".lock");
+}
+/* Destructor of cache item.  */
+ltrans_file_cache::item::~item ()
+{
+  lock.unlock ();
+}
+
+/* Reads next cache item from cachedata file.
+   Adds `dir/` prefix to filenames.  */
+static ltrans_file_cache::item*
+read_cache_item (FILE* f, const char* dir)
+{
+  md5_checksum_t checksum;
+  uint32_t last_used;
+
+  if (fread (&checksum, 1, checksum.size (), f) != checksum.size ())
+    return NULL;
+  if (fread (&last_used, sizeof (last_used), 1, f) != 1)
+    return NULL;
+
+  std::string input (dir);
+  input.push_back ('/');
+  std::string output = input; /* Copy.  */
+
+  int c;
+  while ((c = getc (f)))
+    {
+      if (c == EOF)
+	return NULL;
+      input.push_back (c);
+    }
+  input.push_back (0);
+  while ((c = getc (f)))
+    {
+      if (c == EOF)
+	return NULL;
+      output.push_back (c);
+    }
+  output.push_back (0);
+
+  return new ltrans_file_cache::item (&input[0], &output[0], checksum,
+				      last_used);
+}
+
+/* Writes cache item to cachedata file.
+   Removes `dir/` prefix from filenames.  */
+static void
+write_cache_item (FILE* f, ltrans_file_cache::item *item, const char* dir)
+{
+  fwrite (&item->input_checksum, 1, item->input_checksum.size (), f);
+  fwrite (&item->last_used, sizeof (item->last_used), 1, f);
+
+  gcc_assert (item->input.size () > strlen (dir));
+  fputs (item->input.c_str () + strlen (dir) + 1, f);
+  fputc (0, f);
+
+  gcc_assert (item->output.size () > strlen (dir));
+  fputs (item->output.c_str () + strlen (dir) + 1, f);
+  fputc (0, f);
+}
+
+/* Constructor.  Resulting cache item filenames will be
+   in format `prefix%d[.ltrans]suffix`.  */
+ltrans_file_cache::ltrans_file_cache (const char* dir, const char* prefix,
+				      const char* suffix)
+{
+  this->dir = dir;
+  if (!dir) return;
+
+  creation_lock = lockfile (std::string (dir) + "/lockfile_creation");
+  deletion_lock = lockfile (std::string (dir) + "/lockfile_deletion");
+
+  soft_cache_size = 2048;
+
+  cache_prefix = std::string (dir) + "/" + prefix;
+  cache_free_idx = 0;
+
+  this->prefix = prefix;
+  this->suffix = suffix;
+
+  str_buffer = (char *)xmalloc (cache_prefix.size ()
+				+ sizeof ("4294967295.ltrans")
+				+ strlen (suffix));
+}
+
+/* Destructor.  */
+ltrans_file_cache::~ltrans_file_cache ()
+{
+  if (!*this)
+    return;
+
+  cleanup ();
+  free (str_buffer);
+}
+
+/* Adds given cache item to all relevant datastructures.  */
+void
+ltrans_file_cache::add_item (ltrans_file_cache::item* item)
+{
+  items.push_back (item);
+  map_checksum[item->input_checksum] = item;
+  map_input[item->input] = item;
+
+  usage_counter = std::max (usage_counter, item->last_used);
+}
+
+/* Creates cachedata filename for save/load.  */
+std::string
+ltrans_file_cache::filename_cachedata ()
+{
+  return std::string (dir) + "/cachedata";
+}
+
+/* Loads data about previously cached items from cachedata file.
+   Sorts items by last_used and remaps last_used to small integers.
+
+   Must be called with creation_lock or deletion_lock held to
+   prevent data race.  */
+void
+ltrans_file_cache::load_cache ()
+{
+  cleanup ();
+
+  std::string filename = filename_cachedata ();
+  FILE *file = fopen (filename.c_str (), "rb");
+
+  if (!file)
+    return;
+
+  ltrans_file_cache::item* _item;
+  while ((_item = read_cache_item (file, dir)))
+    add_item (_item);
+
+  fclose (file);
+
+  std::sort (items.begin (), items.end (),
+	     [](item* a, item* b)
+	       {return a->last_used < b->last_used;});
+
+  for (size_t i = 0; i < items.size (); ++i)
+    items[i]->last_used = (uint32_t) i;
+  usage_counter = (uint32_t) items.size ();
+}
+
+/* Rewrites data about cache items into cachedata file.
+
+   Must be only called when creation_lock or deletion_lock was held since last
+   call to load_cache.  */
+void
+ltrans_file_cache::save_cache ()
+{
+  std::string filename = filename_cachedata ();
+  FILE *file = fopen (filename.c_str (), "wb");
+
+  if (!file)
+    return;
+
+  for (item* _item: items)
+    write_cache_item (file, _item, dir);
+
+  fclose (file);
+}
+
+/* Creates new cache item with given checksum.
+   New input/output files are chosen to not collide with other items.
+
+   Must be called with creation_lock held to prevent data race.  */
+ltrans_file_cache::item*
+ltrans_file_cache::create_item (md5_checksum_t checksum)
+{
+  size_t prefix_len = cache_prefix.size ();
+
+  strcpy (str_buffer, cache_prefix.c_str ());
+
+  for (;;)
+    {
+      sprintf (str_buffer + prefix_len, "%04u%s", cache_free_idx, suffix);
+
+      if (map_input.find (str_buffer) == map_input.end ())
+	break;
+      cache_free_idx++;
+    }
+
+  std::string input = str_buffer;
+
+  sprintf (str_buffer + prefix_len, "%04u.ltrans%s", cache_free_idx, suffix);
+
+  return new item (std::move (input), str_buffer, checksum, usage_counter++);
+}
+
+/* Adds input file into cache.  Cache item with input file identical to
+   added input file will be returned as _item.
+   If the file was already cached, `true` is returned, `false` otherwise.
+   The added input file is deleted (or moved).
+
+   Must be called with creation_lock held to prevent data race.  */
+bool
+ltrans_file_cache::add_to_cache (const char* filename, item*& _item)
+{
+  md5_checksum_t checksum = file_checksum (filename);
+
+  auto it = map_checksum.find (checksum);
+
+  if (it != map_checksum.end ()
+      && files_identical (filename, it->second->input.c_str ()))
+    {
+      unlink (filename);
+      _item = it->second;
+      _item->last_used = usage_counter++;
+      return true;
+    }
+  else
+    {
+      /* Cache miss.  Move into cache dir.  */
+      _item = create_item (checksum);
+      add_item (_item);
+
+      rename (filename, _item->input.c_str ());
+      return false;
+    }
+}
+
+/* If exists, returns cache item corresponding to cached input file.  */
+ltrans_file_cache::item*
+ltrans_file_cache::get_item (const char* input)
+{
+  auto it = map_input.find (input);
+  if (it == map_input.end ())
+    return NULL;
+  return it->second;
+}
+
+/* If no other process holds the deletion_lock, prunes oldest unused cache
+   items over limit.  */
+void
+ltrans_file_cache::try_prune ()
+{
+  if (deletion_lock.try_lock_write () == 0)
+    {
+      prune ();
+      deletion_lock.unlock ();
+    }
+}
+
+/* Returns true if file exists.  */
+static int
+file_exists (char const *name)
+{
+  return access (name, R_OK) == 0;
+}
+
+/* Prunes oldest unused cache items over limit.
+   Must be called with deletion_lock held to prevent data race.  */
+void
+ltrans_file_cache::prune ()
+{
+  load_cache ();
+  if (items.size () > soft_cache_size)
+    {
+      std::vector<item*> sorted_items = std::move (items);
+
+      cleanup ();
+
+      for (size_t i = 0; i < sorted_items.size (); ++i)
+	{
+	  ltrans_file_cache::item* item = sorted_items[i];
+	  if ((i < sorted_items.size () - soft_cache_size)
+	      || !file_exists (item->input.c_str ())
+	      || !file_exists (item->output.c_str ()))
+	    {
+	      unlink (item->input.c_str ());
+	      unlink (item->output.c_str ());
+	      delete item;
+	    }
+	  else
+	    add_item (item);
+	}
+    }
+  save_cache ();
+}
+
+/* Clears cache class, as if only constructor was called.  */
+void
+ltrans_file_cache::cleanup ()
+{
+  map_checksum.clear ();
+  map_input.clear ();
+
+  for (ltrans_file_cache::item* item: items)
+    delete item;
+  items.clear ();
+
+  usage_counter = 0;
+}
+
+
+/* Returns ltrans cache dir.
+   Returns NULL if ltrans cache is disabled.  */
+const char*
+get_ltrans_cache_dir ()
+{
+  const char *ltrans_cache_dir = getenv ("GCC_LTRANS_CACHE");
+  if (!ltrans_cache_dir || ltrans_cache_dir[0] == '\0'
+      || !file_exists (ltrans_cache_dir))
+    ltrans_cache_dir = NULL;
+  return ltrans_cache_dir;
+}
diff --git a/gcc/lto-ltrans-cache.h b/gcc/lto-ltrans-cache.h
new file mode 100644
index 00000000000..3cfdfc66630
--- /dev/null
+++ b/gcc/lto-ltrans-cache.h
@@ -0,0 +1,147 @@ 
+/* File caching.
+   Copyright (C) 2023-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef GCC_LTO_LTRANS_CACHE_H
+#define GCC_LTO_LTRANS_CACHE_H
+
+#include "lockfile.h"
+
+using md5_checksum_t = std::array<uint8_t, 16>;
+
+class ltrans_file_cache
+{
+public:
+  /* Cache item representing input/output filename pair.  */
+  struct item
+  {
+    item (std::string input, std::string output,
+	  md5_checksum_t input_checksum, uint32_t last_used);
+    ~item ();
+
+    /* Full path to input filename.  */
+    const std::string input;
+    /* Full path to output filename.  */
+    const std::string output;
+    /* Checksum of input file.  */
+    const md5_checksum_t input_checksum;
+
+    /* Larger last_used corresponds to later usage.  */
+    uint32_t last_used;
+
+    /* Lockfile so that output file can be created later than input file.  */
+    lockfile lock;
+  };
+
+  /* Constructor.  Resulting cache item filenames will be
+     in format `prefix%d[.ltrans]suffix`.  */
+  ltrans_file_cache (const char* dir, const char* prefix, const char* suffix);
+  /* Destructor.  */
+  ~ltrans_file_cache ();
+
+  /* Loads data about previously cached items from cachedata file.
+
+     Must be called with creation_lock or deletion_lock held to
+     prevent data race.  */
+  void load_cache ();
+
+  /* Rewrites data about cache items into cachedata file.
+
+     Must be only called when creation_lock or deletion_lock was held since last
+     call to load_cache.  */
+  void save_cache ();
+
+
+  /* Adds input file into cache.  Cache item with input file identical to
+     added input file will be returned as _item.
+     If the file was already cached, `true` is returned, `false` otherwise.
+     The added input file is deleted (or moved).
+
+     Must be called with creation_lock held to prevent data race.  */
+  bool add_to_cache (const char* filename, item*& _item);
+
+  /* If exists, returns cache item corresponding to cached input file.  */
+  item* get_item (const char* input);
+
+  /* If no other process holds the deletion_lock, prunes oldest unused cache
+     items over limit.  */
+  void try_prune ();
+
+  /* Clears cache class, as if only constructor was called.  */
+  void cleanup ();
+
+  /* Cache is enabled if true.  */
+  operator bool ()
+  {
+    return dir;
+  }
+
+  /* Access to already created items can be concurrent with item creation.  */
+  lockfile creation_lock;
+  /* Access to already created items cannot be concurrent
+     with item deletion.  */
+  lockfile deletion_lock;
+
+  /* Directory of cache.  NULL if cache is disabled.  */
+  const char* dir;
+private:
+  /* Adds given cache item to all relevant datastructures.  */
+  void add_item (item* item);
+
+  /* Creates new cache item with given checksum.
+     New input/output files are chosen to not collide with other items.
+
+     Must be called with creation_lock held to prevent data race.  */
+  item* create_item (md5_checksum_t checksum);
+
+  /* Prunes oldest unused cache items over limit.
+     Must be called with deletion_lock held to prevent data race.  */
+  void prune ();
+
+  /* Creates cachedata filename for save/load.  */
+  std::string filename_cachedata ();
+
+  /* All cache items in current cache.  */
+  std::vector<item*> items;
+  std::map<md5_checksum_t, item*> map_checksum;
+  std::map<std::string, item*> map_input;
+
+  /* Cached filenames are in format "prefix%d[.ltrans]suffix".  */
+  const char* prefix;
+  const char* suffix;
+
+  /* If cache items count is larger, prune deletes old items.  */
+  size_t soft_cache_size;
+
+  /* Counter used to populate last_used of items.  */
+  uint32_t usage_counter;
+
+  /* String in format "dir/prefix".  */
+  std::string cache_prefix;
+  /* Lower indices are occupied.  */
+  uint32_t cache_free_idx;
+
+  /* Buffer for sprintf.  */
+  char* str_buffer;
+};
+
+/* Returns ltrans cache dir.
+   Returns NULL if ltrans cache is disabled.  */
+const char* get_ltrans_cache_dir ();
+
+#endif
diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
index 84835a888ef..cf64ad85daf 100644
--- a/gcc/lto-wrapper.cc
+++ b/gcc/lto-wrapper.cc
@@ -38,6 +38,9 @@  along with GCC; see the file COPYING3.  If not see
 */
 
 #define INCLUDE_STRING
+#define INCLUDE_ARRAY
+#define INCLUDE_MAP
+#define INCLUDE_VECTOR
 #include "config.h"
 #include "system.h"
 #include "coretypes.h"
@@ -52,6 +55,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "opts-diagnostic.h"
 #include "opt-suggestions.h"
 #include "opts-jobserver.h"
+#include "lto-ltrans-cache.h"
 
 /* Environment variable, used for passing the names of offload targets from GCC
    driver to lto-wrapper.  */
@@ -79,7 +83,7 @@  static char *flto_out;
 static unsigned int nr;
 static int *ltrans_priorities;
 static char **input_names;
-static char **output_names;
+static char const**output_names;
 static char **offload_names;
 static char *offload_objects_file_name;
 static char *makefile;
@@ -1809,6 +1813,8 @@  cont1:
 	}
     }
 
+  const char *ltrans_cache_dir = get_ltrans_cache_dir ();
+
   /* If object files contain offload sections, but do not contain LTO sections,
      then there is no need to perform a link-time recompilation, i.e.
      lto-wrapper is used only for a compilation of offload images.  */
@@ -1834,7 +1840,17 @@  cont1:
       char *dumpbase = concat (dumppfx, "wpa", NULL);
       obstack_ptr_grow (&argv_obstack, dumpbase);
 
-      if (save_temps)
+      if (ltrans_cache_dir)
+	{
+	  /* Results of wpa phase must be on the same disk partition as
+	     cache.  */
+	  char* file = concat (ltrans_cache_dir, "/ccXXXXXX.ltrans.out", NULL);
+	  int fd = mkstemps (file, strlen (".ltrans.out"));
+	  gcc_assert (fd != -1 && !close (fd));
+
+	  ltrans_output_file = file;
+	}
+      else if (save_temps)
 	ltrans_output_file = concat (dumppfx, "ltrans.out", NULL);
       else
 	ltrans_output_file = make_temp_file (".ltrans.out");
@@ -1960,7 +1976,8 @@  cont:
 	  ltrans_priorities
 	     = (int *)xrealloc (ltrans_priorities, nr * sizeof (int) * 2);
 	  input_names = (char **)xrealloc (input_names, nr * sizeof (char *));
-	  output_names = (char **)xrealloc (output_names, nr * sizeof (char *));
+	  output_names = (char const**)
+	    xrealloc (output_names, nr * sizeof (char const*));
 	  ltrans_priorities[(nr-1)*2] = priority;
 	  ltrans_priorities[(nr-1)*2+1] = nr-1;
 	  input_names[nr-1] = input_name;
@@ -1992,21 +2009,79 @@  cont:
 	  qsort (ltrans_priorities, nr, sizeof (int) * 2, cmp_priority);
 	}
 
+      ltrans_file_cache ltrans_cache (ltrans_cache_dir, "ltrans", ".o");
+
+      if (ltrans_cache)
+	{
+	  if (!lockfile::lockfile_supported ())
+	    {
+	      warning (0, "using ltrans cache without file locking support,"
+		       " do not use in parallel");
+	    }
+	  ltrans_cache.deletion_lock.lock_read ();
+	  ltrans_cache.creation_lock.lock_write ();
+
+	  ltrans_cache.load_cache ();
+
+	  int recompiling = 0;
+
+	  for (i = 0; i < nr; ++i)
+	    {
+	      /* If it's a pass-through file do nothing.  */
+	      if (output_names[i])
+		continue;
+
+	      ltrans_file_cache::item* item;
+	      bool existed = ltrans_cache.add_to_cache (input_names[i], item);
+	      free (input_names[i]);
+	      input_names[i] = xstrdup (item->input.c_str ());
+
+	      if (existed)
+		{
+		  /* Fill the output_name to skip compilation.  */
+		  output_names[i] = item->output.c_str ();
+		}
+	      else
+		{
+		  /* Lock so no other process can access until the file is
+		     compiled.  */
+		  item->lock.lock_write ();
+		  recompiling++;
+		}
+	    }
+	  if (verbose)
+	    fprintf (stderr, "LTRANS: recompiling %d/%d\n", recompiling, nr);
+
+	  ltrans_cache.save_cache ();
+	  ltrans_cache.creation_lock.unlock ();
+	}
+
       /* Execute the LTRANS stage for each input file (or prepare a
 	 makefile to invoke this in parallel).  */
       for (i = 0; i < nr; ++i)
 	{
-	  char *output_name;
+	  char const* output_name;
 	  char *input_name = input_names[i];
-	  /* If it's a pass-through file do nothing.  */
+	  /* If it's a pass-through or cached file do nothing.  */
 	  if (output_names[i])
 	    continue;
 
-	  /* Replace the .o suffix with a .ltrans.o suffix and write
-	     the resulting name to the LTRANS output list.  */
-	  obstack_grow (&env_obstack, input_name, strlen (input_name) - 2);
-	  obstack_grow (&env_obstack, ".ltrans.o", sizeof (".ltrans.o"));
-	  output_name = XOBFINISH (&env_obstack, char *);
+	  if (ltrans_cache)
+	    {
+	      ltrans_file_cache::item* item;
+	      item = ltrans_cache.get_item (input_name);
+	      gcc_assert (item);
+
+	      output_name = item->output.c_str ();
+	    }
+	  else
+	    {
+	      /* Replace the .o suffix with a .ltrans.o suffix and write
+		 the resulting name to the LTRANS output list.  */
+	      obstack_grow (&env_obstack, input_name, strlen (input_name) - 2);
+	      obstack_grow (&env_obstack, ".ltrans.o", sizeof (".ltrans.o"));
+	      output_name = XOBFINISH (&env_obstack, char const*);
+	    }
 
 	  /* Adjust the dumpbase if the linker output file was seen.  */
 	  int dumpbase_len = (strlen (dumppfx)
@@ -2029,7 +2104,7 @@  cont:
 	      /* If we are not preserving the ltrans input files then
 	         truncate them as soon as we have processed it.  This
 		 reduces temporary disk-space usage.  */
-	      if (! save_temps)
+	      if (!ltrans_cache && !save_temps)
 		fprintf (mstream, " -truncate '%s'", input_name);
 	      fprintf (mstream, "\n");
 	    }
@@ -2043,7 +2118,8 @@  cont:
 			  "ltrans%u.ltrans_args", i);
 	      fork_execute (new_argv[0], CONST_CAST (char **, new_argv),
 			    true, save_temps ? argsuffix : NULL);
-	      maybe_unlink (input_name);
+	      if (!ltrans_cache)
+		maybe_unlink (input_names[i]);
 	    }
 
 	  output_names[i] = output_name;
@@ -2098,15 +2174,66 @@  cont:
 	  freeargv (make_argv);
 	  maybe_unlink (makefile);
 	  makefile = NULL;
+
+	  if (!ltrans_cache)
+	    for (i = 0; i < nr; ++i)
+	      maybe_unlink (input_names[i]);
+	}
+
+      if (ltrans_cache)
+	{
 	  for (i = 0; i < nr; ++i)
-	    maybe_unlink (input_names[i]);
+	    {
+	      char *input_name = input_names[i];
+	      char const *output_name = output_names[i];
+
+	      ltrans_file_cache::item* item;
+	      item = ltrans_cache.get_item (input_name);
+
+	      if (item && !save_temps)
+		{
+		  item->lock.lock_read ();
+		  /* Ensure that cached compiled file is not deleted.
+		     Create copy.  */
+
+		  obstack_grow (&env_obstack, output_name,
+				strlen (output_name) - 2);
+		  obstack_grow (&env_obstack, ".cache_copy.XXX.o",
+				sizeof (".cache_copy.XXX.o"));
+
+		  char* output_name_link = XOBFINISH (&env_obstack, char *);
+		  char* name_idx = output_name_link + strlen (output_name_link)
+				   - strlen ("XXX.o");
+
+		  /* lto-wrapper can run in parallel and access
+		     the same partition.  */
+		  for (int j = 0; ; j++)
+		    {
+		      gcc_assert (j < 1000);
+		      sprintf (name_idx, "%03d.o", j);
+
+		      if (link (output_name, output_name_link) != EEXIST)
+			break;
+		    }
+
+		  output_names[i] = output_name_link;
+		  item->lock.unlock ();
+		}
+	    }
+
+	  ltrans_cache.deletion_lock.unlock ();
 	}
+
       for (i = 0; i < nr; ++i)
 	{
 	  fputs (output_names[i], stdout);
 	  putc ('\n', stdout);
 	  free (input_names[i]);
 	}
+
+      if (ltrans_cache && !save_temps)
+	ltrans_cache.try_prune ();
+
       if (!skip_debug)
 	{
 	  for (i = 0; i < ltoobj_argc; ++i)