diff mbox

Allow embedded timestamps by C/C++ macros to be set externally (3)

Message ID 20160418122636.GR3248@panther
State New
Headers show

Commit Message

Dhole April 18, 2016, 12:26 p.m. UTC
Hi!

A few months ago I submited a patch to allow the embedded timestamps by
C/C++ macros to be set externally [2], which was already an improvement
over [1].  I was told to wait until the GCC 7 stage 1 started to send
this patch again.

I'm quoting from the original emails:

> As a reminder for the motivation behind this patch, we are working on
> the reproducible builds project which aims to provide users with a way
> to reproduce bit-for-bit identical binary packages from the source and
> build environment. The project involves Debian as well as several other
> free software projects. See <https://reproducible-builds.org/> for more
> information.

> In order to do this, we need to make the build processes
> deterministic.  As you can imagine, gcc is quite involved in producing
> Debian packages.  One issue we encounter in many packages that fail to
> build reproducibly is the use of the __DATE__, __TIME__ C macros [3],
> right now we have 442 affected packages that would need patching
> (either removing the macros, or passing a known date externally).

> A solution for toolchain packages that embed timestamps during the
> build process has been proposed for anyone interested and it consists
> of the following: The build environment can export an environment
> variable called SOURCE_DATE_EPOCH with a known timestamp in Unix epoch
> format (In our case, we use the last date of the package's debian
> changelog). The toolchain package running during the build can check
> if the exported variable is set and if so, instead of embedding the
> local date/time, embed the date/time from SOURCE_DATE_EPOCH.

> The proposal to use SOURCE_DATE_EPOCH has now been gathered in a more
> formal specification [4], so that any project can adhere to it to
> achieve reproducible builds when dealing with timestamps.

> It would be very beneficial to our project (and other free software
> projects working on reproducible builds) if gcc supported this
> feature.

I'm attaching a patch for the svn/trunk GCC repository (now GCC 7) that
enables this feature: it modifies the behavior of the macros __DATE__
and __TIME__ when the environment variable SOURCE_DATE_EPOCH is
exported.

Documentation of the environment variable is also provided.

Note: I have already gone through the copyright assignment process :)


[1] https://gcc.gnu.org/ml/gcc-patches/2015-06/msg02210.html
[2] https://gcc.gnu.org/ml/gcc-patches/2015-11/msg01890.html
[3] https://wiki.debian.org/ReproducibleBuilds/TimestampsFromCPPMacros
[4] https://reproducible-builds.org/specs/source-date-epoch/

Best regards,

Comments

Markus Trippelsdorf April 18, 2016, 1:04 p.m. UTC | #1
On 2016.04.18 at 14:26 +0200, Dhole wrote:
> Hi!
> 
> A few months ago I submited a patch to allow the embedded timestamps by
> C/C++ macros to be set externally [2], which was already an improvement
> over [1].  I was told to wait until the GCC 7 stage 1 started to send
> this patch again.

A nice follow-up patch would be to set SOURCE_DATE_EPOCH to the current
time during -fcompare-debug. This would avoid false positives like:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70679
Bernd Schmidt April 25, 2016, 10:15 a.m. UTC | #2
On 04/18/2016 02:26 PM, Dhole wrote:
> A few months ago I submited a patch to allow the embedded timestamps by
> C/C++ macros to be set externally [2], which was already an improvement
> over [1].  I was told to wait until the GCC 7 stage 1 started to send
> this patch again.

> +/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
> +   timestamp to replace embedded current dates to get reproducible
> +   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
> +long long
> +get_source_date_epoch()

Always have a space before open-paren. Maybe this should return time_t. 
See below.

> +/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
> +   timestamp to replace embedded current dates to get reproducible
> +   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
> +extern long long get_source_date_epoch();

Double space after the end of a sentence. Space before open paren.

> +  source_date_epoch = get_source_date_epoch();
> +  cpp_init_source_date_epoch(parse_in, source_date_epoch);

Spaces.

> +/* Initialize the source_date_epoch value.  */
> +extern void cpp_init_source_date_epoch (cpp_reader *, long long);

Also thinking we should be using time_t here.

>   /* Sanity-checks are dependent on command-line options, so it is
>      called as a subroutine of cpp_read_main_file ().  */

We don't write () to mark function names.

> +	     tb = gmtime ((time_t*) &pfile->source_date_epoch);

Space before the "*". But this cast looks ugly and unreliable (think 
big-endian). This is why I would prefer to move to a time_t 
representation sooner.

> 2016-04-18  Eduard Sanou<dhole@openmailbox.org>
> 	    Matthias Klose<doko@debian.org>
> 	* c-common.c (get_source_date_epoch): New function, gets the environment
> 	variable SOURCE_DATE_EPOCH and parses it as long long with error
> 	handling.
> 	* c-common.h (get_source_date_epoch): Prototype.
> 	* c-lex.c (c_lex_with_flags): set parse_in->source_date_epoch.

Add blank lines after the end of the names in ChangeLogs.


Bernd
Dhole May 3, 2016, 2:42 p.m. UTC | #3
On 16-04-18 15:04:58, Markus Trippelsdorf wrote:
> A nice follow-up patch would be to set SOURCE_DATE_EPOCH to the current
> time during -fcompare-debug. This would avoid false positives like:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70679

I've been working on a patch to implement that, but I can't manage to
reproduce the false positive from the link.  Maybe the test code I'm
using compiles too fast.  I'm not familiar with -fcompare-debug either.

Could you provide me some code with instructions to reproduce this false
positive, to see if my patch is working as expected?
Markus Trippelsdorf May 3, 2016, 2:52 p.m. UTC | #4
On 2016.05.03 at 16:42 +0200, Dhole wrote:
> On 16-04-18 15:04:58, Markus Trippelsdorf wrote:
> > A nice follow-up patch would be to set SOURCE_DATE_EPOCH to the current
> > time during -fcompare-debug. This would avoid false positives like:
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70679
> 
> I've been working on a patch to implement that, but I can't manage to
> reproduce the false positive from the link.  Maybe the test code I'm
> using compiles too fast.  I'm not familiar with -fcompare-debug either.
> 
> Could you provide me some code with instructions to reproduce this false
> positive, to see if my patch is working as expected?

Just building LLVM is enough:

markus@x4 llvm_build % time g++ -fcompare-debug -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -fPIC -fvisibility-inlines-hidden -Wall -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wno-missing-field-initializers -pedantic -Wno-long-long -Wno-maybe-uninitialized -Wdelete-non-virtual-dtor -Wno-comment -std=c++11 -O2 -pipe -Ilib/Support -I/home/trippels/llvm/lib/Support -Iinclude -I/home/trippels/llvm/include -UNDEBUG -fno-exceptions -fno-rtti -MMD -MT lib/Support/CMakeFiles/LLVMSupport.dir/CommandLine.cpp.o -MF lib/Support/CMakeFiles/LLVMSupport.dir/CommandLine.cpp.o.d -o lib/Support/CMakeFiles/LLVMSupport.dir/CommandLine.cpp.o -c /home/markus/llvm/lib/Support/CommandLine.cpp 
g++: error: /home/markus/llvm/lib/Support/CommandLine.cpp: -fcompare-debug failure
g++ -fcompare-debug -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GNU_SOURCE    -fPIC  -Wall  5.71s user 0.26s system 101% cpu 5.904 total
diff mbox

Patch

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index f2846bb..3a83673 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -12741,4 +12741,38 @@  valid_array_size_p (location_t loc, tree type, tree name)
   return true;
 }
 
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+long long
+get_source_date_epoch()
+{
+  char *source_date_epoch;
+  unsigned long long epoch;
+  char *endptr;
+
+  source_date_epoch = getenv ("SOURCE_DATE_EPOCH");
+  if (!source_date_epoch)
+    return -1;
+
+  errno = 0;
+  epoch = strtoull (source_date_epoch, &endptr, 10);
+  if ((errno == ERANGE && (epoch == ULLONG_MAX || epoch == 0))
+      || (errno != 0 && epoch == 0))
+    fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+		 "strtoull: %s\n", xstrerror(errno));
+  if (endptr == source_date_epoch)
+    fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+		 "No digits were found: %s\n", endptr);
+  if (*endptr != '\0')
+    fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+		 "Trailing garbage: %s\n", endptr);
+  if (epoch > ULONG_MAX)
+    fatal_error (UNKNOWN_LOCATION, "environment variable $SOURCE_DATE_EPOCH: "
+		 "value must be smaller than or equal to: %lu but was found to "
+		 "be: %llu \n", ULONG_MAX, epoch);
+
+  return (long long) epoch;
+}
+
 #include "gt-c-family-c-common.h"
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index fa3746c..b4d6afc 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1467,4 +1467,9 @@  extern bool reject_gcc_builtin (const_tree, location_t = UNKNOWN_LOCATION);
 extern void warn_duplicated_cond_add_or_warn (location_t, tree, vec<tree> **);
 extern bool valid_array_size_p (location_t, tree, tree);
 
+/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
+   timestamp to replace embedded current dates to get reproducible
+   results. Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
+extern long long get_source_date_epoch();
+
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-lex.c b/gcc/c-family/c-lex.c
index 96da4fc..2454c6f 100644
--- a/gcc/c-family/c-lex.c
+++ b/gcc/c-family/c-lex.c
@@ -385,6 +385,10 @@  c_lex_with_flags (tree *value, location_t *loc, unsigned char *cpp_flags,
   enum cpp_ttype type;
   unsigned char add_flags = 0;
   enum overflow_type overflow = OT_NONE;
+  long long source_date_epoch = -1;
+
+  source_date_epoch = get_source_date_epoch();
+  cpp_init_source_date_epoch(parse_in, source_date_epoch);
 
   timevar_push (TV_CPP);
  retry:
diff --git a/gcc/doc/cppenv.texi b/gcc/doc/cppenv.texi
index 22c8cb3..e958e93 100644
--- a/gcc/doc/cppenv.texi
+++ b/gcc/doc/cppenv.texi
@@ -79,4 +79,21 @@  main input file is omitted.
 @ifclear cppmanual
 @xref{Preprocessor Options}.
 @end ifclear
+
+@item SOURCE_DATE_EPOCH
+
+If this variable is set, its value specifies a UNIX timestamp to be
+used in replacement of the current date and time in the @code{__DATE__}
+and @code{__TIME__} macros, so that the embedded timestamps become
+reproducible.
+
+The value of @env{SOURCE_DATE_EPOCH} must be a UNIX timestamp,
+defined as the number of seconds (excluding leap seconds) since
+01 Jan 1970 00:00:00 represented in ASCII, identical to the output of
+@samp{@command{date +%s}}.
+
+The value should be a known timestamp such as the last modification
+time of the source or package and it should be set by the build
+process.
+
 @end vtable
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index 35b0375..4cd5170 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -784,6 +784,9 @@  extern void cpp_init_special_builtins (cpp_reader *);
 /* Set up built-ins like __FILE__.  */
 extern void cpp_init_builtins (cpp_reader *, int);
 
+/* Initialize the source_date_epoch value.  */
+extern void cpp_init_source_date_epoch (cpp_reader *, long long);
+
 /* This is called after options have been parsed, and partially
    processed.  */
 extern void cpp_post_options (cpp_reader *);
diff --git a/libcpp/init.c b/libcpp/init.c
index 4343075..514282e 100644
--- a/libcpp/init.c
+++ b/libcpp/init.c
@@ -533,6 +533,13 @@  cpp_init_builtins (cpp_reader *pfile, int hosted)
     _cpp_define_builtin (pfile, "__OBJC__ 1");
 }
 
+/* Initialize the source_date_epoch value.  */
+void
+cpp_init_source_date_epoch (cpp_reader *pfile, long long source_date_epoch)
+{
+  pfile->source_date_epoch = source_date_epoch; 
+}
+
 /* Sanity-checks are dependent on command-line options, so it is
    called as a subroutine of cpp_read_main_file ().  */
 #if CHECKING_P
diff --git a/libcpp/internal.h b/libcpp/internal.h
index 9ce8707..5fc41da 100644
--- a/libcpp/internal.h
+++ b/libcpp/internal.h
@@ -502,6 +502,10 @@  struct cpp_reader
   const unsigned char *date;
   const unsigned char *time;
 
+  /* Externally set timestamp to replace current date and time useful for
+     reproducibility.  */
+  long long source_date_epoch;
+
   /* EOF token, and a token forcing paste avoidance.  */
   cpp_token avoid_paste;
   cpp_token eof;
diff --git a/libcpp/macro.c b/libcpp/macro.c
index c251553..833f36b 100644
--- a/libcpp/macro.c
+++ b/libcpp/macro.c
@@ -357,13 +357,20 @@  _cpp_builtin_macro_text (cpp_reader *pfile, cpp_hashnode *node,
 	  time_t tt;
 	  struct tm *tb = NULL;
 
-	  /* (time_t) -1 is a legitimate value for "number of seconds
-	     since the Epoch", so we have to do a little dance to
-	     distinguish that from a genuine error.  */
-	  errno = 0;
-	  tt = time(NULL);
-	  if (tt != (time_t)-1 || errno == 0)
-	    tb = localtime (&tt);
+	  /* Set a reproducible timestamp for __DATE__ and __TIME__ macro
+	     usage if SOURCE_DATE_EPOCH is defined.  */
+	  if (pfile->source_date_epoch != -1)
+	     tb = gmtime ((time_t*) &pfile->source_date_epoch);
+	  else
+	    {
+	      /* (time_t) -1 is a legitimate value for "number of seconds
+		 since the Epoch", so we have to do a little dance to
+		 distinguish that from a genuine error.  */
+	      errno = 0;
+	      tt = time (NULL);
+	      if (tt != (time_t)-1 || errno == 0)
+		tb = localtime (&tt);
+	    }
 
 	  if (tb)
 	    {