Message ID | 20181116035704.14820-3-andi@firstfloor.org |
---|---|
State | New |
Headers | show |
Series | [v2,1/3] Allow memory operands for PTWRITE | expand |
On Fri, Nov 16, 2018 at 4:57 AM Andi Kleen <andi@firstfloor.org> wrote: > > From: Andi Kleen <ak@linux.intel.com> > > Add a new pass to automatically instrument changes to variables > with the new PTWRITE instruction on x86. PTWRITE writes a 4 or 8 byte > field into an Processor Trace log, which allows low over head > logging of information. Essentially it's a hardware accelerated > printf. > > This allows to reconstruct how values later from the log, > which can be useful for debugging or other analysis of the program > behavior. With the compiler support this can be done with without > having to manually add instrumentation to the code. > > Using dwarf information this can be later mapped back to the variables. > The decoder decodes the PTWRITE instructions using IP information > in the log, and then looks up the argument in the debug information. > Then this can be used to reconstruct the original variable > name to display a value history for the variable. > > There are new options to enable instrumentation for different types, > and also a new attribute to control analysis fine grained per > function or variable level. The attributes can be set on both > the variable and the type level, and also on structure fields. > This allows to enable tracing only for specific code in large > programs in a flexible matter. > > The pass is generic, but only the x86 backend enables the necessary > hooks. When the backend enables the necessary hooks (with -mptwrite) > there is an additional pass that looks through the code for > attribute vartrace enabled functions or variables. > > The -fvartrace=locals option is experimental: it works, but it > generates redundant ptwrites because the pass doesn't use > the SSA information to minimize instrumentation. > This could be optimized later. > > Currently the code can be tested with SDE, or on a Intel > Gemini Lake system with a new enough Linux kernel (v4.10+) > that supports PTWRITE for PT. Gemini Lake is used in low > end laptops ("Intel Pentium Silver J5...... / Celeron N4... / > Celeron J4...") > > Linux perf can be used to record the values > > perf record -e intel_pt/ptw=1,branch=0/ program > perf script --itrace=crw -F +synth ... > > I have an experimential version of perf that can also use > dwarf information to symbolize many[1] values back to their variable > names. So far it is not in standard perf, but available at > > https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=perf/var-resolve-4 > > It is currently not able to decode all variable locations to names, > but a large subset. > > Longer term hopefully gdb will support this information too. > > The CPU can potentially generate very data high bandwidths when > code doing a lot of computation is heavily instrumented. > This can cause some data loss in both the CPU and also in perf > logging the data when the disk cannot keep up. > > Running some larger workloads most workloads do not cause > CPU level overflows, but I've seen it with -fvartrace > with crafty, and with more workloads with -fvartrace-locals. > > Recommendation is to not fully instrument programs, > but only areas of interest either at the file level or using > the attributes. > > The other thing is that perf and the disk often cannot keep up > with the data bandwidth for longer computations. In this case > it's possible to use perf snapshot mode (add --snapshot > to the command line above). The data will be only logged to > a memory ring buffer then, and only dump the buffers on events > of interest by sending SIGUSR2 to the perf binrary. > > In the future this will be hopefully better supported with > core files and gdb. > > Passes bootstrap and test suite on x86_64-linux, also > bootstrapped and tested gcc itself with full -fvartrace > and -fvartrace=locals instrumentation. In the cover mail you mentioned you didn't get rid of SSA update. That is because your instrumentation does not set the calls virtual operands. Since your builtin clobbers memory and you instrument non-memory ops that's only possible if you'd track the active virtual operand during the walk over the function. I suppose using SSA update is OK for now. More comments inline > gcc/: > > 2018-11-15 Andi Kleen <ak@linux.intel.com> > > * Makefile.in: Add tree-vartrace.o. > * common.opt: Add -fvartrace > * opts.c (parse_vartrace_options): Add. > (common_handle_option): Call parse_vartrace_options. > * config/i386/i386.c (ix86_vartrace_func): Add. > (TARGET_VARTRACE_FUNC): Add. > * doc/extend.texi: Document vartrace/no_vartrace > attributes. > * doc/invoke.texi: Document -fvartrace. > * doc/tm.texi (TARGET_VARTRACE_FUNC): Add. > * passes.def: Add vartrace pass. > * target.def (vartrace_func): Add. > * tree-pass.h (make_pass_vartrace): Add. > * tree-vartrace.c: New file to implement vartrace pass. > > gcc/c-family/: > > 2018-11-15 Andi Kleen <ak@linux.intel.com> > > * c-attribs.c (handle_vartrace_attribute, > handle_no_vartrace_attribute): New functions. > (attr_vartrace_exclusions): Add. > > config/: > > 2018-11-03 Andi Kleen <ak@linux.intel.com> > > * bootstrap-vartrace.mk: New. > * bootstrap-vartrace-locals.mk: New. > --- > config/bootstrap-vartrace-locals.mk | 3 + > config/bootstrap-vartrace.mk | 3 + > gcc/Makefile.in | 1 + > gcc/c-family/c-attribs.c | 77 +++++ > gcc/common.opt | 8 + > gcc/config/i386/i386.c | 32 ++ > gcc/doc/extend.texi | 32 ++ > gcc/doc/invoke.texi | 27 ++ > gcc/doc/tm.texi | 6 + > gcc/doc/tm.texi.in | 2 + > gcc/flag-types.h | 9 + > gcc/opts.c | 63 ++++ > gcc/passes.def | 1 + > gcc/target.def | 9 + > gcc/tree-pass.h | 1 + > gcc/tree-vartrace.c | 491 ++++++++++++++++++++++++++++ > 16 files changed, 765 insertions(+) > create mode 100644 config/bootstrap-vartrace-locals.mk > create mode 100644 config/bootstrap-vartrace.mk > create mode 100644 gcc/tree-vartrace.c > > diff --git a/config/bootstrap-vartrace-locals.mk b/config/bootstrap-vartrace-locals.mk > new file mode 100644 > index 00000000000..dd16640df74 > --- /dev/null > +++ b/config/bootstrap-vartrace-locals.mk > @@ -0,0 +1,3 @@ > +STAGE2_CFLAGS += -mptwrite -fvartrace=all > +STAGE3_CFLAGS += -mptwrite -fvartrace=all > +STAGE4_CFLAGS += -mptwrite -fvartrace=all > diff --git a/config/bootstrap-vartrace.mk b/config/bootstrap-vartrace.mk > new file mode 100644 > index 00000000000..e29824d799b > --- /dev/null > +++ b/config/bootstrap-vartrace.mk > @@ -0,0 +1,3 @@ > +STAGE2_CFLAGS += -mptwrite -fvartrace > +STAGE3_CFLAGS += -mptwrite -fvartrace > +STAGE4_CFLAGS += -mptwrite -fvartrace > diff --git a/gcc/Makefile.in b/gcc/Makefile.in > index ec793175c3b..64a99a1ec8a 100644 > --- a/gcc/Makefile.in > +++ b/gcc/Makefile.in > @@ -1594,6 +1594,7 @@ OBJS = \ > tree-vectorizer.o \ > tree-vector-builder.o \ > tree-vrp.o \ > + tree-vartrace.o \ > tree.o \ > typed-splay-tree.o \ > unique-ptr-tests.o \ > diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c > index 1657df7f9df..d50e78de830 100644 > --- a/gcc/c-family/c-attribs.c > +++ b/gcc/c-family/c-attribs.c > @@ -104,6 +104,10 @@ static tree handle_tls_model_attribute (tree *, tree, tree, int, > bool *); > static tree handle_no_instrument_function_attribute (tree *, tree, > tree, int, bool *); > +static tree handle_vartrace_attribute (tree *, tree, > + tree, int, bool *); > +static tree handle_no_vartrace_attribute (tree *, tree, > + tree, int, bool *); > static tree handle_no_profile_instrument_function_attribute (tree *, tree, > tree, int, bool *); > static tree handle_malloc_attribute (tree *, tree, tree, int, bool *); > @@ -235,6 +239,13 @@ static const struct attribute_spec::exclusions attr_const_pure_exclusions[] = > ATTR_EXCL (NULL, false, false, false) > }; > > +static const struct attribute_spec::exclusions attr_vartrace_exclusions[] = > +{ > + ATTR_EXCL ("vartrace", true, true, true), > + ATTR_EXCL ("no_vartrace", true, true, true), > + ATTR_EXCL (NULL, false, false, false) > +}; > + > /* Table of machine-independent attributes common to all C-like languages. > > Current list of processed common attributes: nonnull. */ > @@ -326,6 +337,12 @@ const struct attribute_spec c_common_attribute_table[] = > { "no_instrument_function", 0, 0, true, false, false, false, > handle_no_instrument_function_attribute, > NULL }, > + { "vartrace", 0, 0, false, false, false, false, > + handle_vartrace_attribute, > + attr_vartrace_exclusions }, > + { "no_vartrace", 0, 0, false, false, false, false, > + handle_no_vartrace_attribute, > + attr_vartrace_exclusions }, > { "no_profile_instrument_function", 0, 0, true, false, false, false, > handle_no_profile_instrument_function_attribute, > NULL }, > @@ -770,6 +787,66 @@ handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int, > return NULL_TREE; > } > > +/* Handle "vartrace" attribute; arguments as in struct > + attribute_spec.handler. */ > + > +static tree > +handle_vartrace_attribute (tree *node, tree name, tree, int flags, > + bool *no_add_attrs) > +{ > + if (!VAR_OR_FUNCTION_DECL_P (*node) && !TYPE_P (*node) && > + TREE_CODE (*node) != FIELD_DECL) > + { > + warning (OPT_Wattributes, "%qE attribute ignored for object", name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (!targetm.vartrace_func) > + { > + warning (OPT_Wattributes, "%qE attribute not supported for target", name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (TREE_TYPE (*node) > + && TREE_CODE (*node) != FUNCTION_DECL > + && targetm.vartrace_func (TREE_TYPE (*node), true) == NULL_TREE) > + { > + warning (OPT_Wattributes, "%qE attribute not supported for type", name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (TYPE_P (*node) && !(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) > + *node = build_variant_type_copy (*node); > + > + /* We look it up later with lookup_attribute. */ > + return NULL_TREE; > +} > + > +/* Handle "no_vartrace" attribute; arguments as in struct > + attribute_spec.handler. */ > + > +static tree > +handle_no_vartrace_attribute (tree *node, tree name, tree, int flags, > + bool *no_add_attrs) > +{ > + if (!VAR_OR_FUNCTION_DECL_P (*node) && !TYPE_P (*node) > + && TREE_CODE (*node) != FIELD_DECL) > + { > + warning (OPT_Wattributes, "%qE attribute ignored", name); > + *no_add_attrs = true; > + return NULL_TREE; > + } > + > + if (TYPE_P (*node) && !(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) > + *node = build_variant_type_copy (*node); > + > + /* We look it up later with lookup_attribute. */ > + return NULL_TREE; > +} > + > /* Handle an "asan odr indicator" attribute; arguments as in > struct attribute_spec.handler. */ > > diff --git a/gcc/common.opt b/gcc/common.opt > index 72a713593c3..f27a57b1177 100644 > --- a/gcc/common.opt > +++ b/gcc/common.opt > @@ -221,6 +221,10 @@ unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NO > Variable > unsigned int flag_sanitize_coverage > > +; What to instrument with vartrace > +Variable > +unsigned int flag_vartrace > + > ; Flag whether a prefix has been added to dump_base_name > Variable > bool dump_base_name_prefixed = false > @@ -2856,6 +2860,10 @@ ftree-scev-cprop > Common Report Var(flag_tree_scev_cprop) Init(1) Optimization > Enable copy propagation of scalar-evolution information. > > +fvartrace > +Common JoinedOrMissing Report Driver > +-fvartrace=default|all|locals|returns|args|reads|writes|off Enable variable tracing instrumentation. > + > ; -fverbose-asm causes extra commentary information to be produced in > ; the generated assembly code (to make it more readable). This option > ; is generally only of use to those who actually need to read the > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > index c18c60a1d19..6d130bf0804 100644 > --- a/gcc/config/i386/i386.c > +++ b/gcc/config/i386/i386.c > @@ -31900,6 +31900,35 @@ ix86_mangle_function_version_assembler_name (tree decl, tree id) > return ret; > } > > +/* Hook to determine that TYPE can be traced. Ignore target flags > + if FORCE is true. Returns the tracing builtin if tracing is possible, > + or otherwise NULL. */ > + > +static tree > +ix86_vartrace_func (tree type, bool force) > +{ > + if (!(ix86_isa_flags2 & OPTION_MASK_ISA_PTWRITE)) > + { > + /* With force, as in checking for the attribute, ignore > + the current target settings. Otherwise it's not > + possible to declare vartrace variables outside > + an __attribute__((target("ptwrite"))) function > + if -mptwrite is not specified. */ > + if (!force) > + return NULL; > + /* Initialize the builtins if missing, so that we have > + something to return. */ > + if (!ix86_builtins[(int)IX86_BUILTIN_PTWRITE32]) > + ix86_add_new_builtins (0, OPTION_MASK_ISA_PTWRITE); > + } > + > + if (TYPE_PRECISION (type) == 32) > + return ix86_builtins[(int) IX86_BUILTIN_PTWRITE32]; > + else if (TYPE_PRECISION (type) == 64) > + return ix86_builtins[(int) IX86_BUILTIN_PTWRITE64]; I think it makes more sense to pass the target hook a mode instead of a type. Thus above you'd check for SImode vs. DImode. > + else > + return NULL; > +} > > static tree > ix86_mangle_decl_assembler_name (tree decl, tree id) > @@ -50894,6 +50923,9 @@ ix86_run_selftests (void) > #undef TARGET_ASAN_SHADOW_OFFSET > #define TARGET_ASAN_SHADOW_OFFSET ix86_asan_shadow_offset > > +#undef TARGET_VARTRACE_FUNC > +#define TARGET_VARTRACE_FUNC ix86_vartrace_func > + > #undef TARGET_GIMPLIFY_VA_ARG_EXPR > #define TARGET_GIMPLIFY_VA_ARG_EXPR ix86_gimplify_va_arg > > diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi > index d4b1046b6ae..f0cde5dcf0f 100644 > --- a/gcc/doc/extend.texi > +++ b/gcc/doc/extend.texi > @@ -3228,6 +3228,13 @@ the standard C library can be guaranteed not to throw an exception > with the notable exceptions of @code{qsort} and @code{bsearch} that > take function pointer arguments. > > +@item no_vartrace > +The @code{no_vartrace} attribute disables data tracing for > +the function [or variable or structure field] declared with > +the attribute. See @pxref{Common Variable Attributes} and > +@pxref{Common Type Attributes}. When specified for a function > +nothing in the function is traced. > + > @item optimize (@var{level}, @dots{}) > @item optimize (@var{string}, @dots{}) > @cindex @code{optimize} function attribute > @@ -3489,6 +3496,21 @@ When applied to a member function of a C++ class template, the > attribute also means that the function is instantiated if the > class itself is instantiated. > > +@item vartrace > +@cindex @code{vartrace} function or variable attribute > +Enable data tracing for the function or variable or structure field > +marked with this attribute. When applied to a type all instances of the type > +will be traced. When applied to a structure or union all fields will be traced. > +When applied to a structure field that field will be traced. > +For functions will not trace locals (unless enabled on the command line or > +on as an attribute the local itself) but arguments, returns, globals, pointer > +references. For variables or types or structure members any reads or writes will be traced. > +Only integer or pointer types can be tracked. > + > +Currently implemented for x86 when the @option{ptwrite} target option > +is enabled for systems that support the @code{PTWRITE} instruction, > +supporting 4 and 8 byte integers. > + > @item visibility ("@var{visibility_type}") > @cindex @code{visibility} function attribute > This attribute affects the linkage of the declaration to which it is attached. > @@ -7196,6 +7218,16 @@ A @{ /* @r{@dots{}} */ @}; > struct __attribute__ ((copy ( (struct A *)0)) B @{ /* @r{@dots{}} */ @}; > @end smallexample > > +@cindex @code{vartrace} type attribute > +@cindex @code{no_vartrace} type attribute > +@item vartrace > +@itemx no_vartrace > +Specify that all instances of type should be variable traced > +or not variable traced. Can be also also applied to function > +types to disable tracing for all instances of that function type. > +Can be also applied to structure fields. See the description in > +@pxref{Variable Attributes} for more details. > + > @item deprecated > @itemx deprecated (@var{msg}) > @cindex @code{deprecated} type attribute > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi > index 535b258d22b..8b9d1d71669 100644 > --- a/gcc/doc/invoke.texi > +++ b/gcc/doc/invoke.texi > @@ -2754,6 +2754,33 @@ Don't use the @code{__cxa_get_exception_ptr} runtime routine. This > causes @code{std::uncaught_exception} to be incorrect, but is necessary > if the runtime routine is not available. > > +@item -fvartrace > +@opindex -fvartrace=... > +Insert trace instructions to trace variables at runtime with a tracer > +such as Intel Processor Trace. > + > +Requires enabling a backend specific option, like @option{-mptwrite} to enable > +@code{PTWRITE} instruction generation on x86. > + > +Additional qualifiers can be specified after the =: @option{args} > +for arguments, @option{off} to disable, @option{returns} for tracing > +return values, @option{reads} to trace reads, @option{writes} to trace > +writes, @option{locals} to trace locals. > +Default when enabled is tracing reads, writes, arguments, returns, objects > +with a static or thread duration but no locals. > +Multiple options can be separated by comma. > + > +When -fvartrace enabled, the compiler will add trace instructions. By > +default these trace options act like nops, unless tracing is > +enabled. For example to enable tracing on x86 Linux with Linux perf > +use: > + > +@smallexample > +gcc -fvartrace -o program -g program.c > +perf record -e intel_pt/pt=1,ptw=1,branch=0/ program > +perf script > +@end smallexample > + > @item -fvisibility-inlines-hidden > @opindex fvisibility-inlines-hidden > This switch declares that the user does not attempt to compare > diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi > index 0a2ad9a745e..861914ac646 100644 > --- a/gcc/doc/tm.texi > +++ b/gcc/doc/tm.texi > @@ -11936,6 +11936,12 @@ Address Sanitizer shadow memory address. NULL if Address Sanitizer is not > supported by the target. > @end deftypefn > > +@deftypefn {Target Hook} tree TARGET_VARTRACE_FUNC (tree @var{type}, bool @var{force}) > +Return a builtin to call to trace variables of type TYPE or NULL if not supported > +by the target. Ignore target configuration if FORCE is true. The builtin gets called with a > +single argument of TYPE. So make this take a enum machine_mode @var{mode} specified to be an integer mode. The argument type of the function would then be an unsigned integer type with this mode. That should allow more flexible instrumentation, including of floats or doubles or small aggregates. Maybe even instead pass it a number of bytes so it models how atomics work. > +@end deftypefn > + > @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK (unsigned HOST_WIDE_INT @var{val}) > Validate target specific memory model mask bits. When NULL no target specific > memory model bits are allowed. > diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in > index f1ad80da467..756e84d0e07 100644 > --- a/gcc/doc/tm.texi.in > +++ b/gcc/doc/tm.texi.in > @@ -8104,6 +8104,8 @@ and the associated definitions of those functions. > > @hook TARGET_ASAN_SHADOW_OFFSET > > +@hook TARGET_VARTRACE_FUNC > + > @hook TARGET_MEMMODEL_CHECK > > @hook TARGET_ATOMIC_TEST_AND_SET_TRUEVAL > diff --git a/gcc/flag-types.h b/gcc/flag-types.h > index 500f6638f36..6e5add2da9d 100644 > --- a/gcc/flag-types.h > +++ b/gcc/flag-types.h > @@ -261,6 +261,15 @@ enum sanitize_code { > | SANITIZE_BOUNDS_STRICT > }; > > +/* Settings for flag_vartrace */ > +enum flag_vartrace { > + VARTRACE_LOCALS = 1 << 0, > + VARTRACE_ARGS = 1 << 1, > + VARTRACE_RETURNS = 1 << 2, > + VARTRACE_READS = 1 << 3, > + VARTRACE_WRITES = 1 << 4 > +}; > + > /* Settings of flag_incremental_link. */ > enum incremental_link { > INCREMENTAL_LINK_NONE, > diff --git a/gcc/opts.c b/gcc/opts.c > index 318ed442057..c6c47b55a39 100644 > --- a/gcc/opts.c > +++ b/gcc/opts.c > @@ -1872,6 +1872,65 @@ check_alignment_argument (location_t loc, const char *flag, const char *name) > parse_and_check_align_values (flag, name, align_result, true, loc); > } > > +/* Parse vartrace options in P, updating flags OPTS at LOC and return > + updated flags. */ > + > +static int > +parse_vartrace_options (const char *p, int opts, location_t loc) > +{ > + static struct { > + const char *name; > + int opt; > + } vopts[] = > + { > + { "default", > + VARTRACE_ARGS | VARTRACE_RETURNS | VARTRACE_READS > + | VARTRACE_WRITES }, /* Keep as first entry. */ > + { "all", > + VARTRACE_ARGS | VARTRACE_RETURNS | VARTRACE_READS > + | VARTRACE_WRITES | VARTRACE_LOCALS }, > + { "args", VARTRACE_ARGS }, > + { "returns", VARTRACE_RETURNS }, > + { "reads", VARTRACE_READS }, > + { "writes", VARTRACE_WRITES }, > + { "locals", VARTRACE_LOCALS }, > + { NULL, 0 } > + }; > + > + if (*p == '=') > + p++; > + if (*p == 0) > + return opts | vopts[0].opt; > + > + if (!strcmp (p, "off")) > + return 0; > + > + while (*p) > + { > + unsigned len = strcspn (p, ","); > + int i; > + > + for (i = 0; vopts[i].name; i++) > + { > + if (len == strlen (vopts[i].name) && !strncmp (p, vopts[i].name, len)) > + { > + opts |= vopts[i].opt; > + break; > + } > + } > + if (vopts[i].name == NULL) > + { > + error_at (loc, "invalid argument to %qs", "-fvartrace"); > + break; > + } > + > + p += len; > + if (*p == ',') > + p++; > + } > + return opts; > +} > + > /* Handle target- and language-independent options. Return zero to > generate an "unknown option" message. Only options that need > extra handling need to be listed here; if you simply want > @@ -2078,6 +2137,10 @@ common_handle_option (struct gcc_options *opts, > case OPT__completion_: > break; > > + case OPT_fvartrace: > + opts->x_flag_vartrace = parse_vartrace_options (arg, opts->x_flag_vartrace, loc); > + break; > + > case OPT_fsanitize_: > opts->x_flag_sanitize > = parse_sanitizer_options (arg, loc, code, > diff --git a/gcc/passes.def b/gcc/passes.def > index 24f212c8e31..36d14b0386a 100644 > --- a/gcc/passes.def > +++ b/gcc/passes.def > @@ -394,6 +394,7 @@ along with GCC; see the file COPYING3. If not see > NEXT_PASS (pass_asan_O0); > NEXT_PASS (pass_tsan_O0); > NEXT_PASS (pass_sanopt); > + NEXT_PASS (pass_vartrace); I'd move it one lower, after pass_cleanup_eh. Further enhancement would make it a RTL pass ... > NEXT_PASS (pass_cleanup_eh); > NEXT_PASS (pass_lower_resx); > NEXT_PASS (pass_nrv); > diff --git a/gcc/target.def b/gcc/target.def > index f9469d69cb0..b5d970870a4 100644 > --- a/gcc/target.def > +++ b/gcc/target.def > @@ -4300,6 +4300,15 @@ supported by the target.", > unsigned HOST_WIDE_INT, (void), > NULL) > > +/* Defines the builtin to trace variables, or NULL. */ > +DEFHOOK > +(vartrace_func, > + "Return a builtin to call to trace variables of type TYPE or NULL if not supported\n\ > +by the target. Ignore target configuration if FORCE is true. The builtin gets called with a\n\ > +single argument of TYPE.", > + tree, (tree type, bool force), > + NULL) > + > /* Functions relating to calls - argument passing, returns, etc. */ > /* Members of struct call have no special macro prefix. */ > HOOK_VECTOR (TARGET_CALLS, calls) > diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h > index af15adc8e0c..2cf31785a6f 100644 > --- a/gcc/tree-pass.h > +++ b/gcc/tree-pass.h > @@ -423,6 +423,7 @@ extern gimple_opt_pass *make_pass_strlen (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_fold_builtins (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_post_ipa_warn (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_stdarg (gcc::context *ctxt); > +extern gimple_opt_pass *make_pass_vartrace (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_early_warn_uninitialized (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_late_warn_uninitialized (gcc::context *ctxt); > extern gimple_opt_pass *make_pass_cse_reciprocals (gcc::context *ctxt); > diff --git a/gcc/tree-vartrace.c b/gcc/tree-vartrace.c > new file mode 100644 > index 00000000000..1ef81b743fc > --- /dev/null > +++ b/gcc/tree-vartrace.c > @@ -0,0 +1,491 @@ > +/* Insert instructions for data value tracing. > + Copyright (C) 2017, 2018 Free Software Foundation, Inc. > + Contributed by Andi Kleen. > + > +This file is part of GCC. > + > +GCC is free software; you can redistribute it and/or modify > +it under the terms of the GNU General Public License as published by > +the Free Software Foundation; either version 3, or (at your option) > +any later version. > + > +GCC is distributed in the hope that it will be useful, > +but WITHOUT ANY WARRANTY; without even the implied warranty of > +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > +GNU General Public License for more details. > + > +You should have received a copy of the GNU General Public License > +along with GCC; see the file COPYING3. If not see > +<http://www.gnu.org/licenses/>. */ > + > +#include "config.h" > +#include "system.h" > +#include "coretypes.h" > +#include "backend.h" > +#include "target.h" > +#include "tree.h" > +#include "tree-iterator.h" > +#include "tree-pass.h" > +#include "basic-block.h" > +#include "gimple.h" > +#include "gimple-iterator.h" > +#include "gimplify.h" > +#include "gimplify-me.h" > +#include "gimple-ssa.h" > +#include "gimple-pretty-print.h" > +#include "cfghooks.h" > +#include "fold-const.h" > +#include "ssa.h" > +#include "tree-dfa.h" > +#include "attribs.h" > + > +namespace { > + > +enum attrstate { force_off, force_on, neutral }; > + > +/* Can we trace with attributes ATTR. */ > + > +attrstate > +supported_attr (tree attr) > +{ > + if (lookup_attribute ("no_vartrace", attr)) > + return force_off; > + if (lookup_attribute ("vartrace", attr)) > + return force_on; > + return neutral; > +} > + > +/* Is tracing enabled for ARG considering S. */ > + > +attrstate > +supported_op (tree arg, attrstate s) > +{ > + if (s != neutral) > + return s; > + if (DECL_P (arg)) > + { > + s = supported_attr (DECL_ATTRIBUTES (arg)); > + if (s != neutral) > + return s; > + } > + return supported_attr (TYPE_ATTRIBUTES (TREE_TYPE (arg))); > +} > + > +/* Can we trace DECL. */ > + > +bool > +supported_type (tree decl) > +{ > + tree type = TREE_TYPE (decl); > + > + return POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type); So in reality the restriction is on the size of the object, correct? Given you only use this in supported_mem I'd elide this function (also see below) The supported_ names are a bit odd since some of them are about whehter tracing is enabled and some are whether a specific value can be logged at all. I suggest to differentiate appropriately here, like enabled_op vs. supported_type. > +} > + > +/* Return true if DECL is refering to a local variable. */ > + > +bool > +is_local (tree decl) > +{ > + if (!(flag_vartrace & VARTRACE_LOCALS) && supported_op (decl, neutral) != force_on) > + return false; > + return auto_var_in_fn_p (decl, cfun->decl); > +} Likewise. > +/* Is T something we can log, FORCEing the type if needed. */ > + > +bool > +supported_mem (tree t, bool force) This looks like a bad name since 't' isn't necessarily a memory location? > +{ > + if (!supported_type (t)) You handle some nested cases below via recursion, like a.b.c (but not a[i][j]). But then this check will fire. I think it would be better to restructure the function to look at the outermost level for whether the op is of supported type, thus we can log it at all and then get all the way down to the base via sth like if (!supported_type (t)) return false; enum attrstate s = <some default>; do { s = supported_op (t, s); if (s == force_off) return false; } while (handled_component_p (t) && (t = TREE_OPERAND (t, 0))) Now t is either an SSA_NAME, a DECL (you fail to handle PARM_DECL and RESULT_DECL below) or a [TARGET_]MEM_REF. To get rid of non-pointer indirections do then t = get_base_address (t); if (DECL_P (t) && is_local (t)) .... because... > + return false; > + > + enum attrstate s = supported_op (t, neutral); > + if (s == force_off) > + return false; > + if (s == force_on) > + force = true; > + > + switch (TREE_CODE (t)) > + { > + case VAR_DECL: > + if (DECL_ARTIFICIAL (t)) > + return false; > + if (is_local (t)) > + return true; > + return s == force_on || force; > + > + case ARRAY_REF: > + t = TREE_OPERAND (t, 0); > + s = supported_op (t, s); > + if (s == force_off) > + return false; > + return supported_type (TREE_TYPE (t)); Your supported_type is said to take a DECL. And you already asked for this type - it's the type of the original t (well, the type of this type given TREE_TYPE (t) is an array type). But you'll reject a[i][j] where the type of this type is an array type as well. > + > + case COMPONENT_REF: > + s = supported_op (TREE_OPERAND (t, 1), s); > + t = TREE_OPERAND (t, 0); > + if (s == neutral && is_local (t)) > + return true; > + s = supported_op (t, s); > + if (s != neutral) > + return s == force_on ? true : false; > + return supported_mem (t, force); > + > + // support BIT_FIELD_REF? > + > + case VIEW_CONVERT_EXPR: > + case TARGET_MEM_REF: > + case MEM_REF: > + return supported_mem (TREE_OPERAND (t, 0), force); > + > + case SSA_NAME: > + if ((flag_vartrace & VARTRACE_LOCALS) > + && SSA_NAME_VAR (t) > + && !DECL_IGNORED_P (SSA_NAME_VAR (t))) > + return true; > + return force; > + > + default: > + break; > + } > + > + return false; > +} > + > +/* Print debugging for inserting CODE at ORIG_STMT with type of VAL for WHY. */ > + > +void > +log_trace_code (gimple *orig_stmt, gimple *code, tree val, const char *why) > +{ > + if (!dump_file) > + return; > + if (orig_stmt) > + fprintf (dump_file, "BB%d ", gimple_bb (orig_stmt)->index); > + fprintf (dump_file, "%s inserting ", why); > + print_gimple_stmt (dump_file, code, 0, TDF_VOPS|TDF_MEMSYMS); > + if (orig_stmt) > + { > + fprintf (dump_file, "orig "); > + print_gimple_stmt (dump_file, orig_stmt, 2, > + TDF_VOPS|TDF_MEMSYMS); TDF_MEMSYMS is dead, or rather it seems to be a leftover now equivalent to just TDF_VOPS. > + } > + fprintf (dump_file, "type "); > + print_generic_expr (dump_file, TREE_TYPE (val), TDF_SLIM); > + fputc ('\n', dump_file); > + fputc ('\n', dump_file); > +} > + > +/* Insert variable tracing code for VAL before iterator GI, originally > + for ORIG_STMT and optionally at LOC. Normally before ORIG_STMT, but > + AFTER if true. Reason is WHY. Return trace var if successfull, > + or NULL_TREE. */ > + > +tree > +insert_trace (gimple_stmt_iterator *gi, tree val, gimple *orig_stmt, > + const char *why, location_t loc = -1, bool after = false) > +{ > + if (loc == (location_t)-1) > + loc = gimple_location (orig_stmt); > + > + tree func = targetm.vartrace_func (TREE_TYPE (val), false); > + if (!func) > + return NULL_TREE; > + > + tree tvar = val; > + if (!is_gimple_reg (val)) > + { > + tvar = make_ssa_name (TREE_TYPE (val)); > + gassign *assign = gimple_build_assign (tvar, unshare_expr (val)); > + log_trace_code (orig_stmt, assign, val, "copy"); > + gimple_set_location (assign, loc); > + if (after) > + gsi_insert_after (gi, assign, GSI_CONTINUE_LINKING); > + else > + gsi_insert_before (gi, assign, GSI_SAME_STMT); > + update_stmt (assign); gsi_insert_* does update_stmt already. Btw, if you allow any SImode or DImode size value you can use a VIEW_CONVERT_EXPR to view the value as an integer type. You can of course also do simple promotion of char or short to int. I wonder why you do not get IL verification errors when passing the builtin signed ints/longs btw... ah, because we do not actually verify call arguments :/ > + } > + > + gcall *call = gimple_build_call (func, 1, tvar); > + log_trace_code (NULL, call, tvar, why); > + gimple_set_location (call, loc); > + if (after) > + gsi_insert_after (gi, call, GSI_CONTINUE_LINKING); > + else > + gsi_insert_before (gi, call, GSI_SAME_STMT); > + update_stmt (call); > + return tvar; > +} > + > +/* Insert trace at GI for T in FUN if suitable memory or variable > + reference. Always if FORCE. Originally on ORIG_STMT. Reason is > + WHY. Insert after GI if AFTER. Returns trace variable or NULL_TREE. */ > + > +tree > +instrument_mem (gimple_stmt_iterator *gi, tree t, bool force, > + gimple *orig_stmt, const char *why, bool after = false) > +{ > + if (supported_mem (t, force)) > + return insert_trace (gi, t, orig_stmt, why, -1, after); > + return NULL_TREE; > +} > + > +/* Instrument arguments for FUN. Return true if changed. */ > + > +bool > +instrument_args (function *fun) > +{ > + gimple_stmt_iterator gi; > + bool changed = false; > + > + /* Local tracing usually takes care of the argument too, when > + they are read. This avoids redundant trace instructions. */ But only when instrumenting reads? > + if (flag_vartrace & VARTRACE_LOCALS) > + return false; > + > + for (tree arg = DECL_ARGUMENTS (current_function_decl); > + arg != NULL_TREE; > + arg = DECL_CHAIN (arg)) > + { > + gi = gsi_start_bb (BASIC_BLOCK_FOR_FN (fun, NUM_FIXED_BLOCKS)); leftover? It seems to be unused (and is bougs). > + tree type = TREE_TYPE (arg); > + if (POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type)) You had supported_type for this... I think you want to bail out for DECL_BY_REFERENCE (arg) // arg will be a pointer to the actual argument || DECL_IGNORED (arg) // no dbeug info > + { > + tree func = targetm.vartrace_func (TREE_TYPE (arg), false); > + if (!func) > + continue; > + > + if (!is_gimple_reg (arg)) TREE_CODE (arg) != SSA_NAME (since you are calling ssa_default_def) > + continue; > + tree sarg = ssa_default_def (fun, arg); > + if (!sarg) > + continue; > + > + gimple_stmt_iterator egi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun))); > + changed |= !!insert_trace (&egi, sarg, NULL, "arg", fun->function_start_locus); > + } > + } > + return changed; > +} > + > +/* Generate trace call for store GAS at GI, force if FORCE. Return true > + if successfull. Return true if successfully inserted. */ > + > +bool > +instrument_store (gimple_stmt_iterator *gi, gassign *gas, bool force) > +{ > + tree orig = gimple_assign_lhs (gas); > + > + if (!supported_mem (orig, force)) > + return false; > + > + tree func = targetm.vartrace_func (TREE_TYPE (orig), false); > + if (!func) > + return false; > + > + /* Generate another reference to target. That can be racy, but is > + guaranteed to have the debug location of the target. Better > + would be to use the original value to avoid any races, but we > + would need to somehow force the target location of the > + builtin. */ Hmm, but then this requires the target instruction to have a memory operand? That's going to be unlikely for RISCy cases? On x86 does it work if combine later does not syntesize a ptwrite with memory operand? I also wonder how this survives RTL CSE since you are basically doing mem = val; // orig stmt val' = mem; ptwrite (val'); that probably means when CSE removes the load there ends up a debug-insn reflecting what you want? > + tree tvar = make_ssa_name (TREE_TYPE (orig)); > + gassign *assign = gimple_build_assign (tvar, unshare_expr (orig)); > + log_trace_code (gas, assign, orig, "store copy"); > + gimple_set_location (assign, gimple_location (gas)); > + gsi_insert_after (gi, assign, GSI_CONTINUE_LINKING); > + update_stmt (assign); > + > + gcall *tcall = gimple_build_call (func, 1, tvar); > + log_trace_code (gas, tcall, tvar, "store"); > + gimple_set_location (tcall, gimple_location (gas)); > + gsi_insert_after (gi, tcall, GSI_CONTINUE_LINKING); > + update_stmt (tcall); > + return true; > +} > + > +/* Instrument STMT at GI. Force if FORCE. Return true if changed. */ > + > +bool > +instrument_assign (gimple_stmt_iterator *gi, gassign *gas, bool force) > +{ > + if (gimple_clobber_p (gas)) > + return false; > + bool changed = false; > + tree tvar = instrument_mem (gi, gimple_assign_rhs1 (gas), > + (flag_vartrace & VARTRACE_READS) || force, > + gas, "assign load1"); > + if (tvar) > + { > + gimple_assign_set_rhs1 (gas, tvar); > + changed = true; > + } > + /* Handle operators in case they read locals. */ Does it make sense at all to instrument SSA "reads"? You know all SSA vars have a single definition and all locals not in SSA form are represented with memory reads/writes, so ... > + if (gimple_num_ops (gas) > 2) > + { ... this case looks suprious. And as said you probably do not want to isnstrument SSA "reads"? Also not above when instrumenting gimple_assign_rhs1 (gas), so better guard that with gimple_assing_load_p (gas)? > + tvar = instrument_mem (gi, gimple_assign_rhs2 (gas), > + (flag_vartrace & VARTRACE_READS) || force, > + gas, "assign load2"); > + if (tvar) > + { > + gimple_assign_set_rhs2 (gas, tvar); you still have this stmt adjusting here, why? > + changed = true; > + } > + } > + // handle more ops? > + > + if (gimple_store_p (gas)) > + changed |= instrument_store (gi, gas, > + (flag_vartrace & VARTRACE_WRITES) || force); > + > + if (changed) > + update_stmt (gas); > + return changed; > +} > + > +/* Instrument return at statement STMT at GI with FORCE. Return true > + if changed. */ > + > +bool > +instrument_return (gimple_stmt_iterator *gi, greturn *gret, bool force) > +{ > + tree rval = gimple_return_retval (gret); > + > + if (!rval) > + return false; > + if (DECL_P (rval) && DECL_BY_REFERENCE (rval)) > + rval = build_simple_mem_ref (ssa_default_def (cfun, rval)); This looks bogus. If DECL_BY_REFERENCE is true then rval is of pointer type and thus a register, you shouldn't ever see that returned plainly in gimple_return_retval. > + if (supported_mem (rval, force)) > + return !!insert_trace (gi, rval, gret, "return"); > + return false; > +} > + > +/* Instrument asm at GI in statement STMT with FORCE if needed. Return > + true if changed. */ > + > +bool > +instrument_asm (gimple_stmt_iterator *gi, gasm *stmt, bool force) > +{ > + bool changed = false; > + > + for (unsigned i = 0; i < gimple_asm_ninputs (stmt); i++) > + changed |= !!instrument_mem (gi, TREE_VALUE (gimple_asm_input_op (stmt, i)), > + force || (flag_vartrace & VARTRACE_READS), stmt, > + "asm input"); > + for (unsigned i = 0; i < gimple_asm_noutputs (stmt); i++) > + { > + tree o = TREE_VALUE (gimple_asm_output_op (stmt, i)); > + if (supported_mem (o, force | (flag_vartrace & VARTRACE_WRITES))) > + changed |= !!insert_trace (gi, o, stmt, "asm output", -1, true); > + } I would guess instrumenting asms() has the chance of disturbing reg-alloc quite a bit... > + return changed; > +} > + > +/* Insert vartrace calls for FUN. */ > + > +unsigned int > +vartrace_execute (function *fun) > +{ > + basic_block bb; > + gimple_stmt_iterator gi; > + bool force = 0; > + > + if (lookup_attribute ("vartrace", TYPE_ATTRIBUTES (TREE_TYPE (fun->decl))) > + || lookup_attribute ("vartrace", DECL_ATTRIBUTES (fun->decl))) btw, I wonder whether the vartrace attribute should have an argument like vartrace(locals,reads,...)? > + force = true; > + > + bool changed = false; > + > + if ((flag_vartrace & VARTRACE_ARGS) || force) > + changed |= instrument_args (fun); > + > + FOR_EACH_BB_FN (bb, fun) > + for (gi = gsi_start_bb (bb); !gsi_end_p (gi); gsi_next (&gi)) > + { > + gimple *stmt = gsi_stmt (gi); Not sure if I suggested it during the first review but there's walk_stmt_load_store_ops () which lets you walk (via callbacks) all memory loads and stores in a stmt (also loads of non-SSA registers). Then there's FOR_EACH_SSA_DEF_OPERAND () or alternatively FOR_EACH_SSA_TREE_OPERAND () in case you are also interested in uses. For SSA uses you are currently missing indexes in ARRAY_REFs and friends. But as said I think you really want to avoid instrumenting SSA uses. > + switch (gimple_code (stmt)) > + { > + case GIMPLE_ASSIGN: > + changed |= instrument_assign (&gi, as_a <gassign *> (stmt), force); > + break; > + case GIMPLE_RETURN: > + changed |= instrument_return (&gi, as_a <greturn *> (stmt), > + force || (flag_vartrace & VARTRACE_RETURNS)); > + break; > + > + // for GIMPLE_CALL we use the argument logging in the callee > + // we could optionally log in the caller too to handle all possible > + // reads of a local/global when the callee is not instrumented > + // possibly later we could also instrument copy and clear calls. > + > + case GIMPLE_SWITCH: > + changed |= !!instrument_mem (&gi, gimple_switch_index (as_a <gswitch *> (stmt)), > + force, stmt, "switch"); > + break; > + case GIMPLE_COND: > + changed |= !!instrument_mem (&gi, gimple_cond_lhs (stmt), force, stmt, "if lhs"); > + changed |= !!instrument_mem (&gi, gimple_cond_rhs (stmt), force, stmt, "if rhs"); switch and cond are always SSA uses (or constants). > + break; > + > + case GIMPLE_ASM: > + changed |= instrument_asm (&gi, as_a<gasm *> (stmt), force); > + break; > + default: > + // everything else that reads/writes variables should be lowered already > + break; > + } > + } > + > + // for now, until we fix all cases that destroy ssa > + return changed ? TODO_update_ssa : 0;; > +} > + > +const pass_data pass_data_vartrace = > +{ > + GIMPLE_PASS, /* type */ > + "vartrace", /* name */ > + OPTGROUP_NONE, /* optinfo_flags */ > + TV_NONE, /* tv_id */ > + 0, /* properties_required */ > + 0, /* properties_provided */ > + 0, /* properties_destroyed */ > + 0, /* todo_flags_start */ > + 0, /* todo_flags_finish */ > +}; > + > +class pass_vartrace : public gimple_opt_pass > +{ > +public: > + pass_vartrace (gcc::context *ctxt) > + : gimple_opt_pass (pass_data_vartrace, ctxt) > + {} > + > + virtual opt_pass * clone () > + { > + return new pass_vartrace (m_ctxt); > + } > + > + virtual bool gate (function *fun) > + { > + // check if vartrace is supported in backend > + if (!targetm.vartrace_func > + || targetm.vartrace_func (integer_type_node, false) == NULL) > + return false; > + > + if (lookup_attribute ("no_vartrace", TYPE_ATTRIBUTES (TREE_TYPE (fun->decl))) > + || lookup_attribute ("no_vartrace", DECL_ATTRIBUTES (fun->decl))) > + return false; > + > + // need to run pass always to check for variable attributes > + return true; > + } > + > + virtual unsigned int execute (function *f) { return vartrace_execute (f); } > +}; > + > +} // anon namespace > + > +gimple_opt_pass * > +make_pass_vartrace (gcc::context *ctxt) > +{ > + return new pass_vartrace (ctxt); > +} > -- > 2.19.1 >
On Tue, Nov 20, 2018 at 01:04:19PM +0100, Richard Biener wrote: > Since your builtin clobbers memory Hmm, maybe we could get rid of that, but then how to avoid the optimizer moving it around over function calls etc.? The instrumentation should still be useful when the program crashes, so we don't want to delay logging too much. > Maybe even instead pass it a number of bytes so it models how atomics work. How could that reject float? Mode seems better for now. Eventually might support float/double through memory, but not in the first version. > > NEXT_PASS (pass_tsan_O0); > > NEXT_PASS (pass_sanopt); > > + NEXT_PASS (pass_vartrace); > > I'd move it one lower, after pass_cleanup_eh. Further enhancement > would make it a > RTL pass ... It's after pass_nrv now. > So in reality the restriction is on the size of the object, correct? The instruction accepts 32 or 64bit memory or register. In principle everything could be logged through this, but i was trying to limit the cases to integers and pointers for now to simplify the problem. Right now the backend fails up when something else than 4 or 8 bytes is passed. > > > +{ > > + if (!supported_type (t)) > > You handle some nested cases below via recursion, > like a.b.c (but not a[i][j]). But then this check will > fire. I think it would be better to restructure the > function to look at the outermost level for whether > the op is of supported type, thus we can log it > at all and then get all the way down to the base via > sth like > > if (!supported_type (t)) > return false; > enum attrstate s = <some default>; > do > { > s = supported_op (t, s); > if (s == force_off) > return false; > } > while (handled_component_p (t) && (t = TREE_OPERAND (t, 0))) > > Now t is either an SSA_NAME, a DECL (you fail to handle PARM_DECL Incoming arguments and returns are handled separately. > and RESULT_DECL below) or a [TARGET_]MEM_REF. To get rid > of non-pointer indirections do then > > t = get_base_address (t); > if (DECL_P (t) && is_local (t)) > .... > > because... > > > + return false; > > + > > + enum attrstate s = supported_op (t, neutral); > > + if (s == force_off) > > + return false; > > + if (s == force_on) > > + force = true; > > + > > + switch (TREE_CODE (t)) > > + { > > + case VAR_DECL: > > + if (DECL_ARTIFICIAL (t)) > > + return false; > > + if (is_local (t)) > > + return true; > > + return s == force_on || force; > > + > > + case ARRAY_REF: > > + t = TREE_OPERAND (t, 0); > > + s = supported_op (t, s); > > + if (s == force_off) > > + return false; > > + return supported_type (TREE_TYPE (t)); > > Your supported_type is said to take a DECL. And you > already asked for this type - it's the type of the original t > (well, the type of this type given TREE_TYPE (t) is an array type). > But you'll reject a[i][j] where the type of this type is an array type as well. Just to be clear, after your changes above I only need to handle VAR_DECL and SSA_NAME here then, correct? So one of the reasons I handled ARRAY_REF this way is to trace the index as a local if needed. If I can assume it was always in a MEM with an own ASSIGN earlier if the local was a user visible that wouldn't be needed (and also some other similar code elsewhere) But when I look at a simple test case like vartrace-6 void f (void) { int i; for (i = 0; i < 10; i++) f2 (); } i appears to be a SSA name only that is referenced everywhere without MEM. And if the user wants to track the value of i I would need to explicitely handle all these cases. Do I miss something here? I'm starting to think i should perhaps drop locals support to simplify everything? But that might limit usability for debugging somewhat. > gsi_insert_* does update_stmt already. Btw, if you allow any > SImode or DImode size value you can use a VIEW_CONVERT_EXPR Just add them unconditionally? > > +bool > > +instrument_args (function *fun) > > +{ > > + gimple_stmt_iterator gi; > > + bool changed = false; > > + > > + /* Local tracing usually takes care of the argument too, when > > + they are read. This avoids redundant trace instructions. */ > > But only when instrumenting reads? Yes will add the check. > > Hmm, but then this requires the target instruction to have a memory operand? Yes that's right for now. Eventually it will be fixed and x86 would benefit too. > That's going to be unlikely for RISCy cases? On x86 does it work if > combine later does not syntesize a ptwrite with memory operand? > I also wonder how this survives RTL CSE since you are basically doing > > mem = val; // orig stmt > val' = mem; > ptwrite (val'); > > that probably means when CSE removes the load there ends up a debug-insn > reflecting what you want? I'll check. > > + /* Handle operators in case they read locals. */ > > Does it make sense at all to instrument SSA "reads"? You know > all SSA vars have a single definition and all locals not in SSA form > are represented with memory reads/writes, so ... Ok but how about the locals in SSA form? Like in the example above. I would love to simplify all this, but I fear that it would make the locals tracking unusable. > > > + if (gimple_num_ops (gas) > 2) > > + { > > ... this case looks suprious. And as said you probably do not want to > isnstrument SSA "reads"? Also not above when instrumenting > gimple_assign_rhs1 (gas), > so better guard that with gimple_assing_load_p (gas)? > > > + tvar = instrument_mem (gi, gimple_assign_rhs2 (gas), > > + (flag_vartrace & VARTRACE_READS) || force, > > + gas, "assign load2"); > > + if (tvar) > > + { > > + gimple_assign_set_rhs2 (gas, tvar); > > you still have this stmt adjusting here, why? I tried removing it, but I had problems during testing (IIRC it was definitions used after assignment), so I readded it. It might be me doing something bogus elsewhere. > > +/* Instrument return at statement STMT at GI with FORCE. Return true > > + if changed. */ > > + > > +bool > > +instrument_return (gimple_stmt_iterator *gi, greturn *gret, bool force) > > +{ > > + tree rval = gimple_return_retval (gret); > > + > > + if (!rval) > > + return false; > > + if (DECL_P (rval) && DECL_BY_REFERENCE (rval)) > > + rval = build_simple_mem_ref (ssa_default_def (cfun, rval)); > > This looks bogus. If DECL_BY_REFERENCE is true then rval > is of pointer type and thus a register, you shouldn't ever see that > returned plainly in gimple_return_retval. Ok so what to do with the DECL_BY_REFERENCE then? I think i copied this from some other pass. > > I would guess instrumenting asms() has the chance of disturbing > reg-alloc quite a bit... You want to make it optional with a new argument? > > +{ > > + basic_block bb; > > + gimple_stmt_iterator gi; > > + bool force = 0; > > + > > + if (lookup_attribute ("vartrace", TYPE_ATTRIBUTES (TREE_TYPE (fun->decl))) > > + || lookup_attribute ("vartrace", DECL_ATTRIBUTES (fun->decl))) > > btw, I wonder whether the vartrace attribute should have an argument like > vartrace(locals,reads,...)? I was hoping this could be delayed until actually needed. It'll need some changes because I don't want to do the full parsing for every decl all the time, so would need to store a bitmap of options somewhere in tree. > > + > > + FOR_EACH_BB_FN (bb, fun) > > + for (gi = gsi_start_bb (bb); !gsi_end_p (gi); gsi_next (&gi)) > > + { > > + gimple *stmt = gsi_stmt (gi); > > Not sure if I suggested it during the first review but there's > > walk_stmt_load_store_ops () > > which lets you walk (via callbacks) all memory loads and stores in a stmt > (also loads of non-SSA registers). Then there's > > FOR_EACH_SSA_DEF_OPERAND () > > or alternatively FOR_EACH_SSA_TREE_OPERAND () in case you are > also interested in uses. For SSA uses you are currently missing > indexes in ARRAY_REFs and friends. But as said I think you really > want to avoid instrumenting SSA uses. I would love too, but would the example above still trace "i" ? -Andi
On Tue, Nov 20, 2018 at 7:27 PM Andi Kleen <andi@firstfloor.org> wrote: > > On Tue, Nov 20, 2018 at 01:04:19PM +0100, Richard Biener wrote: > > Since your builtin clobbers memory > > Hmm, maybe we could get rid of that, but then how to avoid > the optimizer moving it around over function calls etc.? > The instrumentation should still be useful when the program > crashes, so we don't want to delay logging too much. You can't avoid moving it, yes. But it can be even moved now, effectively detaching it from the "interesting" $pc ranges we have debug location lists for? > > Maybe even instead pass it a number of bytes so it models how atomics work. > > How could that reject float? Why does it need to reject floats? Note you are already rejecting floats in the instrumentation pass. > Mode seems better for now. > > Eventually might support float/double through memory, but not in the > first version. Why does movq %xmm0, %rax; ptwrite %rax not work? > > > > NEXT_PASS (pass_tsan_O0); > > > NEXT_PASS (pass_sanopt); > > > + NEXT_PASS (pass_vartrace); > > > > I'd move it one lower, after pass_cleanup_eh. Further enhancement > > would make it a > > RTL pass ... > > It's after pass_nrv now. > > > So in reality the restriction is on the size of the object, correct? > > The instruction accepts 32 or 64bit memory or register. > > In principle everything could be logged through this, but i was trying > to limit the cases to integers and pointers for now to simplify > the problem. > > Right now the backend fails up when something else than 4 or 8 bytes > is passed. Fair enough, the instrumentation would need to pad out smaller values and/or split larger values. I think 1 and 2 byte values would be interesting so you can support char and shorts. Eventually 16byte values for __int128 or vectors. > > > > > +{ > > > + if (!supported_type (t)) > > > > You handle some nested cases below via recursion, > > like a.b.c (but not a[i][j]). But then this check will > > fire. I think it would be better to restructure the > > function to look at the outermost level for whether > > the op is of supported type, thus we can log it > > at all and then get all the way down to the base via > > sth like > > > > if (!supported_type (t)) > > return false; > > enum attrstate s = <some default>; > > do > > { > > s = supported_op (t, s); > > if (s == force_off) > > return false; > > } > > while (handled_component_p (t) && (t = TREE_OPERAND (t, 0))) > > > > Now t is either an SSA_NAME, a DECL (you fail to handle PARM_DECL > > Incoming arguments and returns are handled separately. > > > and RESULT_DECL below) or a [TARGET_]MEM_REF. To get rid > > of non-pointer indirections do then > > > > t = get_base_address (t); > > if (DECL_P (t) && is_local (t)) > > .... > > > > because... > > > > > + return false; > > > + > > > + enum attrstate s = supported_op (t, neutral); > > > + if (s == force_off) > > > + return false; > > > + if (s == force_on) > > > + force = true; > > > + > > > + switch (TREE_CODE (t)) > > > + { > > > + case VAR_DECL: > > > + if (DECL_ARTIFICIAL (t)) > > > + return false; > > > + if (is_local (t)) > > > + return true; > > > + return s == force_on || force; > > > + > > > + case ARRAY_REF: > > > + t = TREE_OPERAND (t, 0); > > > + s = supported_op (t, s); > > > + if (s == force_off) > > > + return false; > > > + return supported_type (TREE_TYPE (t)); > > > > Your supported_type is said to take a DECL. And you > > already asked for this type - it's the type of the original t > > (well, the type of this type given TREE_TYPE (t) is an array type). > > But you'll reject a[i][j] where the type of this type is an array type as well. > > Just to be clear, after your changes above I only need > to handle VAR_DECL and SSA_NAME here then, correct? Yes (and PARM_DECL and RESULT_DECL). > So one of the reasons I handled ARRAY_REF this way is to > trace the index as a local if needed. If I can assume > it was always in a MEM with an own ASSIGN earlier if the local > was a user visible that wouldn't be needed (and also some other similar > code elsewhere) If the index lives in memory it has a corresponding load. For SSA uses see my comment about instrumenting them at all (together with the suggestion on how to handle them in an easier way). > > But when I look at a simple test case like vartrace-6 > > void > f (void) > { > int i; > for (i = 0; i < 10; i++) > f2 (); > } > > i appears to be a SSA name only that is referenced everywhere without > MEM. And if the user wants to track the value of i I would need > to explicitely handle all these cases. Do I miss something here? You handle those in different places I think. > I'm starting to think i should perhaps drop locals support to simplify > everything? But that might limit usability for debugging somewhat. I think you confuse "locals" a bit. In GIMPLE an automatic variable of type 'int' might be assigned a memory location, then reads (if not cached in a register) happen via separate memory load statements. If it has not been assigned a memory location then it lives in a (SSA) register. > > > gsi_insert_* does update_stmt already. Btw, if you allow any > > SImode or DImode size value you can use a VIEW_CONVERT_EXPR > > Just add them unconditionally? You can use the simplification machinery to elide it automatically for example. tree tem = gimple_build (&seq, VIEW_CONVERT_EXPR, uint/ulong-type, val); gives you a register or value of desired type with a conversion stmt appended to seq if one is required. > > > +bool > > > +instrument_args (function *fun) > > > +{ > > > + gimple_stmt_iterator gi; > > > + bool changed = false; > > > + > > > + /* Local tracing usually takes care of the argument too, when > > > + they are read. This avoids redundant trace instructions. */ > > > > But only when instrumenting reads? > > Yes will add the check. > > > > > Hmm, but then this requires the target instruction to have a memory operand? > > Yes that's right for now. Eventually it will be fixed and x86 would > benefit too. > > > That's going to be unlikely for RISCy cases? On x86 does it work if > > combine later does not syntesize a ptwrite with memory operand? > > I also wonder how this survives RTL CSE since you are basically doing > > > > mem = val; // orig stmt > > val' = mem; > > ptwrite (val'); > > > > that probably means when CSE removes the load there ends up a debug-insn > > reflecting what you want? > > I'll check. > > > > + /* Handle operators in case they read locals. */ > > > > Does it make sense at all to instrument SSA "reads"? You know > > all SSA vars have a single definition and all locals not in SSA form > > are represented with memory reads/writes, so ... > > Ok but how about the locals in SSA form? Like in the example above. > > I would love to simplify all this, but I fear that it would make > the locals tracking unusable. I'd instrument SSA variables at the point of their definition. > > > > > + if (gimple_num_ops (gas) > 2) > > > + { > > > > ... this case looks suprious. And as said you probably do not want to > > isnstrument SSA "reads"? Also not above when instrumenting > > gimple_assign_rhs1 (gas), > > so better guard that with gimple_assing_load_p (gas)? > > > > > + tvar = instrument_mem (gi, gimple_assign_rhs2 (gas), > > > + (flag_vartrace & VARTRACE_READS) || force, > > > + gas, "assign load2"); > > > + if (tvar) > > > + { > > > + gimple_assign_set_rhs2 (gas, tvar); > > > > you still have this stmt adjusting here, why? > > I tried removing it, but I had problems during testing (IIRC it was > definitions used after assignment), so I readded it. It might be me > doing something bogus elsewhere. Probably... > > > +/* Instrument return at statement STMT at GI with FORCE. Return true > > > + if changed. */ > > > + > > > +bool > > > +instrument_return (gimple_stmt_iterator *gi, greturn *gret, bool force) > > > +{ > > > + tree rval = gimple_return_retval (gret); > > > + > > > + if (!rval) > > > + return false; > > > + if (DECL_P (rval) && DECL_BY_REFERENCE (rval)) > > > + rval = build_simple_mem_ref (ssa_default_def (cfun, rval)); > > > > This looks bogus. If DECL_BY_REFERENCE is true then rval > > is of pointer type and thus a register, you shouldn't ever see that > > returned plainly in gimple_return_retval. > > Ok so what to do with the DECL_BY_REFERENCE then? > > I think i copied this from some other pass. The above should be unreachable so just remove it ;) > > > > I would guess instrumenting asms() has the chance of disturbing > > reg-alloc quite a bit... > > You want to make it optional with a new argument? Not sure. > > > +{ > > > + basic_block bb; > > > + gimple_stmt_iterator gi; > > > + bool force = 0; > > > + > > > + if (lookup_attribute ("vartrace", TYPE_ATTRIBUTES (TREE_TYPE (fun->decl))) > > > + || lookup_attribute ("vartrace", DECL_ATTRIBUTES (fun->decl))) > > > > btw, I wonder whether the vartrace attribute should have an argument like > > vartrace(locals,reads,...)? > > I was hoping this could be delayed until actually needed. It'll need > some changes because I don't want to do the full parsing for every decl > all the time, so would need to store a bitmap of options somewhere > in tree. Hmm, indeed. I wonder if you need the function attribute at all then? Maybe a negative, no-vartrace is enough? Maybe even that is not needed? That is, on types and decls you'd interpret it as if locals tracing is on then locals of type are traced but otherwise locals of type are not traced? That is, if I just want to trace variable i and do int i __attribute__((vartrace)); then what options do I enable to make that work and how can I avoid tracing anything else? Similar for typedef int traced_int __attribute_((vartrace)); ? I guess we'd need -fvar-trace=locals to get the described effect for the type attribute and then -fvar-trace=locals,all to have _all_ locals traced? Or -fvar-trace=locals,only-marked? ... or forgo with the idea of marking types? > > > + > > > + FOR_EACH_BB_FN (bb, fun) > > > + for (gi = gsi_start_bb (bb); !gsi_end_p (gi); gsi_next (&gi)) > > > + { > > > + gimple *stmt = gsi_stmt (gi); > > > > Not sure if I suggested it during the first review but there's > > > > walk_stmt_load_store_ops () > > > > which lets you walk (via callbacks) all memory loads and stores in a stmt > > (also loads of non-SSA registers). Then there's > > > > FOR_EACH_SSA_DEF_OPERAND () > > > > or alternatively FOR_EACH_SSA_TREE_OPERAND () in case you are > > also interested in uses. For SSA uses you are currently missing > > indexes in ARRAY_REFs and friends. But as said I think you really > > want to avoid instrumenting SSA uses. > > I would love too, but would the example above still trace "i" ? Yes. Or, well... at -O0 you instrument sth like f () { int i; <bb 2> : i_3 = 0; goto <bb 4>; [INV] <bb 3> : f2 (); i_6 = i_1 + 1; <bb 4> : # i_1 = PHI <i_3(2), i_6(3)> if (i_1 <= 9) goto <bb 3>; [INV] else goto <bb 5>; [INV] <bb 5> : return; if you instrument at SSA definition sites this would become f () { int i; <bb 2> : i_3 = 0; trace (0); goto <bb 4>; [INV] <bb 3> : f2 (); i_6 = i_1 + 1; trace (i_6); <bb 4> : # i_1 = PHI <i_3(2), i_6(3)> trace (i_1); if (i_1 <= 9) goto <bb 3>; [INV] else goto <bb 5>; [INV] <bb 5> : return; of course when optimizing you'll instead see f () { unsigned int ivtmp_2; unsigned int ivtmp_10; <bb 2> [local count: 97603132]: <bb 3> [local count: 976138693]: # ivtmp_10 = PHI <10(2), ivtmp_2(3)> f2 (); ivtmp_2 = ivtmp_10 + 4294967295; if (ivtmp_2 != 0) goto <bb 3>; [90.91%] else goto <bb 4>; [9.09%] <bb 4> [local count: 97603132]: return; so there's no 'i' anymore. If you enable debug-info you can get it back via looking at DEBUG_INSNs: <bb 2> [local count: 97603132]: # DEBUG BEGIN_STMT # DEBUG BEGIN_STMT # DEBUG i => 0 # DEBUG i => 0 <bb 3> [local count: 976138693]: # ivtmp_10 = PHI <10(2), ivtmp_2(3)> # DEBUG i => (int) (10 - ivtmp_10) # DEBUG BEGIN_STMT f2 (); # DEBUG D#1 => (int) (11 - ivtmp_10) # DEBUG i => D#1 # DEBUG i => D#1 ivtmp_2 = ivtmp_10 + 4294967295; if (ivtmp_2 != 0) goto <bb 3>; [90.91%] else goto <bb 4>; [9.09%] <bb 4> [local count: 97603132]: return; but that's going to be a bit more tricky. That is, instrumenting memory is easy, instrumenting register "reads"/"writes" is going to be a bit tricky, at least when looking at optimized code and instrumenting late. (but ptwrite is only interesting for optimized code, no?) Richard. > > -Andi >
On Thu, Nov 22, 2018 at 02:53:11PM +0100, Richard Biener wrote: > > the optimizer moving it around over function calls etc.? > > The instrumentation should still be useful when the program > > crashes, so we don't want to delay logging too much. > > You can't avoid moving it, yes. But it can be even moved now, effectively > detaching it from the "interesting" $pc ranges we have debug location lists for? > > > > Maybe even instead pass it a number of bytes so it models how atomics work. > > > > How could that reject float? > > Why does it need to reject floats? Note you are already rejecting floats > in the instrumentation pass. The backend doesn't know how to generate the code for ptwrite %xmm0. So it would either need to learn about all these conversions, or reject them in the hook so that the middle end eventually can generate them. Or maybe hard code the knowledge in the middle end? > > Mode seems better for now. > > > > Eventually might support float/double through memory, but not in the > > first version. > > Why does movq %xmm0, %rax; ptwrite %rax not work? It works of course, the question is just who is in charge of figuring out that the movq needs to be generated (and that it's not a round, but a bit cast) > Fair enough, the instrumentation would need to pad out smaller values > and/or split larger values. I think 1 and 2 byte values would be interesting > so you can support char and shorts. Eventually 16byte values for __int128 > or vectors. Right. And for copies/clears too. But I was hoping to just get the basics solid and then do these more advanced features later. > > > btw, I wonder whether the vartrace attribute should have an argument like > > > vartrace(locals,reads,...)? > > > > I was hoping this could be delayed until actually needed. It'll need > > some changes because I don't want to do the full parsing for every decl > > all the time, so would need to store a bitmap of options somewhere > > in tree. > > Hmm, indeed. I wonder if you need the function attribute at all then? Maybe I really would like an opt-in per function. I think that's an important use case. Just instrument a few functions that contribute to the bug you're debugging to cut down the bandwidth. The idea was that __attribute__((vartrace)) for a function would log everything in that function, including locals. I could see a use case to opt-in into a function without locals (mainly because local tracking is more likely to cause trace overflows if there is a lot of execution). But I think I can do without that in v1 might add it later. > That is, on types and decls you'd interpret it as if locals tracing is on > then locals of type are traced but otherwise locals of type are not traced? When the opt-in is on the type or the variable then the variable should be tracked, even if it is an local (or maybe even a cast) The locals check is mainly for the command line. > That is, if I just want to trace variable i and do > > int i __attribute__((vartrace)); > > then what options do I enable to make that work and how can I avoid > tracing anything else? Similar for Just enable -mptwrite (or nothing if it's implied with -mcpu=...) > > typedef int traced_int __attribute_((vartrace)); Same. > > ? I guess we'd need -fvar-trace=locals to get the described effect for > the type attribute and then -fvar-trace=locals,all to have _all_ locals > traced? Or -fvar-trace=locals,only-marked? ... or forgo with the idea > of marking types? You want a command line override to not trace if the attribute is set? -mno-ptwrite would work. Could also add something separate in -fvartrace (perhaps implied in =off) but not sure if it's worth it. I suppose it could make sense if someone wants to use the _ptwrite builtin separately still. I'll add it to =off. > so there's no 'i' anymore. If you enable debug-info you Hmm I guess that's ok for now. I suppose it'll work with -Og. > and instrumenting late. (but ptwrite is only > interesting for optimized code, no?) It's very interesting for non optimized code too. In fact that would be a common use case during debugging. -Andi
diff --git a/config/bootstrap-vartrace-locals.mk b/config/bootstrap-vartrace-locals.mk new file mode 100644 index 00000000000..dd16640df74 --- /dev/null +++ b/config/bootstrap-vartrace-locals.mk @@ -0,0 +1,3 @@ +STAGE2_CFLAGS += -mptwrite -fvartrace=all +STAGE3_CFLAGS += -mptwrite -fvartrace=all +STAGE4_CFLAGS += -mptwrite -fvartrace=all diff --git a/config/bootstrap-vartrace.mk b/config/bootstrap-vartrace.mk new file mode 100644 index 00000000000..e29824d799b --- /dev/null +++ b/config/bootstrap-vartrace.mk @@ -0,0 +1,3 @@ +STAGE2_CFLAGS += -mptwrite -fvartrace +STAGE3_CFLAGS += -mptwrite -fvartrace +STAGE4_CFLAGS += -mptwrite -fvartrace diff --git a/gcc/Makefile.in b/gcc/Makefile.in index ec793175c3b..64a99a1ec8a 100644 --- a/gcc/Makefile.in +++ b/gcc/Makefile.in @@ -1594,6 +1594,7 @@ OBJS = \ tree-vectorizer.o \ tree-vector-builder.o \ tree-vrp.o \ + tree-vartrace.o \ tree.o \ typed-splay-tree.o \ unique-ptr-tests.o \ diff --git a/gcc/c-family/c-attribs.c b/gcc/c-family/c-attribs.c index 1657df7f9df..d50e78de830 100644 --- a/gcc/c-family/c-attribs.c +++ b/gcc/c-family/c-attribs.c @@ -104,6 +104,10 @@ static tree handle_tls_model_attribute (tree *, tree, tree, int, bool *); static tree handle_no_instrument_function_attribute (tree *, tree, tree, int, bool *); +static tree handle_vartrace_attribute (tree *, tree, + tree, int, bool *); +static tree handle_no_vartrace_attribute (tree *, tree, + tree, int, bool *); static tree handle_no_profile_instrument_function_attribute (tree *, tree, tree, int, bool *); static tree handle_malloc_attribute (tree *, tree, tree, int, bool *); @@ -235,6 +239,13 @@ static const struct attribute_spec::exclusions attr_const_pure_exclusions[] = ATTR_EXCL (NULL, false, false, false) }; +static const struct attribute_spec::exclusions attr_vartrace_exclusions[] = +{ + ATTR_EXCL ("vartrace", true, true, true), + ATTR_EXCL ("no_vartrace", true, true, true), + ATTR_EXCL (NULL, false, false, false) +}; + /* Table of machine-independent attributes common to all C-like languages. Current list of processed common attributes: nonnull. */ @@ -326,6 +337,12 @@ const struct attribute_spec c_common_attribute_table[] = { "no_instrument_function", 0, 0, true, false, false, false, handle_no_instrument_function_attribute, NULL }, + { "vartrace", 0, 0, false, false, false, false, + handle_vartrace_attribute, + attr_vartrace_exclusions }, + { "no_vartrace", 0, 0, false, false, false, false, + handle_no_vartrace_attribute, + attr_vartrace_exclusions }, { "no_profile_instrument_function", 0, 0, true, false, false, false, handle_no_profile_instrument_function_attribute, NULL }, @@ -770,6 +787,66 @@ handle_no_sanitize_undefined_attribute (tree *node, tree name, tree, int, return NULL_TREE; } +/* Handle "vartrace" attribute; arguments as in struct + attribute_spec.handler. */ + +static tree +handle_vartrace_attribute (tree *node, tree name, tree, int flags, + bool *no_add_attrs) +{ + if (!VAR_OR_FUNCTION_DECL_P (*node) && !TYPE_P (*node) && + TREE_CODE (*node) != FIELD_DECL) + { + warning (OPT_Wattributes, "%qE attribute ignored for object", name); + *no_add_attrs = true; + return NULL_TREE; + } + + if (!targetm.vartrace_func) + { + warning (OPT_Wattributes, "%qE attribute not supported for target", name); + *no_add_attrs = true; + return NULL_TREE; + } + + if (TREE_TYPE (*node) + && TREE_CODE (*node) != FUNCTION_DECL + && targetm.vartrace_func (TREE_TYPE (*node), true) == NULL_TREE) + { + warning (OPT_Wattributes, "%qE attribute not supported for type", name); + *no_add_attrs = true; + return NULL_TREE; + } + + if (TYPE_P (*node) && !(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) + *node = build_variant_type_copy (*node); + + /* We look it up later with lookup_attribute. */ + return NULL_TREE; +} + +/* Handle "no_vartrace" attribute; arguments as in struct + attribute_spec.handler. */ + +static tree +handle_no_vartrace_attribute (tree *node, tree name, tree, int flags, + bool *no_add_attrs) +{ + if (!VAR_OR_FUNCTION_DECL_P (*node) && !TYPE_P (*node) + && TREE_CODE (*node) != FIELD_DECL) + { + warning (OPT_Wattributes, "%qE attribute ignored", name); + *no_add_attrs = true; + return NULL_TREE; + } + + if (TYPE_P (*node) && !(flags & (int) ATTR_FLAG_TYPE_IN_PLACE)) + *node = build_variant_type_copy (*node); + + /* We look it up later with lookup_attribute. */ + return NULL_TREE; +} + /* Handle an "asan odr indicator" attribute; arguments as in struct attribute_spec.handler. */ diff --git a/gcc/common.opt b/gcc/common.opt index 72a713593c3..f27a57b1177 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -221,6 +221,10 @@ unsigned int flag_sanitize_recover = (SANITIZE_UNDEFINED | SANITIZE_UNDEFINED_NO Variable unsigned int flag_sanitize_coverage +; What to instrument with vartrace +Variable +unsigned int flag_vartrace + ; Flag whether a prefix has been added to dump_base_name Variable bool dump_base_name_prefixed = false @@ -2856,6 +2860,10 @@ ftree-scev-cprop Common Report Var(flag_tree_scev_cprop) Init(1) Optimization Enable copy propagation of scalar-evolution information. +fvartrace +Common JoinedOrMissing Report Driver +-fvartrace=default|all|locals|returns|args|reads|writes|off Enable variable tracing instrumentation. + ; -fverbose-asm causes extra commentary information to be produced in ; the generated assembly code (to make it more readable). This option ; is generally only of use to those who actually need to read the diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c index c18c60a1d19..6d130bf0804 100644 --- a/gcc/config/i386/i386.c +++ b/gcc/config/i386/i386.c @@ -31900,6 +31900,35 @@ ix86_mangle_function_version_assembler_name (tree decl, tree id) return ret; } +/* Hook to determine that TYPE can be traced. Ignore target flags + if FORCE is true. Returns the tracing builtin if tracing is possible, + or otherwise NULL. */ + +static tree +ix86_vartrace_func (tree type, bool force) +{ + if (!(ix86_isa_flags2 & OPTION_MASK_ISA_PTWRITE)) + { + /* With force, as in checking for the attribute, ignore + the current target settings. Otherwise it's not + possible to declare vartrace variables outside + an __attribute__((target("ptwrite"))) function + if -mptwrite is not specified. */ + if (!force) + return NULL; + /* Initialize the builtins if missing, so that we have + something to return. */ + if (!ix86_builtins[(int)IX86_BUILTIN_PTWRITE32]) + ix86_add_new_builtins (0, OPTION_MASK_ISA_PTWRITE); + } + + if (TYPE_PRECISION (type) == 32) + return ix86_builtins[(int) IX86_BUILTIN_PTWRITE32]; + else if (TYPE_PRECISION (type) == 64) + return ix86_builtins[(int) IX86_BUILTIN_PTWRITE64]; + else + return NULL; +} static tree ix86_mangle_decl_assembler_name (tree decl, tree id) @@ -50894,6 +50923,9 @@ ix86_run_selftests (void) #undef TARGET_ASAN_SHADOW_OFFSET #define TARGET_ASAN_SHADOW_OFFSET ix86_asan_shadow_offset +#undef TARGET_VARTRACE_FUNC +#define TARGET_VARTRACE_FUNC ix86_vartrace_func + #undef TARGET_GIMPLIFY_VA_ARG_EXPR #define TARGET_GIMPLIFY_VA_ARG_EXPR ix86_gimplify_va_arg diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi index d4b1046b6ae..f0cde5dcf0f 100644 --- a/gcc/doc/extend.texi +++ b/gcc/doc/extend.texi @@ -3228,6 +3228,13 @@ the standard C library can be guaranteed not to throw an exception with the notable exceptions of @code{qsort} and @code{bsearch} that take function pointer arguments. +@item no_vartrace +The @code{no_vartrace} attribute disables data tracing for +the function [or variable or structure field] declared with +the attribute. See @pxref{Common Variable Attributes} and +@pxref{Common Type Attributes}. When specified for a function +nothing in the function is traced. + @item optimize (@var{level}, @dots{}) @item optimize (@var{string}, @dots{}) @cindex @code{optimize} function attribute @@ -3489,6 +3496,21 @@ When applied to a member function of a C++ class template, the attribute also means that the function is instantiated if the class itself is instantiated. +@item vartrace +@cindex @code{vartrace} function or variable attribute +Enable data tracing for the function or variable or structure field +marked with this attribute. When applied to a type all instances of the type +will be traced. When applied to a structure or union all fields will be traced. +When applied to a structure field that field will be traced. +For functions will not trace locals (unless enabled on the command line or +on as an attribute the local itself) but arguments, returns, globals, pointer +references. For variables or types or structure members any reads or writes will be traced. +Only integer or pointer types can be tracked. + +Currently implemented for x86 when the @option{ptwrite} target option +is enabled for systems that support the @code{PTWRITE} instruction, +supporting 4 and 8 byte integers. + @item visibility ("@var{visibility_type}") @cindex @code{visibility} function attribute This attribute affects the linkage of the declaration to which it is attached. @@ -7196,6 +7218,16 @@ A @{ /* @r{@dots{}} */ @}; struct __attribute__ ((copy ( (struct A *)0)) B @{ /* @r{@dots{}} */ @}; @end smallexample +@cindex @code{vartrace} type attribute +@cindex @code{no_vartrace} type attribute +@item vartrace +@itemx no_vartrace +Specify that all instances of type should be variable traced +or not variable traced. Can be also also applied to function +types to disable tracing for all instances of that function type. +Can be also applied to structure fields. See the description in +@pxref{Variable Attributes} for more details. + @item deprecated @itemx deprecated (@var{msg}) @cindex @code{deprecated} type attribute diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index 535b258d22b..8b9d1d71669 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -2754,6 +2754,33 @@ Don't use the @code{__cxa_get_exception_ptr} runtime routine. This causes @code{std::uncaught_exception} to be incorrect, but is necessary if the runtime routine is not available. +@item -fvartrace +@opindex -fvartrace=... +Insert trace instructions to trace variables at runtime with a tracer +such as Intel Processor Trace. + +Requires enabling a backend specific option, like @option{-mptwrite} to enable +@code{PTWRITE} instruction generation on x86. + +Additional qualifiers can be specified after the =: @option{args} +for arguments, @option{off} to disable, @option{returns} for tracing +return values, @option{reads} to trace reads, @option{writes} to trace +writes, @option{locals} to trace locals. +Default when enabled is tracing reads, writes, arguments, returns, objects +with a static or thread duration but no locals. +Multiple options can be separated by comma. + +When -fvartrace enabled, the compiler will add trace instructions. By +default these trace options act like nops, unless tracing is +enabled. For example to enable tracing on x86 Linux with Linux perf +use: + +@smallexample +gcc -fvartrace -o program -g program.c +perf record -e intel_pt/pt=1,ptw=1,branch=0/ program +perf script +@end smallexample + @item -fvisibility-inlines-hidden @opindex fvisibility-inlines-hidden This switch declares that the user does not attempt to compare diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi index 0a2ad9a745e..861914ac646 100644 --- a/gcc/doc/tm.texi +++ b/gcc/doc/tm.texi @@ -11936,6 +11936,12 @@ Address Sanitizer shadow memory address. NULL if Address Sanitizer is not supported by the target. @end deftypefn +@deftypefn {Target Hook} tree TARGET_VARTRACE_FUNC (tree @var{type}, bool @var{force}) +Return a builtin to call to trace variables of type TYPE or NULL if not supported +by the target. Ignore target configuration if FORCE is true. The builtin gets called with a +single argument of TYPE. +@end deftypefn + @deftypefn {Target Hook} {unsigned HOST_WIDE_INT} TARGET_MEMMODEL_CHECK (unsigned HOST_WIDE_INT @var{val}) Validate target specific memory model mask bits. When NULL no target specific memory model bits are allowed. diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in index f1ad80da467..756e84d0e07 100644 --- a/gcc/doc/tm.texi.in +++ b/gcc/doc/tm.texi.in @@ -8104,6 +8104,8 @@ and the associated definitions of those functions. @hook TARGET_ASAN_SHADOW_OFFSET +@hook TARGET_VARTRACE_FUNC + @hook TARGET_MEMMODEL_CHECK @hook TARGET_ATOMIC_TEST_AND_SET_TRUEVAL diff --git a/gcc/flag-types.h b/gcc/flag-types.h index 500f6638f36..6e5add2da9d 100644 --- a/gcc/flag-types.h +++ b/gcc/flag-types.h @@ -261,6 +261,15 @@ enum sanitize_code { | SANITIZE_BOUNDS_STRICT }; +/* Settings for flag_vartrace */ +enum flag_vartrace { + VARTRACE_LOCALS = 1 << 0, + VARTRACE_ARGS = 1 << 1, + VARTRACE_RETURNS = 1 << 2, + VARTRACE_READS = 1 << 3, + VARTRACE_WRITES = 1 << 4 +}; + /* Settings of flag_incremental_link. */ enum incremental_link { INCREMENTAL_LINK_NONE, diff --git a/gcc/opts.c b/gcc/opts.c index 318ed442057..c6c47b55a39 100644 --- a/gcc/opts.c +++ b/gcc/opts.c @@ -1872,6 +1872,65 @@ check_alignment_argument (location_t loc, const char *flag, const char *name) parse_and_check_align_values (flag, name, align_result, true, loc); } +/* Parse vartrace options in P, updating flags OPTS at LOC and return + updated flags. */ + +static int +parse_vartrace_options (const char *p, int opts, location_t loc) +{ + static struct { + const char *name; + int opt; + } vopts[] = + { + { "default", + VARTRACE_ARGS | VARTRACE_RETURNS | VARTRACE_READS + | VARTRACE_WRITES }, /* Keep as first entry. */ + { "all", + VARTRACE_ARGS | VARTRACE_RETURNS | VARTRACE_READS + | VARTRACE_WRITES | VARTRACE_LOCALS }, + { "args", VARTRACE_ARGS }, + { "returns", VARTRACE_RETURNS }, + { "reads", VARTRACE_READS }, + { "writes", VARTRACE_WRITES }, + { "locals", VARTRACE_LOCALS }, + { NULL, 0 } + }; + + if (*p == '=') + p++; + if (*p == 0) + return opts | vopts[0].opt; + + if (!strcmp (p, "off")) + return 0; + + while (*p) + { + unsigned len = strcspn (p, ","); + int i; + + for (i = 0; vopts[i].name; i++) + { + if (len == strlen (vopts[i].name) && !strncmp (p, vopts[i].name, len)) + { + opts |= vopts[i].opt; + break; + } + } + if (vopts[i].name == NULL) + { + error_at (loc, "invalid argument to %qs", "-fvartrace"); + break; + } + + p += len; + if (*p == ',') + p++; + } + return opts; +} + /* Handle target- and language-independent options. Return zero to generate an "unknown option" message. Only options that need extra handling need to be listed here; if you simply want @@ -2078,6 +2137,10 @@ common_handle_option (struct gcc_options *opts, case OPT__completion_: break; + case OPT_fvartrace: + opts->x_flag_vartrace = parse_vartrace_options (arg, opts->x_flag_vartrace, loc); + break; + case OPT_fsanitize_: opts->x_flag_sanitize = parse_sanitizer_options (arg, loc, code, diff --git a/gcc/passes.def b/gcc/passes.def index 24f212c8e31..36d14b0386a 100644 --- a/gcc/passes.def +++ b/gcc/passes.def @@ -394,6 +394,7 @@ along with GCC; see the file COPYING3. If not see NEXT_PASS (pass_asan_O0); NEXT_PASS (pass_tsan_O0); NEXT_PASS (pass_sanopt); + NEXT_PASS (pass_vartrace); NEXT_PASS (pass_cleanup_eh); NEXT_PASS (pass_lower_resx); NEXT_PASS (pass_nrv); diff --git a/gcc/target.def b/gcc/target.def index f9469d69cb0..b5d970870a4 100644 --- a/gcc/target.def +++ b/gcc/target.def @@ -4300,6 +4300,15 @@ supported by the target.", unsigned HOST_WIDE_INT, (void), NULL) +/* Defines the builtin to trace variables, or NULL. */ +DEFHOOK +(vartrace_func, + "Return a builtin to call to trace variables of type TYPE or NULL if not supported\n\ +by the target. Ignore target configuration if FORCE is true. The builtin gets called with a\n\ +single argument of TYPE.", + tree, (tree type, bool force), + NULL) + /* Functions relating to calls - argument passing, returns, etc. */ /* Members of struct call have no special macro prefix. */ HOOK_VECTOR (TARGET_CALLS, calls) diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h index af15adc8e0c..2cf31785a6f 100644 --- a/gcc/tree-pass.h +++ b/gcc/tree-pass.h @@ -423,6 +423,7 @@ extern gimple_opt_pass *make_pass_strlen (gcc::context *ctxt); extern gimple_opt_pass *make_pass_fold_builtins (gcc::context *ctxt); extern gimple_opt_pass *make_pass_post_ipa_warn (gcc::context *ctxt); extern gimple_opt_pass *make_pass_stdarg (gcc::context *ctxt); +extern gimple_opt_pass *make_pass_vartrace (gcc::context *ctxt); extern gimple_opt_pass *make_pass_early_warn_uninitialized (gcc::context *ctxt); extern gimple_opt_pass *make_pass_late_warn_uninitialized (gcc::context *ctxt); extern gimple_opt_pass *make_pass_cse_reciprocals (gcc::context *ctxt); diff --git a/gcc/tree-vartrace.c b/gcc/tree-vartrace.c new file mode 100644 index 00000000000..1ef81b743fc --- /dev/null +++ b/gcc/tree-vartrace.c @@ -0,0 +1,491 @@ +/* Insert instructions for data value tracing. + Copyright (C) 2017, 2018 Free Software Foundation, Inc. + Contributed by Andi Kleen. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#include "config.h" +#include "system.h" +#include "coretypes.h" +#include "backend.h" +#include "target.h" +#include "tree.h" +#include "tree-iterator.h" +#include "tree-pass.h" +#include "basic-block.h" +#include "gimple.h" +#include "gimple-iterator.h" +#include "gimplify.h" +#include "gimplify-me.h" +#include "gimple-ssa.h" +#include "gimple-pretty-print.h" +#include "cfghooks.h" +#include "fold-const.h" +#include "ssa.h" +#include "tree-dfa.h" +#include "attribs.h" + +namespace { + +enum attrstate { force_off, force_on, neutral }; + +/* Can we trace with attributes ATTR. */ + +attrstate +supported_attr (tree attr) +{ + if (lookup_attribute ("no_vartrace", attr)) + return force_off; + if (lookup_attribute ("vartrace", attr)) + return force_on; + return neutral; +} + +/* Is tracing enabled for ARG considering S. */ + +attrstate +supported_op (tree arg, attrstate s) +{ + if (s != neutral) + return s; + if (DECL_P (arg)) + { + s = supported_attr (DECL_ATTRIBUTES (arg)); + if (s != neutral) + return s; + } + return supported_attr (TYPE_ATTRIBUTES (TREE_TYPE (arg))); +} + +/* Can we trace DECL. */ + +bool +supported_type (tree decl) +{ + tree type = TREE_TYPE (decl); + + return POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type); +} + +/* Return true if DECL is refering to a local variable. */ + +bool +is_local (tree decl) +{ + if (!(flag_vartrace & VARTRACE_LOCALS) && supported_op (decl, neutral) != force_on) + return false; + return auto_var_in_fn_p (decl, cfun->decl); +} + +/* Is T something we can log, FORCEing the type if needed. */ + +bool +supported_mem (tree t, bool force) +{ + if (!supported_type (t)) + return false; + + enum attrstate s = supported_op (t, neutral); + if (s == force_off) + return false; + if (s == force_on) + force = true; + + switch (TREE_CODE (t)) + { + case VAR_DECL: + if (DECL_ARTIFICIAL (t)) + return false; + if (is_local (t)) + return true; + return s == force_on || force; + + case ARRAY_REF: + t = TREE_OPERAND (t, 0); + s = supported_op (t, s); + if (s == force_off) + return false; + return supported_type (TREE_TYPE (t)); + + case COMPONENT_REF: + s = supported_op (TREE_OPERAND (t, 1), s); + t = TREE_OPERAND (t, 0); + if (s == neutral && is_local (t)) + return true; + s = supported_op (t, s); + if (s != neutral) + return s == force_on ? true : false; + return supported_mem (t, force); + + // support BIT_FIELD_REF? + + case VIEW_CONVERT_EXPR: + case TARGET_MEM_REF: + case MEM_REF: + return supported_mem (TREE_OPERAND (t, 0), force); + + case SSA_NAME: + if ((flag_vartrace & VARTRACE_LOCALS) + && SSA_NAME_VAR (t) + && !DECL_IGNORED_P (SSA_NAME_VAR (t))) + return true; + return force; + + default: + break; + } + + return false; +} + +/* Print debugging for inserting CODE at ORIG_STMT with type of VAL for WHY. */ + +void +log_trace_code (gimple *orig_stmt, gimple *code, tree val, const char *why) +{ + if (!dump_file) + return; + if (orig_stmt) + fprintf (dump_file, "BB%d ", gimple_bb (orig_stmt)->index); + fprintf (dump_file, "%s inserting ", why); + print_gimple_stmt (dump_file, code, 0, TDF_VOPS|TDF_MEMSYMS); + if (orig_stmt) + { + fprintf (dump_file, "orig "); + print_gimple_stmt (dump_file, orig_stmt, 2, + TDF_VOPS|TDF_MEMSYMS); + } + fprintf (dump_file, "type "); + print_generic_expr (dump_file, TREE_TYPE (val), TDF_SLIM); + fputc ('\n', dump_file); + fputc ('\n', dump_file); +} + +/* Insert variable tracing code for VAL before iterator GI, originally + for ORIG_STMT and optionally at LOC. Normally before ORIG_STMT, but + AFTER if true. Reason is WHY. Return trace var if successfull, + or NULL_TREE. */ + +tree +insert_trace (gimple_stmt_iterator *gi, tree val, gimple *orig_stmt, + const char *why, location_t loc = -1, bool after = false) +{ + if (loc == (location_t)-1) + loc = gimple_location (orig_stmt); + + tree func = targetm.vartrace_func (TREE_TYPE (val), false); + if (!func) + return NULL_TREE; + + tree tvar = val; + if (!is_gimple_reg (val)) + { + tvar = make_ssa_name (TREE_TYPE (val)); + gassign *assign = gimple_build_assign (tvar, unshare_expr (val)); + log_trace_code (orig_stmt, assign, val, "copy"); + gimple_set_location (assign, loc); + if (after) + gsi_insert_after (gi, assign, GSI_CONTINUE_LINKING); + else + gsi_insert_before (gi, assign, GSI_SAME_STMT); + update_stmt (assign); + } + + gcall *call = gimple_build_call (func, 1, tvar); + log_trace_code (NULL, call, tvar, why); + gimple_set_location (call, loc); + if (after) + gsi_insert_after (gi, call, GSI_CONTINUE_LINKING); + else + gsi_insert_before (gi, call, GSI_SAME_STMT); + update_stmt (call); + return tvar; +} + +/* Insert trace at GI for T in FUN if suitable memory or variable + reference. Always if FORCE. Originally on ORIG_STMT. Reason is + WHY. Insert after GI if AFTER. Returns trace variable or NULL_TREE. */ + +tree +instrument_mem (gimple_stmt_iterator *gi, tree t, bool force, + gimple *orig_stmt, const char *why, bool after = false) +{ + if (supported_mem (t, force)) + return insert_trace (gi, t, orig_stmt, why, -1, after); + return NULL_TREE; +} + +/* Instrument arguments for FUN. Return true if changed. */ + +bool +instrument_args (function *fun) +{ + gimple_stmt_iterator gi; + bool changed = false; + + /* Local tracing usually takes care of the argument too, when + they are read. This avoids redundant trace instructions. */ + if (flag_vartrace & VARTRACE_LOCALS) + return false; + + for (tree arg = DECL_ARGUMENTS (current_function_decl); + arg != NULL_TREE; + arg = DECL_CHAIN (arg)) + { + gi = gsi_start_bb (BASIC_BLOCK_FOR_FN (fun, NUM_FIXED_BLOCKS)); + tree type = TREE_TYPE (arg); + if (POINTER_TYPE_P (type) || INTEGRAL_TYPE_P (type)) + { + tree func = targetm.vartrace_func (TREE_TYPE (arg), false); + if (!func) + continue; + + if (!is_gimple_reg (arg)) + continue; + tree sarg = ssa_default_def (fun, arg); + if (!sarg) + continue; + + gimple_stmt_iterator egi = gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (fun))); + changed |= !!insert_trace (&egi, sarg, NULL, "arg", fun->function_start_locus); + } + } + return changed; +} + +/* Generate trace call for store GAS at GI, force if FORCE. Return true + if successfull. Return true if successfully inserted. */ + +bool +instrument_store (gimple_stmt_iterator *gi, gassign *gas, bool force) +{ + tree orig = gimple_assign_lhs (gas); + + if (!supported_mem (orig, force)) + return false; + + tree func = targetm.vartrace_func (TREE_TYPE (orig), false); + if (!func) + return false; + + /* Generate another reference to target. That can be racy, but is + guaranteed to have the debug location of the target. Better + would be to use the original value to avoid any races, but we + would need to somehow force the target location of the + builtin. */ + + tree tvar = make_ssa_name (TREE_TYPE (orig)); + gassign *assign = gimple_build_assign (tvar, unshare_expr (orig)); + log_trace_code (gas, assign, orig, "store copy"); + gimple_set_location (assign, gimple_location (gas)); + gsi_insert_after (gi, assign, GSI_CONTINUE_LINKING); + update_stmt (assign); + + gcall *tcall = gimple_build_call (func, 1, tvar); + log_trace_code (gas, tcall, tvar, "store"); + gimple_set_location (tcall, gimple_location (gas)); + gsi_insert_after (gi, tcall, GSI_CONTINUE_LINKING); + update_stmt (tcall); + return true; +} + +/* Instrument STMT at GI. Force if FORCE. Return true if changed. */ + +bool +instrument_assign (gimple_stmt_iterator *gi, gassign *gas, bool force) +{ + if (gimple_clobber_p (gas)) + return false; + bool changed = false; + tree tvar = instrument_mem (gi, gimple_assign_rhs1 (gas), + (flag_vartrace & VARTRACE_READS) || force, + gas, "assign load1"); + if (tvar) + { + gimple_assign_set_rhs1 (gas, tvar); + changed = true; + } + /* Handle operators in case they read locals. */ + if (gimple_num_ops (gas) > 2) + { + tvar = instrument_mem (gi, gimple_assign_rhs2 (gas), + (flag_vartrace & VARTRACE_READS) || force, + gas, "assign load2"); + if (tvar) + { + gimple_assign_set_rhs2 (gas, tvar); + changed = true; + } + } + // handle more ops? + + if (gimple_store_p (gas)) + changed |= instrument_store (gi, gas, + (flag_vartrace & VARTRACE_WRITES) || force); + + if (changed) + update_stmt (gas); + return changed; +} + +/* Instrument return at statement STMT at GI with FORCE. Return true + if changed. */ + +bool +instrument_return (gimple_stmt_iterator *gi, greturn *gret, bool force) +{ + tree rval = gimple_return_retval (gret); + + if (!rval) + return false; + if (DECL_P (rval) && DECL_BY_REFERENCE (rval)) + rval = build_simple_mem_ref (ssa_default_def (cfun, rval)); + if (supported_mem (rval, force)) + return !!insert_trace (gi, rval, gret, "return"); + return false; +} + +/* Instrument asm at GI in statement STMT with FORCE if needed. Return + true if changed. */ + +bool +instrument_asm (gimple_stmt_iterator *gi, gasm *stmt, bool force) +{ + bool changed = false; + + for (unsigned i = 0; i < gimple_asm_ninputs (stmt); i++) + changed |= !!instrument_mem (gi, TREE_VALUE (gimple_asm_input_op (stmt, i)), + force || (flag_vartrace & VARTRACE_READS), stmt, + "asm input"); + for (unsigned i = 0; i < gimple_asm_noutputs (stmt); i++) + { + tree o = TREE_VALUE (gimple_asm_output_op (stmt, i)); + if (supported_mem (o, force | (flag_vartrace & VARTRACE_WRITES))) + changed |= !!insert_trace (gi, o, stmt, "asm output", -1, true); + } + return changed; +} + +/* Insert vartrace calls for FUN. */ + +unsigned int +vartrace_execute (function *fun) +{ + basic_block bb; + gimple_stmt_iterator gi; + bool force = 0; + + if (lookup_attribute ("vartrace", TYPE_ATTRIBUTES (TREE_TYPE (fun->decl))) + || lookup_attribute ("vartrace", DECL_ATTRIBUTES (fun->decl))) + force = true; + + bool changed = false; + + if ((flag_vartrace & VARTRACE_ARGS) || force) + changed |= instrument_args (fun); + + FOR_EACH_BB_FN (bb, fun) + for (gi = gsi_start_bb (bb); !gsi_end_p (gi); gsi_next (&gi)) + { + gimple *stmt = gsi_stmt (gi); + switch (gimple_code (stmt)) + { + case GIMPLE_ASSIGN: + changed |= instrument_assign (&gi, as_a <gassign *> (stmt), force); + break; + case GIMPLE_RETURN: + changed |= instrument_return (&gi, as_a <greturn *> (stmt), + force || (flag_vartrace & VARTRACE_RETURNS)); + break; + + // for GIMPLE_CALL we use the argument logging in the callee + // we could optionally log in the caller too to handle all possible + // reads of a local/global when the callee is not instrumented + // possibly later we could also instrument copy and clear calls. + + case GIMPLE_SWITCH: + changed |= !!instrument_mem (&gi, gimple_switch_index (as_a <gswitch *> (stmt)), + force, stmt, "switch"); + break; + case GIMPLE_COND: + changed |= !!instrument_mem (&gi, gimple_cond_lhs (stmt), force, stmt, "if lhs"); + changed |= !!instrument_mem (&gi, gimple_cond_rhs (stmt), force, stmt, "if rhs"); + break; + + case GIMPLE_ASM: + changed |= instrument_asm (&gi, as_a<gasm *> (stmt), force); + break; + default: + // everything else that reads/writes variables should be lowered already + break; + } + } + + // for now, until we fix all cases that destroy ssa + return changed ? TODO_update_ssa : 0;; +} + +const pass_data pass_data_vartrace = +{ + GIMPLE_PASS, /* type */ + "vartrace", /* name */ + OPTGROUP_NONE, /* optinfo_flags */ + TV_NONE, /* tv_id */ + 0, /* properties_required */ + 0, /* properties_provided */ + 0, /* properties_destroyed */ + 0, /* todo_flags_start */ + 0, /* todo_flags_finish */ +}; + +class pass_vartrace : public gimple_opt_pass +{ +public: + pass_vartrace (gcc::context *ctxt) + : gimple_opt_pass (pass_data_vartrace, ctxt) + {} + + virtual opt_pass * clone () + { + return new pass_vartrace (m_ctxt); + } + + virtual bool gate (function *fun) + { + // check if vartrace is supported in backend + if (!targetm.vartrace_func + || targetm.vartrace_func (integer_type_node, false) == NULL) + return false; + + if (lookup_attribute ("no_vartrace", TYPE_ATTRIBUTES (TREE_TYPE (fun->decl))) + || lookup_attribute ("no_vartrace", DECL_ATTRIBUTES (fun->decl))) + return false; + + // need to run pass always to check for variable attributes + return true; + } + + virtual unsigned int execute (function *f) { return vartrace_execute (f); } +}; + +} // anon namespace + +gimple_opt_pass * +make_pass_vartrace (gcc::context *ctxt) +{ + return new pass_vartrace (ctxt); +}
From: Andi Kleen <ak@linux.intel.com> Add a new pass to automatically instrument changes to variables with the new PTWRITE instruction on x86. PTWRITE writes a 4 or 8 byte field into an Processor Trace log, which allows low over head logging of information. Essentially it's a hardware accelerated printf. This allows to reconstruct how values later from the log, which can be useful for debugging or other analysis of the program behavior. With the compiler support this can be done with without having to manually add instrumentation to the code. Using dwarf information this can be later mapped back to the variables. The decoder decodes the PTWRITE instructions using IP information in the log, and then looks up the argument in the debug information. Then this can be used to reconstruct the original variable name to display a value history for the variable. There are new options to enable instrumentation for different types, and also a new attribute to control analysis fine grained per function or variable level. The attributes can be set on both the variable and the type level, and also on structure fields. This allows to enable tracing only for specific code in large programs in a flexible matter. The pass is generic, but only the x86 backend enables the necessary hooks. When the backend enables the necessary hooks (with -mptwrite) there is an additional pass that looks through the code for attribute vartrace enabled functions or variables. The -fvartrace=locals option is experimental: it works, but it generates redundant ptwrites because the pass doesn't use the SSA information to minimize instrumentation. This could be optimized later. Currently the code can be tested with SDE, or on a Intel Gemini Lake system with a new enough Linux kernel (v4.10+) that supports PTWRITE for PT. Gemini Lake is used in low end laptops ("Intel Pentium Silver J5...... / Celeron N4... / Celeron J4...") Linux perf can be used to record the values perf record -e intel_pt/ptw=1,branch=0/ program perf script --itrace=crw -F +synth ... I have an experimential version of perf that can also use dwarf information to symbolize many[1] values back to their variable names. So far it is not in standard perf, but available at https://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-misc.git/log/?h=perf/var-resolve-4 It is currently not able to decode all variable locations to names, but a large subset. Longer term hopefully gdb will support this information too. The CPU can potentially generate very data high bandwidths when code doing a lot of computation is heavily instrumented. This can cause some data loss in both the CPU and also in perf logging the data when the disk cannot keep up. Running some larger workloads most workloads do not cause CPU level overflows, but I've seen it with -fvartrace with crafty, and with more workloads with -fvartrace-locals. Recommendation is to not fully instrument programs, but only areas of interest either at the file level or using the attributes. The other thing is that perf and the disk often cannot keep up with the data bandwidth for longer computations. In this case it's possible to use perf snapshot mode (add --snapshot to the command line above). The data will be only logged to a memory ring buffer then, and only dump the buffers on events of interest by sending SIGUSR2 to the perf binrary. In the future this will be hopefully better supported with core files and gdb. Passes bootstrap and test suite on x86_64-linux, also bootstrapped and tested gcc itself with full -fvartrace and -fvartrace=locals instrumentation. gcc/: 2018-11-15 Andi Kleen <ak@linux.intel.com> * Makefile.in: Add tree-vartrace.o. * common.opt: Add -fvartrace * opts.c (parse_vartrace_options): Add. (common_handle_option): Call parse_vartrace_options. * config/i386/i386.c (ix86_vartrace_func): Add. (TARGET_VARTRACE_FUNC): Add. * doc/extend.texi: Document vartrace/no_vartrace attributes. * doc/invoke.texi: Document -fvartrace. * doc/tm.texi (TARGET_VARTRACE_FUNC): Add. * passes.def: Add vartrace pass. * target.def (vartrace_func): Add. * tree-pass.h (make_pass_vartrace): Add. * tree-vartrace.c: New file to implement vartrace pass. gcc/c-family/: 2018-11-15 Andi Kleen <ak@linux.intel.com> * c-attribs.c (handle_vartrace_attribute, handle_no_vartrace_attribute): New functions. (attr_vartrace_exclusions): Add. config/: 2018-11-03 Andi Kleen <ak@linux.intel.com> * bootstrap-vartrace.mk: New. * bootstrap-vartrace-locals.mk: New. --- config/bootstrap-vartrace-locals.mk | 3 + config/bootstrap-vartrace.mk | 3 + gcc/Makefile.in | 1 + gcc/c-family/c-attribs.c | 77 +++++ gcc/common.opt | 8 + gcc/config/i386/i386.c | 32 ++ gcc/doc/extend.texi | 32 ++ gcc/doc/invoke.texi | 27 ++ gcc/doc/tm.texi | 6 + gcc/doc/tm.texi.in | 2 + gcc/flag-types.h | 9 + gcc/opts.c | 63 ++++ gcc/passes.def | 1 + gcc/target.def | 9 + gcc/tree-pass.h | 1 + gcc/tree-vartrace.c | 491 ++++++++++++++++++++++++++++ 16 files changed, 765 insertions(+) create mode 100644 config/bootstrap-vartrace-locals.mk create mode 100644 config/bootstrap-vartrace.mk create mode 100644 gcc/tree-vartrace.c