diff mbox

Adjustment to DW_TAG_GNU_call_site patch for ICF debug

Message ID 20101223211226.GM16156@tyan-ft48-01.lab.bos.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Dec. 23, 2010, 9:12 p.m. UTC
Hi!

This patch is an attempt to provide all info needed for ICF disambiguation
in DW_TAG_GNU_call_site DIEs, so far just for -O1 -g and above, not for -O0
-g (but that is possible to handle, by setting some flag that dwarf2out
shouldn't be expecting call arg notes and instead look at CALL_INSNs alone).
For the direct calls, I believe the
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01793.html
http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01794.html
provide everything the debugger needs to disambiguate, for most virtual
calls they don't, because most often there isn't any
DW_AT_GNU_call_site_target that can be provided.

Roland wanted call site's DW_AT_abstract_origin in these cases to contain
a reference to the fn pointer that is called, but in the virtual calls
we don't have anything like that, all we have is DW_TAG_subprogram
in the class type with DW_AT_virtuality saying it is a virtual call.
If we referenced that, it would be ambiguous whether it is a direct call
to the virtual method (e.g. discovered through devirtualization), or
just indirect call to the virtual method that could be at runtime a call to
a different method.

The patch below arranges in that case (as can be seen on the testcase below)
to have DW_AT_GNU_call_site_target_clobbered (e.g. on x86_64 if this is
passed in %rdi and OBJ_TYPE_REF token is 0 it is
DW_OP_breg5 0 DW_OP_deref DW_OP_deref), i.e. where at the call time can be
the address found.  The debugger could check that %rdi in this case in the
called function is the artificial this parameter, assume in C++ it doesn't
change in the function and use that to discover the called function's
vtable.

And/or we could invent a new attribute which would contain what the
.debug_vcall function contained, i.e. the OBJ_TYPE_REF_TOKEN value.

I haven't seen .debug_dcall/.debug_vcall support patches for gdb anywhere,
and the patch doesn't disable generation of those sections (though, that
could be simplified now that the OBJ_TYPE_REF should be available from the
CALL_INSN).

Also note that at least ia64 would need some further changes, because its
call splitters recreate the call MEMs and thus throw their MEM_EXPR down on
the floor and thus don't preserve it till var-tracking/dwarf2out.

Comments/suggestions?

2010-12-23  Jakub Jelinek  <jakub@redhat.com>

	* calls.c (emit_call_1): Set MEM_EXPR on call's MEM.
	* var-tracking.c (prepare_call_arguments): Use MEM_EXPR on
	call's MEM.  Handle functions returning aggregate through a hidden
	first pointer.  For virtual calls add clobbered pc to call arguments
	chain.
	* dwarf2out.c (gen_subprogram_die): Emit
	DW_AT_GNU_call_site_target_clobbered if DW_AT_GNU_call_site_target
	can't be emitted.


	Jakub
__attribute__((noinline, noclone)) void
f1 (int i)
{
  asm volatile ("" : : "r" (i) : "memory");
}

__attribute__((noinline, noclone)) void
f2 (int i)
{
  asm volatile ("" : : "r" (i) : "memory");
}

__attribute__((noinline, noclone)) void
f3 (int i)
{
  asm volatile ("" : : "r" (i) : "memory");
}

__attribute__((noinline, noclone)) int
bar (int i)
{
  f1 (i * 4 + 2);
  f2 (i * 4 + 2);
  f3 (i * 4 + 2);
  return 4;
}

#ifdef RETURN_STRUCT
struct S
{
  S ();
  char c[1024];
};
#else
#define S void
#endif

struct A
{
  A () { int i = 4; baz (i); foo (); }
  void foo () __attribute__((noinline, noclone));
  virtual S baz (int &) __attribute__((noinline, noclone));
  virtual ~A () {}
};

struct B : A
{
  B () { int i = 5; baz (i); foo (); }
  virtual S baz (int &) __attribute__((noinline, noclone));
  virtual ~B () {}
};

struct C : A
{
  C () { int i = 6; baz (i); foo (); }
  virtual S baz (int &) __attribute__((noinline, noclone));
  virtual S baz2 (int &) __attribute__((noinline, noclone));
  virtual ~C () {}
};

struct D : B, C
{
  D () { int i = 7; baz (i); foo (); }
  void foo () __attribute__((noinline, noclone));
  virtual S baz (int &) __attribute__((noinline, noclone));
  virtual ~D () {}
};

void
A::foo ()
{
  int i = 8;
  baz (i);
}

#ifdef RETURN_STRUCT
S s;
#else
#define s
#endif

S
A::baz (int &x)
{
  asm volatile ("" : : "r" (x) : "memory");
  return s;
}

S
B::baz (int &x)
{
  asm volatile ("" : : "r" (x) : "memory");
  return s;
}

S
C::baz (int &x)
{
  asm volatile ("" : : "r" (x) : "memory");
  return s;
}

void
D::foo ()
{
  int i = 9;
  baz (i);
  baz2 (i);
}

S
D::baz (int &x)
{
  asm volatile ("" : : "r" (x) : "memory");
  return s;
}

D d;

int
main ()
{
  bar (5);
}

Comments

Cary Coutant Dec. 23, 2010, 9:28 p.m. UTC | #1
> The patch below arranges in that case (as can be seen on the testcase below)
> to have DW_AT_GNU_call_site_target_clobbered (e.g. on x86_64 if this is
> passed in %rdi and OBJ_TYPE_REF token is 0 it is
> DW_OP_breg5 0 DW_OP_deref DW_OP_deref), i.e. where at the call time can be
> the address found.  The debugger could check that %rdi in this case in the
> called function is the artificial this parameter, assume in C++ it doesn't
> change in the function and use that to discover the called function's
> vtable.
>
> And/or we could invent a new attribute which would contain what the
> .debug_vcall function contained, i.e. the OBJ_TYPE_REF_TOKEN value.

If we can access the pointer to the called function's vtable directly,
I don't think we need the vtable slot index that was stored in the
.debug_vcall table. I'll have to think about it some more, though, to
be sure. How reasonable is it to assume that the artificial this
parameter doesn't move around?

> I haven't seen .debug_dcall/.debug_vcall support patches for gdb anywhere,
> and the patch doesn't disable generation of those sections (though, that
> could be simplified now that the OBJ_TYPE_REF should be available from the
> CALL_INSN).

I started the work in gdb, but dropped it a while back and haven't
gotten back to it. There's no legacy yet that depends on those two
tables.

-cary
Jakub Jelinek Dec. 23, 2010, 11:06 p.m. UTC | #2
On Thu, Dec 23, 2010 at 01:28:33PM -0800, Cary Coutant wrote:
> > The patch below arranges in that case (as can be seen on the testcase below)
> > to have DW_AT_GNU_call_site_target_clobbered (e.g. on x86_64 if this is
> > passed in %rdi and OBJ_TYPE_REF token is 0 it is
> > DW_OP_breg5 0 DW_OP_deref DW_OP_deref), i.e. where at the call time can be
> > the address found.  The debugger could check that %rdi in this case in the
> > called function is the artificial this parameter, assume in C++ it doesn't
> > change in the function and use that to discover the called function's
> > vtable.
> >
> > And/or we could invent a new attribute which would contain what the
> > .debug_vcall function contained, i.e. the OBJ_TYPE_REF_TOKEN value.
> 
> If we can access the pointer to the called function's vtable directly,
> I don't think we need the vtable slot index that was stored in the
> .debug_vcall table. I'll have to think about it some more, though, to
> be sure. How reasonable is it to assume that the artificial this
> parameter doesn't move around?

I think there is no such guarantee, but the problem is IMHO the same
for using DW_AT_GNU_call_site_target_clobbered and .debug_vcall.
For C++ I think you can assume that the this pointer is not changing during
lifetime of a C++ method and you can hope that the .debug_loc description of
it is accurate.   With entry_value stuff in some places where the this
pointer is no longer passed to anything else nor is needed any longer,
it could be DW_OP_GNU_entry_value in some parts of the method, but might not
be usable as checking whether there is no tail call involved would mean
verifying the seen caller calls that method and if the only way to check
that would be to look at DW_AT_GNU_call_site_target_clobbered where you
need the current this pointer, it is a chicken and egg issue.

For the disambiguation, the debugger would need to do extra checking
on DW_AT_GNU_call_site_target_clobbered I guess, verify that the last deref
reads from within some vtable and derive the method name from that vtable
(because, the pointer it actually reads from that offset might be the ICF
optimized one pointing to code used for multiple purposes.

	Jakub
diff mbox

Patch

--- gcc/calls.c.jj	2010-12-22 10:17:25.000000000 +0100
+++ gcc/calls.c	2010-12-22 14:09:52.000000000 +0100
@@ -256,7 +256,7 @@  emit_call_1 (rtx funexp, tree fntree ATT
 	     CUMULATIVE_ARGS *args_so_far ATTRIBUTE_UNUSED)
 {
   rtx rounded_stack_size_rtx = GEN_INT (rounded_stack_size);
-  rtx call_insn;
+  rtx call_insn, call, funmem;
   int already_popped = 0;
   HOST_WIDE_INT n_popped
     = targetm.calls.return_pops_args (fndecl, funtype, stack_size);
@@ -271,6 +271,12 @@  emit_call_1 (rtx funexp, tree fntree ATT
   if (GET_CODE (funexp) != SYMBOL_REF)
     funexp = memory_address (FUNCTION_MODE, funexp);
 
+  funmem = gen_rtx_MEM (FUNCTION_MODE, funexp);
+  if (fndecl && TREE_CODE (fndecl) == FUNCTION_DECL)
+    set_mem_expr (funmem, fndecl);
+  else if (fntree)
+    set_mem_expr (funmem, build_fold_indirect_ref (CALL_EXPR_FN (fntree)));
+
 #if defined (HAVE_sibcall_pop) && defined (HAVE_sibcall_value_pop)
   if ((ecf_flags & ECF_SIBCALL)
       && HAVE_sibcall_pop && HAVE_sibcall_value_pop
@@ -283,13 +289,11 @@  emit_call_1 (rtx funexp, tree fntree ATT
 	 if possible, for the sake of frame pointer elimination.  */
 
       if (valreg)
-	pat = GEN_SIBCALL_VALUE_POP (valreg,
-				     gen_rtx_MEM (FUNCTION_MODE, funexp),
-				     rounded_stack_size_rtx, next_arg_reg,
-				     n_pop);
+	pat = GEN_SIBCALL_VALUE_POP (valreg, funmem, rounded_stack_size_rtx,
+				     next_arg_reg, n_pop);
       else
-	pat = GEN_SIBCALL_POP (gen_rtx_MEM (FUNCTION_MODE, funexp),
-			       rounded_stack_size_rtx, next_arg_reg, n_pop);
+	pat = GEN_SIBCALL_POP (funmem, rounded_stack_size_rtx, next_arg_reg,
+			       n_pop);
 
       emit_call_insn (pat);
       already_popped = 1;
@@ -316,12 +320,11 @@  emit_call_1 (rtx funexp, tree fntree ATT
 	 if possible, for the sake of frame pointer elimination.  */
 
       if (valreg)
-	pat = GEN_CALL_VALUE_POP (valreg,
-				  gen_rtx_MEM (FUNCTION_MODE, funexp),
-				  rounded_stack_size_rtx, next_arg_reg, n_pop);
+	pat = GEN_CALL_VALUE_POP (valreg, funmem, rounded_stack_size_rtx,
+				  next_arg_reg, n_pop);
       else
-	pat = GEN_CALL_POP (gen_rtx_MEM (FUNCTION_MODE, funexp),
-			    rounded_stack_size_rtx, next_arg_reg, n_pop);
+	pat = GEN_CALL_POP (funmem, rounded_stack_size_rtx, next_arg_reg,
+			    n_pop);
 
       emit_call_insn (pat);
       already_popped = 1;
@@ -334,13 +337,12 @@  emit_call_1 (rtx funexp, tree fntree ATT
       && HAVE_sibcall && HAVE_sibcall_value)
     {
       if (valreg)
-	emit_call_insn (GEN_SIBCALL_VALUE (valreg,
-					   gen_rtx_MEM (FUNCTION_MODE, funexp),
+	emit_call_insn (GEN_SIBCALL_VALUE (valreg, funmem,
 					   rounded_stack_size_rtx,
 					   next_arg_reg, NULL_RTX));
       else
-	emit_call_insn (GEN_SIBCALL (gen_rtx_MEM (FUNCTION_MODE, funexp),
-				     rounded_stack_size_rtx, next_arg_reg,
+	emit_call_insn (GEN_SIBCALL (funmem, rounded_stack_size_rtx,
+				     next_arg_reg,
 				     GEN_INT (struct_value_size)));
     }
   else
@@ -350,13 +352,10 @@  emit_call_1 (rtx funexp, tree fntree ATT
   if (HAVE_call && HAVE_call_value)
     {
       if (valreg)
-	emit_call_insn (GEN_CALL_VALUE (valreg,
-					gen_rtx_MEM (FUNCTION_MODE, funexp),
-					rounded_stack_size_rtx, next_arg_reg,
-					NULL_RTX));
+	emit_call_insn (GEN_CALL_VALUE (valreg, funmem, rounded_stack_size_rtx,
+					next_arg_reg, NULL_RTX));
       else
-	emit_call_insn (GEN_CALL (gen_rtx_MEM (FUNCTION_MODE, funexp),
-				  rounded_stack_size_rtx, next_arg_reg,
+	emit_call_insn (GEN_CALL (funmem, rounded_stack_size_rtx, next_arg_reg,
 				  GEN_INT (struct_value_size)));
     }
   else
@@ -366,6 +365,19 @@  emit_call_1 (rtx funexp, tree fntree ATT
   /* Find the call we just emitted.  */
   call_insn = last_call_insn ();
 
+  /* Some target create a fresh MEM instead of reusing the one provided
+     above.  Set its MEM_EXPR.  */
+  call = PATTERN (call_insn);
+  if (GET_CODE (call) == PARALLEL)
+    call = XVECEXP (call, 0, 0);
+  if (GET_CODE (call) == SET)
+    call = SET_SRC (call);
+  if (GET_CODE (call) == CALL
+      && MEM_P (XEXP (call, 0))
+      && MEM_EXPR (XEXP (call, 0)) == NULL_TREE
+      && MEM_EXPR (funmem) != NULL_TREE)
+    set_mem_expr (XEXP (call, 0), MEM_EXPR (funmem));
+
   /* Put the register usage information there.  */
   add_function_usage_to (call_insn, call_fusage);
 
--- gcc/var-tracking.c.jj	2010-12-22 15:09:56.000000000 +0100
+++ gcc/var-tracking.c	2010-12-23 10:11:00.000000000 +0100
@@ -5557,7 +5557,9 @@  prepare_call_arguments (basic_block bb, 
   rtx link, x;
   rtx prev, cur, next;
   rtx call = PATTERN (insn);
-  tree type = NULL_TREE, t;
+  rtx this_arg = NULL_RTX;
+  tree type = NULL_TREE, t, fndecl = NULL_TREE;
+  tree obj_type_ref = NULL_TREE;
   CUMULATIVE_ARGS args_so_far;
 
   memset (&args_so_far, 0, sizeof (args_so_far));
@@ -5565,27 +5567,91 @@  prepare_call_arguments (basic_block bb, 
     call = XVECEXP (call, 0, 0);
   if (GET_CODE (call) == SET)
     call = SET_SRC (call);
-  if (GET_CODE (call) == CALL
-      && MEM_P (XEXP (call, 0))
-      && GET_CODE (XEXP (XEXP (call, 0), 0)) == SYMBOL_REF)
-    {
-      rtx symbol = XEXP (XEXP (call, 0), 0);
-      if (SYMBOL_REF_DECL (symbol)
-	  && TREE_CODE (SYMBOL_REF_DECL (symbol)) == FUNCTION_DECL
-	  && TYPE_ARG_TYPES (TREE_TYPE (SYMBOL_REF_DECL (symbol))))
+  if (GET_CODE (call) == CALL && MEM_P (XEXP (call, 0)))
+    {
+      if (GET_CODE (XEXP (XEXP (call, 0), 0)) == SYMBOL_REF)
+	{
+	  rtx symbol = XEXP (XEXP (call, 0), 0);
+	  if (SYMBOL_REF_DECL (symbol))
+	    fndecl = SYMBOL_REF_DECL (symbol);
+	}
+      if (fndecl == NULL_TREE)
+	fndecl = MEM_EXPR (XEXP (call, 0));
+      if (fndecl
+	  && TREE_CODE (TREE_TYPE (fndecl)) != FUNCTION_TYPE
+	  && TREE_CODE (TREE_TYPE (fndecl)) != METHOD_TYPE)
+	fndecl = NULL_TREE;
+      if (fndecl && TYPE_ARG_TYPES (TREE_TYPE (fndecl)))
+	type = TREE_TYPE (fndecl);
+      if (fndecl && TREE_CODE (fndecl) != FUNCTION_DECL)
+	{
+	  if (TREE_CODE (fndecl) == INDIRECT_REF
+	      && TREE_CODE (TREE_OPERAND (fndecl, 0)) == OBJ_TYPE_REF)
+	    obj_type_ref = TREE_OPERAND (fndecl, 0);
+	  fndecl = NULL_TREE;
+	}
+      if (type)
 	{
-	  type = TREE_TYPE (SYMBOL_REF_DECL (symbol));
 	  for (t = TYPE_ARG_TYPES (type); t && t != void_list_node;
 	       t = TREE_CHAIN (t))
 	    if (TREE_CODE (TREE_VALUE (t)) == REFERENCE_TYPE
 		&& INTEGRAL_TYPE_P (TREE_TYPE (TREE_VALUE (t))))
 	      break;
-	  if (t == NULL || t == void_list_node)
+	  if ((t == NULL || t == void_list_node) && obj_type_ref == NULL_TREE)
 	    type = NULL;
 	  else
-	    INIT_CUMULATIVE_ARGS (args_so_far, type, NULL_RTX,
-				  SYMBOL_REF_DECL (symbol),
-				  list_length (TYPE_ARG_TYPES (type)));
+	    {
+	      int nargs = list_length (TYPE_ARG_TYPES (type));
+	      link = CALL_INSN_FUNCTION_USAGE (insn);
+#ifndef PCC_STATIC_STRUCT_RETURN
+	      if (aggregate_value_p (TREE_TYPE (type), type)
+		  && targetm.calls.struct_value_rtx (type, 0) == 0)
+		{
+		  tree struct_addr = build_pointer_type (TREE_TYPE (type));
+		  enum machine_mode mode = TYPE_MODE (struct_addr);
+		  rtx reg;
+		  INIT_CUMULATIVE_ARGS (args_so_far, type, NULL_RTX, fndecl,
+					nargs + 1);
+		  reg = targetm.calls.function_arg (&args_so_far, mode,
+						    struct_addr, true);
+		  targetm.calls.function_arg_advance (&args_so_far, mode,
+						      struct_addr, true);
+		  if (reg == NULL_RTX)
+		    {
+		      for (; link; link = XEXP (link, 1))
+			if (GET_CODE (XEXP (link, 0)) == USE
+			    && MEM_P (XEXP (XEXP (link, 0), 0)))
+			  {
+			    link = XEXP (link, 1);
+			    break;
+			  }
+		    }
+		}
+#endif
+	      else
+		INIT_CUMULATIVE_ARGS (args_so_far, type, NULL_RTX, fndecl,
+				      nargs);
+	      if (obj_type_ref && TYPE_ARG_TYPES (type) != void_list_node)
+		{
+		  enum machine_mode mode;
+		  t = TYPE_ARG_TYPES (type);
+		  mode = TYPE_MODE (TREE_VALUE (t));
+		  this_arg = targetm.calls.function_arg (&args_so_far, mode,
+							 TREE_VALUE (t), true);
+		  if (this_arg && !REG_P (this_arg))
+		    this_arg = NULL_RTX;
+		  else if (this_arg == NULL_RTX)
+		    {
+		      for (; link; link = XEXP (link, 1))
+			if (GET_CODE (XEXP (link, 0)) == USE
+			    && MEM_P (XEXP (XEXP (link, 0), 0)))
+			  {
+			    this_arg = XEXP (XEXP (link, 0), 0);
+			    break;
+			  }
+		    }
+		}
+	    }
 	}
     }
   t = type ? TYPE_ARG_TYPES (type) : NULL_TREE;
@@ -5733,6 +5799,20 @@  prepare_call_arguments (basic_block bb, 
 	    }
 	}
     }
+  if (this_arg)
+    {
+      enum machine_mode mode
+	= TYPE_MODE (TREE_TYPE (OBJ_TYPE_REF_EXPR (obj_type_ref)));
+      rtx clobbered = gen_rtx_MEM (mode, this_arg);
+      HOST_WIDE_INT token
+	= tree_low_cst (OBJ_TYPE_REF_TOKEN (obj_type_ref), 0);
+      if (token)
+	clobbered = plus_constant (clobbered, token * GET_MODE_SIZE (mode));
+      clobbered = gen_rtx_MEM (mode, clobbered);
+      x = gen_rtx_CONCAT (mode, gen_rtx_CLOBBER (VOIDmode, pc_rtx), clobbered);
+      call_arguments
+	= gen_rtx_EXPR_LIST (VOIDmode, x, call_arguments);
+    }
 }
 
 /* Callback for cselib_record_sets_hook, that records as micro
--- gcc/dwarf2out.c.jj	2010-12-22 12:32:47.000000000 +0100
+++ gcc/dwarf2out.c	2010-12-23 09:40:43.000000000 +0100
@@ -19418,7 +19418,7 @@  gen_subprogram_die (tree decl, dw_die_re
 	  for (ca_loc = call_arg_locations; ca_loc; ca_loc = ca_loc->next)
 	    {
 	      dw_die_ref die = NULL;
-	      rtx tloc = NULL_RTX;
+	      rtx tloc = NULL_RTX, tlocc = NULL_RTX;
 	      rtx arg, next_arg;
 
 	      for (arg = NOTE_VAR_LOCATION (ca_loc->call_arg_loc_note);
@@ -19447,6 +19447,13 @@  gen_subprogram_die (tree decl, dw_die_re
 		      tloc = XEXP (XEXP (arg, 0), 1);
 		      continue;
 		    }
+		  else if (GET_CODE (XEXP (XEXP (arg, 0), 0)) == CLOBBER
+			   && XEXP (XEXP (XEXP (arg, 0), 0), 0) == pc_rtx)
+		    {
+		      gcc_assert (ca_loc->symbol_ref == NULL_RTX);
+		      tlocc = XEXP (XEXP (arg, 0), 1);
+		      continue;
+		    }
 		  if (REG_P (XEXP (XEXP (arg, 0), 0)))
 		    reg = reg_loc_descriptor (XEXP (XEXP (arg, 0), 0),
 					      VAR_INIT_STATUS_INITIALIZED);
@@ -19480,13 +19487,23 @@  gen_subprogram_die (tree decl, dw_die_re
 	      if (die == NULL
 		  && (ca_loc->symbol_ref || tloc))
 		die = gen_call_site_die (decl, subr_die, ca_loc);
-	      if (die != NULL && tloc != NULL_RTX)
+	      if (die != NULL && (tloc != NULL_RTX || tlocc != NULL_RTX))
 		{
-		  dw_loc_descr_ref tval
-		    = mem_loc_descriptor (tloc, VOIDmode,
-					  VAR_INIT_STATUS_INITIALIZED);
+		  dw_loc_descr_ref tval = NULL;
+
+		  if (tloc != NULL_RTX)
+		    tval = mem_loc_descriptor (tloc, VOIDmode,
+					       VAR_INIT_STATUS_INITIALIZED);
 		  if (tval)
 		    add_AT_loc (die, DW_AT_GNU_call_site_target, tval);
+		  else if (tlocc != NULL_RTX)
+		    {
+		      tval = mem_loc_descriptor (tlocc, VOIDmode,
+						 VAR_INIT_STATUS_INITIALIZED);
+		      if (tval)
+			add_AT_loc (die, DW_AT_GNU_call_site_target_clobbered,
+				    tval);
+		    }
 		}
 	      if (die != NULL)
 		{