diff mbox series

[v4,4/6] btf: add -gprune-btf option

Message ID 20240611190145.115887-5-david.faust@oracle.com
State New
Headers show
Series btf: refactor and add pruning option | expand

Commit Message

David Faust June 11, 2024, 7:01 p.m. UTC
This patch adds a new option, -gprune-btf, to control BTF debug info
generation.

As the name implies, this option enables a kind of "pruning" of the BTF
information before it is emitted.  When enabled, rather than emitting
all type information translated from DWARF, only information for types
directly used in the source program is emitted.

The primary purpose of this pruning is to reduce the amount of
unnecessary BTF information emitted, especially for BPF programs.  It is
very common for BPF programs to include Linux kernel internal headers in
order to have access to kernel data structures.  However, doing so often
has the side effect of also adding type definitions for a large number
of types which are not actually used by nor relevant to the program.
In these cases, -gprune-btf commonly reduces the size of the resulting
BTF information by 10x or more, as seen on average when compiling Linux
kernel BPF selftests.  This both slims down the size of the resulting
object and reduces the time required by the BPF loader to verify the
program and its BTF information.

Note that the pruning implemented in this patch follows the same rules
as the BTF pruning performed unconditionally by LLVM's BPF backend when
generating BTF.  In particular, the main sources of pruning are:

  1) Only generate BTF for types used by variables and functions at the
     file scope.

     Note that which variables are known to be "used" may differ
     slightly between LTO and non-LTO builds due to optimizations.  For
     non-LTO builds (and always for the BPF target), variables which are
     optimized away during compilation are considered to be unused, and
     they (along with their types) are pruned.  For LTO builds, such
     variables are not known to be optimized away by the time pruning
     occurs, so VAR records for them and information for their types may
     be present in the emitted BTF information.  This is a missed
     optimization that may be fixed in the future.

  2) Avoid emitting full BTF for struct and union types which are only
     pointed-to by members of other struct/union types.  In these cases,
     the full BTF_KIND_STRUCT or BTF_KIND_UNION which would normally
     be emitted is replaced with a BTF_KIND_FWD, as though the
     underlying type was a forward-declared struct or union type.

gcc/
	* btfout.cc (btf_used_types): New hash set.
	(struct btf_fixup): New.
	(fixups, forwards): New vecs.
	(btf_output): Calculate num_types depending on debug_prune_btf.
	(btf_early_finsih): New initialization for debug_prune_btf.
	(btf_add_used_type): New function.
	(btf_used_type_list_cb): Likewise.
	(btf_collect_pruned_types): Likewise.
	(btf_add_vars): Handle special case for variables in ".maps" section
	when generating BTF for BPF CO-RE target.
	(btf_late_finish): Use btf_collect_pruned_types when debug_prune_btf
	is in effect.  Move some initialization to btf_early_finish.
	(btf_finalize): Additional deallocation for debug_prune_btf.
	* common.opt (gprune-btf): New flag.
	* ctfc.cc (init_ctf_strtable): Make non-static.
	* ctfc.h (init_ctf_strtable, ctfc_delete_strtab): Make extern.
	* doc/invoke.texi (Debugging Options): Document -gprune-btf.

gcc/testsuite/
	* gcc.dg/debug/btf/btf-prune-1.c: New test.
	* gcc.dg/debug/btf/btf-prune-2.c: Likewise.
	* gcc.dg/debug/btf/btf-prune-3.c: Likewise.
	* gcc.dg/debug/btf/btf-prune-maps.c: Likewise.
---
 gcc/btfout.cc                                 | 358 +++++++++++++++++-
 gcc/common.opt                                |   4 +
 gcc/ctfc.cc                                   |   2 +-
 gcc/ctfc.h                                    |   3 +
 gcc/doc/invoke.texi                           |  20 +
 gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c  |  25 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c  |  33 ++
 gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c  |  35 ++
 .../gcc.dg/debug/btf/btf-prune-maps.c         |  20 +
 9 files changed, 493 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c
 create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c

Comments

David Faust June 24, 2024, 4:11 p.m. UTC | #1
Ping.

Richard: I changed the option name as you asked but forgot to CC you on
the updated patch.  Is the new option OK?

Indu: You had some minor comments on the prior version which I have
addressed, not sure whether you meant the rest of the patch was OK or
not, or if you had a chance to review it.

Thanks!

archive: https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654252.html

On 6/11/24 12:01, David Faust wrote:
> This patch adds a new option, -gprune-btf, to control BTF debug info
> generation.
> 
> As the name implies, this option enables a kind of "pruning" of the BTF
> information before it is emitted.  When enabled, rather than emitting
> all type information translated from DWARF, only information for types
> directly used in the source program is emitted.
> 
> The primary purpose of this pruning is to reduce the amount of
> unnecessary BTF information emitted, especially for BPF programs.  It is
> very common for BPF programs to include Linux kernel internal headers in
> order to have access to kernel data structures.  However, doing so often
> has the side effect of also adding type definitions for a large number
> of types which are not actually used by nor relevant to the program.
> In these cases, -gprune-btf commonly reduces the size of the resulting
> BTF information by 10x or more, as seen on average when compiling Linux
> kernel BPF selftests.  This both slims down the size of the resulting
> object and reduces the time required by the BPF loader to verify the
> program and its BTF information.
> 
> Note that the pruning implemented in this patch follows the same rules
> as the BTF pruning performed unconditionally by LLVM's BPF backend when
> generating BTF.  In particular, the main sources of pruning are:
> 
>   1) Only generate BTF for types used by variables and functions at the
>      file scope.
> 
>      Note that which variables are known to be "used" may differ
>      slightly between LTO and non-LTO builds due to optimizations.  For
>      non-LTO builds (and always for the BPF target), variables which are
>      optimized away during compilation are considered to be unused, and
>      they (along with their types) are pruned.  For LTO builds, such
>      variables are not known to be optimized away by the time pruning
>      occurs, so VAR records for them and information for their types may
>      be present in the emitted BTF information.  This is a missed
>      optimization that may be fixed in the future.
> 
>   2) Avoid emitting full BTF for struct and union types which are only
>      pointed-to by members of other struct/union types.  In these cases,
>      the full BTF_KIND_STRUCT or BTF_KIND_UNION which would normally
>      be emitted is replaced with a BTF_KIND_FWD, as though the
>      underlying type was a forward-declared struct or union type.
> 
> gcc/
> 	* btfout.cc (btf_used_types): New hash set.
> 	(struct btf_fixup): New.
> 	(fixups, forwards): New vecs.
> 	(btf_output): Calculate num_types depending on debug_prune_btf.
> 	(btf_early_finsih): New initialization for debug_prune_btf.
> 	(btf_add_used_type): New function.
> 	(btf_used_type_list_cb): Likewise.
> 	(btf_collect_pruned_types): Likewise.
> 	(btf_add_vars): Handle special case for variables in ".maps" section
> 	when generating BTF for BPF CO-RE target.
> 	(btf_late_finish): Use btf_collect_pruned_types when debug_prune_btf
> 	is in effect.  Move some initialization to btf_early_finish.
> 	(btf_finalize): Additional deallocation for debug_prune_btf.
> 	* common.opt (gprune-btf): New flag.
> 	* ctfc.cc (init_ctf_strtable): Make non-static.
> 	* ctfc.h (init_ctf_strtable, ctfc_delete_strtab): Make extern.
> 	* doc/invoke.texi (Debugging Options): Document -gprune-btf.
> 
> gcc/testsuite/
> 	* gcc.dg/debug/btf/btf-prune-1.c: New test.
> 	* gcc.dg/debug/btf/btf-prune-2.c: Likewise.
> 	* gcc.dg/debug/btf/btf-prune-3.c: Likewise.
> 	* gcc.dg/debug/btf/btf-prune-maps.c: Likewise.
> ---
>  gcc/btfout.cc                                 | 358 +++++++++++++++++-
>  gcc/common.opt                                |   4 +
>  gcc/ctfc.cc                                   |   2 +-
>  gcc/ctfc.h                                    |   3 +
>  gcc/doc/invoke.texi                           |  20 +
>  gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c  |  25 ++
>  gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c  |  33 ++
>  gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c  |  35 ++
>  .../gcc.dg/debug/btf/btf-prune-maps.c         |  20 +
>  9 files changed, 493 insertions(+), 7 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c
>  create mode 100644 gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c
> 
> diff --git a/gcc/btfout.cc b/gcc/btfout.cc
> index 89f148de9650..34d8cec0a2e3 100644
> --- a/gcc/btfout.cc
> +++ b/gcc/btfout.cc
> @@ -828,7 +828,10 @@ output_btf_types (ctf_container_ref ctfc)
>  {
>    size_t i;
>    size_t num_types;
> -  num_types = ctfc->ctfc_types->elements ();
> +  if (debug_prune_btf)
> +    num_types = max_translated_id;
> +  else
> +    num_types = ctfc->ctfc_types->elements ();
>  
>    if (num_types)
>      {
> @@ -957,6 +960,212 @@ btf_add_func_records (ctf_container_ref ctfc)
>      }
>  }
>  
> +/* The set of types used directly in the source program, and any types manually
> +   marked as used.  This is the set of types which will be emitted when
> +   flag_prune_btf is set.  */
> +static GTY (()) hash_set<ctf_dtdef_ref> *btf_used_types;
> +
> +/* Fixup used to avoid unnecessary pointer chasing for types.  A fixup is
> +   created when a structure or union member is a pointer to another struct
> +   or union type.  In such cases, avoid emitting full type information for
> +   the pointee struct or union type (which may be quite large), unless that
> +   type is used directly elsewhere.  */
> +struct btf_fixup
> +{
> +  ctf_dtdef_ref pointer_dtd; /* Type node to which the fixup is applied.  */
> +  ctf_dtdef_ref pointee_dtd; /* Original type node referred to by pointer_dtd.
> +				If this concrete type is not otherwise used,
> +				then a forward is created.  */
> +};
> +
> +/* Stores fixups while processing types.  */
> +static vec<struct btf_fixup> fixups;
> +
> +/* For fixups where the underlying type is not used in the end, a BTF_KIND_FWD
> +   is created and emitted.  This vector stores them.  */
> +static GTY (()) vec<ctf_dtdef_ref, va_gc> *forwards;
> +
> +/* Recursively add type DTD and any types it references to the used set.
> +   Return a type that should be used for references to DTD - usually DTD itself,
> +   but may be NULL if DTD corresponds to a type which will not be emitted.
> +   CHECK_PTR is true if one of the predecessors in recursive calls is a struct
> +   or union member.  SEEN_PTR is true if CHECK_PTR is true AND one of the
> +   predecessors was a pointer type.  These two flags are used to avoid chasing
> +   pointers to struct/union only used from pointer members.  For such types, we
> +   will emit a forward instead of the full type information, unless
> +   CREATE_FIXUPS is false.  */
> +
> +static ctf_dtdef_ref
> +btf_add_used_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd,
> +		   bool check_ptr, bool seen_ptr, bool create_fixups)
> +{
> +  if (dtd == NULL)
> +    return NULL;
> +
> +  uint32_t ctf_kind = CTF_V2_INFO_KIND (dtd->dtd_data.ctti_info);
> +  uint32_t kind = get_btf_kind (ctf_kind);
> +
> +  /* Check whether the type has already been added.  */
> +  if (btf_used_types->contains (dtd))
> +    {
> +      /* It's possible the type was already added as a fixup, but that we now
> +	 have a concrete use of it.  */
> +      switch (kind)
> +	{
> +	case BTF_KIND_PTR:
> +	case BTF_KIND_TYPEDEF:
> +	case BTF_KIND_CONST:
> +	case BTF_KIND_VOLATILE:
> +	case BTF_KIND_RESTRICT:
> +	  if (check_ptr)
> +	    /* Type was previously added as a fixup, and that's OK.  */
> +	    return dtd;
> +	  else
> +	    {
> +	      /* The type was previously added as a fixup, but now we have
> +		 a concrete use of it.  Remove the fixup.  */
> +	      for (size_t i = 0; i < fixups.length (); i++)
> +		if (fixups[i].pointer_dtd == dtd)
> +		  fixups.unordered_remove (i);
> +
> +	      /* Add the concrete base type.  */
> +	      dtd->ref_type = btf_add_used_type (ctfc, dtd->ref_type, check_ptr,
> +						 seen_ptr, create_fixups);
> +	      return dtd;
> +	    }
> +	default:
> +	  return dtd;
> +	}
> +    }
> +
> +  if (ctf_kind == CTF_K_SLICE)
> +    {
> +      /* Bitfield.  Add the underlying type to the used set, but leave
> +	 the reference to the bitfield.  The slice type won't be emitted,
> +	 but we need the information in it when writing out the bitfield
> +	 encoding.  */
> +      btf_add_used_type (ctfc, dtd->dtd_u.dtu_slice.cts_type,
> +			 check_ptr, seen_ptr, create_fixups);
> +      return dtd;
> +    }
> +
> +  /* Skip redundant definitions of void and types with no BTF encoding.  */
> +  if ((kind == BTF_KIND_INT && dtd->dtd_data.ctti_size == 0)
> +      || (kind == BTF_KIND_UNKN))
> +    return NULL;
> +
> +  /* Add the type itself, and assign its id.
> +     Do this before recursing to handle things like linked list structures.  */
> +  gcc_assert (ctfc->ctfc_nextid <= BTF_MAX_TYPE);
> +  dtd->dtd_type = ctfc->ctfc_nextid++;
> +  btf_used_types->add (dtd);
> +  ctf_add_string (ctfc, dtd->dtd_name, &(dtd->dtd_data.ctti_name), CTF_STRTAB);
> +  ctfc->ctfc_num_types++;
> +  ctfc->ctfc_num_vlen_bytes += btf_calc_num_vbytes (dtd);
> +
> +  /* Recursively add types referenced by this type.  */
> +  switch (kind)
> +    {
> +    case BTF_KIND_INT:
> +    case BTF_KIND_FLOAT:
> +    case BTF_KIND_FWD:
> +      /* Leaf kinds which do not refer to any other types.  */
> +      break;
> +
> +    case BTF_KIND_FUNC:
> +    case BTF_KIND_VAR:
> +      /* Root kinds; no type we are visiting may refer to these.  */
> +      gcc_unreachable ();
> +
> +    case BTF_KIND_PTR:
> +    case BTF_KIND_TYPEDEF:
> +    case BTF_KIND_CONST:
> +    case BTF_KIND_VOLATILE:
> +    case BTF_KIND_RESTRICT:
> +      {
> +	/* These type kinds refer to exactly one other type.  */
> +	if (check_ptr && !seen_ptr)
> +	  seen_ptr = (kind == BTF_KIND_PTR);
> +
> +	/* Try to avoid chasing pointers to struct/union types if the
> +	   underlying type isn't used.  */
> +	if (check_ptr && seen_ptr && create_fixups)
> +	  {
> +	    ctf_dtdef_ref ref = dtd->ref_type;
> +	    uint32_t ref_kind = btf_dtd_kind (ref);
> +
> +	    if ((ref_kind == BTF_KIND_STRUCT || ref_kind == BTF_KIND_UNION)
> +		&& !btf_used_types->contains (ref))
> +	      {
> +		struct btf_fixup fixup;
> +		fixup.pointer_dtd = dtd;
> +		fixup.pointee_dtd = ref;
> +		fixups.safe_push (fixup);
> +		break;
> +	      }
> +	  }
> +
> +	/* Add the type to which this type refers.  */
> +	dtd->ref_type = btf_add_used_type (ctfc, dtd->ref_type, check_ptr,
> +					   seen_ptr, create_fixups);
> +	break;
> +      }
> +    case BTF_KIND_ARRAY:
> +      {
> +	/* Add element and index types.  */
> +	ctf_arinfo_t *arr = &(dtd->dtd_u.dtu_arr);
> +	arr->ctr_contents = btf_add_used_type (ctfc, arr->ctr_contents, false,
> +					       false, create_fixups);
> +	arr->ctr_index = btf_add_used_type (ctfc, arr->ctr_index, false, false,
> +					    create_fixups);
> +	break;
> +      }
> +    case BTF_KIND_STRUCT:
> +    case BTF_KIND_UNION:
> +    case BTF_KIND_ENUM:
> +    case BTF_KIND_ENUM64:
> +      {
> +	/* Add members.  */
> +	ctf_dmdef_t *dmd;
> +	for (dmd = dtd->dtd_u.dtu_members;
> +	     dmd != NULL; dmd = (ctf_dmdef_t *) ctf_dmd_list_next (dmd))
> +	  {
> +	    /* Add member type for struct/union members.  For enums, only the
> +	       enumerator names are needed.  */
> +	    if (kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION)
> +	      dmd->dmd_type = btf_add_used_type (ctfc, dmd->dmd_type, true,
> +						 false, create_fixups);
> +	    ctf_add_string (ctfc, dmd->dmd_name, &(dmd->dmd_name_offset),
> +			    CTF_STRTAB);
> +	  }
> +	break;
> +      }
> +    case BTF_KIND_FUNC_PROTO:
> +      {
> +	/* Add return type.  */
> +	dtd->ref_type = btf_add_used_type (ctfc, dtd->ref_type, false, false,
> +					   create_fixups);
> +
> +	/* Add arg types.  */
> +	ctf_func_arg_t * farg;
> +	for (farg = dtd->dtd_u.dtu_argv;
> +	     farg != NULL; farg = (ctf_func_arg_t *) ctf_farg_list_next (farg))
> +	  {
> +	    farg->farg_type = btf_add_used_type (ctfc, farg->farg_type, false,
> +						 false, create_fixups);
> +	    /* Note: argument names are stored in the auxilliary string table,
> +	       since CTF does not include arg names.  That table has not been
> +	       cleared, so no need to re-add argument names here.  */
> +	  }
> +	break;
> +      }
> +    default:
> +      return NULL;
> +    }
> +
> +  return dtd;
> +}
> +
>  /* Initial entry point of BTF generation, called at early_finish () after
>     CTF information has possibly been output.  Translate all CTF information
>     to BTF, and do any processing that must be done early, such as creating
> @@ -975,6 +1184,26 @@ btf_early_finish (void)
>       be emitted before starting the translation to BTF.  */
>    btf_add_const_void (tu_ctfc);
>    btf_add_func_records (tu_ctfc);
> +
> +  /* These fields are reset to count BTF types etc.  */
> +  tu_ctfc->ctfc_num_types = 0;
> +  tu_ctfc->ctfc_num_vlen_bytes = 0;
> +  tu_ctfc->ctfc_vars_list_count = 0;
> +
> +  if (debug_prune_btf)
> +    {
> +      btf_used_types
> +	= hash_set<ctf_dtdef_ref>::create_ggc (tu_ctfc->ctfc_types->elements ());
> +      tu_ctfc->ctfc_nextid = 1;
> +      fixups.create (1);
> +
> +      /* Empty the string table, which was already populated with strings for
> +	 all types translated from DWARF.  We may only need a very small subset
> +	 of these strings; those will be re-added below.  */
> +      ctfc_delete_strtab (&tu_ctfc->ctfc_strtable);
> +      init_ctf_strtable (&tu_ctfc->ctfc_strtable);
> +      tu_ctfc->ctfc_strlen++;
> +    }
>  }
>  
>  /* Push a BTF datasec entry ENTRY into the datasec named SECNAME,
> @@ -1157,6 +1386,25 @@ btf_add_vars (ctf_container_ref ctfc)
>  
>        /* Add a BTF_KIND_DATASEC entry for the variable.  */
>        btf_datasec_add_var (ctfc, var, dvd);
> +
> +      const char *section = var->get_section ();
> +      if (section && (strcmp (section, ".maps") == 0) && debug_prune_btf)
> +	{
> +	  /* The .maps section has special meaning in BTF: it is used for BPF
> +	     map definitions.  These definitions should be structs.  We must
> +	     collect pointee types used in map members as though they are used
> +	     directly, effectively ignoring (from the pruning perspective) that
> +	     they are struct members.  */
> +	  ctf_dtdef_ref dtd = dvd->dvd_type;
> +	  uint32_t kind = btf_dtd_kind (dvd->dvd_type);
> +	  if (kind == BTF_KIND_STRUCT)
> +	    {
> +	      ctf_dmdef_t *dmd;
> +	      for (dmd = dtd->dtd_u.dtu_members;
> +		   dmd != NULL; dmd = (ctf_dmdef_t *) ctf_dmd_list_next (dmd))
> +		btf_add_used_type (ctfc, dmd->dmd_type, false, false, true);
> +	    }
> +	}
>      }
>  }
>  
> @@ -1255,6 +1503,86 @@ btf_assign_datasec_ids (ctf_container_ref ctfc)
>      }
>  }
>  
> +/* Callback used for assembling the only-used-types list.  Note that this is
> +   the same as btf_type_list_cb above, but the hash_set traverse requires a
> +   different function signature.  */
> +
> +static bool
> +btf_used_type_list_cb (const ctf_dtdef_ref& dtd, ctf_container_ref ctfc)
> +{
> +  ctfc->ctfc_types_list[dtd->dtd_type] = dtd;
> +  return true;
> +}
> +
> +/* Collect the set of types reachable from global variables and functions.
> +   This is the minimal set of types, used when generating pruned BTF.  */
> +
> +static void
> +btf_collect_pruned_types (ctf_container_ref ctfc)
> +{
> +  vec_alloc (forwards, 1);
> +
> +  /* Add types used from functions.  */
> +  ctf_dtdef_ref dtd;
> +  size_t i;
> +  FOR_EACH_VEC_ELT (*funcs, i, dtd)
> +    {
> +      btf_add_used_type (ctfc, dtd->ref_type, false, false, true);
> +      ctf_add_string (ctfc, dtd->dtd_name, &(dtd->dtd_data.ctti_name),
> +		      CTF_STRTAB);
> +    }
> +
> +  /* Add types used from global variables.  */
> +  for (i = 0; i < ctfc->ctfc_vars_list_count; i++)
> +    {
> +      ctf_dvdef_ref dvd = ctfc->ctfc_vars_list[i];
> +      btf_add_used_type (ctfc, dvd->dvd_type, false, false, true);
> +      ctf_add_string (ctfc, dvd->dvd_name, &(dvd->dvd_name_offset), CTF_STRTAB);
> +    }
> +
> +  /* Process fixups.  If the base type was never added, create a forward for it
> +     and adjust the reference to point to that.  If it was added, then nothing
> +     needs to change.  */
> +  for (i = 0; i < fixups.length (); i++)
> +    {
> +      struct btf_fixup *fx = &fixups[i];
> +      if (!btf_used_types->contains (fx->pointee_dtd))
> +	{
> +	  /* The underlying type is not used.  Create a forward.  */
> +	  ctf_dtdef_ref fwd = ggc_cleared_alloc<ctf_dtdef_t> ();
> +	  ctf_id_t id = ctfc->ctfc_nextid++;
> +	  gcc_assert (id <= BTF_MAX_TYPE);
> +
> +	  bool union_p = (btf_dtd_kind (fx->pointee_dtd) == BTF_KIND_UNION);
> +
> +	  fwd->dtd_name = fx->pointee_dtd->dtd_name;
> +	  fwd->dtd_data.ctti_info = CTF_TYPE_INFO (CTF_K_FORWARD, union_p, 0);
> +	  fwd->dtd_type = id;
> +	  ctfc->ctfc_num_types++;
> +	  ctfc->ctfc_num_vlen_bytes += btf_calc_num_vbytes (fwd);
> +	  ctf_add_string (ctfc, fwd->dtd_name, &(fwd->dtd_data.ctti_name),
> +			  CTF_STRTAB);
> +
> +	  /* Update the pointer to point to the forward.  */
> +	  fx->pointer_dtd->ref_type = fwd;
> +	  vec_safe_push (forwards, fwd);
> +	}
> +    }
> +
> +  /* Construct the resulting pruned type list.  */
> +  ctfc->ctfc_types_list
> +    = ggc_vec_alloc<ctf_dtdef_ref> (btf_used_types->elements () + 1
> +				    + vec_safe_length (forwards));
> +
> +  btf_used_types->traverse<ctf_container_ref, btf_used_type_list_cb> (ctfc);
> +
> +  /* Insert the newly created forwards into the regular types list too.  */
> +  FOR_EACH_VEC_ELT (*forwards, i, dtd)
> +    ctfc->ctfc_types_list[dtd->dtd_type] = dtd;
> +
> +  max_translated_id = btf_used_types->elements () + vec_safe_length (forwards);
> +}
> +
>  /* Late entry point for BTF generation, called from dwarf2out_finish ().
>     Complete and emit BTF information.  */
>  
> @@ -1266,13 +1594,22 @@ btf_finish (void)
>  
>    datasecs.create (0);
>  
> -  tu_ctfc->ctfc_num_types = 0;
> -  tu_ctfc->ctfc_num_vlen_bytes = 0;
> -  tu_ctfc->ctfc_vars_list_count = 0;
> -
>    btf_add_vars (tu_ctfc);
> -  btf_collect_translated_types (tu_ctfc);
> +  if (debug_prune_btf)
> +    {
> +      /* Collect pruned set of BTF types and prepare for emission.
> +	 This includes only types directly used in file-scope variables and
> +	 function return/argument types.  */
> +      btf_collect_pruned_types (tu_ctfc);
> +    }
> +  else
> +    {
> +      /* Collect all BTF types and prepare for emission.
> +	 This includes all types translated from DWARF.  */
> +      btf_collect_translated_types (tu_ctfc);
> +    }
>    btf_add_func_datasec_entries (tu_ctfc);
> +
>    btf_assign_var_ids (tu_ctfc);
>    btf_assign_func_ids (tu_ctfc);
>    btf_assign_datasec_ids (tu_ctfc);
> @@ -1305,6 +1642,15 @@ btf_finalize (void)
>    func_map->empty ();
>    func_map = NULL;
>  
> +  if (debug_prune_btf)
> +    {
> +      btf_used_types->empty ();
> +      btf_used_types = NULL;
> +
> +      fixups.release ();
> +      forwards = NULL;
> +    }
> +
>    ctf_container_ref tu_ctfc = ctf_get_tu_ctfc ();
>    ctfc_delete_container (tu_ctfc);
>    tu_ctfc = NULL;
> diff --git a/gcc/common.opt b/gcc/common.opt
> index f2bc47fdc5e6..849f6456ff85 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -3534,6 +3534,10 @@ gbtf
>  Common Driver RejectNegative JoinedOrMissing
>  Generate BTF debug information at default level.
>  
> +gprune-btf
> +Common Driver Var(debug_prune_btf) Init(0)
> +Generate pruned BTF when emitting BTF info.
> +
>  gdwarf
>  Common Driver JoinedOrMissing RejectNegative
>  Generate debug information in default version of DWARF format.
> diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc
> index 8da37f260458..8f531ffebf88 100644
> --- a/gcc/ctfc.cc
> +++ b/gcc/ctfc.cc
> @@ -909,7 +909,7 @@ ctfc_get_dvd_srcloc (ctf_dvdef_ref dvd, ctf_srcloc_ref loc)
>  /* Initialize the CTF string table.
>     The first entry in the CTF string table (empty string) is added.  */
>  
> -static void
> +void
>  init_ctf_strtable (ctf_strtable_t * strtab)
>  {
>    strtab->ctstab_head = NULL;
> diff --git a/gcc/ctfc.h b/gcc/ctfc.h
> index d0b724817a7f..29267dc036d1 100644
> --- a/gcc/ctfc.h
> +++ b/gcc/ctfc.h
> @@ -369,6 +369,9 @@ extern unsigned int ctfc_get_num_ctf_vars (ctf_container_ref);
>  
>  extern ctf_strtable_t * ctfc_get_strtab (ctf_container_ref, int);
>  
> +extern void init_ctf_strtable (ctf_strtable_t *);
> +extern void ctfc_delete_strtab (ctf_strtable_t *);
> +
>  /* Get the length of the specified string table in the CTF container.  */
>  
>  extern size_t ctfc_get_strtab_len (ctf_container_ref, int);
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index a0b375646468..8479fd5cf2b8 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -522,6 +522,7 @@ Objective-C and Objective-C++ Dialects}.
>  @xref{Debugging Options,,Options for Debugging Your Program}.
>  @gccoptlist{-g  -g@var{level}  -gdwarf  -gdwarf-@var{version}
>  -gbtf -gctf  -gctf@var{level}
> +-gprune-btf -gno-prune-btf
>  -ggdb  -grecord-gcc-switches  -gno-record-gcc-switches
>  -gstrict-dwarf  -gno-strict-dwarf
>  -gas-loc-support  -gno-as-loc-support
> @@ -12003,6 +12004,25 @@ eBPF target.  On other targets, like x86, BTF debug information can be
>  generated along with DWARF debug information when both of the debug formats are
>  enabled explicitly via their respective command line options.
>  
> +@opindex gprune-btf
> +@opindex gno-prune-btf
> +@item -gprune-btf
> +@itemx -gno-prune-btf
> +Prune BTF information before emission.  When pruning, only type
> +information for types used by global variables and file-scope functions
> +will be emitted.  If compiling for the BPF target with BPF CO-RE
> +enabled, type information will also be emitted for types used in BPF
> +CO-RE relocations.  In addition, struct and union types which are only
> +referred to via pointers from members of other struct or union types
> +shall be pruned and replaced with BTF_KIND_FWD, as though those types
> +were only present in the input as forward declarations.
> +
> +This option substantially reduces the size of produced BTF information,
> +but at significant loss in the amount of detailed type information.
> +It is primarily useful when compiling for the BPF target, to minimize
> +the size of the resulting object, and to eliminate BTF information
> +which is not immediately relevant to the BPF program loading process.
> +
>  @opindex gctf
>  @item -gctf
>  @itemx -gctf@var{level}
> diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
> new file mode 100644
> index 000000000000..7115d074bd58
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
> @@ -0,0 +1,25 @@
> +/* Simple test of -gprune-btf option operation.
> +   Since 'struct foo' is not used, no BTF shall be emitted for it.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-gbtf -gprune-btf -dA" } */
> +
> +/* No BTF info for 'struct foo' nor types used only by it.  */
> +/* { dg-final { scan-assembler-not "BTF_KIND_STRUCT 'foo'" } } */
> +/* { dg-final { scan-assembler-not "BTF_KIND_INT 'char'" } } */
> +
> +/* We should get BTF info for 'struct bar' since it is used.  */
> +/* { dg-final { scan-assembler "BTF_KIND_STRUCT 'bar'"} } */
> +
> +struct foo {
> +  int a;
> +  char c;
> +};
> +
> +struct bar {
> +  int x;
> +  long z[4];
> +};
> +
> +struct bar a_bar;
> +
> diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
> new file mode 100644
> index 000000000000..e6b4a1946f1b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
> @@ -0,0 +1,33 @@
> +/* Test that -gprune-btf does not chase pointer-to-struct members.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-gbtf -gprune-btf -dA" } */
> +
> +/* Only use of B is via a pointer member of C.
> +   Full BTF for B is replaced with a forward.  */
> +/* { dg-final { scan-assembler-not "BTF_KIND_STRUCT 'B'" } } */
> +/* { dg-final { scan-assembler-times "TYPE \[0-9\]+ BTF_KIND_FWD 'B'" 1 } } */
> +
> +/* Detailed info for B is omitted, and A is otherwise unused.  */
> +/* { dg-final { scan-assembler-not "BTF_KIND_\[A-Z\]+ 'A'" } } */
> +
> +/* { dg-final { scan-assembler "BTF_KIND_STRUCT 'C'" } } */
> +
> +struct A;
> +
> +struct B {
> +  int x;
> +  int (*do_A_thing) (int, int);
> +  struct A *other;
> +};
> +
> +struct C {
> +  unsigned int x;
> +  struct B * a;
> +};
> +
> +int
> +foo (struct C *c)
> +{
> +  return c->x;
> +}
> diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c
> new file mode 100644
> index 000000000000..16f8110fc781
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c
> @@ -0,0 +1,35 @@
> +/* Test that -gprune-btf does not prune through array members.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-gbtf -gprune-btf -dA" } */
> +
> +/* We expect full BTF information each struct.  */
> +/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_FWD 'file'" } } */
> +/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_STRUCT 'A'" } } */
> +/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_STRUCT 'B'" } } */
> +/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_STRUCT 'C'" } } */
> +
> +struct file;
> +
> +struct A {
> +  void *private;
> +  long (*read)(struct file *, char *, unsigned long);
> +  long (*write)(struct file *, const char *, unsigned long);
> +};
> +
> +struct B {
> +  unsigned int x;
> +  struct A **as;
> +};
> +
> +struct C {
> +  struct A *arr_a[4];
> +  struct A *lone_a;
> +  unsigned int z;
> +};
> +
> +unsigned int
> +foo (struct B *b, struct C *c)
> +{
> +  return b->x + c->z;
> +}
> diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c
> new file mode 100644
> index 000000000000..f3d870ac59ec
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c
> @@ -0,0 +1,20 @@
> +/* Test special meaning of .maps section for BTF when pruning.  For global
> +   variables of struct type placed in this section, we must treat members as
> +   though they are used directly, always collecting pointee types.
> +   Therefore, full type information for struct keep_me should be emitted.  */
> +
> +/* { dg-do compile } */
> +/* { dg-options "-gbtf -gprune-btf -dA" } */
> +
> +/* { dg-final { scan-assembler-not "BTF_KIND_FWD 'keep_me'" } } */
> +/* { dg-final { scan-assembler "BTF_KIND_STRUCT 'keep_me'" } } */
> +
> +struct keep_me {
> +  int a;
> +  char c;
> +};
> +
> +struct {
> +  int *key;
> +  struct keep_me *value;
> +} my_map __attribute__((section (".maps")));
Indu Bhagat June 24, 2024, 6:32 p.m. UTC | #2
On 6/24/24 09:11, David Faust wrote:
> Ping.
> 
> Richard: I changed the option name as you asked but forgot to CC you on
> the updated patch.  Is the new option OK?
> 
> Indu: You had some minor comments on the prior version which I have
> addressed, not sure whether you meant the rest of the patch was OK or
> not, or if you had a chance to review it.
> 

Hi David,

Thanks for making the change in the commit message to clearly state the 
behavior of the option -gprune-btf with and without LTO build.  I did 
take a look at the V3 version of the patch, had tested it a bit too.

While there are still remain some gaps in my understanding of the 
algorithm, but overall I think this patch as such looks good and makes 
forward progress.

So, LGTM.

Thanks
Indu

> Thanks!
> 
> archive:https://gcc.gnu.org/pipermail/gcc-patches/2024-June/654252.html
diff mbox series

Patch

diff --git a/gcc/btfout.cc b/gcc/btfout.cc
index 89f148de9650..34d8cec0a2e3 100644
--- a/gcc/btfout.cc
+++ b/gcc/btfout.cc
@@ -828,7 +828,10 @@  output_btf_types (ctf_container_ref ctfc)
 {
   size_t i;
   size_t num_types;
-  num_types = ctfc->ctfc_types->elements ();
+  if (debug_prune_btf)
+    num_types = max_translated_id;
+  else
+    num_types = ctfc->ctfc_types->elements ();
 
   if (num_types)
     {
@@ -957,6 +960,212 @@  btf_add_func_records (ctf_container_ref ctfc)
     }
 }
 
+/* The set of types used directly in the source program, and any types manually
+   marked as used.  This is the set of types which will be emitted when
+   flag_prune_btf is set.  */
+static GTY (()) hash_set<ctf_dtdef_ref> *btf_used_types;
+
+/* Fixup used to avoid unnecessary pointer chasing for types.  A fixup is
+   created when a structure or union member is a pointer to another struct
+   or union type.  In such cases, avoid emitting full type information for
+   the pointee struct or union type (which may be quite large), unless that
+   type is used directly elsewhere.  */
+struct btf_fixup
+{
+  ctf_dtdef_ref pointer_dtd; /* Type node to which the fixup is applied.  */
+  ctf_dtdef_ref pointee_dtd; /* Original type node referred to by pointer_dtd.
+				If this concrete type is not otherwise used,
+				then a forward is created.  */
+};
+
+/* Stores fixups while processing types.  */
+static vec<struct btf_fixup> fixups;
+
+/* For fixups where the underlying type is not used in the end, a BTF_KIND_FWD
+   is created and emitted.  This vector stores them.  */
+static GTY (()) vec<ctf_dtdef_ref, va_gc> *forwards;
+
+/* Recursively add type DTD and any types it references to the used set.
+   Return a type that should be used for references to DTD - usually DTD itself,
+   but may be NULL if DTD corresponds to a type which will not be emitted.
+   CHECK_PTR is true if one of the predecessors in recursive calls is a struct
+   or union member.  SEEN_PTR is true if CHECK_PTR is true AND one of the
+   predecessors was a pointer type.  These two flags are used to avoid chasing
+   pointers to struct/union only used from pointer members.  For such types, we
+   will emit a forward instead of the full type information, unless
+   CREATE_FIXUPS is false.  */
+
+static ctf_dtdef_ref
+btf_add_used_type (ctf_container_ref ctfc, ctf_dtdef_ref dtd,
+		   bool check_ptr, bool seen_ptr, bool create_fixups)
+{
+  if (dtd == NULL)
+    return NULL;
+
+  uint32_t ctf_kind = CTF_V2_INFO_KIND (dtd->dtd_data.ctti_info);
+  uint32_t kind = get_btf_kind (ctf_kind);
+
+  /* Check whether the type has already been added.  */
+  if (btf_used_types->contains (dtd))
+    {
+      /* It's possible the type was already added as a fixup, but that we now
+	 have a concrete use of it.  */
+      switch (kind)
+	{
+	case BTF_KIND_PTR:
+	case BTF_KIND_TYPEDEF:
+	case BTF_KIND_CONST:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_RESTRICT:
+	  if (check_ptr)
+	    /* Type was previously added as a fixup, and that's OK.  */
+	    return dtd;
+	  else
+	    {
+	      /* The type was previously added as a fixup, but now we have
+		 a concrete use of it.  Remove the fixup.  */
+	      for (size_t i = 0; i < fixups.length (); i++)
+		if (fixups[i].pointer_dtd == dtd)
+		  fixups.unordered_remove (i);
+
+	      /* Add the concrete base type.  */
+	      dtd->ref_type = btf_add_used_type (ctfc, dtd->ref_type, check_ptr,
+						 seen_ptr, create_fixups);
+	      return dtd;
+	    }
+	default:
+	  return dtd;
+	}
+    }
+
+  if (ctf_kind == CTF_K_SLICE)
+    {
+      /* Bitfield.  Add the underlying type to the used set, but leave
+	 the reference to the bitfield.  The slice type won't be emitted,
+	 but we need the information in it when writing out the bitfield
+	 encoding.  */
+      btf_add_used_type (ctfc, dtd->dtd_u.dtu_slice.cts_type,
+			 check_ptr, seen_ptr, create_fixups);
+      return dtd;
+    }
+
+  /* Skip redundant definitions of void and types with no BTF encoding.  */
+  if ((kind == BTF_KIND_INT && dtd->dtd_data.ctti_size == 0)
+      || (kind == BTF_KIND_UNKN))
+    return NULL;
+
+  /* Add the type itself, and assign its id.
+     Do this before recursing to handle things like linked list structures.  */
+  gcc_assert (ctfc->ctfc_nextid <= BTF_MAX_TYPE);
+  dtd->dtd_type = ctfc->ctfc_nextid++;
+  btf_used_types->add (dtd);
+  ctf_add_string (ctfc, dtd->dtd_name, &(dtd->dtd_data.ctti_name), CTF_STRTAB);
+  ctfc->ctfc_num_types++;
+  ctfc->ctfc_num_vlen_bytes += btf_calc_num_vbytes (dtd);
+
+  /* Recursively add types referenced by this type.  */
+  switch (kind)
+    {
+    case BTF_KIND_INT:
+    case BTF_KIND_FLOAT:
+    case BTF_KIND_FWD:
+      /* Leaf kinds which do not refer to any other types.  */
+      break;
+
+    case BTF_KIND_FUNC:
+    case BTF_KIND_VAR:
+      /* Root kinds; no type we are visiting may refer to these.  */
+      gcc_unreachable ();
+
+    case BTF_KIND_PTR:
+    case BTF_KIND_TYPEDEF:
+    case BTF_KIND_CONST:
+    case BTF_KIND_VOLATILE:
+    case BTF_KIND_RESTRICT:
+      {
+	/* These type kinds refer to exactly one other type.  */
+	if (check_ptr && !seen_ptr)
+	  seen_ptr = (kind == BTF_KIND_PTR);
+
+	/* Try to avoid chasing pointers to struct/union types if the
+	   underlying type isn't used.  */
+	if (check_ptr && seen_ptr && create_fixups)
+	  {
+	    ctf_dtdef_ref ref = dtd->ref_type;
+	    uint32_t ref_kind = btf_dtd_kind (ref);
+
+	    if ((ref_kind == BTF_KIND_STRUCT || ref_kind == BTF_KIND_UNION)
+		&& !btf_used_types->contains (ref))
+	      {
+		struct btf_fixup fixup;
+		fixup.pointer_dtd = dtd;
+		fixup.pointee_dtd = ref;
+		fixups.safe_push (fixup);
+		break;
+	      }
+	  }
+
+	/* Add the type to which this type refers.  */
+	dtd->ref_type = btf_add_used_type (ctfc, dtd->ref_type, check_ptr,
+					   seen_ptr, create_fixups);
+	break;
+      }
+    case BTF_KIND_ARRAY:
+      {
+	/* Add element and index types.  */
+	ctf_arinfo_t *arr = &(dtd->dtd_u.dtu_arr);
+	arr->ctr_contents = btf_add_used_type (ctfc, arr->ctr_contents, false,
+					       false, create_fixups);
+	arr->ctr_index = btf_add_used_type (ctfc, arr->ctr_index, false, false,
+					    create_fixups);
+	break;
+      }
+    case BTF_KIND_STRUCT:
+    case BTF_KIND_UNION:
+    case BTF_KIND_ENUM:
+    case BTF_KIND_ENUM64:
+      {
+	/* Add members.  */
+	ctf_dmdef_t *dmd;
+	for (dmd = dtd->dtd_u.dtu_members;
+	     dmd != NULL; dmd = (ctf_dmdef_t *) ctf_dmd_list_next (dmd))
+	  {
+	    /* Add member type for struct/union members.  For enums, only the
+	       enumerator names are needed.  */
+	    if (kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION)
+	      dmd->dmd_type = btf_add_used_type (ctfc, dmd->dmd_type, true,
+						 false, create_fixups);
+	    ctf_add_string (ctfc, dmd->dmd_name, &(dmd->dmd_name_offset),
+			    CTF_STRTAB);
+	  }
+	break;
+      }
+    case BTF_KIND_FUNC_PROTO:
+      {
+	/* Add return type.  */
+	dtd->ref_type = btf_add_used_type (ctfc, dtd->ref_type, false, false,
+					   create_fixups);
+
+	/* Add arg types.  */
+	ctf_func_arg_t * farg;
+	for (farg = dtd->dtd_u.dtu_argv;
+	     farg != NULL; farg = (ctf_func_arg_t *) ctf_farg_list_next (farg))
+	  {
+	    farg->farg_type = btf_add_used_type (ctfc, farg->farg_type, false,
+						 false, create_fixups);
+	    /* Note: argument names are stored in the auxilliary string table,
+	       since CTF does not include arg names.  That table has not been
+	       cleared, so no need to re-add argument names here.  */
+	  }
+	break;
+      }
+    default:
+      return NULL;
+    }
+
+  return dtd;
+}
+
 /* Initial entry point of BTF generation, called at early_finish () after
    CTF information has possibly been output.  Translate all CTF information
    to BTF, and do any processing that must be done early, such as creating
@@ -975,6 +1184,26 @@  btf_early_finish (void)
      be emitted before starting the translation to BTF.  */
   btf_add_const_void (tu_ctfc);
   btf_add_func_records (tu_ctfc);
+
+  /* These fields are reset to count BTF types etc.  */
+  tu_ctfc->ctfc_num_types = 0;
+  tu_ctfc->ctfc_num_vlen_bytes = 0;
+  tu_ctfc->ctfc_vars_list_count = 0;
+
+  if (debug_prune_btf)
+    {
+      btf_used_types
+	= hash_set<ctf_dtdef_ref>::create_ggc (tu_ctfc->ctfc_types->elements ());
+      tu_ctfc->ctfc_nextid = 1;
+      fixups.create (1);
+
+      /* Empty the string table, which was already populated with strings for
+	 all types translated from DWARF.  We may only need a very small subset
+	 of these strings; those will be re-added below.  */
+      ctfc_delete_strtab (&tu_ctfc->ctfc_strtable);
+      init_ctf_strtable (&tu_ctfc->ctfc_strtable);
+      tu_ctfc->ctfc_strlen++;
+    }
 }
 
 /* Push a BTF datasec entry ENTRY into the datasec named SECNAME,
@@ -1157,6 +1386,25 @@  btf_add_vars (ctf_container_ref ctfc)
 
       /* Add a BTF_KIND_DATASEC entry for the variable.  */
       btf_datasec_add_var (ctfc, var, dvd);
+
+      const char *section = var->get_section ();
+      if (section && (strcmp (section, ".maps") == 0) && debug_prune_btf)
+	{
+	  /* The .maps section has special meaning in BTF: it is used for BPF
+	     map definitions.  These definitions should be structs.  We must
+	     collect pointee types used in map members as though they are used
+	     directly, effectively ignoring (from the pruning perspective) that
+	     they are struct members.  */
+	  ctf_dtdef_ref dtd = dvd->dvd_type;
+	  uint32_t kind = btf_dtd_kind (dvd->dvd_type);
+	  if (kind == BTF_KIND_STRUCT)
+	    {
+	      ctf_dmdef_t *dmd;
+	      for (dmd = dtd->dtd_u.dtu_members;
+		   dmd != NULL; dmd = (ctf_dmdef_t *) ctf_dmd_list_next (dmd))
+		btf_add_used_type (ctfc, dmd->dmd_type, false, false, true);
+	    }
+	}
     }
 }
 
@@ -1255,6 +1503,86 @@  btf_assign_datasec_ids (ctf_container_ref ctfc)
     }
 }
 
+/* Callback used for assembling the only-used-types list.  Note that this is
+   the same as btf_type_list_cb above, but the hash_set traverse requires a
+   different function signature.  */
+
+static bool
+btf_used_type_list_cb (const ctf_dtdef_ref& dtd, ctf_container_ref ctfc)
+{
+  ctfc->ctfc_types_list[dtd->dtd_type] = dtd;
+  return true;
+}
+
+/* Collect the set of types reachable from global variables and functions.
+   This is the minimal set of types, used when generating pruned BTF.  */
+
+static void
+btf_collect_pruned_types (ctf_container_ref ctfc)
+{
+  vec_alloc (forwards, 1);
+
+  /* Add types used from functions.  */
+  ctf_dtdef_ref dtd;
+  size_t i;
+  FOR_EACH_VEC_ELT (*funcs, i, dtd)
+    {
+      btf_add_used_type (ctfc, dtd->ref_type, false, false, true);
+      ctf_add_string (ctfc, dtd->dtd_name, &(dtd->dtd_data.ctti_name),
+		      CTF_STRTAB);
+    }
+
+  /* Add types used from global variables.  */
+  for (i = 0; i < ctfc->ctfc_vars_list_count; i++)
+    {
+      ctf_dvdef_ref dvd = ctfc->ctfc_vars_list[i];
+      btf_add_used_type (ctfc, dvd->dvd_type, false, false, true);
+      ctf_add_string (ctfc, dvd->dvd_name, &(dvd->dvd_name_offset), CTF_STRTAB);
+    }
+
+  /* Process fixups.  If the base type was never added, create a forward for it
+     and adjust the reference to point to that.  If it was added, then nothing
+     needs to change.  */
+  for (i = 0; i < fixups.length (); i++)
+    {
+      struct btf_fixup *fx = &fixups[i];
+      if (!btf_used_types->contains (fx->pointee_dtd))
+	{
+	  /* The underlying type is not used.  Create a forward.  */
+	  ctf_dtdef_ref fwd = ggc_cleared_alloc<ctf_dtdef_t> ();
+	  ctf_id_t id = ctfc->ctfc_nextid++;
+	  gcc_assert (id <= BTF_MAX_TYPE);
+
+	  bool union_p = (btf_dtd_kind (fx->pointee_dtd) == BTF_KIND_UNION);
+
+	  fwd->dtd_name = fx->pointee_dtd->dtd_name;
+	  fwd->dtd_data.ctti_info = CTF_TYPE_INFO (CTF_K_FORWARD, union_p, 0);
+	  fwd->dtd_type = id;
+	  ctfc->ctfc_num_types++;
+	  ctfc->ctfc_num_vlen_bytes += btf_calc_num_vbytes (fwd);
+	  ctf_add_string (ctfc, fwd->dtd_name, &(fwd->dtd_data.ctti_name),
+			  CTF_STRTAB);
+
+	  /* Update the pointer to point to the forward.  */
+	  fx->pointer_dtd->ref_type = fwd;
+	  vec_safe_push (forwards, fwd);
+	}
+    }
+
+  /* Construct the resulting pruned type list.  */
+  ctfc->ctfc_types_list
+    = ggc_vec_alloc<ctf_dtdef_ref> (btf_used_types->elements () + 1
+				    + vec_safe_length (forwards));
+
+  btf_used_types->traverse<ctf_container_ref, btf_used_type_list_cb> (ctfc);
+
+  /* Insert the newly created forwards into the regular types list too.  */
+  FOR_EACH_VEC_ELT (*forwards, i, dtd)
+    ctfc->ctfc_types_list[dtd->dtd_type] = dtd;
+
+  max_translated_id = btf_used_types->elements () + vec_safe_length (forwards);
+}
+
 /* Late entry point for BTF generation, called from dwarf2out_finish ().
    Complete and emit BTF information.  */
 
@@ -1266,13 +1594,22 @@  btf_finish (void)
 
   datasecs.create (0);
 
-  tu_ctfc->ctfc_num_types = 0;
-  tu_ctfc->ctfc_num_vlen_bytes = 0;
-  tu_ctfc->ctfc_vars_list_count = 0;
-
   btf_add_vars (tu_ctfc);
-  btf_collect_translated_types (tu_ctfc);
+  if (debug_prune_btf)
+    {
+      /* Collect pruned set of BTF types and prepare for emission.
+	 This includes only types directly used in file-scope variables and
+	 function return/argument types.  */
+      btf_collect_pruned_types (tu_ctfc);
+    }
+  else
+    {
+      /* Collect all BTF types and prepare for emission.
+	 This includes all types translated from DWARF.  */
+      btf_collect_translated_types (tu_ctfc);
+    }
   btf_add_func_datasec_entries (tu_ctfc);
+
   btf_assign_var_ids (tu_ctfc);
   btf_assign_func_ids (tu_ctfc);
   btf_assign_datasec_ids (tu_ctfc);
@@ -1305,6 +1642,15 @@  btf_finalize (void)
   func_map->empty ();
   func_map = NULL;
 
+  if (debug_prune_btf)
+    {
+      btf_used_types->empty ();
+      btf_used_types = NULL;
+
+      fixups.release ();
+      forwards = NULL;
+    }
+
   ctf_container_ref tu_ctfc = ctf_get_tu_ctfc ();
   ctfc_delete_container (tu_ctfc);
   tu_ctfc = NULL;
diff --git a/gcc/common.opt b/gcc/common.opt
index f2bc47fdc5e6..849f6456ff85 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3534,6 +3534,10 @@  gbtf
 Common Driver RejectNegative JoinedOrMissing
 Generate BTF debug information at default level.
 
+gprune-btf
+Common Driver Var(debug_prune_btf) Init(0)
+Generate pruned BTF when emitting BTF info.
+
 gdwarf
 Common Driver JoinedOrMissing RejectNegative
 Generate debug information in default version of DWARF format.
diff --git a/gcc/ctfc.cc b/gcc/ctfc.cc
index 8da37f260458..8f531ffebf88 100644
--- a/gcc/ctfc.cc
+++ b/gcc/ctfc.cc
@@ -909,7 +909,7 @@  ctfc_get_dvd_srcloc (ctf_dvdef_ref dvd, ctf_srcloc_ref loc)
 /* Initialize the CTF string table.
    The first entry in the CTF string table (empty string) is added.  */
 
-static void
+void
 init_ctf_strtable (ctf_strtable_t * strtab)
 {
   strtab->ctstab_head = NULL;
diff --git a/gcc/ctfc.h b/gcc/ctfc.h
index d0b724817a7f..29267dc036d1 100644
--- a/gcc/ctfc.h
+++ b/gcc/ctfc.h
@@ -369,6 +369,9 @@  extern unsigned int ctfc_get_num_ctf_vars (ctf_container_ref);
 
 extern ctf_strtable_t * ctfc_get_strtab (ctf_container_ref, int);
 
+extern void init_ctf_strtable (ctf_strtable_t *);
+extern void ctfc_delete_strtab (ctf_strtable_t *);
+
 /* Get the length of the specified string table in the CTF container.  */
 
 extern size_t ctfc_get_strtab_len (ctf_container_ref, int);
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index a0b375646468..8479fd5cf2b8 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -522,6 +522,7 @@  Objective-C and Objective-C++ Dialects}.
 @xref{Debugging Options,,Options for Debugging Your Program}.
 @gccoptlist{-g  -g@var{level}  -gdwarf  -gdwarf-@var{version}
 -gbtf -gctf  -gctf@var{level}
+-gprune-btf -gno-prune-btf
 -ggdb  -grecord-gcc-switches  -gno-record-gcc-switches
 -gstrict-dwarf  -gno-strict-dwarf
 -gas-loc-support  -gno-as-loc-support
@@ -12003,6 +12004,25 @@  eBPF target.  On other targets, like x86, BTF debug information can be
 generated along with DWARF debug information when both of the debug formats are
 enabled explicitly via their respective command line options.
 
+@opindex gprune-btf
+@opindex gno-prune-btf
+@item -gprune-btf
+@itemx -gno-prune-btf
+Prune BTF information before emission.  When pruning, only type
+information for types used by global variables and file-scope functions
+will be emitted.  If compiling for the BPF target with BPF CO-RE
+enabled, type information will also be emitted for types used in BPF
+CO-RE relocations.  In addition, struct and union types which are only
+referred to via pointers from members of other struct or union types
+shall be pruned and replaced with BTF_KIND_FWD, as though those types
+were only present in the input as forward declarations.
+
+This option substantially reduces the size of produced BTF information,
+but at significant loss in the amount of detailed type information.
+It is primarily useful when compiling for the BPF target, to minimize
+the size of the resulting object, and to eliminate BTF information
+which is not immediately relevant to the BPF program loading process.
+
 @opindex gctf
 @item -gctf
 @itemx -gctf@var{level}
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
new file mode 100644
index 000000000000..7115d074bd58
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-1.c
@@ -0,0 +1,25 @@ 
+/* Simple test of -gprune-btf option operation.
+   Since 'struct foo' is not used, no BTF shall be emitted for it.  */
+
+/* { dg-do compile } */
+/* { dg-options "-gbtf -gprune-btf -dA" } */
+
+/* No BTF info for 'struct foo' nor types used only by it.  */
+/* { dg-final { scan-assembler-not "BTF_KIND_STRUCT 'foo'" } } */
+/* { dg-final { scan-assembler-not "BTF_KIND_INT 'char'" } } */
+
+/* We should get BTF info for 'struct bar' since it is used.  */
+/* { dg-final { scan-assembler "BTF_KIND_STRUCT 'bar'"} } */
+
+struct foo {
+  int a;
+  char c;
+};
+
+struct bar {
+  int x;
+  long z[4];
+};
+
+struct bar a_bar;
+
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
new file mode 100644
index 000000000000..e6b4a1946f1b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-2.c
@@ -0,0 +1,33 @@ 
+/* Test that -gprune-btf does not chase pointer-to-struct members.  */
+
+/* { dg-do compile } */
+/* { dg-options "-gbtf -gprune-btf -dA" } */
+
+/* Only use of B is via a pointer member of C.
+   Full BTF for B is replaced with a forward.  */
+/* { dg-final { scan-assembler-not "BTF_KIND_STRUCT 'B'" } } */
+/* { dg-final { scan-assembler-times "TYPE \[0-9\]+ BTF_KIND_FWD 'B'" 1 } } */
+
+/* Detailed info for B is omitted, and A is otherwise unused.  */
+/* { dg-final { scan-assembler-not "BTF_KIND_\[A-Z\]+ 'A'" } } */
+
+/* { dg-final { scan-assembler "BTF_KIND_STRUCT 'C'" } } */
+
+struct A;
+
+struct B {
+  int x;
+  int (*do_A_thing) (int, int);
+  struct A *other;
+};
+
+struct C {
+  unsigned int x;
+  struct B * a;
+};
+
+int
+foo (struct C *c)
+{
+  return c->x;
+}
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c
new file mode 100644
index 000000000000..16f8110fc781
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-3.c
@@ -0,0 +1,35 @@ 
+/* Test that -gprune-btf does not prune through array members.  */
+
+/* { dg-do compile } */
+/* { dg-options "-gbtf -gprune-btf -dA" } */
+
+/* We expect full BTF information each struct.  */
+/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_FWD 'file'" } } */
+/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_STRUCT 'A'" } } */
+/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_STRUCT 'B'" } } */
+/* { dg-final { scan-assembler "TYPE \[0-9\]+ BTF_KIND_STRUCT 'C'" } } */
+
+struct file;
+
+struct A {
+  void *private;
+  long (*read)(struct file *, char *, unsigned long);
+  long (*write)(struct file *, const char *, unsigned long);
+};
+
+struct B {
+  unsigned int x;
+  struct A **as;
+};
+
+struct C {
+  struct A *arr_a[4];
+  struct A *lone_a;
+  unsigned int z;
+};
+
+unsigned int
+foo (struct B *b, struct C *c)
+{
+  return b->x + c->z;
+}
diff --git a/gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c
new file mode 100644
index 000000000000..f3d870ac59ec
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/debug/btf/btf-prune-maps.c
@@ -0,0 +1,20 @@ 
+/* Test special meaning of .maps section for BTF when pruning.  For global
+   variables of struct type placed in this section, we must treat members as
+   though they are used directly, always collecting pointee types.
+   Therefore, full type information for struct keep_me should be emitted.  */
+
+/* { dg-do compile } */
+/* { dg-options "-gbtf -gprune-btf -dA" } */
+
+/* { dg-final { scan-assembler-not "BTF_KIND_FWD 'keep_me'" } } */
+/* { dg-final { scan-assembler "BTF_KIND_STRUCT 'keep_me'" } } */
+
+struct keep_me {
+  int a;
+  char c;
+};
+
+struct {
+  int *key;
+  struct keep_me *value;
+} my_map __attribute__((section (".maps")));