diff mbox series

[RFA] vec: Fix --enable-gather-detailed-mem-stats

Message ID 30daf345-6fd4-5fb4-d0a6-754b48d9b441@redhat.com
State New
Headers show
Series [RFA] vec: Fix --enable-gather-detailed-mem-stats | expand

Commit Message

Jason Merrill Oct. 4, 2021, 6:27 p.m. UTC
When r12-4038 introduced the global auto_vec save_opt_decoded_options, 
it broke compilers configured with --enable-gather-detailed-mem-stats, 
due to the memory descriptors getting discarded before the auto_vec was 
destroyed.  Attached below are two approaches to making this work, 
either by using the init_priority attribute, or turning vec_mem_desc 
into a singleton function.  I prefer the first one, primarily because it 
doesn't require auto_vec variables to force immediate allocation.  It 
relies on a G++ extension, but I figure that's OK for code that is only 
exercised with a debugging configure flag.

Thoughts?  Either one OK for trunk?

Comments

Iain Sandoe Oct. 4, 2021, 6:37 p.m. UTC | #1
Hi Jason,

> On 4 Oct 2021, at 19:27, Jason Merrill via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> When r12-4038 introduced the global auto_vec save_opt_decoded_options, it broke compilers configured with --enable-gather-detailed-mem-stats, due to the memory descriptors getting discarded before the auto_vec was destroyed.  Attached below are two approaches to making this work, either by using the init_priority attribute, or turning vec_mem_desc into a singleton function.  I prefer the first one, primarily because it doesn't require auto_vec variables to force immediate allocation.  It relies on a G++ extension, but I figure that's OK for code that is only exercised with a debugging configure flag.

I suspect the problem is not necessarily that it’s a G++ extension, (init priority has some support elsewhere) - but that some targets (with non-binutils ld) cannot support it [between multiple TUs at least] even with a native G++ (Darwin at least cannot).  OTOH, there are worse broken things from this than a gathering of stats…
Iain

> Thoughts?  Either one OK for trunk?<0001-vec-Fix-enable-gather-detailed-mem-stats.patch><0002-vec-make-vec_mem_desc-a-singleton-function.patch>
Jason Merrill Oct. 4, 2021, 9:27 p.m. UTC | #2
On 10/4/21 14:37, Iain Sandoe wrote:
> Hi Jason,
> 
>> On 4 Oct 2021, at 19:27, Jason Merrill via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>
>> When r12-4038 introduced the global auto_vec save_opt_decoded_options, it broke compilers configured with --enable-gather-detailed-mem-stats, due to the memory descriptors getting discarded before the auto_vec was destroyed.  Attached below are two approaches to making this work, either by using the init_priority attribute, or turning vec_mem_desc into a singleton function.  I prefer the first one, primarily because it doesn't require auto_vec variables to force immediate allocation.  It relies on a G++ extension, but I figure that's OK for code that is only exercised with a debugging configure flag.
> 
> I suspect the problem is not necessarily that it’s a G++ extension, (init priority has some support elsewhere) - but that some targets (with non-binutils ld) cannot support it [between multiple TUs at least] even with a native G++ (Darwin at least cannot).  OTOH, there are worse broken things from this than a gathering of stats…

Hmm, that was previously handled for other linkers with the collect2 
wrapper.  I haven't followed what has happened with collect2 in recent 
years, does Darwin not use it?

Jason
Iain Sandoe Oct. 4, 2021, 10:22 p.m. UTC | #3
> On 4 Oct 2021, at 22:27, Jason Merrill via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> 
> On 10/4/21 14:37, Iain Sandoe wrote:
>> Hi Jason,
>>> On 4 Oct 2021, at 19:27, Jason Merrill via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>> 
>>> When r12-4038 introduced the global auto_vec save_opt_decoded_options, it broke compilers configured with --enable-gather-detailed-mem-stats, due to the memory descriptors getting discarded before the auto_vec was destroyed.  Attached below are two approaches to making this work, either by using the init_priority attribute, or turning vec_mem_desc into a singleton function.  I prefer the first one, primarily because it doesn't require auto_vec variables to force immediate allocation.  It relies on a G++ extension, but I figure that's OK for code that is only exercised with a debugging configure flag.
>> I suspect the problem is not necessarily that it’s a G++ extension, (init priority has some support elsewhere) - but that some targets (with non-binutils ld) cannot support it [between multiple TUs at least] even with a native G++ (Darwin at least cannot).  OTOH, there are worse broken things from this than a gathering of stats…
> 
> Hmm, that was previously handled for other linkers with the collect2 wrapper.  I haven't followed what has happened with collect2 in recent years, does Darwin not use it?

It does use collect2, but init_priority is, nevertheless disabled for the target; I will investigate some more.
thanks for the head’s up,
Iain
Richard Biener Oct. 5, 2021, 11:27 a.m. UTC | #4
On Mon, Oct 4, 2021 at 8:28 PM Jason Merrill via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> When r12-4038 introduced the global auto_vec save_opt_decoded_options,
> it broke compilers configured with --enable-gather-detailed-mem-stats,
> due to the memory descriptors getting discarded before the auto_vec was
> destroyed.  Attached below are two approaches to making this work,
> either by using the init_priority attribute, or turning vec_mem_desc
> into a singleton function.  I prefer the first one, primarily because it
> doesn't require auto_vec variables to force immediate allocation.  It
> relies on a G++ extension, but I figure that's OK for code that is only
> exercised with a debugging configure flag.
>
> Thoughts?  Either one OK for trunk?

Hmm, isn't the way to fix this to turn the global auto_vec into
vec<> *x and allocate it at runtime (thus explicitly mange its
lifetime?).  We don't want global CTORs/DTORs in general
because of startup cost and of course those pesky ordering issues...

Richard.
Richard Biener Oct. 5, 2021, 11:29 a.m. UTC | #5
On Tue, Oct 5, 2021 at 1:27 PM Richard Biener
<richard.guenther@gmail.com> wrote:
>
> On Mon, Oct 4, 2021 at 8:28 PM Jason Merrill via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > When r12-4038 introduced the global auto_vec save_opt_decoded_options,
> > it broke compilers configured with --enable-gather-detailed-mem-stats,
> > due to the memory descriptors getting discarded before the auto_vec was
> > destroyed.  Attached below are two approaches to making this work,
> > either by using the init_priority attribute, or turning vec_mem_desc
> > into a singleton function.  I prefer the first one, primarily because it
> > doesn't require auto_vec variables to force immediate allocation.  It
> > relies on a G++ extension, but I figure that's OK for code that is only
> > exercised with a debugging configure flag.
> >
> > Thoughts?  Either one OK for trunk?
>
> Hmm, isn't the way to fix this to turn the global auto_vec into
> vec<> *x and allocate it at runtime (thus explicitly mange its
> lifetime?).  We don't want global CTORs/DTORs in general
> because of startup cost and of course those pesky ordering issues...

Oh, and maybe we can make

 static mem_alloc_description <vec_usage> vec_mem_desc;

statically initialized with some C++?  (constexpr? constinit? whatever?)

> Richard.
Jason Merrill Oct. 5, 2021, 6:44 p.m. UTC | #6
On 10/5/21 07:27, Richard Biener wrote:
> On Mon, Oct 4, 2021 at 8:28 PM Jason Merrill via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> When r12-4038 introduced the global auto_vec save_opt_decoded_options,
>> it broke compilers configured with --enable-gather-detailed-mem-stats,
>> due to the memory descriptors getting discarded before the auto_vec was
>> destroyed.  Attached below are two approaches to making this work,
>> either by using the init_priority attribute, or turning vec_mem_desc
>> into a singleton function.  I prefer the first one, primarily because it
>> doesn't require auto_vec variables to force immediate allocation.  It
>> relies on a G++ extension, but I figure that's OK for code that is only
>> exercised with a debugging configure flag.
>>
>> Thoughts?  Either one OK for trunk?
> 
> Hmm, isn't the way to fix this to turn the global auto_vec into
> vec<> *x and allocate it at runtime (thus explicitly mange its
> lifetime?).  We don't want global CTORs/DTORs in general
> because of startup cost and of course those pesky ordering issues...

That is the usual approach, yes.  I was giving up on that, but perhaps 
it's better to stick with it.  Martin, want to make that fix for 
save_opt_decoded_options?

> Oh, and maybe we can make
> 
>  static mem_alloc_description <vec_usage> vec_mem_desc;
> 
> statically initialized with some C++?  (constexpr? constinit? whatever?
It can't be statically initialized, because it needs to allocate 
multiple maps.

Jason
Martin Liška Oct. 7, 2021, 10:38 a.m. UTC | #7
On 10/5/21 20:44, Jason Merrill wrote:
> That is the usual approach, yes.  I was giving up on that, but perhaps it's better to stick with it.  Martin, want to make that fix for save_opt_decoded_options?

Yes, I'm going to install the following patch once it survives regression tests
and bootstrap.

Martin
diff mbox series

Patch

From 7641bceec2cb53e76c57bd70c38f390610bb82b6 Mon Sep 17 00:00:00 2001
From: Jason Merrill <jason@redhat.com>
Date: Fri, 1 Oct 2021 16:09:57 -0400
Subject: [PATCH 2/2] vec: make vec_mem_desc a singleton function
To: gcc-patches@gcc.gnu.org

When r12-4038 introduced the global auto_vec save_opt_decoded_options, it
broke compilers configured with --enable-gather-detailed-mem-stats due to
the memory descriptors getting discarded before the auto_vec was destroyed.
We can fix the ordering of construction/destruction by turning vec_mem_desc
into a singleton function.

For this to work, the construction of save_opt_decoded_options needs to
allocate immediately, so that allocation calls vec_mem_desc.

gcc/ChangeLog:

	* vec.c (vec_mem_desc): Change to function.
	(vec_prefix::register_overhead): Adjust.
	(vec_prefix::release_overhead): Adjust.
	(dump_vec_loc_statistics): Adjust.
	* toplev.c (save_opt_decoded_options): Allocate.
---
 gcc/toplev.c |  7 ++++++-
 gcc/vec.c    | 28 +++++++++++++++-------------
 2 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/gcc/toplev.c b/gcc/toplev.c
index ec9f998a49b..727a9972dd0 100644
--- a/gcc/toplev.c
+++ b/gcc/toplev.c
@@ -122,7 +122,12 @@  struct cl_decoded_option *save_decoded_options;
 unsigned int save_decoded_options_count;
 
 /* Vector of saved Optimization decoded command line options.  */
-auto_vec<cl_decoded_option> save_opt_decoded_options;
+auto_vec<cl_decoded_option> save_opt_decoded_options
+#if GATHER_STATISTICS
+/* Force allocation so vec_mem_desc is called.  */
+(1)
+#endif
+;
 
 /* Used to enable -fvar-tracking, -fweb and -frename-registers according
    to optimize in process_options ().  */
diff --git a/gcc/vec.c b/gcc/vec.c
index 715460397c6..4858753223c 100644
--- a/gcc/vec.c
+++ b/gcc/vec.c
@@ -110,13 +110,15 @@  public:
   size_t m_element_size;
 };
 
-/* Vector memory description.  */
-#if GATHER_STATISTICS
-/* With --enable-gather-detailed-mem-stats, this needs to be constructed before
-   any auto_vec variables.  The number 237 is entirely arbitrary.  */
-[[gnu::init_priority (237)]]
-#endif
-static mem_alloc_description <vec_usage> vec_mem_desc;
+/* Vector memory description.  This is a singleton function rather than a
+   variable so that the object's [cd]tor are properly ordered relative to any
+   auto_vec variables.  */
+static mem_alloc_description <vec_usage> &
+vec_mem_desc ()
+{
+  static mem_alloc_description<vec_usage> m;
+  return m;
+}
 
 /* Account the overhead.  */
 
@@ -124,10 +126,10 @@  void
 vec_prefix::register_overhead (void *ptr, size_t elements,
 			       size_t element_size MEM_STAT_DECL)
 {
-  vec_mem_desc.register_descriptor (ptr, VEC_ORIGIN, false
+  vec_mem_desc ().register_descriptor (ptr, VEC_ORIGIN, false
 				    FINAL_PASS_MEM_STAT);
   vec_usage *usage
-    = vec_mem_desc.register_instance_overhead (elements * element_size, ptr);
+    = vec_mem_desc ().register_instance_overhead (elements * element_size, ptr);
   usage->m_element_size = element_size;
   usage->m_items += elements;
   if (usage->m_items_peak < usage->m_items)
@@ -140,10 +142,10 @@  void
 vec_prefix::release_overhead (void *ptr, size_t size, size_t elements,
 			      bool in_dtor MEM_STAT_DECL)
 {
-  if (!vec_mem_desc.contains_descriptor_for_instance (ptr))
-    vec_mem_desc.register_descriptor (ptr, VEC_ORIGIN,
+  if (!vec_mem_desc ().contains_descriptor_for_instance (ptr))
+    vec_mem_desc ().register_descriptor (ptr, VEC_ORIGIN,
 				      false FINAL_PASS_MEM_STAT);
-  vec_usage *usage = vec_mem_desc.release_instance_overhead (ptr, size,
+  vec_usage *usage = vec_mem_desc ().release_instance_overhead (ptr, size,
 							     in_dtor);
   usage->m_items -= elements;
 }
@@ -178,7 +180,7 @@  vec_prefix::calculate_allocation_1 (unsigned alloc, unsigned desired)
 void
 dump_vec_loc_statistics (void)
 {
-  vec_mem_desc.dump (VEC_ORIGIN);
+  vec_mem_desc ().dump (VEC_ORIGIN);
 }
 
 #if CHECKING_P
-- 
2.27.0