diff mbox series

[v2] Remove upper limit on tunable MALLOC_MMAP_THRESHOLD

Message ID 1635803327-25346-1-git-send-email-patrick.mcgehearty@oracle.com
State New
Headers show
Series [v2] Remove upper limit on tunable MALLOC_MMAP_THRESHOLD | expand

Commit Message

Patrick McGehearty Nov. 1, 2021, 9:48 p.m. UTC
Differences in Version 2:

Version 1 of this patch was made on glibc 2.28 malloc/malloc.c.
My mistake. This version is made against today's (Nov 1, 2021)
upstream version of glibc, malloc/malloc.c.

The current limit on MALLOC_MMAP_THRESHOLD is either 1 Mbyte (for
32-bit apps) or 32 Mbytes (for 64-bit apps).  This value was set by a
patch dated 2006 (15 years ago).  Attempts to set the threshold higher
are currently ignored.

The default behavior is appropriate for many highly parallel
applications where many processes or threads are sharing RAM. In other
situations where the number of active processes or threads closely
matches the number of cores, a much higher limit may be desired by the
application designer. By today's standards on personal computers and
small servers, 2 Gbytes of RAM per core is commonly available. On
larger systems 4 Gbytes or more of RAM is sometimes available.
Instead of raising the limit to match current needs, this patch
proposes to remove the limit of the tunable, leaving the decision up
to the user of a tunable to judge the best value for their needs.

This patch does not change any of the defaults for malloc tunables,
retaining the current behavior of the dynamic malloc mmap threshold.

malloc/
	malloc.c changed do_set_mmap_threshold to remove test
	for HEAP_MAX_SIZE.
---
 malloc/malloc.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

Comments

DJ Delorie Nov. 2, 2021, 12:27 a.m. UTC | #1
If mmap_threshold is greater than heap_max, what happens when you
allocate a chunk between those two sizes?
Patrick McGehearty Nov. 9, 2021, 10:33 p.m. UTC | #2
If a chunk smaller than the mmap_threshold is requested,
then MORECORE [typically sbrk()] is called and HEAP_MAX
is not considered by the malloc code.  Heaps are only used
for mmap()ed allocations, not sbrk()'ed allocations,
so far as I can tell in reading the code.

Setting a higher default HEAP_MAX would mean the main arena
would by default be larger which might cause issues for
applications with many, many threads. For that reason,
I did not consider changing the default HEAP_MAX as part
of this patch.

It might be desirable to also allow HEAP_MAX to be set by
the user before the first call to malloc, but I see that
as a separate task.

-  patrick

On 11/1/2021 7:27 PM, DJ Delorie wrote:
> If mmap_threshold is greater than heap_max, what happens when you
> allocate a chunk between those two sizes?
>
DJ Delorie Nov. 10, 2021, 12:36 a.m. UTC | #3
Patrick McGehearty <patrick.mcgehearty@oracle.com> writes:
> If a chunk smaller than the mmap_threshold is requested,
> then MORECORE [typically sbrk()] is called and HEAP_MAX
> is not considered by the malloc code.  Heaps are only used
> for mmap()ed allocations, not sbrk()'ed allocations,
> so far as I can tell in reading the code.

There are two types of arenas: the sbrk-based arena (limited in size by
ulimit), and zero or more mmap-based arenas (limited by HEAP_MAX).  The
sbrk-based one is used when the program is single threaded; the
mmap-based ones are used when the program is multi-threaded.  Your
original email made it sound like you were concerned with the
multi-threaded case, where the mmap-based heaps are used.

In either case, a malloc() request may be satisfied by pulling a free
chunk out of either type of arena (possibly growing the arena if needed
and possible), or by calling mmap() directly to satisfy that one
request.

I would think mmap_threshold should still apply if you're using the
mmap'd heaps, so you can reserve the heaps for smaller chunks, but that
is meaningless if mmap_threshold is larger than the heap size.  I could
not find an obvious place in the code where mmap_threshold is used to
bypass the mmap'd heaps, though.

So while I have no problems with allowing larger mmap_threshold settings
for the sbrk-based arena, I still wonder what happens to requests that
go through an mmap-based arena that are larger than HEAP_MAX but still
under the mmap_threshold.

Of course, I've spent more time typing this response than it would take
to write a test program and see what happens ;-)

> It might be desirable to also allow HEAP_MAX to be set by
> the user before the first call to malloc, but I see that
> as a separate task.

Our current implementation requires that the heap size be a compile-time
constant, but... yeah.
Patrick McGehearty Nov. 25, 2021, 12:52 a.m. UTC | #4
After studying the main malloc code paths, I believe I can
summarize the malloc behavior when the request is below
MALLOC_MMAP_THRESHOLD and above HEAP_MAX  as follows
(omitting much code for failure paths, etc):

malloc trys to find an arena with enough memory.
If success, allow, return.
If no arena is usable, call sysmalloc to get more system memory.
[We are concerned with large allocations where no arena
has enough memory, so we continue with:]

sysmalloc does the following:

If there is no usable arena or (the request exceeds mmap
threshold and we have not hit the mmap section limit),
then we "try_mmap".
[But we are asking about the case where we exceed HEAP_MAX_SIZE
but not mmap_threshold, meaning we will bypass try_mmap here.]

If we don't try_mmap and sysmalloc was called without an arena, then
we return 0 (i.e. fail).

If the above cases do not apply, then we have two branches, one for
a non-main arena and one for the main arena.

[I believe the following is the case the reviewer was concerned about.]
If the arena is not the main arena (I'm guessing this case only
applies to multi-threads apps), then we either try to grow_heap (but
not this time because our request is too large) or we call
new_heap. The new_heap call will fail because our request exceeds
HEAP_MAX_SIZE. At this point, if we have not yet tried mmap, we jump
back to try_mmap.

The try_mmap: label bypasses the test for mmap_threshold as the label
is below the if clause. As far as I can tell from reading the code,
following this path, the mmap call succeeds and a single chunk is
allocated to fulfill the request.

[This case is where we agree that all is as expected.]
If the arena is the main arena (i.e. single threaded app), then
we call MORECORE (usually sbrk) to extend the arena region as desired.

- patrick


On 11/9/2021 6:36 PM, DJ Delorie wrote:
> Patrick McGehearty <patrick.mcgehearty@oracle.com> writes:
>> If a chunk smaller than the mmap_threshold is requested,
>> then MORECORE [typically sbrk()] is called and HEAP_MAX
>> is not considered by the malloc code.  Heaps are only used
>> for mmap()ed allocations, not sbrk()'ed allocations,
>> so far as I can tell in reading the code.
> There are two types of arenas: the sbrk-based arena (limited in size by
> ulimit), and zero or more mmap-based arenas (limited by HEAP_MAX).  The
> sbrk-based one is used when the program is single threaded; the
> mmap-based ones are used when the program is multi-threaded.  Your
> original email made it sound like you were concerned with the
> multi-threaded case, where the mmap-based heaps are used.
>
> In either case, a malloc() request may be satisfied by pulling a free
> chunk out of either type of arena (possibly growing the arena if needed
> and possible), or by calling mmap() directly to satisfy that one
> request.
>
> I would think mmap_threshold should still apply if you're using the
> mmap'd heaps, so you can reserve the heaps for smaller chunks, but that
> is meaningless if mmap_threshold is larger than the heap size.  I could
> not find an obvious place in the code where mmap_threshold is used to
> bypass the mmap'd heaps, though.
>
> So while I have no problems with allowing larger mmap_threshold settings
> for the sbrk-based arena, I still wonder what happens to requests that
> go through an mmap-based arena that are larger than HEAP_MAX but still
> under the mmap_threshold.
>
> Of course, I've spent more time typing this response than it would take
> to write a test program and see what happens ;-)
>
>> It might be desirable to also allow HEAP_MAX to be set by
>> the user before the first call to malloc, but I see that
>> as a separate task.
> Our current implementation requires that the heap size be a compile-time
> constant, but... yeah.
>
DJ Delorie Nov. 29, 2021, 8:42 p.m. UTC | #5
Ok, based on all that, I think it's safe to increase the threshold -
although, given your description, setting it bigger than the mmap'd heap
size will still be kinda pointless in multi-threaded programs.  Safe,
but pointless.

I wonder if any of these details or caveats need to be documented,
either in the code or the manual...
Patrick McGehearty Nov. 29, 2021, 9:35 p.m. UTC | #6
I thought about documentation issues.
The change in behavior for single threaded programs will
match the current documentation more closely. No change needed there.

I decided not to suggest documentation changes for the multi-thread
case as other threads have wanted to avoid tying documentation to
implementation details. If/when we change how MAX_HEAP_SIZE is handled
as described in the next paragraph, the current documentation would
be fine.

Longer term, it probably would be useful to change the max heap
size to be a tunable variable instead of a fixed value.
That change might also have the max heap size changed appropriately
when the MALLOC_MMAP_THRESHOLD is set beyond the default value.
Of course documentation would need to be added about the new
tunable for MAX_HEAP_SIZE.

- patrick


On 11/29/2021 2:42 PM, DJ Delorie wrote:
> Ok, based on all that, I think it's safe to increase the threshold -
> although, given your description, setting it bigger than the mmap'd heap
> size will still be kinda pointless in multi-threaded programs.  Safe,
> but pointless.
>
> I wonder if any of these details or caveats need to be documented,
> either in the code or the manual...
>
Patrick McGehearty Dec. 7, 2021, 7:51 p.m. UTC | #7
DJ, does your final comments mean this patch is
OK to commit?

I should say that I intend to also change HEAP_MAX to be
a tunable to handle the issues we discussed as a separate
patch. It does not look to be time consuming. I just
can't predict when I'll have it ready because it's behind
a couple of other things in my task list and I can't
predict what new work tasks might cause further delays.

- patrick

On 11/29/2021 2:42 PM, DJ Delorie wrote:
> Ok, based on all that, I think it's safe to increase the threshold -
> although, given your description, setting it bigger than the mmap'd heap
> size will still be kinda pointless in multi-threaded programs.  Safe,
> but pointless.
>
> I wonder if any of these details or caveats need to be documented,
> either in the code or the manual...
>
DJ Delorie Dec. 7, 2021, 8:35 p.m. UTC | #8
Patrick McGehearty <patrick.mcgehearty@oracle.com> writes:
> DJ, does your final comments mean this patch is
> OK to commit?

Sorry, yes.  Please commit.

> I should say that I intend to also change HEAP_MAX to be
> a tunable to handle the issues we discussed as a separate
> patch. It does not look to be time consuming. I just
> can't predict when I'll have it ready because it's behind
> a couple of other things in my task list and I can't
> predict what new work tasks might cause further delays.

Please pay attention to the performance metrics on that one; you're
replacing a compiled-in constant (with associated masks and shifts,
which may not be obviously dependent) with a runtime computation, and we
do those computations a lot.  Otherwise, good luck ;-)
diff mbox series

Patch

diff --git a/malloc/malloc.c b/malloc/malloc.c
index 095d97a..c1c4ef6 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -5216,16 +5216,11 @@  do_set_top_pad (size_t value)
 static __always_inline int
 do_set_mmap_threshold (size_t value)
 {
-  /* Forbid setting the threshold too high.  */
-  if (value <= HEAP_MAX_SIZE / 2)
-    {
-      LIBC_PROBE (memory_mallopt_mmap_threshold, 3, value, mp_.mmap_threshold,
-		  mp_.no_dyn_threshold);
-      mp_.mmap_threshold = value;
-      mp_.no_dyn_threshold = 1;
-      return 1;
-    }
-  return 0;
+  LIBC_PROBE (memory_mallopt_mmap_threshold, 3, value, mp_.mmap_threshold,
+	      mp_.no_dyn_threshold);
+  mp_.mmap_threshold = value;
+  mp_.no_dyn_threshold = 1;
+  return 1;
 }
 
 static __always_inline int