Message ID | 1635803327-25346-1-git-send-email-patrick.mcgehearty@oracle.com |
---|---|
State | New |
Headers | show |
Series | [v2] Remove upper limit on tunable MALLOC_MMAP_THRESHOLD | expand |
If mmap_threshold is greater than heap_max, what happens when you allocate a chunk between those two sizes?
If a chunk smaller than the mmap_threshold is requested, then MORECORE [typically sbrk()] is called and HEAP_MAX is not considered by the malloc code. Heaps are only used for mmap()ed allocations, not sbrk()'ed allocations, so far as I can tell in reading the code. Setting a higher default HEAP_MAX would mean the main arena would by default be larger which might cause issues for applications with many, many threads. For that reason, I did not consider changing the default HEAP_MAX as part of this patch. It might be desirable to also allow HEAP_MAX to be set by the user before the first call to malloc, but I see that as a separate task. - patrick On 11/1/2021 7:27 PM, DJ Delorie wrote: > If mmap_threshold is greater than heap_max, what happens when you > allocate a chunk between those two sizes? >
Patrick McGehearty <patrick.mcgehearty@oracle.com> writes: > If a chunk smaller than the mmap_threshold is requested, > then MORECORE [typically sbrk()] is called and HEAP_MAX > is not considered by the malloc code. Heaps are only used > for mmap()ed allocations, not sbrk()'ed allocations, > so far as I can tell in reading the code. There are two types of arenas: the sbrk-based arena (limited in size by ulimit), and zero or more mmap-based arenas (limited by HEAP_MAX). The sbrk-based one is used when the program is single threaded; the mmap-based ones are used when the program is multi-threaded. Your original email made it sound like you were concerned with the multi-threaded case, where the mmap-based heaps are used. In either case, a malloc() request may be satisfied by pulling a free chunk out of either type of arena (possibly growing the arena if needed and possible), or by calling mmap() directly to satisfy that one request. I would think mmap_threshold should still apply if you're using the mmap'd heaps, so you can reserve the heaps for smaller chunks, but that is meaningless if mmap_threshold is larger than the heap size. I could not find an obvious place in the code where mmap_threshold is used to bypass the mmap'd heaps, though. So while I have no problems with allowing larger mmap_threshold settings for the sbrk-based arena, I still wonder what happens to requests that go through an mmap-based arena that are larger than HEAP_MAX but still under the mmap_threshold. Of course, I've spent more time typing this response than it would take to write a test program and see what happens ;-) > It might be desirable to also allow HEAP_MAX to be set by > the user before the first call to malloc, but I see that > as a separate task. Our current implementation requires that the heap size be a compile-time constant, but... yeah.
After studying the main malloc code paths, I believe I can summarize the malloc behavior when the request is below MALLOC_MMAP_THRESHOLD and above HEAP_MAX as follows (omitting much code for failure paths, etc): malloc trys to find an arena with enough memory. If success, allow, return. If no arena is usable, call sysmalloc to get more system memory. [We are concerned with large allocations where no arena has enough memory, so we continue with:] sysmalloc does the following: If there is no usable arena or (the request exceeds mmap threshold and we have not hit the mmap section limit), then we "try_mmap". [But we are asking about the case where we exceed HEAP_MAX_SIZE but not mmap_threshold, meaning we will bypass try_mmap here.] If we don't try_mmap and sysmalloc was called without an arena, then we return 0 (i.e. fail). If the above cases do not apply, then we have two branches, one for a non-main arena and one for the main arena. [I believe the following is the case the reviewer was concerned about.] If the arena is not the main arena (I'm guessing this case only applies to multi-threads apps), then we either try to grow_heap (but not this time because our request is too large) or we call new_heap. The new_heap call will fail because our request exceeds HEAP_MAX_SIZE. At this point, if we have not yet tried mmap, we jump back to try_mmap. The try_mmap: label bypasses the test for mmap_threshold as the label is below the if clause. As far as I can tell from reading the code, following this path, the mmap call succeeds and a single chunk is allocated to fulfill the request. [This case is where we agree that all is as expected.] If the arena is the main arena (i.e. single threaded app), then we call MORECORE (usually sbrk) to extend the arena region as desired. - patrick On 11/9/2021 6:36 PM, DJ Delorie wrote: > Patrick McGehearty <patrick.mcgehearty@oracle.com> writes: >> If a chunk smaller than the mmap_threshold is requested, >> then MORECORE [typically sbrk()] is called and HEAP_MAX >> is not considered by the malloc code. Heaps are only used >> for mmap()ed allocations, not sbrk()'ed allocations, >> so far as I can tell in reading the code. > There are two types of arenas: the sbrk-based arena (limited in size by > ulimit), and zero or more mmap-based arenas (limited by HEAP_MAX). The > sbrk-based one is used when the program is single threaded; the > mmap-based ones are used when the program is multi-threaded. Your > original email made it sound like you were concerned with the > multi-threaded case, where the mmap-based heaps are used. > > In either case, a malloc() request may be satisfied by pulling a free > chunk out of either type of arena (possibly growing the arena if needed > and possible), or by calling mmap() directly to satisfy that one > request. > > I would think mmap_threshold should still apply if you're using the > mmap'd heaps, so you can reserve the heaps for smaller chunks, but that > is meaningless if mmap_threshold is larger than the heap size. I could > not find an obvious place in the code where mmap_threshold is used to > bypass the mmap'd heaps, though. > > So while I have no problems with allowing larger mmap_threshold settings > for the sbrk-based arena, I still wonder what happens to requests that > go through an mmap-based arena that are larger than HEAP_MAX but still > under the mmap_threshold. > > Of course, I've spent more time typing this response than it would take > to write a test program and see what happens ;-) > >> It might be desirable to also allow HEAP_MAX to be set by >> the user before the first call to malloc, but I see that >> as a separate task. > Our current implementation requires that the heap size be a compile-time > constant, but... yeah. >
Ok, based on all that, I think it's safe to increase the threshold - although, given your description, setting it bigger than the mmap'd heap size will still be kinda pointless in multi-threaded programs. Safe, but pointless. I wonder if any of these details or caveats need to be documented, either in the code or the manual...
I thought about documentation issues. The change in behavior for single threaded programs will match the current documentation more closely. No change needed there. I decided not to suggest documentation changes for the multi-thread case as other threads have wanted to avoid tying documentation to implementation details. If/when we change how MAX_HEAP_SIZE is handled as described in the next paragraph, the current documentation would be fine. Longer term, it probably would be useful to change the max heap size to be a tunable variable instead of a fixed value. That change might also have the max heap size changed appropriately when the MALLOC_MMAP_THRESHOLD is set beyond the default value. Of course documentation would need to be added about the new tunable for MAX_HEAP_SIZE. - patrick On 11/29/2021 2:42 PM, DJ Delorie wrote: > Ok, based on all that, I think it's safe to increase the threshold - > although, given your description, setting it bigger than the mmap'd heap > size will still be kinda pointless in multi-threaded programs. Safe, > but pointless. > > I wonder if any of these details or caveats need to be documented, > either in the code or the manual... >
DJ, does your final comments mean this patch is OK to commit? I should say that I intend to also change HEAP_MAX to be a tunable to handle the issues we discussed as a separate patch. It does not look to be time consuming. I just can't predict when I'll have it ready because it's behind a couple of other things in my task list and I can't predict what new work tasks might cause further delays. - patrick On 11/29/2021 2:42 PM, DJ Delorie wrote: > Ok, based on all that, I think it's safe to increase the threshold - > although, given your description, setting it bigger than the mmap'd heap > size will still be kinda pointless in multi-threaded programs. Safe, > but pointless. > > I wonder if any of these details or caveats need to be documented, > either in the code or the manual... >
Patrick McGehearty <patrick.mcgehearty@oracle.com> writes: > DJ, does your final comments mean this patch is > OK to commit? Sorry, yes. Please commit. > I should say that I intend to also change HEAP_MAX to be > a tunable to handle the issues we discussed as a separate > patch. It does not look to be time consuming. I just > can't predict when I'll have it ready because it's behind > a couple of other things in my task list and I can't > predict what new work tasks might cause further delays. Please pay attention to the performance metrics on that one; you're replacing a compiled-in constant (with associated masks and shifts, which may not be obviously dependent) with a runtime computation, and we do those computations a lot. Otherwise, good luck ;-)
diff --git a/malloc/malloc.c b/malloc/malloc.c index 095d97a..c1c4ef6 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -5216,16 +5216,11 @@ do_set_top_pad (size_t value) static __always_inline int do_set_mmap_threshold (size_t value) { - /* Forbid setting the threshold too high. */ - if (value <= HEAP_MAX_SIZE / 2) - { - LIBC_PROBE (memory_mallopt_mmap_threshold, 3, value, mp_.mmap_threshold, - mp_.no_dyn_threshold); - mp_.mmap_threshold = value; - mp_.no_dyn_threshold = 1; - return 1; - } - return 0; + LIBC_PROBE (memory_mallopt_mmap_threshold, 3, value, mp_.mmap_threshold, + mp_.no_dyn_threshold); + mp_.mmap_threshold = value; + mp_.no_dyn_threshold = 1; + return 1; } static __always_inline int