Message ID | DB6PR0801MB20534AC3908A760CBE1EA173834B0@DB6PR0801MB2053.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
Series | Add single-threaded fast path to malloc | expand |
DJ Delorie wrote: > + /* Check that the top of the bin is not the record we are going to > + add (i.e., double free). */ > + if (__builtin_expect (old == p, 0)) > + malloc_printerr ("double free or corruption (fasttop)"); > By moving this out of the multi-thread loop, you're removing a check > against some of the top chunks in the case where the loop happens more > than once. Sure, my goal was to share the checks between single and multi-threaded cases and also keep the loop as simple as possible to reduce the chance of contention. How often do we iterate more than once? And what are the chances that when there is contention, the block at the top of the bin is the same as the one we're trying to insert but the older one we checked isn't? If we want to detect double free in almost all cases, wouldn't it be much easier to check the in-use bit? I can't make much sense of the check on the next chunk either - that's only done if have_lock == true, so rarely executed. > + if (SINGLE_THREAD_P) > { > If you set have_lock to zero here, you can omit the last two chunks of > this patch. To true you mean - yes I can do that. Wilco
DJ Delorie wrote: > Wilco Dijkstra <Wilco.Dijkstra@arm.com> writes: > > How often do we iterate more than once? And what are the chances that > > when there is contention, the block at the top of the bin is the same > > as the one we're trying to insert but the older one we checked isn't? > > Sadly, these are the cases that Bad Guys can create and take advantage > of to infect systems. Removing a test should never be done without > extreme forethought and consideration. If that's the case then why isn't a double free checked everywhere? I can free a block many times without getting any errors. It just gets added to the tcache without any consistency checks! Also freeing a block into the fastbin even if it is already in the fastbin is not detected. Even repeatedly freeing the same block after the undetected double free goes completely undetected despite the supposed check... So I think any security features need to be well designed and supported. Doing checks that are completely ineffective doesn't make sense - that just adds unnecessary overhead while providing no actual security benefit (in fact it gives a false sense of security which is even worse...). Wilco
DJ Delorie wrote: > > + if (SINGLE_THREAD_P) >> { > > If you set have_lock to zero here, you can omit the last two chunks of > this patch. Here is the updated version with that change and the original check: This patch adds single-threaded fast paths to _int_free. Bypass the explicit locking for larger allocations. Passes GLIBC tests, OK for commit? ChangeLog: 2017-10-19 Wilco Dijkstra <wdijkstr@arm.com> * malloc/malloc.c (_int_free): Add SINGLE_THREAD_P paths. diff --git a/malloc/malloc.c b/malloc/malloc.c index e220fba83b0f9dc515aef562bdfca6a3ad13d3ea..ca5cfff3a1b1882ae608219fdec973b7f13cbb21 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -4195,24 +4195,34 @@ _int_free (mstate av, mchunkptr p, int have_lock) /* Atomically link P to its fastbin: P->FD = *FB; *FB = P; */ mchunkptr old = *fb, old2; - unsigned int old_idx = ~0u; - do + + if (SINGLE_THREAD_P) { - /* Check that the top of the bin is not the record we are going to add - (i.e., double free). */ + /* Check that the top of the bin is not the record we are going to + add (i.e., double free). */ if (__builtin_expect (old == p, 0)) malloc_printerr ("double free or corruption (fasttop)"); - /* Check that size of fastbin chunk at the top is the same as - size of the chunk that we are adding. We can dereference OLD - only if we have the lock, otherwise it might have already been - deallocated. See use of OLD_IDX below for the actual check. */ - if (have_lock && old != NULL) - old_idx = fastbin_index(chunksize(old)); - p->fd = old2 = old; + p->fd = old; + *fb = p; } - while ((old = catomic_compare_and_exchange_val_rel (fb, p, old2)) != old2); - - if (have_lock && old != NULL && __builtin_expect (old_idx != idx, 0)) + else + do + { + /* Check that the top of the bin is not the record we are going to + add (i.e., double free). */ + if (__builtin_expect (old == p, 0)) + malloc_printerr ("double free or corruption (fasttop)"); + p->fd = old2 = old; + } + while ((old = catomic_compare_and_exchange_val_rel (fb, p, old2)) + != old2); + + /* Check that size of fastbin chunk at the top is the same as + size of the chunk that we are adding. We can dereference OLD + only if we have the lock, otherwise it might have already been + allocated again. */ + if (have_lock && old != NULL + && __builtin_expect (fastbin_index (chunksize (old)) != idx, 0)) malloc_printerr ("invalid fastbin entry (free)"); } @@ -4221,6 +4231,11 @@ _int_free (mstate av, mchunkptr p, int have_lock) */ else if (!chunk_is_mmapped(p)) { + + /* If we're single-threaded, don't lock the arena. */ + if (SINGLE_THREAD_P) + have_lock = true; + if (!have_lock) __libc_lock_lock (av->mutex);
diff --git a/malloc/malloc.c b/malloc/malloc.c index 13247c64e2be0779050dfbd0fd25f205ba7184f7..c00df205c6004ee5b5d0aee9ffd5130b3c8f9e9f 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -4188,24 +4188,29 @@ _int_free (mstate av, mchunkptr p, int have_lock) /* Atomically link P to its fastbin: P->FD = *FB; *FB = P; */ mchunkptr old = *fb, old2; - unsigned int old_idx = ~0u; - do + + /* Check that the top of the bin is not the record we are going to + add (i.e., double free). */ + if (__builtin_expect (old == p, 0)) + malloc_printerr ("double free or corruption (fasttop)"); + + if (SINGLE_THREAD_P) { - /* Check that the top of the bin is not the record we are going to add - (i.e., double free). */ - if (__builtin_expect (old == p, 0)) - malloc_printerr ("double free or corruption (fasttop)"); - /* Check that size of fastbin chunk at the top is the same as - size of the chunk that we are adding. We can dereference OLD - only if we have the lock, otherwise it might have already been - deallocated. See use of OLD_IDX below for the actual check. */ - if (have_lock && old != NULL) - old_idx = fastbin_index(chunksize(old)); - p->fd = old2 = old; + p->fd = old; + *fb = p; } - while ((old = catomic_compare_and_exchange_val_rel (fb, p, old2)) != old2); - - if (have_lock && old != NULL && __builtin_expect (old_idx != idx, 0)) + else + do + p->fd = old2 = old; + while ((old = catomic_compare_and_exchange_val_rel (fb, p, old2)) + != old2); + + /* Check that size of fastbin chunk at the top is the same as + size of the chunk that we are adding. We can dereference OLD + only if we have the lock, otherwise it might have already been + allocated again. */ + if (have_lock && old != NULL + && __builtin_expect (fastbin_index (chunksize (old)) != idx, 0)) malloc_printerr ("invalid fastbin entry (free)"); } @@ -4214,7 +4219,7 @@ _int_free (mstate av, mchunkptr p, int have_lock) */ else if (!chunk_is_mmapped(p)) { - if (!have_lock) + if (!SINGLE_THREAD_P && !have_lock) __libc_lock_lock (av->mutex); nextchunk = chunk_at_offset(p, size); @@ -4329,7 +4334,7 @@ _int_free (mstate av, mchunkptr p, int have_lock) } } - if (!have_lock) + if (!SINGLE_THREAD_P && !have_lock) __libc_lock_unlock (av->mutex); } /*