From patchwork Thu Aug 22 02:59:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Wangyang" X-Patchwork-Id: 1975214 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=LDHWrAMn; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wq7L65XGhz1yf6 for ; Thu, 22 Aug 2024 13:02:54 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 87B3938708FD for ; Thu, 22 Aug 2024 03:02:52 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by sourceware.org (Postfix) with ESMTPS id 33CAB385E441 for ; Thu, 22 Aug 2024 03:02:32 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 33CAB385E441 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 33CAB385E441 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295754; cv=none; b=YkYBWrQG1GU+uo6G6Jxkb8hYUzETCBPgztCPdYTmkllHfkyMisBOoojZErhhd7Ee6qwz9cTYr2APYHsPteK7ZbGobyRv7gzRrUMhVWD+eblJ/t1hJN1GPmvPkWKhr2HmT+5BrDP7pARo2Loleutw8a0fXKm8rpid1YQAR511rlY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295754; c=relaxed/simple; bh=GW+o/JX24tfMBd2jcmqZGCe3gw07EVgMrvsvB3e2OvM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=L/FPg1eCuepkJxjDyCcV1MXOTymvG5ZunX8qXwDX4twHYl3KFgfQ2MRS1dWI6XF3veMJvqEYjPD2WsjQR3UjO+2sg+mCA0syLvGpq2Sd3ls1rMwKgNhizF9zfAyDKz/qXoY8s2Bx1c1is0DYTXHocIPBi8FXRIm8SocCxgl9PHs= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724295752; x=1755831752; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GW+o/JX24tfMBd2jcmqZGCe3gw07EVgMrvsvB3e2OvM=; b=LDHWrAMnLIC4/tBNmQ1R8U3XwkSid69Y8F7BQAbNUl3tvAv3WM3mf25q C8Xy76VVYngpSjEYI/egAsOhS1IqaPzPdeCM7BkDmDlB06tSZzmN1PcD1 NmYyZ5C35cv6Iity23cZAmIM6FmQeGOR0lzCrPSgcaLFkErHL2VQa0v22 cynaKif+7zaLAwuM2/6q/Tiwfnurb+GWIhpibbkW9RcjmMljriVSoC7Wa PW58v9L//kQfODq9zG7vEfsNAxCopkFqARJ4/2q5xRXRMTSTA/J5UNEd1 xFO4JyLrkfJOxzaiNFqszfEMSg2fBGYxKHpSufoh97ntMphwDQltTn2AD Q==; X-CSE-ConnectionGUID: WK2q3FEPT3WXN4LQmKdbpg== X-CSE-MsgGUID: CXwd5J7EQWmJb/8nYE5nvA== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="25581814" X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="25581814" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 20:02:31 -0700 X-CSE-ConnectionGUID: L41yx615RIm0yFkXIpmTmg== X-CSE-MsgGUID: U287D+jFQ2WmiN+6bANOZQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="66181816" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by orviesa005.jf.intel.com with ESMTP; 21 Aug 2024 20:02:30 -0700 From: Wangyang Guo To: libc-alpha@sourceware.org Cc: Noah Goldstein , Tianyou Li , Wangyang Guo Subject: [PATCH 1/6] malloc: Split _int_free() into 3 sub functions Date: Thu, 22 Aug 2024 10:59:16 +0800 Message-ID: <20240822025921.3120998-2-wangyang.guo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240822025921.3120998-1-wangyang.guo@intel.com> References: <20240822025921.3120998-1-wangyang.guo@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org Split _int_free() into 3 smaller functions for flexible combination: * _int_free_check -- sanity check for free * tcache_free -- free memory to tcache (quick path) * _int_free_chunk -- free memory chunk (slow path) Signed-off-by: Wangyang Guo --- malloc/malloc.c | 120 ++++++++++++++++++++++++++++-------------------- 1 file changed, 71 insertions(+), 49 deletions(-) diff --git a/malloc/malloc.c b/malloc/malloc.c index bcb6e5b83c..b2373b2212 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -1086,7 +1086,9 @@ typedef struct malloc_chunk* mchunkptr; /* Internal routines. */ static void* _int_malloc(mstate, size_t); -static void _int_free(mstate, mchunkptr, int); +static void _int_free (mstate, mchunkptr, int); +static void _int_free_check (mstate, mchunkptr, INTERNAL_SIZE_T); +static void _int_free_chunk (mstate, mchunkptr, INTERNAL_SIZE_T, int); static void _int_free_merge_chunk (mstate, mchunkptr, INTERNAL_SIZE_T); static INTERNAL_SIZE_T _int_free_create_chunk (mstate, mchunkptr, INTERNAL_SIZE_T, @@ -3206,6 +3208,49 @@ tcache_next (tcache_entry *e) return (tcache_entry *) REVEAL_PTR (e->next); } +static inline bool +tcache_free (mchunkptr p, INTERNAL_SIZE_T size) +{ + bool done = false; + size_t tc_idx = csize2tidx (size); + if (tcache != NULL && tc_idx < mp_.tcache_bins) + { + /* Check to see if it's already in the tcache. */ + tcache_entry *e = (tcache_entry *) chunk2mem (p); + + /* This test succeeds on double free. However, we don't 100% + trust it (it also matches random payload data at a 1 in + 2^ chance), so verify it's not an unlikely + coincidence before aborting. */ + if (__glibc_unlikely (e->key == tcache_key)) + { + tcache_entry *tmp; + size_t cnt = 0; + LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx); + for (tmp = tcache->entries[tc_idx]; + tmp; + tmp = REVEAL_PTR (tmp->next), ++cnt) + { + if (cnt >= mp_.tcache_count) + malloc_printerr ("free(): too many chunks detected in tcache"); + if (__glibc_unlikely (!aligned_OK (tmp))) + malloc_printerr ("free(): unaligned chunk detected in tcache 2"); + if (tmp == e) + malloc_printerr ("free(): double free detected in tcache 2"); + /* If we get here, it was a coincidence. We've wasted a + few cycles, but don't abort. */ + } + } + + if (tcache->counts[tc_idx] < mp_.tcache_count) + { + tcache_put (p, tc_idx); + done = true; + } + } + return done; +} + static void tcache_thread_shutdown (void) { @@ -4490,14 +4535,9 @@ _int_malloc (mstate av, size_t bytes) ------------------------------ free ------------------------------ */ -static void -_int_free (mstate av, mchunkptr p, int have_lock) +static inline void +_int_free_check (mstate av, mchunkptr p, INTERNAL_SIZE_T size) { - INTERNAL_SIZE_T size; /* its size */ - mfastbinptr *fb; /* associated fastbin */ - - size = chunksize (p); - /* Little security check which won't hurt performance: the allocator never wraps around at the end of the address space. Therefore we can exclude some size values which might appear @@ -4510,48 +4550,13 @@ _int_free (mstate av, mchunkptr p, int have_lock) if (__glibc_unlikely (size < MINSIZE || !aligned_OK (size))) malloc_printerr ("free(): invalid size"); - check_inuse_chunk(av, p); - -#if USE_TCACHE - { - size_t tc_idx = csize2tidx (size); - if (tcache != NULL && tc_idx < mp_.tcache_bins) - { - /* Check to see if it's already in the tcache. */ - tcache_entry *e = (tcache_entry *) chunk2mem (p); - - /* This test succeeds on double free. However, we don't 100% - trust it (it also matches random payload data at a 1 in - 2^ chance), so verify it's not an unlikely - coincidence before aborting. */ - if (__glibc_unlikely (e->key == tcache_key)) - { - tcache_entry *tmp; - size_t cnt = 0; - LIBC_PROBE (memory_tcache_double_free, 2, e, tc_idx); - for (tmp = tcache->entries[tc_idx]; - tmp; - tmp = REVEAL_PTR (tmp->next), ++cnt) - { - if (cnt >= mp_.tcache_count) - malloc_printerr ("free(): too many chunks detected in tcache"); - if (__glibc_unlikely (!aligned_OK (tmp))) - malloc_printerr ("free(): unaligned chunk detected in tcache 2"); - if (tmp == e) - malloc_printerr ("free(): double free detected in tcache 2"); - /* If we get here, it was a coincidence. We've wasted a - few cycles, but don't abort. */ - } - } + check_inuse_chunk (av, p); +} - if (tcache->counts[tc_idx] < mp_.tcache_count) - { - tcache_put (p, tc_idx); - return; - } - } - } -#endif +static void +_int_free_chunk (mstate av, mchunkptr p, INTERNAL_SIZE_T size, int have_lock) +{ + mfastbinptr *fb; /* associated fastbin */ /* If eligible, place chunk on a fastbin so it can be found @@ -4657,6 +4662,23 @@ _int_free (mstate av, mchunkptr p, int have_lock) } } +static void +_int_free (mstate av, mchunkptr p, int have_lock) +{ + INTERNAL_SIZE_T size; /* its size */ + + size = chunksize (p); + + _int_free_check (av, p, size); + +#if USE_TCACHE + if (tcache_free (p, size)) + return; +#endif + + _int_free_chunk (av, p, size, have_lock); +} + /* Try to merge chunk P of SIZE bytes with its neighbors. Put the resulting chunk on the appropriate bin list. P must not be on a bin list yet, and it can be in use. */ From patchwork Thu Aug 22 02:59:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Wangyang" X-Patchwork-Id: 1975219 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=mrwJAE1+; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wq7MC14Xdz1ybW for ; Thu, 22 Aug 2024 13:03:51 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 01D48385C6CE for ; Thu, 22 Aug 2024 03:03:49 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by sourceware.org (Postfix) with ESMTPS id 84A97385E82D for ; Thu, 22 Aug 2024 03:02:33 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 84A97385E82D Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 84A97385E82D Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295755; cv=none; b=gsYWqBLWCTYZZAbnkv9uhqCqzo3WLsAerf9JRfMb5woa2fq+gauFVpO9B8KQLXuk5dl+axGu9eGbKiHEhlSa9/Z1Kxt6Y33NbFslKp2Y++Eq1/3Kcu1CdCt+/cz4CcUBXBTdcBsxrdH7mKtGggfWgg73hTciVVrk5H1J0jv2Q88= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295755; c=relaxed/simple; bh=P58cRf+9YQrVHS8betavE5nJK4nI0nvFayGklyCu3Xg=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=jeyppO9mxXp+CPMCTeCZtR2h4wcs75jTGo0g3hG3DhvgKTIcJR8poDBr8MVlt0H97uSijL43hmvXjyTfAUtzqDKuSpVH2g4nmQLTBbmjMUuS+9Z8vj6kg+novSqP3rXiSCW6+HlG0oZoEVDJUgxXyM2IOJGEUqBOPb06Y+ei70Y= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724295753; x=1755831753; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=P58cRf+9YQrVHS8betavE5nJK4nI0nvFayGklyCu3Xg=; b=mrwJAE1+YmTnKfKDWdHnAAXfMo0NvRa2q2dmZSw1HAotuo9g8eBbWFc1 hwwRIReqE9/UBTRk6NZEsy25nSbtBNeYbGfR1dwbTqBrkA9jW/ZJB+YzB AWmD+JpfYuw4CQ8k5JpBQZDSRONm97TblP8BcNUmD160lw2uMdrTTcBSP HXx+IB6ZFEcOKHu0QBmmuqYFNEwdd+zaMj8CdQQr66w3kremtr7GHrpVW VHSo9r1Tlvv+9fDZQO8aiKlnjGhWDdLbYogTw07ePUGE+5L6aj6mLRAK4 wbwUgs7sdXbH8/MathsVahUPqD4YbVdk/8Jz3ZU9DzaI24+6F2eGZbIBN A==; X-CSE-ConnectionGUID: Itjw4v6ZRPS42GfoaGZ3FA== X-CSE-MsgGUID: 56SdLcWATfW1uXzSoTqKEw== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="25581819" X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="25581819" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 20:02:32 -0700 X-CSE-ConnectionGUID: P/N/p7QuR+2cNxg20JmNVA== X-CSE-MsgGUID: qLlJV5VkSPuFtHYTFa0xMQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="66181822" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by orviesa005.jf.intel.com with ESMTP; 21 Aug 2024 20:02:31 -0700 From: Wangyang Guo To: libc-alpha@sourceware.org Cc: Noah Goldstein , Tianyou Li , Wangyang Guo Subject: [PATCH 2/6] malloc: Avoid func call for tcache quick path in free() Date: Thu, 22 Aug 2024 10:59:17 +0800 Message-ID: <20240822025921.3120998-3-wangyang.guo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240822025921.3120998-1-wangyang.guo@intel.com> References: <20240822025921.3120998-1-wangyang.guo@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-11.9 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org Tcache is an important optimzation to accelerate memory free(), things within this code path should be kept as simple as possible. This commit try to remove the function call when free() invokes tcache code path. Result of bench-malloc-thread benchmark Test Platform: Xeon-8380 Ratio: New / Original time_per_iteration (Lower is Better) Threads# | Ratio -----------|------ 1 thread | 0.904 4 threads | 0.919 The performance data shows it can improve bench-malloc-thread benchmark by ~10% in single thread and ~8% in multi-thread scenario. Signed-off-by: Wangyang Guo --- malloc/malloc.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/malloc/malloc.c b/malloc/malloc.c index b2373b2212..4ec6c5db35 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -3440,7 +3440,17 @@ __libc_free (void *mem) (void)tag_region (chunk2mem (p), memsize (p)); ar_ptr = arena_for_chunk (p); - _int_free (ar_ptr, p, 0); + INTERNAL_SIZE_T size = chunksize (p); + +#if USE_TCACHE + _int_free_check (ar_ptr, p, size); + if (tcache_free (p, size)) + { + __set_errno (err); + return; + } +#endif + _int_free_chunk (ar_ptr, p, size, 0); } __set_errno (err); From patchwork Thu Aug 22 02:59:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Wangyang" X-Patchwork-Id: 1975215 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=c9GxLj5c; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wq7Ld5ZVHz1ybW for ; Thu, 22 Aug 2024 13:03:21 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id DC97E3870C17 for ; Thu, 22 Aug 2024 03:03:19 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by sourceware.org (Postfix) with ESMTPS id 4BE0D385DDEE for ; Thu, 22 Aug 2024 03:02:34 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 4BE0D385DDEE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 4BE0D385DDEE Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295756; cv=none; b=N+Yp9lCV7Sx+jVFExXLktFBvOnCLia/w36S3GUetYCAGqU4/hDI+l2ODvovI9u0EtALmemcjhju2TxoXgE/NKpZDuz/9i6uZu9J8We1OuInO/oRXWYPz5ie7P19nJNSwRWu4wYteeNme/MGB4oG7QbBFZN8JbAo7W1btkkHaWxw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295756; c=relaxed/simple; bh=cj0E5I2ipJf9gMrQDmYhXCyp7Nv43nQ3OCA3aCy537o=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=oDOmw/2lYfp7huddOb8uZWpVTwD4uk7BcjpL8S03dE/tQ9UfwNBamGqvbKWIwHgdGUr8wzjiaoWiyJVcnmcev0nWpeV/h5oO7zmJQBBLoQXcfTimsqjyPkhBDHAwZPrNfVM0ZQiERo6Eo24cZFKpZ7psaD5pSr+EQXaiyuaRT2I= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724295754; x=1755831754; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cj0E5I2ipJf9gMrQDmYhXCyp7Nv43nQ3OCA3aCy537o=; b=c9GxLj5cO8RmkVzRYiz99oiIVakkVcdcvkphWY/g5V8gc1teJ8/AYUYq zDJqzKQvwVW5Dsyl59Aig+7M2UdEMb5qvLEvHT+fNGzLrjykz9Kx0VCSl J/ut2ifq1DxgJctH7vKTlZPnoTcWDW4n0uyhROUY9Yo7oDGikyG1brK+a 9Qabm/or3bn9y7211gaxDKVRRBK09uGzQobDvnmIGhrbYqNXnX+yJxq7L 49Da6KyT8RnIi+ADM6Bwy7eDWnk29US2frEarR4GfeBLZHZUIMtj/r7J5 uhRdQ5TgfC/+/xuFvkpsk9DfIBvndKWo6c1kYkuKZRoqwiYa4DmzSqF07 w==; X-CSE-ConnectionGUID: XSYcvOb5R6OI+2MdyIEXmA== X-CSE-MsgGUID: 7pOTecp7Qf6uN+FlABYguQ== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="25581825" X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="25581825" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 20:02:34 -0700 X-CSE-ConnectionGUID: HQDPSdkWQQ+JvqZ96IhQXg== X-CSE-MsgGUID: luyTBhHJR4abdO1JD4xO1w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="66181825" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by orviesa005.jf.intel.com with ESMTP; 21 Aug 2024 20:02:32 -0700 From: Wangyang Guo To: libc-alpha@sourceware.org Cc: Noah Goldstein , Tianyou Li , Wangyang Guo Subject: [PATCH 3/6] malloc: Arena is not needed for tcache path in free() Date: Thu, 22 Aug 2024 10:59:18 +0800 Message-ID: <20240822025921.3120998-4-wangyang.guo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240822025921.3120998-1-wangyang.guo@intel.com> References: <20240822025921.3120998-1-wangyang.guo@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org Arena is not needed for _int_free_check() in non-DEBUG mode. This commit defers arena deference to _int_free_chunk() thus accelerate tcache path. When DEBUG enabled, arena can be obtained from p in do_check_inuse_chunk(). Result of bench-malloc-thread benchmark Test Platform: Xeon-8380 Ratio: New / Original time_per_iteration (Lower is Better) Threads# | Ratio -----------|------ 1 thread | 0.994 4 threads | 0.968 The data shows it can brings 3% performance gain in multi-thread scenario. Signed-off-by: Wangyang Guo --- malloc/malloc.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/malloc/malloc.c b/malloc/malloc.c index 4ec6c5db35..030aff093b 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -2143,6 +2143,9 @@ do_check_inuse_chunk (mstate av, mchunkptr p) { mchunkptr next; + if (av == NULL) + av = arena_for_chunk (p); + do_check_chunk (av, p); if (chunk_is_mmapped (p)) @@ -3439,17 +3442,20 @@ __libc_free (void *mem) /* Mark the chunk as belonging to the library again. */ (void)tag_region (chunk2mem (p), memsize (p)); - ar_ptr = arena_for_chunk (p); INTERNAL_SIZE_T size = chunksize (p); #if USE_TCACHE - _int_free_check (ar_ptr, p, size); + /* av is not needed for _int_free_check in non-DEBUG mode, + in DEBUG mode, av will fetch from p in do_check_inuse_chunk. */ + _int_free_check (NULL, p, size); if (tcache_free (p, size)) { __set_errno (err); return; } #endif + + ar_ptr = arena_for_chunk (p); _int_free_chunk (ar_ptr, p, size, 0); } From patchwork Thu Aug 22 02:59:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Wangyang" X-Patchwork-Id: 1975216 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=BecY+1a1; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=2620:52:3:1:0:246e:9693:128c; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [IPv6:2620:52:3:1:0:246e:9693:128c]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wq7Ln3czTz1ybW for ; Thu, 22 Aug 2024 13:03:29 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 5A41238708FB for ; Thu, 22 Aug 2024 03:03:27 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by sourceware.org (Postfix) with ESMTPS id 6771A38708BE for ; Thu, 22 Aug 2024 03:02:36 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6771A38708BE Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 6771A38708BE Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295758; cv=none; b=Xvmw+udWc3cx+HtTvJy7liHb5fTiBpu7zNotSQWxOzOUgZHKjatkzhAh7+1iZneSu5sgBft9Cnmg+HH5CXs3JkwoS4XTLqcPCT4Z7zeSZfcCqdIF7Up2eHNGfJ7XJ6faZtMkmmPxSBGDe015r5KfzigdwtdLzQaRnWQuEcq3dj0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295758; c=relaxed/simple; bh=QUeZCN/vrqM/2R3OhTqgOniYD1vX69BK6nbpn+BwG2o=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=u7gES/yHRdyBbOCn16ORSoXorOjKuQkYftXisbKiLi0on0ABxuZ3NYyXWNZpXUcQswAuG3ij8glSD8XeEBEtj2kLEUhkr0ETHwfeDphkidjWT4NxgM4xPH/1aBtgpaBt984GV/fuiB9wRMLHI7jvXDo8xDffWvYqK5MU/I7ykIQ= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724295756; x=1755831756; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=QUeZCN/vrqM/2R3OhTqgOniYD1vX69BK6nbpn+BwG2o=; b=BecY+1a1oNjx0uUXPbwlrpjbWpIM3dc7+cMskp9zPvVZ6AunGmrQ9rKO fzJNqCtZv+gqS0eMIyyh7VMx4OASDRBJub6ZpiXeF/RxBmqTqrtWA/wfl c59vEKWwfhPoJQhLbHULMCpO6EEfPV+DtnyL0VEgLk1nIPjjOJaIYdmjc lmwhYGZYzKJWLo6ORKcyFQR9ayYzMReHHGyPZ4t0XjiFpYEClR/mdM9xc fAB19/s8vk8YchxTnk3dFhjL6GFrRZ3Gvp7SjmJQrzZY/D0VZwsoIp8sj fwCiltW8GZ7zsARMBpvKXSLsE4ge0+KwPhsdIU/Elyw5hm+uIJDtZDnP4 g==; X-CSE-ConnectionGUID: AFQwZEjZTI6eR4BrYVitcw== X-CSE-MsgGUID: G6NzMt61QoOvzJYEYBugxQ== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="25581831" X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="25581831" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 20:02:35 -0700 X-CSE-ConnectionGUID: XdXhIO7DRUCSLQ/1IXm2HQ== X-CSE-MsgGUID: RMZszCWsSvyYiXtOADkskw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="66181832" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by orviesa005.jf.intel.com with ESMTP; 21 Aug 2024 20:02:34 -0700 From: Wangyang Guo To: libc-alpha@sourceware.org Cc: Noah Goldstein , Tianyou Li , Wangyang Guo Subject: [PATCH 4/6] benchtests: Add calloc function test to bench-malloc-thread Date: Thu, 22 Aug 2024 10:59:19 +0800 Message-ID: <20240822025921.3120998-5-wangyang.guo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240822025921.3120998-1-wangyang.guo@intel.com> References: <20240822025921.3120998-1-wangyang.guo@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.0 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org Signed-off-by: Wangyang Guo --- benchtests/bench-malloc-thread.c | 114 ++++++++++++++++++++----------- 1 file changed, 76 insertions(+), 38 deletions(-) diff --git a/benchtests/bench-malloc-thread.c b/benchtests/bench-malloc-thread.c index 46fdabd30c..e8429cec10 100644 --- a/benchtests/bench-malloc-thread.c +++ b/benchtests/bench-malloc-thread.c @@ -123,6 +123,8 @@ alarm_handler (int signum) timeout = true; } +typedef size_t (*loop_func_t)(void **); + /* Allocate and free blocks in a random order. */ static size_t malloc_benchmark_loop (void **ptr_arr) @@ -145,10 +147,32 @@ malloc_benchmark_loop (void **ptr_arr) return iters; } +static size_t +calloc_benchmark_loop (void **ptr_arr) +{ + unsigned int offset_state = 0, block_state = 0; + size_t iters = 0; + + while (!timeout) + { + unsigned int next_idx = get_random_offset (&offset_state); + unsigned int next_block = get_random_block_size (&block_state); + + free (ptr_arr[next_idx]); + + ptr_arr[next_idx] = calloc (1, next_block); + + iters++; + } + + return iters; +} + struct thread_args { size_t iters; void **working_set; + loop_func_t benchmark_loop; timing_t elapsed; }; @@ -161,7 +185,7 @@ benchmark_thread (void *arg) timing_t start, stop; TIMING_NOW (start); - iters = malloc_benchmark_loop (thread_set); + iters = args->benchmark_loop (thread_set); TIMING_NOW (stop); TIMING_DIFF (args->elapsed, start, stop); @@ -171,7 +195,7 @@ benchmark_thread (void *arg) } static timing_t -do_benchmark (size_t num_threads, size_t *iters) +do_benchmark (loop_func_t benchmark_loop, size_t num_threads, size_t *iters) { timing_t elapsed = 0; @@ -183,7 +207,7 @@ do_benchmark (size_t num_threads, size_t *iters) memset (working_set, 0, sizeof (working_set)); TIMING_NOW (start); - *iters = malloc_benchmark_loop (working_set); + *iters = benchmark_loop (working_set); TIMING_NOW (stop); TIMING_DIFF (elapsed, start, stop); @@ -201,6 +225,7 @@ do_benchmark (size_t num_threads, size_t *iters) for (size_t i = 0; i < num_threads; i++) { args[i].working_set = working_set[i]; + args[i].benchmark_loop = benchmark_loop; pthread_create(&threads[i], NULL, benchmark_thread, &args[i]); } @@ -214,6 +239,47 @@ do_benchmark (size_t num_threads, size_t *iters) return elapsed; } +static void +bench_function (json_ctx_t *json_ctx, size_t num_threads, + const char *func_name, loop_func_t benchmark_loop) +{ + timing_t cur; + size_t iters = 0; + double d_total_s, d_total_i; + + init_random_values (); + + json_attr_object_begin (json_ctx, func_name); + + json_attr_object_begin (json_ctx, ""); + + timeout = false; + alarm (BENCHMARK_DURATION); + + cur = do_benchmark (benchmark_loop, num_threads, &iters); + + struct rusage usage; + getrusage(RUSAGE_SELF, &usage); + + d_total_s = cur; + d_total_i = iters; + + json_attr_double (json_ctx, "duration", d_total_s); + json_attr_double (json_ctx, "iterations", d_total_i); + json_attr_double (json_ctx, "time_per_iteration", d_total_s / d_total_i); + json_attr_double (json_ctx, "max_rss", usage.ru_maxrss); + + json_attr_double (json_ctx, "threads", num_threads); + json_attr_double (json_ctx, "min_size", MIN_ALLOCATION_SIZE); + json_attr_double (json_ctx, "max_size", MAX_ALLOCATION_SIZE); + json_attr_double (json_ctx, "random_seed", RAND_SEED); + + json_attr_object_end (json_ctx); + + json_attr_object_end (json_ctx); + +} + static void usage(const char *name) { fprintf (stderr, "%s: \n", name); @@ -223,10 +289,8 @@ static void usage(const char *name) int main (int argc, char **argv) { - timing_t cur; - size_t iters = 0, num_threads = 1; + size_t num_threads = 1; json_ctx_t json_ctx; - double d_total_s, d_total_i; struct sigaction act; if (argc == 1) @@ -246,48 +310,22 @@ main (int argc, char **argv) else usage(argv[0]); - init_random_values (); - - json_init (&json_ctx, 0, stdout); - - json_document_begin (&json_ctx); - - json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); - - json_attr_object_begin (&json_ctx, "functions"); - - json_attr_object_begin (&json_ctx, "malloc"); - - json_attr_object_begin (&json_ctx, ""); - memset (&act, 0, sizeof (act)); act.sa_handler = &alarm_handler; sigaction (SIGALRM, &act, NULL); - alarm (BENCHMARK_DURATION); - - cur = do_benchmark (num_threads, &iters); - - struct rusage usage; - getrusage(RUSAGE_SELF, &usage); + json_init (&json_ctx, 0, stdout); - d_total_s = cur; - d_total_i = iters; + json_document_begin (&json_ctx); - json_attr_double (&json_ctx, "duration", d_total_s); - json_attr_double (&json_ctx, "iterations", d_total_i); - json_attr_double (&json_ctx, "time_per_iteration", d_total_s / d_total_i); - json_attr_double (&json_ctx, "max_rss", usage.ru_maxrss); + json_attr_string (&json_ctx, "timing_type", TIMING_TYPE); - json_attr_double (&json_ctx, "threads", num_threads); - json_attr_double (&json_ctx, "min_size", MIN_ALLOCATION_SIZE); - json_attr_double (&json_ctx, "max_size", MAX_ALLOCATION_SIZE); - json_attr_double (&json_ctx, "random_seed", RAND_SEED); + json_attr_object_begin (&json_ctx, "functions"); - json_attr_object_end (&json_ctx); + bench_function (&json_ctx, num_threads, "malloc", malloc_benchmark_loop); - json_attr_object_end (&json_ctx); + bench_function (&json_ctx, num_threads, "calloc", calloc_benchmark_loop); json_attr_object_end (&json_ctx); From patchwork Thu Aug 22 02:59:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Wangyang" X-Patchwork-Id: 1975221 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=Re3n1CS5; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wq7MM3vlgz1ybW for ; Thu, 22 Aug 2024 13:03:59 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BC5BE38708D3 for ; Thu, 22 Aug 2024 03:03:57 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by sourceware.org (Postfix) with ESMTPS id 5D7ED38708F7 for ; Thu, 22 Aug 2024 03:02:37 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5D7ED38708F7 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5D7ED38708F7 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295759; cv=none; b=d+h/FL3+jS3cZUAa2t49UPZwNgUWanbNBNiaEq4c3L//gOtUPMOSjINB5uWzH3WMuSgByL2eu3w/t2BrMRC++8Z4bb26jBhqCF9WXrGszvGwuQEA/kGR8RmF8W/G/2KacM5gFm7nh8hnJ0w6fccw9EOXj0xZMouW4WYXHevrUYA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295759; c=relaxed/simple; bh=TAAEGXCno3Mh0LcApjO9x2v9uPwC0Mcdo20syTvg7qs=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=x7FJPWVfsfZQr6HXYM/Noi3vlfQKLC4F+D1nOgE7Xm1N/0svAE4Rl29wTbaUxtWPJtitUW2/fllZddD4hsivsZ+TmBFIUi0yDrranc6oYvrHtAhtGKAZLEPOGEI9WRurF/FO80LoUlDOq56dgB6oRcyuF8pHVLz+CscksNaYuwY= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724295757; x=1755831757; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=TAAEGXCno3Mh0LcApjO9x2v9uPwC0Mcdo20syTvg7qs=; b=Re3n1CS5ZtXHoD0VQ/JlbmW3ggqnKcOvA18m6aFXKz4ZmbaF0M5kmPqs GaTsJEH6Rn+Ql8KZ7zZG25PWVGUczaTa9TNFSJoH6GpTbNgwbXF4EGkJC T86kjDihpD95xioI7EXgDXlGTaMPqPMwNGulsItEJFJSdMqFTXQwh/96U Ilueus2qDRDChG/1p0O9g944MdVCFts14GH9ucrLL9BPXuDCCaya9YVPS jnZ7lO9PqipDs69PpWrCpAHM1ZtmihW4Suz5tXYIhK5GlYdwzlrEae0gu BL+BU3RDWALZHUCU4n6Ar/lFLwBmA4jzVl9fZ7PDX/aJXpxl45KunfEnv g==; X-CSE-ConnectionGUID: BjmkyB1XRr6lo87kACfqUQ== X-CSE-MsgGUID: EfECJDmzS1S4ROKxVG3v7w== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="25581836" X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="25581836" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 20:02:37 -0700 X-CSE-ConnectionGUID: N5k7Ce0sQbaDBc2p+eiB8g== X-CSE-MsgGUID: 1CDYTgg2RHa+HH1Ts09SDQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="66181842" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by orviesa005.jf.intel.com with ESMTP; 21 Aug 2024 20:02:35 -0700 From: Wangyang Guo To: libc-alpha@sourceware.org Cc: Noah Goldstein , Tianyou Li , Wangyang Guo Subject: [PATCH 5/6] malloc: Add tcache path for calloc Date: Thu, 22 Aug 2024 10:59:20 +0800 Message-ID: <20240822025921.3120998-6-wangyang.guo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240822025921.3120998-1-wangyang.guo@intel.com> References: <20240822025921.3120998-1-wangyang.guo@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org This commit add tcache support in calloc() which can largely improve the performance of small size allocation, especially in multi-thread scenario. clear_mem() is also split out as a helper function for better reusing the code. Result of bench-malloc-thread benchmark Test Platform: Xeon-8380 Bench Function: calloc Ratio: New / Original time_per_iteration (Lower is Better) Threads# | Ratio -----------|------ 1 thread | 0.724 4 threads | 0.534 Signed-off-by: Wangyang Guo --- malloc/malloc.c | 111 ++++++++++++++++++++++++++++++------------------ 1 file changed, 70 insertions(+), 41 deletions(-) diff --git a/malloc/malloc.c b/malloc/malloc.c index 030aff093b..19fdd72444 100644 --- a/malloc/malloc.c +++ b/malloc/malloc.c @@ -3755,16 +3755,55 @@ __libc_pvalloc (size_t bytes) return _mid_memalign (pagesize, rounded_bytes, address); } +static __always_inline void * +clear_mem (void *mem, INTERNAL_SIZE_T csz) +{ + INTERNAL_SIZE_T *d; + unsigned long clearsize, nclears; + + /* Unroll clear of <= 36 bytes (72 if 8byte sizes). We know that + contents have an odd number of INTERNAL_SIZE_T-sized words; + minimally 3. */ + d = (INTERNAL_SIZE_T *) mem; + clearsize = csz - SIZE_SZ; + nclears = clearsize / sizeof (INTERNAL_SIZE_T); + assert (nclears >= 3); + + if (nclears > 9) + return memset (d, 0, clearsize); + + else + { + *(d + 0) = 0; + *(d + 1) = 0; + *(d + 2) = 0; + if (nclears > 4) + { + *(d + 3) = 0; + *(d + 4) = 0; + if (nclears > 6) + { + *(d + 5) = 0; + *(d + 6) = 0; + if (nclears > 8) + { + *(d + 7) = 0; + *(d + 8) = 0; + } + } + } + } + + return mem; +} + void * __libc_calloc (size_t n, size_t elem_size) { mstate av; - mchunkptr oldtop; - INTERNAL_SIZE_T sz, oldtopsize; + mchunkptr oldtop, p; + INTERNAL_SIZE_T sz, oldtopsize, csz; void *mem; - unsigned long clearsize; - unsigned long nclears; - INTERNAL_SIZE_T *d; ptrdiff_t bytes; if (__glibc_unlikely (__builtin_mul_overflow (n, elem_size, &bytes))) @@ -3780,6 +3819,29 @@ __libc_calloc (size_t n, size_t elem_size) MAYBE_INIT_TCACHE (); +#if USE_TCACHE + /* int_free also calls request2size, be careful to not pad twice. */ + size_t tbytes = checked_request2size (bytes); + if (tbytes == 0) + { + __set_errno (ENOMEM); + return NULL; + } + size_t tc_idx = csize2tidx (tbytes); + + if (tc_idx < mp_.tcache_bins + && tcache != NULL + && tcache->counts[tc_idx] > 0) + { + mem = tcache_get (tc_idx); + p = mem2chunk (mem); + if (__glibc_unlikely (mtag_enabled)) + return tag_new_zero_region (mem, memsize (p)); + csz = chunksize (p); + return clear_mem (mem, csz); + } +#endif + if (SINGLE_THREAD_P) av = &main_arena; else @@ -3834,7 +3896,7 @@ __libc_calloc (size_t n, size_t elem_size) if (mem == 0) return 0; - mchunkptr p = mem2chunk (mem); + p = mem2chunk (mem); /* If we are using memory tagging, then we need to set the tags regardless of MORECORE_CLEARS, so we zero the whole block while @@ -3842,7 +3904,7 @@ __libc_calloc (size_t n, size_t elem_size) if (__glibc_unlikely (mtag_enabled)) return tag_new_zero_region (mem, memsize (p)); - INTERNAL_SIZE_T csz = chunksize (p); + csz = chunksize (p); /* Two optional cases in which clearing not necessary */ if (chunk_is_mmapped (p)) @@ -3861,40 +3923,7 @@ __libc_calloc (size_t n, size_t elem_size) } #endif - /* Unroll clear of <= 36 bytes (72 if 8byte sizes). We know that - contents have an odd number of INTERNAL_SIZE_T-sized words; - minimally 3. */ - d = (INTERNAL_SIZE_T *) mem; - clearsize = csz - SIZE_SZ; - nclears = clearsize / sizeof (INTERNAL_SIZE_T); - assert (nclears >= 3); - - if (nclears > 9) - return memset (d, 0, clearsize); - - else - { - *(d + 0) = 0; - *(d + 1) = 0; - *(d + 2) = 0; - if (nclears > 4) - { - *(d + 3) = 0; - *(d + 4) = 0; - if (nclears > 6) - { - *(d + 5) = 0; - *(d + 6) = 0; - if (nclears > 8) - { - *(d + 7) = 0; - *(d + 8) = 0; - } - } - } - } - - return mem; + return clear_mem (mem, csz); } #endif /* IS_IN (libc) */ From patchwork Thu Aug 22 02:59:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Guo, Wangyang" X-Patchwork-Id: 1975217 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@legolas.ozlabs.org Authentication-Results: legolas.ozlabs.org; dkim=pass (2048-bit key; unprotected) header.d=intel.com header.i=@intel.com header.a=rsa-sha256 header.s=Intel header.b=OB6O6+Gn; dkim-atps=neutral Authentication-Results: legolas.ozlabs.org; spf=pass (sender SPF authorized) smtp.mailfrom=sourceware.org (client-ip=8.43.85.97; helo=server2.sourceware.org; envelope-from=libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org; receiver=patchwork.ozlabs.org) Received: from server2.sourceware.org (server2.sourceware.org [8.43.85.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (secp384r1) server-digest SHA384) (No client certificate requested) by legolas.ozlabs.org (Postfix) with ESMTPS id 4Wq7Lz6Y4Cz1ybW for ; Thu, 22 Aug 2024 13:03:39 +1000 (AEST) Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2E8F53870C06 for ; Thu, 22 Aug 2024 03:03:38 +0000 (GMT) X-Original-To: libc-alpha@sourceware.org Delivered-To: libc-alpha@sourceware.org Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) by sourceware.org (Postfix) with ESMTPS id D79AC38708E8 for ; Thu, 22 Aug 2024 03:02:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D79AC38708E8 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D79AC38708E8 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=192.198.163.13 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295760; cv=none; b=JlRUqSplGV96bFxlx0ZtOaAMVqclTRhcoZZ/JozEUnAasB74Pzxj1PDWYZgK8UnQmuCKOhj+Gh64GY6htH9SVxlo5LmQGzCq2txxGS3MVzFjH7aI/AMBiPC09fifWnYTegDMFRjVm1plnmNyYnGWfZRIesD4O9W0XXPCb9cFNBU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1724295760; c=relaxed/simple; bh=qCQmdR5VtkceZps2nOaI9JWENe7XhR1cKy3ICZ0ChJw=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=mBVH9DVdEVegTA1CoBeIOgmP9vzwzlePtj0kt1tledsoh3OR+0wdIectDdyIaHBejHGVelU6Y3ylxQB+bxzjmEgsXCsmm7EXbC65gVDdWYMMa1Fm11q82zF0lj6/wNtcUP3xBuCq48wN97SXs118T2kbQbzAJoj++qbrmWzMphw= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1724295759; x=1755831759; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qCQmdR5VtkceZps2nOaI9JWENe7XhR1cKy3ICZ0ChJw=; b=OB6O6+Gnv7LeOgHei9IyVrRGJ/5IgFcjbMB71kSpnKsCM81blW+BvbQS +jK5LiR7d9vM/xzIbDd/PLuMgyXbVEGBjME1Ii36qwTMxrBKsRGTBBDsW u4WqwyNWDGU1bAP5jc9mRrm5G44m0mdC+iKLoTqils2iPbXwjl+/KgpOs EQexNb36W4cIygPpShmVFVaSNrunXUeLi4aYl9VguSR8d7dVI2BWIVKDb eAWlvR0U+AKstuCbA7mH+KAei3AjyWqrj1PGZecf8FDKjehjvQyrtjW6I VUwErQkZsFFVmm4eb70IOC0A7dVaRaqknfgOm967dzC3hzeGCglXAe8ZR g==; X-CSE-ConnectionGUID: pOHJJyaXSiOzZT87C2xr2g== X-CSE-MsgGUID: TEFRcqThQXKCbEEjPj5vhg== X-IronPort-AV: E=McAfee;i="6700,10204,11171"; a="25581840" X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="25581840" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Aug 2024 20:02:38 -0700 X-CSE-ConnectionGUID: y294rvrzQG+rD1nclYO7uw== X-CSE-MsgGUID: S0+ZpDoKT2i2x/RvlJ7m+g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.10,165,1719903600"; d="scan'208";a="66181846" Received: from linux-pnp-server-11.sh.intel.com ([10.239.176.178]) by orviesa005.jf.intel.com with ESMTP; 21 Aug 2024 20:02:37 -0700 From: Wangyang Guo To: libc-alpha@sourceware.org Cc: Noah Goldstein , Tianyou Li , Wangyang Guo Subject: [PATCH 6/6] malloc: Fix tst-safe-linking failure after enable tcache Date: Thu, 22 Aug 2024 10:59:21 +0800 Message-ID: <20240822025921.3120998-7-wangyang.guo@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240822025921.3120998-1-wangyang.guo@intel.com> References: <20240822025921.3120998-1-wangyang.guo@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: libc-alpha@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Libc-alpha mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: libc-alpha-bounces~incoming=patchwork.ozlabs.org@sourceware.org In previous, calloc() is used as a way to by-pass tcache in memory allocation and trigger safe-linking check in fastbins path. With tcache enabled in calloc(), it needs extra workarounds to bypass tcache. Signed-off-by: Wangyang Guo --- malloc/tst-safe-linking.c | 81 ++++++++++++++++++++++++++++++++------- 1 file changed, 68 insertions(+), 13 deletions(-) diff --git a/malloc/tst-safe-linking.c b/malloc/tst-safe-linking.c index 01dd07004d..5302575ad1 100644 --- a/malloc/tst-safe-linking.c +++ b/malloc/tst-safe-linking.c @@ -111,22 +111,37 @@ test_fastbin (void *closure) int i; int mask = ((int *)closure)[0]; size_t size = TCACHE_ALLOC_SIZE; + void * ps[TCACHE_FILL_COUNT]; + void * pps[TCACHE_FILL_COUNT]; printf ("++ fastbin ++\n"); + /* Populate the fastbin list. */ + void * volatile a = calloc (1, size); + void * volatile b = calloc (1, size); + void * volatile c = calloc (1, size); + printf ("a=%p, b=%p, c=%p\n", a, b, c); + + /* Chunks for later tcache filling from fastbins. */ + for (i = 0; i < TCACHE_FILL_COUNT; ++i) + { + void * volatile p = calloc (1, size); + pps[i] = p; + } + /* Take the tcache out of the game. */ for (i = 0; i < TCACHE_FILL_COUNT; ++i) { void * volatile p = calloc (1, size); - printf ("p=%p\n", p); - free (p); + ps[i] = p; } - /* Populate the fastbin list. */ - void * volatile a = calloc (1, size); - void * volatile b = calloc (1, size); - void * volatile c = calloc (1, size); - printf ("a=%p, b=%p, c=%p\n", a, b, c); + for (i = 0; i < TCACHE_FILL_COUNT; ++i) + { + free (ps[i]); + } + + /* Free abc will return to fastbin in FIFO order. */ free (a); free (b); free (c); @@ -136,11 +151,43 @@ test_fastbin (void *closure) memset (c, mask & 0xFF, size); printf ("After: c=%p, c[0]=%p\n", c, ((void **)c)[0]); + /* Filling fastbins, will be copied to tcache later. */ + for (i = 0; i < TCACHE_FILL_COUNT; ++i) + { + free (pps[i]); + } + + /* Drain out tcache to make sure later alloc from fastbins. */ + for (i = 0; i < TCACHE_FILL_COUNT; ++i) + { + void * volatile p = calloc (1, size); + ps[i] = p; + } + + /* This line will also filling tcache with remain pps and c. */ + pps[TCACHE_FILL_COUNT - 1] = calloc (1, size); + + /* Tcache is FILO, now the first one is c, take it out. */ c = calloc (1, size); printf ("Allocated: c=%p\n", c); + + /* Drain out remain pps from tcache. */ + for (i = 0; i < TCACHE_FILL_COUNT - 1; ++i) + { + void * volatile p = calloc (1, size); + pps[i] = p; + } + /* This line will trigger the Safe-Linking check. */ b = calloc (1, size); printf ("b=%p\n", b); + + /* Free previous pointers. */ + for (i = 0; i < TCACHE_FILL_COUNT; ++i) + { + free (ps[i]); + free (pps[i]); + } } /* Try corrupting the fastbin list and trigger a consolidate. */ @@ -150,21 +197,29 @@ test_fastbin_consolidate (void *closure) int i; int mask = ((int*)closure)[0]; size_t size = TCACHE_ALLOC_SIZE; + void * ps[TCACHE_FILL_COUNT]; printf ("++ fastbin consolidate ++\n"); + /* Populate the fastbin list. */ + void * volatile a = calloc (1, size); + void * volatile b = calloc (1, size); + void * volatile c = calloc (1, size); + printf ("a=%p, b=%p, c=%p\n", a, b, c); + /* Take the tcache out of the game. */ for (i = 0; i < TCACHE_FILL_COUNT; ++i) { void * volatile p = calloc (1, size); - free (p); + ps[i] = p; } - /* Populate the fastbin list. */ - void * volatile a = calloc (1, size); - void * volatile b = calloc (1, size); - void * volatile c = calloc (1, size); - printf ("a=%p, b=%p, c=%p\n", a, b, c); + for (i = 0; i < TCACHE_FILL_COUNT; ++i) + { + free (ps[i]); + } + + /* Free abc will return to fastbin. */ free (a); free (b); free (c);