From patchwork Thu Oct 12 09:35:09 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 824733 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=sourceware.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=libc-alpha-return-85693-incoming=patchwork.ozlabs.org@sourceware.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; secure) header.d=sourceware.org header.i=@sourceware.org header.b="cQD7CJms"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3yCQgZ2r23z9t2l for ; Thu, 12 Oct 2017 20:35:21 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; q=dns; s= default; b=CRaO3XC2rIc4FlGb1s1w65dRzySW+1+n6t90lZoscirxmHO0o7okW tAFuchdvis5/SgYNsPR0OXAhXUWL5qHvs527A5sbwCSzeG9vk1oXpwfls+DQB92y HhjkAMLJnNm1u+9+/WM1L6EEwZzB9Yq8RhylWbTATy/jfioFKIAWeo= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=sourceware.org; h=list-id :list-unsubscribe:list-subscribe:list-archive:list-post :list-help:sender:from:to:cc:subject:date:message-id :content-type:content-transfer-encoding:mime-version; s=default; bh=C8hQ1PPkHhlDJntV6ofa4Cp+mlI=; b=cQD7CJmsv5XoqigDji8eGtJWdpDY mPjIQFb0foks2UnQldzF494kEKif1jTBWxhGeSZvrS3Yr2LHvqYaT5VZYjWeXuUB Wor7qURW7fn2NVkQ08xfWchFPdvQZAjhVRp5UltPSHFtLNt2RdwEkl2fFezu+Qk7 3iiJhREfx4x9UqM= Received: (qmail 10784 invoked by alias); 12 Oct 2017 09:35:15 -0000 Mailing-List: contact libc-alpha-help@sourceware.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Subscribe: List-Archive: List-Post: List-Help: , Sender: libc-alpha-owner@sourceware.org Delivered-To: mailing list libc-alpha@sourceware.org Received: (qmail 10772 invoked by uid 89); 12 Oct 2017 09:35:14 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-7.1 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_2, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=scenarios, H*r:15.20.0077.021, Hx-spam-relays-external:15.20.0077.021, H*RU:15.20.0077.021 X-HELO: EUR03-DB5-obe.outbound.protection.outlook.com From: Wilco Dijkstra To: "libc-alpha@sourceware.org" CC: nd Subject: [PATCH 0/5] Add single-threaded fast path to malloc Date: Thu, 12 Oct 2017 09:35:09 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6SPR00MB2500; 6:AWQwOYAgiqcq5becWTTUOtxeQm4GLtgyKEHWpicVFrH6IQMQln5OBLPJowvCHayvaroI8K/00Zvf9J24axuDGrjoLj/6JzI2dRyFEWTA1o0S+U9WZ+rtaJ1ljbgO8gcmj0uCcNfdUGS9+I21DpvmuigPR+fZp6lisZi1u1o+3MzoMNAUG5ISJH8IXM3npLPLXWyHkNtxQSSMqCoFlePRjmyoVT8aVLDMtbCIacQk5mCaobtKUxQD9dmxLilnYMzNaxbeQlslts/IjMvGSLZcjjGPJyeLeZfpjBuvSBOl3EKIg4LY7euttszh47Np8+a4RXw2Sbp3LEoifdFhLh5RyQ==; 5:d1IDln3daKu68GSqI3qdCH5bIY4DtKGtNCD+sWQSdXUBEpyv0YRAhRHY/nM+rNJM3n8mCxaiE889EVOSNaUBXeXvcKGy3mV1yOBQu3bYRdxXGYm7BESt13WT4jHEutPfbHS6tHXE4wPKw06gg7ycXg==; 24:oMgcH1pcRYlbFfItjgbHhGOjVnp+6QfMlOcCqcEWkjeliq+4f2byvX8zhpQ0AsnI4oYHz/GB0p7sfPkAv2Rk1DoW0hibJGwV24VsaTDXnj8=; 7:RZao3SuJVuftKlGPvlnC4oCQ+ChWgb6ZDGHs7Gw3U0O/RMuJpDJomXC3FbCvMArVVyP4ktofOwUzGByjlXrbIfXxI251zQWQHhF0gtNOHB5pjtkl4FgQKn0Dx6/Ea6V4p17hSv+Ce4HxcT1UwDvWKt5Pi88wKERDgM86ChMMi1zQCdcBg2BbhF183upduF845ATJV9okXZLPR6ZwDZ8jN2Ewg+irAQ4/7sZ2h5DXsoE= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 9f7bb2e2-2df3-4dab-1fab-08d5115487d1 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(2017030254152)(48565401081)(2017052603199)(201703131423075)(201703031133081)(201702281549075); SRVR:DB6SPR00MB2500; x-ms-traffictypediagnostic: DB6SPR00MB2500: nodisclaimer: True x-exchange-antispam-report-test: UriScan:; x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(93006095)(93001095)(10201501046)(100000703101)(100105400095)(3002001)(6055026)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123558100)(20161123564025)(20161123562025)(20161123555025)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DB6SPR00MB2500; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DB6SPR00MB2500; x-forefront-prvs: 04583CED1A x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(346002)(376002)(39860400002)(189002)(199003)(68736007)(189998001)(99286003)(2900100001)(14454004)(2351001)(5660300001)(6116002)(3846002)(102836003)(8676002)(105586002)(106356001)(5640700003)(53936002)(6436002)(2501003)(81156014)(6506006)(8936002)(86362001)(3280700002)(3660700001)(81166006)(97736004)(5250100002)(478600001)(101416001)(74316002)(305945005)(66066001)(7736002)(50986999)(6916009)(54356999)(72206003)(33656002)(4326008)(9686003)(55016002)(25786009)(7696004)(2906002)(316002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6SPR00MB2500; H:DB6PR0801MB2053.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Oct 2017 09:35:09.5722 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6SPR00MB2500 This patch series significantly speeds up malloc by adding fast paths for single-threaded applications. In this case we can execute a simpler, faster path. Doing this at a high level means we only need a single check which can bypass multiple locks, atomic instructions and related complexity. The speedup for fastbin allocations on AArch64 is about 2.4x. The number of atomic operations is now zero in single-threaded scenarios. Bench-malloc-thread is 11% faster for 1 thread and 5% faster with 8-32 threads. The new bench-malloc-simple shows a speedup of ~2x for small blocks on AArch64 in the new single-threaded paths, while the multi- threaded path varies between a few percent faster to a few percent slower. On x64 the gain is ~1.5x and 1.1x respectively, below are a few typical results for 16-byte blocks: "malloc_block_size_0016": { "st_num_allocs_0025_time": 53.7072, "st_num_allocs_0100_time": 56.3621, "st_num_allocs_0400_time": 56.9713, "st_num_allocs_1000_time": 57.5048, "mt_num_allocs_0025_time": 93.9841, "mt_num_allocs_0100_time": 108.32, "mt_num_allocs_0400_time": 112.172, "mt_num_allocs_1000_time": 113.499 }, "malloc_block_size_0016": { "st_num_allocs_0025_time": 37.4045, "st_num_allocs_0100_time": 36.7491, "st_num_allocs_0400_time": 37.0647, "st_num_allocs_1000_time": 37.4376, "mt_num_allocs_0025_time": 84.2352, "mt_num_allocs_0100_time": 97.7985, "mt_num_allocs_0400_time": 101.453, "mt_num_allocs_1000_time": 102.858 }, Patch 1: Inline tcache functions Patch 2: Add single-threaded path to malloc/realloc/calloc/memalloc Patch 3: Add single-threaded path to _int_free Patch 4: Fix deadlock in _int_free consistency check Patch 5: Add single-threaded path to _int_malloc malloc/malloc.c | 202 +++++++++++++++++++++++++++++++++++++++----------------- 1 file changed, 141 insertions(+), 61 deletions(-)