From patchwork Fri Jan 5 12:22:44 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 856029 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=gcc.gnu.org (client-ip=209.132.180.131; helo=sourceware.org; envelope-from=gcc-patches-return-470232-incoming=patchwork.ozlabs.org@gcc.gnu.org; receiver=) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b="kki0eb7k"; dkim-atps=neutral Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3zCkMs6kX2z9s71 for ; Fri, 5 Jan 2018 23:23:03 +1100 (AEDT) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type :content-transfer-encoding:mime-version; q=dns; s=default; b=jFl /ICTiO04E0X2izzA7K7NTTDV64sOLD9BSH0zq3nQC2xTkKV5FVSk2lfE25GcN5D+ hQ2C8gp6ChTscycf4XXkKoZ3OadehODEe/3Pu+LCkc4sHYQWJGW2i3ctwcPli0Qf akGtkM17djuxMNK2692nkeXdGRbQoaJHp5SKs4HE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:content-type :content-transfer-encoding:mime-version; s=default; bh=StQqhE+Fh 2Vx6fGDSkm9q0rVEGo=; b=kki0eb7kNajaWN7CE0r5FZVgsN51KkS+73ap7yeA0 KQiY0upkCm5QavWlS4zcK4L36RqMg6cRId/ZwEByYmUmYmyvQ/rhREHO3fs1MQU6 cUiTtELQ5ESFo841uzSZ1SdVp9LdWxIHR/IaCvy2ZqVNJiaSYpVMIIkR8Hsc1xT+ 1c= Received: (qmail 40043 invoked by alias); 5 Jan 2018 12:22:51 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 40027 invoked by uid 89); 5 Jan 2018 12:22:50 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-25.2 required=5.0 tests=AWL, BAYES_00, GIT_PATCH_0, GIT_PATCH_1, GIT_PATCH_2, GIT_PATCH_3, RCVD_IN_DNSWL_NONE, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 spammy=x28 X-HELO: EUR01-HE1-obe.outbound.protection.outlook.com Received: from mail-he1eur01on0062.outbound.protection.outlook.com (HELO EUR01-HE1-obe.outbound.protection.outlook.com) (104.47.0.62) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 05 Jan 2018 12:22:48 +0000 Received: from HE1PR0801MB2058.eurprd08.prod.outlook.com (10.168.95.23) by HE1PR0801MB2057.eurprd08.prod.outlook.com (10.168.95.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.366.8; Fri, 5 Jan 2018 12:22:44 +0000 Received: from HE1PR0801MB2058.eurprd08.prod.outlook.com ([fe80::6939:eaa1:2d7d:9f79]) by HE1PR0801MB2058.eurprd08.prod.outlook.com ([fe80::6939:eaa1:2d7d:9f79%17]) with mapi id 15.20.0366.011; Fri, 5 Jan 2018 12:22:44 +0000 From: Wilco Dijkstra To: GCC Patches CC: nd Subject: [PATCH][AArch64] Use LDP/STP in shrinkwrapping Date: Fri, 5 Jan 2018 12:22:44 +0000 Message-ID: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Wilco.Dijkstra@arm.com; x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; HE1PR0801MB2057; 6:+dNtdr+uqMsXUa0u4xWYQ7kVAnrrOvYstPCqOLhkNcW5IDsM01w0xIl0rDXwsM+SP6nxo+2rWacm8qrtMHLM7kGKpbB45cEPjGm7oZgeZTJ5peeYx1dqJdX3hBv7J3K1l/zf0Ib1KhYch+PgihKTGC+McHpkiDNOya/v2e+cINmKjCMc8h9l8MI3VbFtP8RPOa/OtfHROSzDaQk2ayaS/5iADix+Dg+ckyBvsQ+Xt+S0TwmQMnVBzSQjN1QsaAqUXMkbWr3XjaA6j0Gia3AG8rlGRP+82tL5wXStjFRqEe5BS8yweKfrYr0IqDQIT7UuEUm/rHKeEAs+HyeTKfdTAzcjo+iTMLvqu2kFQBpZNm+lkNfkVwbj8FWrDQu6FW2L; 5:diJg7EDE7GlCk9yfKjTbEhQ3cmyiNlkFv4K5tgu7dTwdyWZEU4NDBbijVSQjptyTyanPF9Tvagr60L5/4jySXkZ7iSU4O1XUljeJtSCXdl9A732Z9CA18vWPd7LV9MZLoVaM/F2eofE7RxbqxPZuvGYReu5BP0/OIBft1J66duI=; 24:P4WI2yMC7s1fklvVCgME3N17xNnCpJbyeMfgT7n+n34ONuYGS9Q2dMUZA9zlRYB/sxhA8P0Of+DJi+5BPQ3kt31K9G1fs7DuXKw1ckr/1eQ=; 7:Q3hbKDJvw6D+dIiGYD2uGaa5Gi6s3lvGRbEAFd/oi9Gc3cJw34WJSGP8Qv21cFaVrugqcmQEeMCkkwk+YzH1sRU8dls/pmQ9Z26MrmI1WHBDaaAIUtym843/wXeXWDD0eaXcOTYMxmNvZGxg5nLXbjy3mFcHWyc7KyNLK4ZP/CyetW/I/4qahJUuKJkOZelFD+U+JtsvqXYQwrz5c7NUuAelqOH+KToJLSmClRW4FAtGRrlZUZiUTIUpFvanMuCF x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 806133a0-14e9-4485-a57f-08d554370655 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(4534020)(4602075)(4627115)(201703031133081)(201702281549075)(5600026)(4604075)(3008032)(48565401081)(2017052603307)(7153060); SRVR:HE1PR0801MB2057; x-ms-traffictypediagnostic: HE1PR0801MB2057: nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040470)(2401047)(8121501046)(5005006)(3231023)(944501075)(3002001)(10201501046)(93006095)(93001095)(6055026)(6041268)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123558120)(6072148)(201708071742011); SRVR:HE1PR0801MB2057; BCL:0; PCL:0; RULEID:(100000803101)(100110400095); SRVR:HE1PR0801MB2057; x-forefront-prvs: 05437568AA x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(366004)(396003)(39860400002)(376002)(346002)(39380400002)(199004)(189003)(377424004)(54534003)(55016002)(53936002)(8676002)(3280700002)(9686003)(8936002)(33656002)(81156014)(81166006)(105586002)(106356001)(68736007)(97736004)(2906002)(14454004)(25786009)(72206003)(2900100001)(4326008)(478600001)(6436002)(3846002)(6116002)(6916009)(99286004)(5250100002)(316002)(66066001)(7696005)(7736002)(305945005)(102836004)(74316002)(6506007)(5660300001)(86362001)(3660700001); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0801MB2057; H:HE1PR0801MB2058.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: arm.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: SMd/4zXxW6GXDF9Um2fQch1frnoWImvD7VLmc/p4InHCzK4q30d3sHNovC6h/XRC61ZtrNWe6JaFY97ervCzxw== spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-Network-Message-Id: 806133a0-14e9-4485-a57f-08d554370655 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Jan 2018 12:22:44.8066 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB2057 The shrinkwrap optimization added late in GCC 7 allows each callee-save to be delayed and done only across blocks which need a particular callee-save. Although this reduces unnecessary memory traffic on code paths that need few callee-saves, it typically uses LDR/STR rather than LDP/STP. The number of LDP/STP instructions is reduced by ~7%. This means more memory accesses and increased codesize, ~1.0% on average. To improve this, if a particular callee-save must be saved/restored, also add the adjacent callee-save to allow use of LDP/STP. This significantly reduces codesize (for example gcc_r, povray_r, parest_r, xalancbmk_r are 1% smaller). This is a simple fix which can be backported. A more advanced approach would scan blocks for pairs of callee-saves, but that requires a rewrite of all the callee-save code which is too late at this stage. An example epilog in a shrinkwrapped function before: ldp x21, x22, [sp,#16] ldr x23, [sp,#32] ldr x24, [sp,#40] ldp x25, x26, [sp,#48] ldr x27, [sp,#64] ldr x28, [sp,#72] ldr x30, [sp,#80] ldr d8, [sp,#88] ldp x19, x20, [sp],#96 ret And after this patch: ldr d8, [sp,#88] ldp x21, x22, [sp,#16] ldp x23, x24, [sp,#32] ldp x25, x26, [sp,#48] ldp x27, x28, [sp,#64] ldr x30, [sp,#80] ldp x19, x20, [sp],#96 ret Passes bootstrap, OK for commit (and backport to GCC7)? ChangeLog: 2018-01-05 Wilco Dijkstra * config/aarch64/aarch64.c (aarch64_components_for_bb): Increase LDP/STP opportunities by adding adjacent callee-saves. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 9735fc18402dd8fe2fa4022eef4c0522814a0552..da21032b19413d0361b8d30b51a31124eaaa31a1 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -3503,7 +3503,22 @@ aarch64_components_for_bb (basic_block bb) && (bitmap_bit_p (in, regno) || bitmap_bit_p (gen, regno) || bitmap_bit_p (kill, regno))) - bitmap_set_bit (components, regno); + { + unsigned regno2, offset, offset2; + bitmap_set_bit (components, regno); + + /* If there is a callee-save at an adjacent offset, add it too + to increase the use of LDP/STP. */ + offset = cfun->machine->frame.reg_offset[regno]; + regno2 = ((offset & 8) == 0) ? regno + 1 : regno - 1; + + if (regno2 <= LAST_SAVED_REGNUM) + { + offset2 = cfun->machine->frame.reg_offset[regno2]; + if ((offset & ~8) == (offset2 & ~8)) + bitmap_set_bit (components, regno2); + } + } return components; }