From patchwork Tue Jul 19 15:32:41 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 650258 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rv3w32RX4z9sBM for ; Wed, 20 Jul 2016 01:33:07 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Mu/5ty9I; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=Vw0+s/hms1H/ddHm Gd3j1Slf5r4kB3E52OOBB3U0gScG9y9V9dggKQ+So7jM5rbatoAA8vv3Bvhwl+Hk TwEZTV6dZ3RZwczBsbl+ghNWp/VBXSxGkSTgxBI5Z/eXKV5cPusTtP8MJ7H31CCO 9CaGdqbhKY0DJn0kJr1OXLJLmPE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=wxtM07DYY8Ef5EzSHh6LDA 1+GrQ=; b=Mu/5ty9ICf8GubuYcLOwhOaRahgvsnPYiKHI4psYJHlr/1j7KwIqJM rfHVcHzcRFRDQgRhncOlHVehvPZtP1/UCnuMYzwlQv0UeUT95I9qYelV96sH3cVx wJiOZJ3ZwRgvohyMa53R34FF6c125g0Q/w2TZAzZhSuui9nRnhrF8= Received: (qmail 30809 invoked by alias); 19 Jul 2016 15:32:59 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 30787 invoked by uid 89); 19 Jul 2016 15:32:58 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.3 required=5.0 tests=AWL, BAYES_00, KAM_ASCII_DIVIDERS, KAM_LOTSOFHASH, SPF_PASS autolearn=no version=3.3.2 spammy=rm, bic, SHORT, match_operand X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Tue, 19 Jul 2016 15:32:48 +0000 Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01lp0248.outbound.protection.outlook.com [213.199.154.248]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-7-0khagSblNQGhFt-rQNGmsg-1; Tue, 19 Jul 2016 16:32:43 +0100 Received: from HE1PR0801MB1482.eurprd08.prod.outlook.com (10.167.190.136) by HE1PR0801MB1385.eurprd08.prod.outlook.com (10.167.248.17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.544.10; Tue, 19 Jul 2016 15:32:41 +0000 Received: from HE1PR0801MB1482.eurprd08.prod.outlook.com ([10.167.190.136]) by HE1PR0801MB1482.eurprd08.prod.outlook.com ([10.167.190.136]) with mapi id 15.01.0544.014; Tue, 19 Jul 2016 15:32:41 +0000 From: Wilco Dijkstra To: GCC Patches CC: nd , James Greenhalgh Subject: [PATCH 3/3][AArch64] Improve zero extend Date: Tue, 19 Jul 2016 15:32:41 +0000 Message-ID: x-ms-office365-filtering-correlation-id: 3d9fc5dc-3640-4132-9458-08d3afe9ec64 x-microsoft-exchange-diagnostics: 1; HE1PR0801MB1385; 20:QbfeAKVr7gzwR5lj5U1kbzRQ7zLJTvioJaPma5dqIbTXfxNL+1/NVv4nG5XExeix1GHXVkObXdNNT5nkQ7KbrCgn8v2833J1gZ9ybbkZWYCznsYfq/ntgg4dt4Cp5NNZ6FDrP6sxxTC9cZ3cjDGN5vdH3pWDLIwHSfhQbC0BqRo= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HE1PR0801MB1385; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026); SRVR:HE1PR0801MB1385; BCL:0; PCL:0; RULEID:; SRVR:HE1PR0801MB1385; x-forefront-prvs: 000800954F x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(189002)(377424004)(199003)(586003)(110136002)(5002640100001)(76576001)(101416001)(229853001)(106116001)(106356001)(50986999)(54356999)(122556002)(105586002)(74316002)(5003600100003)(7696003)(7736002)(305945005)(189998001)(3280700002)(2900100001)(19580395003)(19580405001)(450100001)(8676002)(3660700001)(97736004)(7846002)(68736007)(81156014)(81166006)(10400500002)(2906002)(8936002)(86362001)(6116002)(3846002)(11100500001)(66066001)(102836003)(92566002)(33656002)(87936001)(9686002)(4326007)(77096005); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0801MB1385; H:HE1PR0801MB1482.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Jul 2016 15:32:41.5225 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0801MB1385 X-MC-Unique: 0khagSblNQGhFt-rQNGmsg-1 On AArch64 the UXTB and UXTH instructions are aliases of UBFM, which does a shift as part of its operation. An AND immediate is a simpler operation, and might be faster on some implementations, so it is better to emit this this instead of UBFM. Benchmarking showed no difference on implementations where UBFM has the same performance as AND, and minor speedups across several benchmarks on an implementation where UBFM is slower than AND. Bootstrapped and tested on aarch64-none-elf. 2016-07-19 Kristina Martsenko 2016-07-19 Wilco Dijkstra * config/aarch64/aarch64.md (zero_extend2_aarch64): Change output statement and type. (qihi2_aarch64): Likewise, and split into two. (extendqihi2_aarch64): New. (zero_extendqihi2_aarch64): New. * config/aarch64/iterators.md (ldrxt): Remove. * config/aarch64/aarch64.c (aarch64_rtx_costs): Change cost of uxtb/uxth. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index c7249e8e98905bea4879bb2e2ee81d51a1004faa..e98e41521bfa8f807248b0147843de9e1f3447e3 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -6886,8 +6886,8 @@ cost_plus: } else { - /* UXTB/UXTH. */ - *cost += extra_cost->alu.extend; + /* We generate an AND instead of UXTB/UXTH. */ + *cost += extra_cost->alu.logical; } } return false; diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 64f9ca1c4d1bec64cef769c9dbef9e4b5b00ba9e..5e8b1a815515eabc7e69c75574c2c300f50a6fe4 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1580,10 +1580,10 @@ (zero_extend:GPI (match_operand:SHORT 1 "nonimmediate_operand" "r,m,m")))] "" "@ - uxt\t%0, %w1 + and\t%0, %1, ldr\t%w0, %1 ldr\t%0, %1" - [(set_attr "type" "extend,load1,load1")] + [(set_attr "type" "logic_imm,load1,load1")] ) (define_expand "qihi2" @@ -1592,16 +1592,26 @@ "" ) -(define_insn "*qihi2_aarch64" +(define_insn "*extendqihi2_aarch64" [(set (match_operand:HI 0 "register_operand" "=r,r") - (ANY_EXTEND:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] + (sign_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] "" "@ - xtb\t%w0, %w1 - b\t%w0, %1" + sxtb\t%w0, %w1 + ldrsb\t%w0, %1" [(set_attr "type" "extend,load1")] ) +(define_insn "*zero_extendqihi2_aarch64" + [(set (match_operand:HI 0 "register_operand" "=r,r") + (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" "r,m")))] + "" + "@ + and\t%w0, %w1, 255 + ldrb\t%w0, %1" + [(set_attr "type" "logic_imm,load1")] +) + ;; ------------------------------------------------------------------- ;; Simple arithmetic ;; ------------------------------------------------------------------- diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md index e8fbb1281dec2e8f37f58ef2ced792dd62e3b5aa..ef48ffda6f98a2d4aa29daaca206fef2bafcda48 100644 --- a/gcc/config/aarch64/iterators.md +++ b/gcc/config/aarch64/iterators.md @@ -888,9 +888,6 @@ ;; Similar, but when not(op) (define_code_attr nlogical [(and "bic") (ior "orn") (xor "eon")]) -;; Sign- or zero-extending load -(define_code_attr ldrxt [(sign_extend "ldrs") (zero_extend "ldr")]) - ;; Sign- or zero-extending data-op (define_code_attr su [(sign_extend "s") (zero_extend "u") (sign_extract "s") (zero_extract "u")