From patchwork Mon Oct 24 14:28:07 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Wilco Dijkstra X-Patchwork-Id: 685883 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3t2dtq32xzz9sCZ for ; Tue, 25 Oct 2016 01:28:35 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=PafLTEW3; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; q=dns; s=default; b=QxxOiY0IOWcB3mXQ OaVgmy2PVvZM4TNmjWnT91HslAmIWprXrGcdkOVpwyR67zqMKYLvwQfJpYH5ftmg 1PUYEnpk+cUiueWOUlEe1Yv7wh6VNHBkBxkfmtsZIFF9lj+mN3FFQxgk8iq8WEz7 vXhWMXVVo0vjRuB5YqqX+TMejdw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender:from :to:cc:subject:date:message-id:mime-version:content-type :content-transfer-encoding; s=default; bh=9ArUva8iknXbDoKMeTpxBv FqM18=; b=PafLTEW3Q2hO7VbtlgIHiGKPFW3gwEFsimLpEJ6YOfdfKx7PnqEY4B BJFRnkbG7tyFXKY8muc10TkGoW8gRVldamq3rWC7pKDqRZeXdTy+lCBpZ9MywF+W IsrFtJXNV5rFfvDmGS5m+Ybi9Sz3TzIt9pWrbpYGSe2XCmpB3TfrY= Received: (qmail 29995 invoked by alias); 24 Oct 2016 14:28:24 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 29967 invoked by uid 89); 24 Oct 2016 14:28:23 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.8 required=5.0 tests=AWL, BAYES_00, SPF_PASS autolearn=ham version=3.3.2 spammy=xx, XX, wx X-HELO: eu-smtp-delivery-143.mimecast.com Received: from eu-smtp-delivery-143.mimecast.com (HELO eu-smtp-delivery-143.mimecast.com) (146.101.78.143) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Mon, 24 Oct 2016 14:28:13 +0000 Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01lp0215.outbound.protection.outlook.com [213.199.154.215]) (Using TLS) by eu-smtp-1.mimecast.com with ESMTP id uk-mta-7-N-6fm7Z5NKaVOlzXL0ZCOQ-1; Mon, 24 Oct 2016 15:28:09 +0100 Received: from AM5PR0802MB2610.eurprd08.prod.outlook.com (10.175.46.18) by AM5PR0802MB2386.eurprd08.prod.outlook.com (10.175.43.148) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.669.16; Mon, 24 Oct 2016 14:28:07 +0000 Received: from AM5PR0802MB2610.eurprd08.prod.outlook.com ([10.175.46.18]) by AM5PR0802MB2610.eurprd08.prod.outlook.com ([10.175.46.18]) with mapi id 15.01.0669.021; Mon, 24 Oct 2016 14:28:07 +0000 From: Wilco Dijkstra To: GCC Patches CC: nd , Richard Earnshaw Subject: [PATCH][ARM] Avoid partial overlaps in DImode shifts Date: Mon, 24 Oct 2016 14:28:07 +0000 Message-ID: x-ms-office365-filtering-correlation-id: 89540f2d-142e-43ef-bf4b-08d3fc19f95c x-microsoft-exchange-diagnostics: 1; AM5PR0802MB2386; 7:RgV69RCIpKXK2jng8cqAddP7BdYNOOXKPCPCw+SgTDDVY3DNP/43hgJ4apvO4Gj41MkC/L0lQzN4o2D1Jr/UxOPAWKQCLyRmbXhBCinxG+dHzK7uMjSkoGy/0SuXantygrhdAR7DHhOalqavt7+5kFmws/c6a3KxeI0abJcaTYGCsN+ar5GraZDh6vuCj6EUuj5yO0CtO/qGc59C6WkbIhlvBjEud2F4Np24rdqQZ93ZApiCYLVFm4pDMw6PnLjIAauqmolZTThage2lNyeImw4cpITDuDVBw15ZHjlu8FkLEDrgLJMwh1IFiLdUzlagetLnpW8ez428Yng00LZnvOVwsovWMAn3M7G0UpXUfIA= x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:AM5PR0802MB2386; nodisclaimer: True x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(180628864354917); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040176)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6055026); SRVR:AM5PR0802MB2386; BCL:0; PCL:0; RULEID:; SRVR:AM5PR0802MB2386; x-forefront-prvs: 0105DAA385 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(7916002)(377424004)(199003)(189002)(54534003)(189998001)(8936002)(122556002)(5002640100001)(4001150100001)(3280700002)(3846002)(92566002)(110136003)(7696004)(2900100001)(6116002)(102836003)(586003)(9686002)(50986999)(6916009)(11100500001)(54356999)(97736004)(3660700001)(19580405001)(19580395003)(86362001)(68736007)(2906002)(66066001)(106356001)(105586002)(305945005)(5660300001)(106116001)(450100001)(101416001)(10400500002)(77096005)(4326007)(76576001)(7846002)(87936001)(7736002)(229853001)(8676002)(74316002)(81156014)(81166006)(33656002); DIR:OUT; SFP:1101; SCL:1; SRVR:AM5PR0802MB2386; H:AM5PR0802MB2610.eurprd08.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM MIME-Version: 1.0 X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-originalarrivaltime: 24 Oct 2016 14:28:07.5427 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5PR0802MB2386 X-MC-Unique: N-6fm7Z5NKaVOlzXL0ZCOQ-1 With -fpu=neon DI mode shifts are expanded after reload. DI mode registers can either fully or partially overlap. However the shift expansion code can only deal with the full overlap case, and generates incorrect code for partial overlaps. The fix is to add new variants that support either full overlap or no overlap. Bootstrap & regress on arm-linux-gnueabihf OK. This will need backporting to all active branches. ChangeLog: 2016-10-20 Wilco Dijkstra gcc/ PR target/78041 * config/arm/neon.md (ashldi3_neon): Add "r 0 i" and "&r r i" variants. Remove partial overlap check for shift by 1. (ashldi3_neon): Likewise. testsuite/ * gcc.target/arm/pr78041.c: New test. diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md index 05323334ffd81aeff33ee407b96c788d123b3fe3..59316de004107913c1db0951ced4d584999fc099 100644 --- a/gcc/config/arm/neon.md +++ b/gcc/config/arm/neon.md @@ -1143,12 +1143,12 @@ ) (define_insn_and_split "ashldi3_neon" - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r, ?w,w") - (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, r, 0w,w") - (match_operand:SI 2 "general_operand" "rUm, i, r, i,rUm,i"))) - (clobber (match_scratch:SI 3 "= X, X,?&r, X, X,X")) - (clobber (match_scratch:SI 4 "= X, X,?&r, X, X,X")) - (clobber (match_scratch:DI 5 "=&w, X, X, X, &w,X")) + [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r, ?w,w") + (ashift:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r, 0w,w") + (match_operand:SI 2 "general_operand" "rUm, i, r, i, i,rUm,i"))) + (clobber (match_scratch:SI 3 "= X, X,?&r, X, X, X,X")) + (clobber (match_scratch:SI 4 "= X, X,?&r, X, X, X,X")) + (clobber (match_scratch:DI 5 "=&w, X, X, X, X, &w,X")) (clobber (reg:CC_C CC_REGNUM))] "TARGET_NEON" "#" @@ -1180,9 +1180,11 @@ } else { - if (operands[2] == CONST1_RTX (SImode) - && (!reg_overlap_mentioned_p (operands[0], operands[1]) - || REGNO (operands[0]) == REGNO (operands[1]))) + /* The shift expanders support either full overlap or no overlap. */ + gcc_assert (!reg_overlap_mentioned_p (operands[0], operands[1]) + || REGNO (operands[0]) == REGNO (operands[1])); + + if (operands[2] == CONST1_RTX (SImode)) /* This clobbers CC. */ emit_insn (gen_arm_ashldi3_1bit (operands[0], operands[1])); else @@ -1191,8 +1193,8 @@ } DONE; }" - [(set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits") - (set_attr "opt" "*,*,speed,speed,*,*") + [(set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits") + (set_attr "opt" "*,*,speed,speed,speed,*,*") (set_attr "type" "multiple")] ) @@ -1241,12 +1243,12 @@ ;; ashrdi3_neon ;; lshrdi3_neon (define_insn_and_split "di3_neon" - [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?w,?w") - (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, r,0w, w") - (match_operand:SI 2 "reg_or_int_operand" " r, i, r, i, r, i"))) - (clobber (match_scratch:SI 3 "=2r, X, &r, X,2r, X")) - (clobber (match_scratch:SI 4 "= X, X, &r, X, X, X")) - (clobber (match_scratch:DI 5 "=&w, X, X, X,&w, X")) + [(set (match_operand:DI 0 "s_register_operand" "= w, w,?&r,?r,?&r,?w,?w") + (RSHIFTS:DI (match_operand:DI 1 "s_register_operand" " 0w, w, 0r, 0, r,0w, w") + (match_operand:SI 2 "reg_or_int_operand" " r, i, r, i, i, r, i"))) + (clobber (match_scratch:SI 3 "=2r, X, &r, X, X,2r, X")) + (clobber (match_scratch:SI 4 "= X, X, &r, X, X, X, X")) + (clobber (match_scratch:DI 5 "=&w, X, X, X, X,&w, X")) (clobber (reg:CC CC_REGNUM))] "TARGET_NEON" "#" @@ -1282,9 +1284,11 @@ } else { - if (operands[2] == CONST1_RTX (SImode) - && (!reg_overlap_mentioned_p (operands[0], operands[1]) - || REGNO (operands[0]) == REGNO (operands[1]))) + /* The shift expanders support either full overlap or no overlap. */ + gcc_assert (!reg_overlap_mentioned_p (operands[0], operands[1]) + || REGNO (operands[0]) == REGNO (operands[1])); + + if (operands[2] == CONST1_RTX (SImode)) /* This clobbers CC. */ emit_insn (gen_arm_di3_1bit (operands[0], operands[1])); else @@ -1295,8 +1299,8 @@ DONE; }" - [(set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits") - (set_attr "opt" "*,*,speed,speed,*,*") + [(set_attr "arch" "neon_for_64bits,neon_for_64bits,*,*,*,avoid_neon_for_64bits,avoid_neon_for_64bits") + (set_attr "opt" "*,*,speed,speed,speed,*,*") (set_attr "type" "multiple")] ) diff --git a/gcc/testsuite/gcc.target/arm/pr78041.c b/gcc/testsuite/gcc.target/arm/pr78041.c new file mode 100644 index 0000000000000000000000000000000000000000..340ab5cb433b28ca7d47e236fee93581e7c195c4 --- /dev/null +++ b/gcc/testsuite/gcc.target/arm/pr78041.c @@ -0,0 +1,20 @@ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-require-effective-target arm_neon_ok } */ +/* { dg-options "-fno-inline -mthumb -O1 -mfpu=neon -w" } */ + +extern void abort (void); + +register long long x asm ("r1"); + +long long f (void) +{ + return x << 5; +} + +int main () +{ + x = 0x0100000001; + if (f () != 0x2000000020) + abort (); + return 0; +}