From patchwork Fri May 27 13:46:14 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kyrill Tkachov X-Patchwork-Id: 627184 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3rGS3f3spvz9t3f for ; Fri, 27 May 2016 23:46:37 +1000 (AEST) Authentication-Results: ozlabs.org; dkim=pass (1024-bit key; unprotected) header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=Y011G4g2; dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; q=dns; s=default; b=bHt0oCQKsJGzjng6s 3P2FvkkVonE9puJfreJRcADLX05RhpYQK+GaLq/+QSq8HRmBXI91RtZoZlIj7Yfl ZaVZoOhjLHFMfcE/pJJgvjO+sonumEXlaQBS7fBEoOP3dcD8hWJHgHPD7J1hhv57 0CX2ubAttvJG75r9bMJwAGQFQc= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type; s=default; bh=r8EBA175WujHa1TYJ3iBDXe 4TPY=; b=Y011G4g2DrouvODArya1f1KRUSITKAo9b06toXf8inW7WSMlMHRQg7A MH8yIDwUyTrwqNAUUlgopmAMWiHBkfvtnq7w1kN1SFuA2vRJIJuf4XvqdeLoJRTn Kzrvk5v2LH1wHemk+X/F9jTZWI1GlQcz1sEyUCMH0MmejHCbBqDE= Received: (qmail 107598 invoked by alias); 27 May 2016 13:46:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 107585 invoked by uid 89); 27 May 2016 13:46:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.1 required=5.0 tests=BAYES_00, KAM_LAZY_DOMAIN_SECURITY, KAM_LOTSOFHASH, RP_MATCHES_RCVD autolearn=ham version=3.3.2 spammy=james's, jamess X-HELO: foss.arm.com Received: from foss.arm.com (HELO foss.arm.com) (217.140.101.70) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 27 May 2016 13:46:18 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 483C5434; Fri, 27 May 2016 06:46:42 -0700 (PDT) Received: from [10.2.206.43] (e100706-lin.cambridge.arm.com [10.2.206.43]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D63193F21A; Fri, 27 May 2016 06:46:15 -0700 (PDT) Message-ID: <57484FA6.1060600@foss.arm.com> Date: Fri, 27 May 2016 14:46:14 +0100 From: Kyrill Tkachov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.2.0 MIME-Version: 1.0 To: GCC Patches CC: Ramana Radhakrishnan , Richard Earnshaw Subject: Re: [PATCH][ARM] Tie operand 1 to operand 0 in AESMC pattern when fusing AES/AESMC References: <573EE122.9030002@foss.arm.com> In-Reply-To: <573EE122.9030002@foss.arm.com> On 20/05/16 11:04, Kyrill Tkachov wrote: > Hi all, > > The recent -frename-registers change exposed a deficiency in the way we fuse AESE/AESMC instruction > pairs in arm. > > Basically we want to enforce: > AESE Vn, _ > AESMC Vn, Vn > > to enable the fusion, but regrename comes along and renames the output Vn register in AESMC to something > else, killing the fusion in the hardware. > > The solution in this patch is to add an alternative that ties the input and output registers in the AESMC pattern > and enable that alternative when the fusion is enabled. > > With this patch I've confirmed that the above preferred register sequence is kept even with -frename-registers > when tuning for a cpu that enables the fusion and that the chain is broken by regrename otherwise and have > seen the appropriate improvement in a proprietary benchmark (that I cannot name) that exercises this sequence. > > Bootstrapped and tested on arm-none-linux-gnueabihf. > > Ok for trunk? > Following James's feedback on the AArch64 version, this slightly modified version uses the enum type for the argument of the new function. Is this ok instead? Thanks, Kyrill 2016-05-27 Kyrylo Tkachov * config/arm/arm.c (arm_fusion_enabled_p): New function. * config/arm/arm-protos.h (arm_fusion_enabled_p): Declare prototype. * config/arm/crypto.md (crypto_, CRYPTO_UNARY): Add "=w,0" alternative. Enable it when AES/AESMC fusion is enabled. diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h index cf221d6793eaf0959f2713fe0903a5d8602ec2f4..12a781de13f2f7816cc2b16b04835d87c83f7abb 100644 --- a/gcc/config/arm/arm-protos.h +++ b/gcc/config/arm/arm-protos.h @@ -320,6 +320,7 @@ extern int vfp3_const_double_for_bits (rtx); extern void arm_emit_coreregs_64bit_shift (enum rtx_code, rtx, rtx, rtx, rtx, rtx); +extern bool arm_fusion_enabled_p (tune_params::fuse_ops); extern bool arm_valid_symbolic_address_p (rtx); extern bool arm_validize_comparison (rtx *, rtx *, rtx *); #endif /* RTX_CODE */ diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c index 5110d9e989d605a9e2c262e6007b89a1c7dc7080..39a24c06c123b86883134368ef39794abf11898b 100644 --- a/gcc/config/arm/arm.c +++ b/gcc/config/arm/arm.c @@ -29704,6 +29704,13 @@ aarch_macro_fusion_pair_p (rtx_insn* prev, rtx_insn* curr) return false; } +/* Return true iff the instruction fusion described by OP is enabled. */ +bool +arm_fusion_enabled_p (tune_params::fuse_ops op) +{ + return current_tune->fusible_ops & op; +} + /* Implement the TARGET_ASAN_SHADOW_OFFSET hook. */ static unsigned HOST_WIDE_INT diff --git a/gcc/config/arm/crypto.md b/gcc/config/arm/crypto.md index c6f17270b1dbaf6dc43eb1e9b8a182dbb0f5a1e1..0f510f069408471fcbf6751f161e984f39929813 100644 --- a/gcc/config/arm/crypto.md +++ b/gcc/config/arm/crypto.md @@ -18,14 +18,27 @@ ;; along with GCC; see the file COPYING3. If not see ;; . + +;; When AES/AESMC fusion is enabled we want the register allocation to +;; look like: +;; AESE Vn, _ +;; AESMC Vn, Vn +;; So prefer to tie operand 1 to operand 0 when fusing. + (define_insn "crypto_" - [(set (match_operand: 0 "register_operand" "=w") + [(set (match_operand: 0 "register_operand" "=w,w") (unspec: [(match_operand: 1 - "register_operand" "w")] + "register_operand" "0,w")] CRYPTO_UNARY))] "TARGET_CRYPTO" ".\\t%q0, %q1" - [(set_attr "type" "")] + [(set_attr "type" "") + (set_attr_alternative "enabled" + [(if_then_else (match_test + "arm_fusion_enabled_p (tune_params::FUSE_AES_AESMC)") + (const_string "yes" ) + (const_string "no")) + (const_string "yes")])] ) (define_insn "crypto_"