From patchwork Wed May 21 23:44:17 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kugan Vivekanandarajah X-Patchwork-Id: 351350 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 029AC140084 for ; Thu, 22 May 2014 09:44:36 +1000 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; q=dns; s=default; b=GKuJ0ZMBV1mo/vcHtm3M0Eu5x6pmD1y4pLGUxWdb5pY Nd8vaWEoR4wXfe1HmifabYdm7LtwuLoJH0CjyTWBk5yHKPyui7Xf9AjuCAAvz9f2 RJcVFjtPXN9JramGuM4mQdu7nwWf5G5k1zCvyvoBVbebRbEx6Fm7rwzrq+nK2zmY = DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:content-type; s=default; bh=UuB7ftLbFP2dTAJwLozrQC44EFM=; b=K8eVkFp1v6FCmgzd8 w61KXBeACbWAvbCmxtzclYngW4NS947BniuTT1Q8ahZTMEpp5dIIGD0/U6JefuUx DDv4xMJMYzs2lLeP/arCB/ryifwmJOw8bCgcYTfznmn4FeVgmGkFENJAxdvDDz1C HXIpP0JVEBtsYqNTF6uT3rDccc= Received: (qmail 29243 invoked by alias); 21 May 2014 23:44:29 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 29229 invoked by uid 89); 21 May 2014 23:44:28 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-2.0 required=5.0 tests=AWL, BAYES_00, RCVD_IN_DNSWL_NONE, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mail-pd0-f172.google.com Received: from mail-pd0-f172.google.com (HELO mail-pd0-f172.google.com) (209.85.192.172) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES128-SHA encrypted) ESMTPS; Wed, 21 May 2014 23:44:26 +0000 Received: by mail-pd0-f172.google.com with SMTP id x10so1862092pdj.3 for ; Wed, 21 May 2014 16:44:24 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :cc:subject:content-type; bh=e5KcDUlvk8kksyS5dij7JQ+aZ4YPJ4d0W/Ln4C8Yxtw=; b=RJarVQtGrHS6jaz+aSmtvmAZZuUDaRzJs/D8XuA7ZmJ6qBfAKOI4KA4iHwgqQzqy/g 30tJoIkaaHpowPf5RjT9IM5LpddK9eeBumRIJMKVL4sfr68CsZriOWD5QSTy/Z0N3jju MWIuEMgS2VJP+qWhM2bGOjtRB1TBI/Dm5UCdpPEw15oPkAUwrUGnyiab3G8I15QeR35c cJfrITbzfVRZEZqPNu3DT0LGLUQ2weJRwfDMqJtip+2aJ0blLP/NBdxHqdGkdksFh8WC lPfNFsAcGTPuRYvCKrK+1LsmDkL2CK9y4ixkdi/Rr3ZVyf7JivpK4mA422b0vuuVNpzv CPvA== X-Gm-Message-State: ALoCoQklmlpdIyso7wImQY7UA+o4Xu5XWBHxHhcDbhD7Ur1xBqzBbEX/k5dthiJfJzbpyCb+9XrX X-Received: by 10.66.251.101 with SMTP id zj5mr63426725pac.154.1400715864533; Wed, 21 May 2014 16:44:24 -0700 (PDT) Received: from [10.1.1.3] (58-6-183-210.dyn.iinet.net.au. [58.6.183.210]) by mx.google.com with ESMTPSA id y2sm25347374pas.45.2014.05.21.16.44.22 for (version=TLSv1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Wed, 21 May 2014 16:44:23 -0700 (PDT) Message-ID: <537D3A51.2090006@linaro.org> Date: Thu, 22 May 2014 09:44:17 +1000 From: Kugan User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 MIME-Version: 1.0 To: "gcc-patches@gcc.gnu.org" CC: Marcus Shawcroft , Richard Earnshaw Subject: [RFC][AArch64] Define TARGET_SPILL_CLASS X-IsSubscribed: yes Compiling some applications with -mgeneral-regs-only produces better code (runs faster) compared to not using it. The difference here is that when -mgeneral-regs-only is not used, floating point register are also used in register allocation. Then IRA/LRA has to move them to core registers before performing operations. I experimented with TARGET_SPILL_CLASS (as in attached patch) to make floating point register class as just spill class for integer pseudos. Though this benefits the application which had this issue. Overall performance with speck2k is neutral (some of the benchmarks benefits a lot but others regress). I am looking to see if I can make it perform better overall. Any suggestions welcome. Attached experimental patch passes regression but 168.wupwise and 187.facerec miscompares now. I am looking at fixing this as well. Thanks, Kugan gcc/ 2014-05-22 Kugan Vivekanandarajah * config/aarch64/aarch64.c (generic_regmove_cost) : Adjust GP2FP and FP2GP costs. (aarch64_spill_class) : New function. (TARGET_SHIFT_TRUNCATION_MASK) : Define. diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index a3147ee..16d1b51 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -184,8 +184,8 @@ __extension__ static const struct cpu_regmove_cost generic_regmove_cost = { NAMED_PARAM (GP2GP, 1), - NAMED_PARAM (GP2FP, 2), - NAMED_PARAM (FP2GP, 2), + NAMED_PARAM (GP2FP, 5), + NAMED_PARAM (FP2GP, 5), /* We currently do not provide direct support for TFmode Q->Q move. Therefore we need to raise the cost above 2 in order to have reload handle the situation. */ @@ -4882,6 +4882,18 @@ aarch64_register_move_cost (enum machine_mode mode ATTRIBUTE_UNUSED, return regmove_cost->FP2FP; } +/* Return class of registers which could be used for pseudo of MODE + and of class RCLASS for spilling instead of memory. */ +static reg_class_t +aarch64_spill_class (reg_class_t rclass, enum machine_mode mode) +{ + if ((GET_MODE_CLASS (mode) == MODE_INT) + && reg_class_subset_p (rclass, GENERAL_REGS)) + return FP_REGS; + return NO_REGS; +} + + static int aarch64_memory_move_cost (enum machine_mode mode ATTRIBUTE_UNUSED, reg_class_t rclass ATTRIBUTE_UNUSED, @@ -8431,6 +8443,9 @@ aarch64_cannot_change_mode_class (enum machine_mode from, #undef TARGET_SECONDARY_RELOAD #define TARGET_SECONDARY_RELOAD aarch64_secondary_reload +#undef TARGET_SPILL_CLASS +#define TARGET_SPILL_CLASS aarch64_spill_class + #undef TARGET_SHIFT_TRUNCATION_MASK #define TARGET_SHIFT_TRUNCATION_MASK aarch64_shift_truncation_mask