From patchwork Wed Dec 11 16:49:10 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vladimir Makarov X-Patchwork-Id: 300232 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 07EC62C00AC for ; Thu, 12 Dec 2013 03:49:28 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type :content-transfer-encoding; q=dns; s=default; b=SqzQrcePjFtWer0v LOhjIzgh2Qj6DfqjKC3nPl3J7Rr/j9pl5FzLRfiE/wfvnKqcQfefbn6sdKbnCpfm IKSyZqVHOta4MWKqq5837Y/pcMzZrrj+2xARO40INlLujFJI98Bz3DPWSBa+Wm5U EiSqBWjaKPG81ENumdqgk6SV4xw= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type :content-transfer-encoding; s=default; bh=ju+uPVWb/D021BQ9/vcbPL LRp+M=; b=fsetLEQzlZPIc3AizPgGKKigjPajhM5G5ENmVszN/bRVBbmidm3wnB vYKV3PgLdldojOlperBiVqgK/zn+N7wvaN5sATp64oAf+c9yEjFF72xPbcyfYB55 KFJrMAttXzXiCxDTLzHrJePVxKiCAiEHgMJr9aCzIxhyQeInqHsHA= Received: (qmail 16535 invoked by alias); 11 Dec 2013 16:49:21 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 16525 invoked by uid 89); 11 Dec 2013 16:49:21 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-4.1 required=5.0 tests=AWL, BAYES_00, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from Unknown (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Wed, 11 Dec 2013 16:49:20 +0000 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rBBGnCkj021197 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Wed, 11 Dec 2013 11:49:12 -0500 Received: from Mair-2.local (vpn-59-205.rdu2.redhat.com [10.10.59.205]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id rBBGnBlS027612 for ; Wed, 11 Dec 2013 11:49:12 -0500 Message-ID: <52A89786.20605@redhat.com> Date: Wed, 11 Dec 2013 11:49:10 -0500 From: Vladimir Makarov User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.1.1 MIME-Version: 1.0 To: GCC Patches Subject: RFA: patch to fix PR59466 (inefficient address generation on ppc ) X-IsSubscribed: yes The following patch fixes PR59466 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59466 LRA on PPC sometimes generates inefficient code addis 9,2,.LC77@toc@ha addi 9,9,.LC77@toc@l ld 9,0(9) instead of addis 9,2,.LC77@toc@ha ld 9,.LC77@toc@l(9) I can not create a small test for this. The smallest file when I found is bzip2.c from SPEC2000. LRA generates move insn with invalid memory [unspec[`*.LC29',%2:DI] 45] but it can handle it (make it valid) very efficiently after that trying different numerous transformations. PPC target machinary through validize_mem just put all address in a pseudo. I could prevent use validize_mem in rs6000.c but I prefer to do it in change_addres_1 as other targets might have the same problem and it is better to have one for all solution. Still it does not fully solve the problem as insn r257:DI=[unspec[`*.LC29',%2:DI] 45] cant be recognized as *movdi... pattern has operand predicates rejecting memory because of invalid address. To fix this a change in general_operand is done. As LRA can not work properly with regular insn recognition, I added an assert for this in lra_set_insn_recog_data to figure out this situation earlier. Again, LRA has a very good code for legitimize address by itself and it is better to use it. After applying patch I see code size reduction on SPEC2000. Before the patch (this is relative reload generated code): ----------------CFP2000----------------- -0.471% 27171 27043 168.wupwise -0.457% 14006 13942 171.swim -0.392% 24515 24419 172.mgrid 0.226% 85079 85271 173.applu 0.751% 728891 734363 177.mesa 0.194% 214357 214773 178.galgel -0.295% 21683 21619 179.art 0.412% 31089 31217 183.equake -0.520% 79976 79560 187.facerec 0.000% 152504 152504 188.ammp 0.000% 43758 43758 189.lucas -0.181% 1062265 1060337 191.fma3d 1.035% 1041684 1052468 200.sixtrack -0.105% 151944 151784 301.apsi Average = 0.0139775% ----------------CINT2000----------------- 0.252% 76242 76434 164.gzip 0.172% 186152 186472 175.vpr -0.215% 2084612 2080132 176.gcc 0.000% 16716 16716 181.mcf 0.085% 225316 225508 186.crafty -0.015% 210100 210068 197.parser 0.622% 433635 436332 252.eon -0.298% 762014 759742 253.perlbmk =0.000% 904784 904784 254.gap -0.285% 706432 704416 255.vortex 0.220% 58297 58425 256.bzip2 0.314% 265334 266166 300.twolf Average = 0.070863% After the patch: ----------------CFP2000----------------- -0.589% 27171 27011 168.wupwise -0.457% 14006 13942 171.swim -0.392% 24515 24419 172.mgrid -0.113% 85079 84983 173.applu 0.654% 728891 733659 177.mesa 0.060% 214357 214485 178.galgel -0.295% 21683 21619 179.art -0.412% 31089 30961 183.equake -0.520% 79976 79560 187.facerec 0.000% 152504 152504 188.ammp 0.000% 43758 43758 189.lucas -0.317% 1062265 1058897 191.fma3d 0.356% 1041684 1045396 200.sixtrack -0.105% 151944 151784 301.apsi Average = -0.152103% ----------------CINT2000----------------- 0.084% 76242 76306 164.gzip -0.052% 186152 186056 175.vpr -0.284% 2084612 2078692 176.gcc 0.000% 16716 16716 181.mcf -0.312% 225316 224612 186.crafty -0.091% 210100 209908 197.parser 0.622% 433635 436332 252.eon -0.340% 762014 759422 253.perlbmk 0.000% 904784 904784 254.gap -0.335% 706432 704064 255.vortex 0.110% 58297 58361 256.bzip2 -0.241% 265334 264694 300.twolf Average = -0.070023% Code size reduction for PPC in this case means also faster code generation. I see it but cannot provide reliable SPEC2000 rate change. The patch was successfully bootstrapped and tested on i686, x86_64, and PPC64. Ok to commit? 2013-12-11 Vladimir Makarov PR rtl-optimization/59466 * emit-rtl.c (change_address_1): Don't validate address for LRA. * recog.c (general_operand): Accept any memory for LRA. * lra.c (lra_set_insn_recog_data): Add an assert. Index: emit-rtl.c =================================================================== --- emit-rtl.c (revision 205870) +++ emit-rtl.c (working copy) @@ -1951,7 +1951,9 @@ change_address_1 (rtx memref, enum machi && (!validate || memory_address_addr_space_p (mode, addr, as))) return memref; - if (validate) + /* Don't validate address for LRA. LRA can make the address valid + by itself in most efficient way. */ + if (validate && !lra_in_progress) { if (reload_in_progress || reload_completed) gcc_assert (memory_address_addr_space_p (mode, addr, as)); Index: lra.c =================================================================== --- lra.c (revision 205870) +++ lra.c (working copy) @@ -1072,9 +1072,16 @@ lra_set_insn_recog_data (rtx insn) nop = asm_noperands (PATTERN (insn)); data->operand_loc = data->dup_loc = NULL; if (nop < 0) - /* Its is a special insn like USE or CLOBBER. */ - data->insn_static_data = insn_static_data - = get_static_insn_data (-1, 0, 0, 1); + { + /* Its is a special insn like USE or CLOBBER. We should + recognize any regular insn otherwise LRA can do nothing + with this insn. */ + gcc_assert (GET_CODE (PATTERN (insn)) == USE + || GET_CODE (PATTERN (insn)) == CLOBBER + || GET_CODE (PATTERN (insn)) == ASM_INPUT); + data->insn_static_data = insn_static_data + = get_static_insn_data (-1, 0, 0, 1); + } else { /* expand_asm_operands makes sure there aren't too many Index: recog.c =================================================================== --- recog.c (revision 205870) +++ recog.c (working copy) @@ -1,3 +1,4 @@ + /* Subroutines used by or related to instruction recognition. Copyright (C) 1987-2013 Free Software Foundation, Inc. @@ -1021,8 +1022,12 @@ general_operand (rtx op, enum machine_mo if (! volatile_ok && MEM_VOLATILE_P (op)) return 0; - /* Use the mem's mode, since it will be reloaded thus. */ - if (memory_address_addr_space_p (GET_MODE (op), y, MEM_ADDR_SPACE (op))) + /* Use the mem's mode, since it will be reloaded thus. LRA can + generate move insn with invalid addresses which is made valid + and efficiently calculated by LRA through further numerous + transformations. */ + if (lra_in_progress + || memory_address_addr_space_p (GET_MODE (op), y, MEM_ADDR_SPACE (op))) return 1; }