From patchwork Fri Dec 20 04:57:36 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Law X-Patchwork-Id: 303878 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 06CB32C02A6 for ; Fri, 20 Dec 2013 15:57:47 +1100 (EST) DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; q= dns; s=default; b=Efyaq5yzXuUvsNtX8ZRQLfmCdzC451GbqwccDn4Cba5GQo ROsEHfG4XJViBAICqdtAvkeexw8vJ4GWE15kAk+5wWHNhG6x6ugV1mqQmeL8bxKO QVy8RYdwBYPrv6QGdiHBOc6YNhNBkYmb07+FR5euQIsw1W4ACiwFAhrFOE1ek= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:subject:content-type; s= default; bh=x2wrQ2nw0Mx1/9W0GPrfQBLEW4w=; b=hrF/sw0/t8Ln3zYtFLur zEtAUYgA4RQD6qluR9Gb4vjIfgy0657uf8Tzo479KsHY0Cr2g3aol9MBPDupJC+r 5GtKDhxHjo4L6wgPzPqZx7h+sYXtysj1a6Nt88AXt3jyyNqgbtfumPBkJx1YKRrC 8bm3KCT9Eatvbn8r2vMj5dE= Received: (qmail 7067 invoked by alias); 20 Dec 2013 04:57:40 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 7056 invoked by uid 89); 20 Dec 2013 04:57:39 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-1.1 required=5.0 tests=AWL, BAYES_00, RP_MATCHES_RCVD, SPF_HELO_PASS, SPF_PASS autolearn=ham version=3.3.2 X-HELO: mx1.redhat.com Received: from mx1.redhat.com (HELO mx1.redhat.com) (209.132.183.28) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with ESMTP; Fri, 20 Dec 2013 04:57:38 +0000 Received: from int-mx02.intmail.prod.int.phx2.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id rBK4vbR5011012 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK) for ; Thu, 19 Dec 2013 23:57:37 -0500 Received: from stumpy.slc.redhat.com (ovpn-113-52.phx2.redhat.com [10.3.113.52]) by int-mx02.intmail.prod.int.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id rBK4vaYn009508 for ; Thu, 19 Dec 2013 23:57:36 -0500 Message-ID: <52B3CE40.6090103@redhat.com> Date: Thu, 19 Dec 2013 21:57:36 -0700 From: Jeff Law User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0 MIME-Version: 1.0 To: gcc-patches Subject: [RFA][PATCH][middle-end/53623] Improve extension elimination X-IsSubscribed: yes PR53623 is a case where transformations by the tree-ssa optimizers are inhibiting the RTL optimizers. As noted in c#5 we have something like this: (set (reg:HI %ax) (mem:HI (whatever))) (set (reg:DI %rdx) (sign_extend:DI (reg:HI %ax)) (set (reg:DI %rax) (zero_extend:DI (reg:QI %al)) The first extension can't be combined with the memory ref by REE because the source & destination registers are different. The second extention can't be combined either because of the existence of the first extension. What we'd really like is something like this: (set (reg:DI %rdx) (sign_extend:DI (mem:HI (whatever)))) (set (reg:DI %rax) (zero_extend:DI (reg:QI %rdx)) Jakub had the idea that we could extend REE to handle elimination of extentions where the source and destination registers are not the same. That's precisely what this patch does. Rather than create the desired form directly, we instead generate this: (set (reg:DI %rdx) (sign_extend:DI (mem:HI (whatever)))) (set (reg:DI %rax) (reg:DI $rdx) (set (reg:DI %rax) (zero_extend:DI (reg:QI %rax)) With the structure of REE, that's actually far simpler to generate and the hard register copy propagation which runs after REE will propagate away the unnecessary copy. The extension is pretty simple. First we remove the check that the source & destination registers must be the same in add_removable_extension. That gets the first extension on the list of things to consider. combine_reaching_defs has all the checks to ensure this is safe. When the source & destination in the extension are not the same register, we require that: 1. There only be one reaching def for the source register of the extention. This means we only have to fix up that single reaching def rather than multiple reaching defs. We could certainly handle multiple reaching defs if we felt the need in the future, it's just more work. 2. We require both insns to not have already been modified. This is probably overly conservative too. 3. We require both insns to be in the same block. This simplifies the checks is #4. 4. We require the destination of the extension to be neither set nor modified between the memory reference and the extension. [ All other tests for REE must also be passed, those are additional tests for this specific case. ] Assuming all those tests pass, we allow combination of the extension with reaching def statement. We change the destination of the reaching def to be the destination of the extension and delete the now dead extension. The problem is that there may be other uses of the original destination of the memory load. So we copy from the new destination of the load back into the old destination of the load. And it "just works". It triggers all over the place in a bootstrap. A slightly older version triggered 24000 times during a bootstrap (yes, 24000). Bootstrapped and regression tested on x86_64-unknown-linux-gnu. Obviously fixes 53623 (P2 regression). OK for the trunk? Jeff * ree.c (combine_set_extension): Handle case where source and destination registers in an extension insn are different. (combine_reaching_defs): Allow source and destination registers in extension to be different under limited circumstances. (add_removable_extension): Remove restriction that the source and destination registers in the extension are the same. (find_and_remove_re): Emit a copy from the extension's destination to its source after the defining insn if the source and destination registers are different. testsuite/ * gcc.target/i386/pr53623.c: New test. diff --git a/gcc/ree.c b/gcc/ree.c index 9938e98..63ad86c 100644 --- a/gcc/ree.c +++ b/gcc/ree.c @@ -282,9 +282,21 @@ static bool combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set) { rtx orig_src = SET_SRC (*orig_set); - rtx new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set))); rtx new_set; + /* If the extension's source/destination registers are not the same + then we need to change the original load to reference the destination + of the extension. Then we need to emit a copy from that destination + to the original destination of the load. */ + rtx new_reg; + bool copy_needed + = REGNO (SET_DEST (PATTERN (cand->insn))) + != REGNO (XEXP (SET_SRC (PATTERN (cand->insn)), 0)); + if (copy_needed) + new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (PATTERN (cand->insn)))); + else + new_reg = gen_rtx_REG (cand->mode, REGNO (SET_DEST (*orig_set))); + /* Merge constants by directly moving the constant into the register under some conditions. Recall that RTL constants are sign-extended. */ if (GET_CODE (orig_src) == CONST_INT @@ -342,7 +354,8 @@ combine_set_extension (ext_cand *cand, rtx curr_insn, rtx *orig_set) if (dump_file) { fprintf (dump_file, - "Tentatively merged extension with definition:\n"); + "Tentatively merged extension with definition %s:\n", + (copy_needed) ? "(copy needed)" : ""); print_rtl_single (dump_file, curr_insn); } return true; @@ -662,6 +675,45 @@ combine_reaching_defs (ext_cand *cand, const_rtx set_pat, ext_state *state) if (!outcome) return false; + /* If the destination operand of the extension is a different + register than the source operand, then additional restrictions + are needed. */ + if ((REGNO (SET_DEST (PATTERN (cand->insn))) + != REGNO (XEXP (SET_SRC (PATTERN (cand->insn)), 0)))) + { + /* In theory we could handle more than one reaching def, it + just makes the code to update the insn stream more complex. */ + if (state->defs_list.length () != 1) + return false; + + /* We require the candidate not already be modified. This may + be overly conservative. */ + if (state->modified[INSN_UID (cand->insn)].kind != EXT_MODIFIED_NONE) + return false; + + /* There's only one reaching def. */ + rtx def_insn = state->defs_list[0]; + + /* The defining statement must not have been modified either. */ + if (state->modified[INSN_UID (def_insn)].kind != EXT_MODIFIED_NONE) + return false; + + /* The defining statement and candidate insn must be in the same block. + This is merely to keep the test for safety and updating the insn + stream simple. */ + if (BLOCK_FOR_INSN (cand->insn) != BLOCK_FOR_INSN (def_insn)) + return false; + + /* The destination register of the extension insn must not be + used or set between the def_insn and cand->insn exclusive. */ + if (reg_used_between_p (SET_DEST (PATTERN (cand->insn)), + def_insn, cand->insn) + || reg_set_between_p (SET_DEST (PATTERN (cand->insn)), + def_insn, cand->insn)) + return false; + } + + /* If cand->insn has been already modified, update cand->mode to a wider mode if possible, or punt. */ if (state->modified[INSN_UID (cand->insn)].kind != EXT_MODIFIED_NONE) @@ -778,8 +830,7 @@ add_removable_extension (const_rtx expr, rtx insn, if (REG_P (dest) && (code == SIGN_EXTEND || code == ZERO_EXTEND) - && REG_P (XEXP (src, 0)) - && REGNO (dest) == REGNO (XEXP (src, 0))) + && REG_P (XEXP (src, 0))) { struct df_link *defs, *def; ext_cand *cand; @@ -863,6 +914,7 @@ find_and_remove_re (void) int num_re_opportunities = 0, num_realized = 0, i; vec reinsn_list; auto_vec reinsn_del_list; + auto_vec reinsn_copy_list; ext_state state; /* Construct DU chain to get all reaching definitions of each @@ -899,11 +951,41 @@ find_and_remove_re (void) if (dump_file) fprintf (dump_file, "Eliminated the extension.\n"); num_realized++; - reinsn_del_list.safe_push (curr_cand->insn); + if (REGNO (SET_DEST (PATTERN (curr_cand->insn))) + != REGNO (XEXP (SET_SRC (PATTERN (curr_cand->insn)), 0))) + { + reinsn_copy_list.safe_push (curr_cand->insn); + reinsn_copy_list.safe_push (state.defs_list[0]); + } + reinsn_del_list.safe_push (curr_cand->insn); state.modified[INSN_UID (curr_cand->insn)].deleted = 1; } } + /* The copy list contains pairs of insns which describe copies we + need to insert into the INSN stream. + + The first insn in each pair is the extension insn, from which + we derive the source and destination of the copy. + + The second insn in each pair is the memory reference where the + extension will ultimately happen. We emit the new copy + immediately after this insn. + + It may first appear that the arguments for the copy are reversed. + Remember that the memory reference will be changed to refer to the + destination of the extention. So we're actually emitting a copy + from the new destination to the old destination. */ + for (unsigned int i = 0; i < reinsn_copy_list.length (); i += 2) + { + rtx curr_insn = reinsn_copy_list[i]; + rtx pat = PATTERN (curr_insn); + rtx new_reg = gen_rtx_REG (GET_MODE (SET_DEST (pat)), + REGNO (XEXP (SET_SRC (pat), 0))); + rtx set = gen_rtx_SET (VOIDmode, new_reg, SET_DEST (pat)); + emit_insn_after (set, reinsn_copy_list[i + 1]); + } + /* Delete all useless extensions here in one sweep. */ FOR_EACH_VEC_ELT (reinsn_del_list, i, curr_insn) delete_insn (curr_insn); diff --git a/gcc/testsuite/gcc.target/i386/pr53623.c b/gcc/testsuite/gcc.target/i386/pr53623.c new file mode 100644 index 0000000..5375b61 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr53623.c @@ -0,0 +1,25 @@ +/* { dg-do compile { target { x86_64-*-* } } } */ +/* { dg-options "-O2 -fdump-rtl-ree" } */ + + +#include + +typedef (*inst_t)(int64_t rdi, int64_t rsi, int64_t rdx); + +int16_t code[256]; +inst_t dispatch[256]; + +void an_inst(int64_t rdi, int64_t rsi, int64_t rdx) { + rdx = code[rdx]; + uint8_t inst = (uint8_t) rdx; + rdx >>= 8; + dispatch[inst](rdi, rsi, rdx); +} + +int main(void) { + return 0; +} + +/* { dg-final { scan-rtl-dump "copy needed" "ree" } } */ +/* { dg-final { cleanup-rtl-dump "ree" } } */ +