From patchwork Sat Feb 28 16:38:03 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Georg-Johann Lay X-Patchwork-Id: 444650 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from sourceware.org (server1.sourceware.org [209.132.180.131]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 16BFB140119 for ; Sun, 1 Mar 2015 03:38:21 +1100 (AEDT) Authentication-Results: ozlabs.org; dkim=pass reason="1024-bit key; unprotected key" header.d=gcc.gnu.org header.i=@gcc.gnu.org header.b=VGaFY9kf; dkim-adsp=none (unprotected policy); dkim-atps=neutral DomainKey-Signature: a=rsa-sha1; c=nofws; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type:content-transfer-encoding; q=dns; s= default; b=Z4tFQNA8zHEIawuuQRh7ndQWMoIc4asPqZ/plunpRswJ+KBryRCWk QwZ4ZVGjq7Ka/RMsPCK7mhTATF7OyhmkTbkz9TfnN6tvD65riRWYA2m+T8dN57S2 WpNoLwvbwL9vZkUPycuPcp+8P4xkXrvvrqykP78pb4yBQPQnetjqN4= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=gcc.gnu.org; h=list-id :list-unsubscribe:list-archive:list-post:list-help:sender :message-id:date:from:mime-version:to:cc:subject:references :in-reply-to:content-type:content-transfer-encoding; s=default; bh=rT79A56Ny+gUadV9FzAWTyYH/pI=; b=VGaFY9kfXbPtm0V6TjvjYTKQQSr3 n31tyX7qnooCnGpUJDMAJXQ2XP7dlvMBtfd7OxGCY8wjAruvIAjdiLoKdT0oQIkr 3NBPzIV7XqF0sF+pkRc5Gd1RhiyB0pSXafpdreD9bFssMJ/vTpf+OqUTdCnFiGKu g60CMi82Xc2g0HM= Received: (qmail 99674 invoked by alias); 28 Feb 2015 16:38:13 -0000 Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk List-Id: List-Unsubscribe: List-Archive: List-Post: List-Help: Sender: gcc-patches-owner@gcc.gnu.org Delivered-To: mailing list gcc-patches@gcc.gnu.org Received: (qmail 99663 invoked by uid 89); 28 Feb 2015 16:38:12 -0000 Authentication-Results: sourceware.org; auth=none X-Virus-Found: No X-Spam-SWARE-Status: No, score=-0.1 required=5.0 tests=AWL, BAYES_50, RCVD_IN_DNSWL_NONE autolearn=ham version=3.3.2 X-HELO: mo4-p00-ob.smtp.rzone.de Received: from mo4-p00-ob.smtp.rzone.de (HELO mo4-p00-ob.smtp.rzone.de) (81.169.146.218) by sourceware.org (qpsmtpd/0.93/v0.84-503-g423c35a) with (AES256-GCM-SHA384 encrypted) ESMTPS; Sat, 28 Feb 2015 16:38:08 +0000 X-RZG-AUTH: :LXoWVUeid/7A29J/hMvvT3ol15ykJcYwTPbBBR62PQx1xqvTHw== X-RZG-CLASS-ID: mo00 Received: from [192.168.0.22] (ip5b43a95f.dynamic.kabel-deutschland.de [91.67.169.95]) by smtp.strato.de (RZmta 37.3 DYNA|AUTH) with ESMTPSA id z05c60r1SGc482b (using TLSv1.2 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate); Sat, 28 Feb 2015 17:38:04 +0100 (CET) Message-ID: <54F1EEEB.6060905@gjlay.de> Date: Sat, 28 Feb 2015 17:38:03 +0100 From: Georg-Johann Lay User-Agent: Mozilla/5.0 (X11; Linux i686; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Steven Bosscher CC: GCC Patches , Denis Chertykov Subject: Re: [patch, avr] Take 2: Fix PR64331: insn output and insn length computation rely on REG_DEAD notes. References: <54EB06C0.2080503@gjlay.de> <54EF7591.3030900@gjlay.de> In-Reply-To: X-IsSubscribed: yes Am 02/26/2015 um 11:45 PM schrieb Steven Bosscher: > On Thu, Feb 26, 2015 at 8:35 PM, Georg-Johann Lay wrote: >> Take #2 introduces a new, avr-specific rtl pass whose sole purpose is to >> rectify notes. The pass is scheduled right before cfg does down (right >> before .*free_cfg) so that cfg and hence df machinery is available. >> >> Regression tests look fine and for the test case the patches produce correct >> code and correct insn length. > > Sorry for party-pooping, but it seems to me that the real bug is that > the target is lying to reload. > > IIUC the AVR port selects the insn alternative after register > allocation (after reload). It bases its selection on REG_DEAD notes. > In PR64331 an alternative is used that clobbers r20 that has a > REG_DEAD note, but r20 is not actually dead because hardreg-cprop has > propagated it forward without adjusting the note. It's not actually about constraint alternatives. Let me give an example: Testing HI for 0. The usual sequence would be cc0 = reg.low == 0 cc0 = cc0 && reg.high == 0 which costs 2 instructions. If reg is unused after, then ORing can be used and cc0 will be set as a side effect. This costs 1 insn: cc0 = (reg.low |= reg.high) Using alternatives would double their number, i.e. 14 instead of 7 for *cmphi. These constraint alternatives would have to express 1) reg-alloc, please, use alternative #1 (with clobber of reg) if the register is unused after 2) reg-alloc, please, use alternative #2 (not clobbering reg) if the register is used after If we assume for a moment that we have a *testhi insn and the old *tsthi had just 1 alternative: (define_insn "*tsthi" [(set (cc0) (compare (match_operand:ALL2 0 "register_operand" "r") (const_int 0)))] ...) Then the new one would be something like (define_insn "*tsthi" [(set (cc0) (compare (match_operand:ALL2 0 "register_operand" "r,r") (const_int 0))) (clobber (match_scratch:HI 1 "=0,X")] ...) But how can I express 1) and 2) ? reg-alloc's choices appear to be random from BE's perspective, and it fails with much simpler tasks to produce good results. The first alternative which matches is the 1st, i.e. it would always clobber op 0 which results in up to 4 instructions and higher register pressure if that register is still needed after the insn. Making alternative 1 more expensive presumably results always in 2nd alternateive, i.e. 2 instructions instead of 1 if op 0 is no more needed. Similar if we swap alternatives #1 and #2: Then "r","X" will always match and appear to be cheaper than "r","=0". Currently, my preferred approach is a new drop-in replacement for the old reg_unused_after which uses clobbers to decide whether or not op 0 is still needed. That way, reg-alloc can work like before and there is no need to implement dozens of new constraint alternatives across the md files. The clobbers, in turn, are added by a target-pass which translates deadness information to clobbers. The preferred placement of that pass is the same as for the proposed new avr pass from patch take #2. Below you find a quick hack that outlines the idea: New insn attribute "dead" tags insns which are worth to be taken into account for such dead-->clobber translation and provide the operand number. The new pass scans all insns and translates dead info to clobbers as requested. This needs much more work, of course, like adjusting peepholes and splitters to the new insns with their additional operand. avr_reg_unused_after_df (the old reg_unused_after) should be rewritten to use DF info instead of scanning by hand. Some of the insn templates which used reg_unused_after are move insns; for these insns it's a bit different. > The "normal" way of things is that the insn alternative is selected in > reload (or in LRA) and that the clobbers are added as necessary. In Would you outline a concrete example? In particular, how to avoid clobbers if the operand is no more needed after the insn, and use clobber alternative if the value is no more needed? I.e. how would you express 1) and 2) from above in terms of insn constraints? > PR64331, an alternative for insn r17 would be selected that has a > CLOBBER for r20, prevent hardreg-cprop from propagating r20. > > Selecting insns based on REG-notes is dangerous business. Lying to > reload and to post-RA passes is a mortal sin, the compiler will punish > you. There is no guarantee that nothing will change between your new > pass to recompute notes, and the final pass that emits the insns. Ya. However, for 4.9 I'd still propose patch take #2. It definitely improves matters, does nothing wrong, and throwing out reg_unused_after completely will lead to considerable increase of code size and run time. And for 4.9 a "big change" like outlined above is way too intrusive for already released version, imo. For 5.0 I am also still proposing the patchh. It does nothing wrong and is the base for a complete solution like the one just outlined. Notice that the patch below is based on the proposed patch take #2. Johann + (clobber (match_operand:ALL2 3 "scratch_operand" ""))] "" { switch (which_alternative) @@ -4626,6 +4631,7 @@ return avr_out_compare (insn, operands, NULL); } [(set_attr "cc" "compare") + (set_attr "dead" "0") (set_attr "length" "1,2,2,3,4,2,4") (set_attr "adjust_len" "tsthi,tsthi,*,*,*,compare,compare")]) @@ -4633,7 +4639,8 @@ [(set (cc0) (compare (match_operand:PSI 0 "register_operand" "r,r,d ,r ,d,r") (match_operand:PSI 1 "nonmemory_operand" "L,r,s ,s ,M,n"))) - (clobber (match_scratch:QI 2 "=X,X,&d,&d ,X,&d"))] + (clobber (match_scratch:QI 2 "=X,X,&d,&d ,X,&d")) + (clobber (match_operand:PSI 3 "scratch_operand" ""))] "" { switch (which_alternative) @@ -4666,7 +4673,8 @@ [(set (cc0) (compare (match_operand:ALL4 0 "register_operand" "r ,r ,d,r ,r") (match_operand:ALL4 1 "nonmemory_operand" "Y00,r ,M,M ,n Ynn"))) - (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d"))] + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d")) + (clobber (match_operand:ALL4 3 "scratch_operand" ""))] "" { if (0 == which_alternative) @@ -4677,6 +4685,7 @@ return avr_out_compare (insn, operands, NULL); } [(set_attr "cc" "compare") + (set_attr "dead" "0") (set_attr "length" "4,4,4,5,8") (set_attr "adjust_len" "tstsi,*,compare,compare,compare")]) @@ -4706,7 +4715,8 @@ [(parallel [(set (cc0) (compare (match_operand:ORDERED234 1 "register_operand" "") (match_operand:ORDERED234 2 "nonmemory_operand" ""))) - (clobber (match_scratch:QI 4 ""))]) + (clobber (match_scratch:QI 4 "")) + (clobber (scratch:ORDERED234))]) (set (pc) (if_then_else (match_operator 0 "ordered_comparison_operator" [(cc0) diff --git a/gcc/config/avr/avr.c b/gcc/config/avr/avr.c index 827b280..2510739 100644 --- a/gcc/config/avr/avr.c +++ b/gcc/config/avr/avr.c @@ -83,6 +83,7 @@ #include "builtins.h" #include "context.h" #include "tree-pass.h" +#include "rtl-iter.h" /* Maximal allowed offset for an address in the LD command */ #define MAX_LD_OFFSET(MODE) (64 - (signed)GET_MODE_SIZE (MODE)) @@ -331,6 +332,40 @@ avr_to_int_mode (rtx x) } +static int avr_reg_unused_after_df (rtx_insn *insn, rtx reg); + +static void +avr_rest_of_recompute_notes (void) +{ + for (rtx_insn *insn = get_insns (); insn; insn = NEXT_INSN (insn)) + { + int opno, scratchno; + + if (!NONDEBUG_INSN_P (insn) + || recog_memoized (insn) < 0 + || (-1 == (opno = (int8_t) get_attr_dead (insn)))) + continue; + + extract_insn (insn); + + scratchno = recog_data.n_operands - 1; + + rtx op = recog_data.operand[opno]; + rtx *pscratch = recog_data.operand_loc[scratchno]; + + avr_dump ("\ninsn with dead attr for op %d %r --> op %d %r:\n%r\n", + opno, op, scratchno, *pscratch, insn); + + if (REG_P (op) + && avr_reg_unused_after_df (insn, op)) + { + *pscratch = op; + } + } +} + + static const pass_data avr_pass_data_recompute_notes = { RTL_PASS, // type @@ -359,6 +394,8 @@ public: df_note_add_problem (); df_analyze (); + avr_rest_of_recompute_notes (); + return 0; } }; // avr_pass_recompute_notes @@ -8731,6 +8768,24 @@ avr_adjust_insn_length (rtx_insn *insn, int len) int reg_unused_after (rtx_insn *insn, rtx reg) { + subrtx_iterator::array_type array; + + FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL) + { + const_rtx x = *iter; + + if (CLOBBER == GET_CODE (x) + && rtx_equal_p (reg, XEXP (x, 0))) + return true; + } + + return false; +} + + +static int +avr_reg_unused_after_df (rtx_insn *insn, rtx reg) +{ return (dead_or_set_p (insn, reg) || (REG_P(reg) && _reg_unused_after (insn, reg))); } diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md index 1b39ddb..bc8f701 100644 --- a/gcc/config/avr/avr.md +++ b/gcc/config/avr/avr.md @@ -238,6 +238,9 @@ ] (const_int 0))) +(define_attr "dead" "" + (const_int 255)) + ;; Define mode iterators (define_mode_iterator QIHI [QI HI]) (define_mode_iterator QIHI2 [QI HI]) @@ -4518,6 +4521,7 @@ "cp __zero_reg__,%A0 cpc __zero_reg__,%B0" [(set_attr "cc" "compare") + (set_attr "dead" "0") (set_attr "length" "2")]) (define_insn "*negated_tstpsi" @@ -4598,7 +4602,8 @@ [(set (cc0) (compare (match_operand:ALL2 0 "register_operand" "!w ,r ,r,d ,r ,d,r") (match_operand:ALL2 1 "nonmemory_operand" "Y00,Y00,r,s ,s ,M,n Ynn"))) - (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d ,X,&d"))] + (clobber (match_scratch:QI 2 "=X ,X ,X,&d,&d ,X,&d"))