[avr] Run pass avr-fuse-add a second time

Message ID	ebce7b58-329c-4a43-b949-309bac619280@gjlay.de
State	New
Headers	show Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 125CA385842A Content-Type: multipart/mixed; boundary="------------6iKlBTVvwnE85lYJgfhqtxfG" Message-ID: <ebce7b58-329c-4a43-b949-309bac619280@gjlay.de> Date: Fri, 30 Aug 2024 14:09:57 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird From: Georg-Johann Lay <avr@gjlay.de> Content-Language: en-US To: "gcc-patches@gcc.gnu.org" <gcc-patches@gcc.gnu.org> Cc: Denis Chertykov <chertykov@gmail.com> Subject: [patch,avr] Run pass avr-fuse-add a second time Content-Transfer-Encoding: 7bit Precedence: list Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	[avr] Run pass avr-fuse-add a second time \| expand [avr] Run pass avr-fuse-add a second time

diff --git a/gcc/config/avr/avr-passes.cc b/gcc/config/avr/avr-passes.cc index 8b018ff6a05..8a71b57ada1 100644 --- a/gcc/config/avr/avr-passes.cc +++ b/gcc/config/avr/avr-passes.cc @@ -1009,7 +1009,7 @@ static const pass_data avr_pass_data_fuse_add = RTL_PASS, // type "", // name (will be patched) OPTGROUP_NONE, // optinfo_flags - TV_DF_SCAN, // tv_id + TV_MACH_DEP, // tv_id 0, // properties_required 0, // properties_provided 0, // properties_destroyed @@ -1026,6 +1026,13 @@ public: this->name = name; } + // Cloning is required because we are running one instance of the pass + // before peephole2. and a second one after cprop_hardreg. + opt_pass * clone () final override + { + return make_avr_pass_fuse_add (m_ctxt); + } + bool gate (function *) final override { return optimize && avr_fuse_add > 0; @@ -1503,8 +1510,8 @@ avr_pass_fuse_add::fuse_mem_add (Mem_Insn &mem, Add_Insn &add) - PLUS insn of that kind. - Indirect loads and stores. In almost all cases, combine opportunities arise from the preparation - done by `avr_split_tiny_move', but in some rare cases combinations are - found for the ordinary cores, too. + done by `avr_split_fake_addressing_move', but in some rare cases combinations + are found for the ordinary cores, too. As we consider at most one Mem insn per try, there may still be missed optimizations like POST_INC + PLUS + POST_INC might be performed as PRE_DEC + PRE_DEC for two adjacent locations. */ @@ -1714,7 +1721,7 @@ public: core's capabilities. This sets the stage for pass .avr-fuse-add. */ bool -avr_split_tiny_move (rtx_insn * /*insn*/, rtx *xop) +avr_split_fake_addressing_move (rtx_insn * /*insn*/, rtx *xop) { bool store_p = false; rtx mem, reg_or_0; diff --git a/gcc/config/avr/avr-passes.def b/gcc/config/avr/avr-passes.def index 748260edaef..cd89d673727 100644 --- a/gcc/config/avr/avr-passes.def +++ b/gcc/config/avr/avr-passes.def @@ -20,12 +20,33 @@ /* A post reload optimization pass that fuses PLUS insns with CONST_INT addend with a load or store insn to get POST_INC or PRE_DEC addressing. It can also fuse two PLUSes to a single one, which may occur due to - splits from `avr_split_tiny_move'. We do this in an own pass because - it can find more cases than peephole2, for example when there are - unrelated insns between the interesting ones. */ + splits from `avr_split_fake_addressing_move'. We do this in an own + pass because it can find more cases than peephole2, for example when + there are unrelated insns between the interesting ones. */ INSERT_PASS_BEFORE (pass_peephole2, 1, avr_pass_fuse_add); +/* There are cases where avr-fuse-add doesn't find POST_INC cases because + the RTL code at that time is too long-winded, and moves registers back and + forth (which seems to be the same reason for why pass auto_inc_dec cannot + find POST_INC, either). Some of that long-windedness is cleaned up very + late in pass cprop_hardreg, which opens up new opportunities to find post + increments. An example is the following function from AVR-LibC's qsort: + + void swapfunc (char *a, char *b, int n) + { + do + { + char tmp = *a; + *a++ = *b; + *b++ = tmp; + } while (--n > 0); + } + + Hence, run avr-fuse-add twice; the second time after cprop_hardreg. */ + +INSERT_PASS_AFTER (pass_cprop_hardreg, 1, avr_pass_fuse_add); + /* An analysis pass that runs prior to prologue / epilogue generation. Computes cfun->machine->gasisr.maybe which is used in prologue and epilogue generation provided -mgas-isr-prologues is on. */ @@ -47,9 +68,9 @@ INSERT_PASS_BEFORE (pass_free_cfg, 1, avr_pass_recompute_notes); tries to fix such situations by operating on the original mode. This reduces code size and register pressure. - The assertion is that the code generated by casesi is unaltered and a + The assertion is that the code generated by casesi is unaltered and a sign-extend or zero-extend from QImode or HImode precedes the casesi - insns withaout any insns in between. */ + insns without any insns in between. */ INSERT_PASS_AFTER (pass_expand, 1, avr_pass_casesi); diff --git a/gcc/config/avr/avr-protos.h b/gcc/config/avr/avr-protos.h index 289c80cab55..d6f4abbac66 100644 --- a/gcc/config/avr/avr-protos.h +++ b/gcc/config/avr/avr-protos.h @@ -179,7 +179,7 @@ extern rtl_opt_pass *make_avr_pass_casesi (gcc::context *); extern rtl_opt_pass *make_avr_pass_ifelse (gcc::context *); #ifdef RTX_CODE extern bool avr_casei_sequence_check_operands (rtx *xop); -extern bool avr_split_tiny_move (rtx_insn *insn, rtx *operands); +extern bool avr_split_fake_addressing_move (rtx_insn *insn, rtx *operands); #endif /* RTX_CODE */ /* From avr-log.cc */ diff --git a/gcc/config/avr/avr.md b/gcc/config/avr/avr.md index f477ac170ea..520f1fe41a2 100644 --- a/gcc/config/avr/avr.md +++ b/gcc/config/avr/avr.md @@ -993,41 +993,10 @@ (define_peephole2 (clobber (reg:CC REG_CC))])]) -;; For LPM loads from AS1 we split -;; R = *Z -;; to -;; R = *Z++ -;; Z = Z - sizeof (R) -;; -;; so that the second instruction can be optimized out. - -(define_split ; "split-lpmx" - [(set (match_operand:HISI 0 "register_operand" "") - (match_operand:HISI 1 "memory_operand" ""))] - "reload_completed - && AVR_HAVE_LPMX - && avr_mem_flash_p (operands[1]) - && REG_P (XEXP (operands[1], 0)) - && !reg_overlap_mentioned_p (XEXP (operands[1], 0), operands[0])" - [(set (match_dup 0) - (match_dup 2)) - (set (match_dup 3) - (plus:HI (match_dup 3) - (match_dup 4)))] - { - rtx addr = XEXP (operands[1], 0); - - operands[2] = replace_equiv_address (operands[1], - gen_rtx_POST_INC (Pmode, addr)); - operands[3] = addr; - operands[4] = gen_int_mode (-<SIZE>, HImode); - }) - - ;; Legitimate address and stuff allows way more addressing modes than ;; Reduced Tiny actually supports. Split them now so that we get ;; closer to real instructions which may result in some optimization -;; opportunities. +;; opportunities. This applies also to fake X + offset addressing. (define_split [(parallel [(set (match_operand:MOVMODE 0 "nonimmediate_operand") (match_operand:MOVMODE 1 "general_operand")) @@ -1040,7 +1009,7 @@ (define_split && (MEM_P (operands[0]) || MEM_P (operands[1]))" [(scratch)] { - if (avr_split_tiny_move (curr_insn, operands)) + if (avr_split_fake_addressing_move (curr_insn, operands)) DONE; FAIL; })

[avr] Run pass avr-fuse-add a second time

Commit Message

Comments

Patch