Message ID | m3sjvcnu4v.fsf@redhat.com |
---|---|
State | New |
Headers | show |
On 02/25/2011 05:03 AM, Nick Clifton wrote: >> +/* A special mode used to distinguish the Lcc branch instruction > + from the Bcc branch instruction. */ > +CC_MODE (CC_LCC); Why not just use an unspec on the lcc insn. I think you want to preserve the existing CCmode, so that we DTRT with modes that do not have the overflow bit. > + /* What we're doing here is looking for a conditional branch which > + branches backwards without crossing any non-debug labels. I.e. > + the loop has to enter from the top. */ Ug. Surely we can re-use the existing loop detection code to find the inner-most loops. The existence of normal branches within the loop ought not make any sort of a difference. You'll have to re-establish the CFG, which just got broken down, but that shouldn't be any different from several other targets. Then use flow_loops_find and the rest of the bits in cfgloop.h. r~
Hi Richard, Thanks for the review. Attached is a revised patch which addresses the issues that you raised: >>> +/* A special mode used to distinguish the Lcc branch instruction >> + from the Bcc branch instruction. */ >> +CC_MODE (CC_LCC); > > Why not just use an unspec on the lcc insn. I should have done. The CC_LCC mode was a holdover from when I was trying to get the doloop patterns to work. The doloop pattern recognizer does not accept UNSPECs, so I had to use a unique CC mode instead. The revised patch uses an UNSPEC. >> + /* What we're doing here is looking for a conditional branch which >> + branches backwards without crossing any non-debug labels. I.e. >> + the loop has to enter from the top. */ > > Ug. Surely we can re-use the existing loop detection code to find the > inner-most loops. We can, but it is a lot more complicated than just performing a simple scan. Still I have tried my best to follow up on this. One thing that does worry me however is that I encountered a segmentation fault whilst trying to free the loop data. (See a comment in the patch for more details). I tried to follow the example of other passes that use the loop information, but I must have gotten something wrong. Any pointers on how to fix this are greatly appreciated. Cheers Nick gcc/ChangeLog 2011-03-08 Nick Clifton <nickc@redhat.com> * config/mn10300/mn10300.c: Include cfgloop.h. (DUMP): New macro. (mn10300_insert_setlb_lcc): New function. Inserts a SETLB and a Lcc or a FLcc insn into the instruction stream. (mn10300_block_contains_call): New function. Returns true if the given basic block contains a CALL insn. (mn10300_loop_contains_call_insn): New function. Returns true if the given loop contains a CALL insn. (mn10300_scan_for_setlb_lcc): New function. Finds opportunities to use the SETLB and Lcc or FLcc insns. (mn10300_reorg): Invoke mn10300_scan_for_setlb_lcc. (TARGET_FLAGS): Add MASK_ALLOW_SETLB. * config/mn10300/mn10300.opt (msetlb): New option. Used to disable the SETLB optimization. * config/mn10300/mn10300.h (TARGET_CPU_CPP_BUILTINS): Add __SETLB__ or __NO_SETLB__. * config/mn10300/mn10300.md (UNSPEC_SETLB): New constant. (cmpsi): Make visible. (setlb): New pattern. (Lcc): New pattern. (FLcc): New pattern. * doc/invoke.texi: Document -mno-setlb option.
> + /* If the comparison has not already been split out of the branch > + then do so now. */ > + if (REGNO (cmp_reg) != CC_REG) > + { > + rtx cmp; > + > + cmp = emit_insn_before (gen_cmpsi (cmp_reg, XEXP (comparison, 1)), branch); > + > + DUMP ("Extracted comparison from branch:", cmp); > + } This should never ever fire. These should have been split already, at least with optimization, on which this pass is dependant. Just assert here, if you like. > +mn10300_block_contains_call (struct basic_block_def * block) basic_block block > + for (insn = block->il.rtl->head_; > + insn != NULL_RTX; > + insn = NEXT_INSN (insn)) > + { > + if (CALL_P (insn)) > + return true; > + > + if (insn == block->il.rtl->end_) > + break; > + } FOR_BB_INSNS (block, insn) if (CALL_P (insn)) return true; > + df_analyze (); > + if (flow_loops_find (& loops) > 0) > + { > + unsigned int i; > + loop_p loop; > + > + FOR_EACH_VEC_ELT (loop_p, loops.larray, i, loop) You need compute_bb_for_insn, and to set up current_loops. Then you can use FOR_EACH_LOOP(liter, loop, LI_ONLY_INNERMOST) instead of iterating over the array by hand. This will eliminate at least the fake loop and non-innermost loop check that you do by hand. Of course, current_loops must be cleared at the end. > + else if (loop->header != loop->latch) > + reason = "it is not a simple do-while loop"; Why? A loop with multiple blocks should be fine with Lcc. I do see that you've got loop->header and loop->latch backward further below; perhaps that was your reason for this? > + rtx branch = loop->header->il.rtl->end_; BB_END (loop->latch) > + if (single_set (branch) == NULL_RTX > + || GET_CODE (SET_SRC (single_set (branch))) != IF_THEN_ELSE) !any_condjump_p (branch) > + label = loop->latch->il.rtl->head_; BB_HEAD (loop->header) > + /* If necessary, extract the label from the branch insn. */ > + if (! LABEL_P (label)) This should never ever fire -- how else can the loop have been created? Again, you can just assert here. > + /* FIXME: Calling flow_loops_free() appears to be the correct thing to do, > + but it results in a seg-fault when building regex.c in the target libiberty > + library. I have no idea why; so I have disabled the call for now. */ > + /* flow_loops_free (& loops); */ I'm having a look into this. It's definitely non-obvious... r~
Hi Richard, Thanks for the patch review. Sorry for not responding earlier, but some other work got in the way. I have attached a revised patch which addresses all of the points you raised except one: >> + else if (loop->header != loop->latch >> + reason = "it is not a simple do-while loop"; > > Why? A loop with multiple blocks should be fine with Lcc. The problem I found here was that I did not know how to find all of the test-and-loop-back insns if the loop spanned multiple blocks. The revised patch includes a comment saying that loops spanning multiple blocks could be supported, but I would like to leave that for a future patch rather than trying to get it all working now. Cheers Nick
On 05/03/2011 05:13 AM, Nick Clifton wrote: > The problem I found here was that I did not know how to find all of > the test-and-loop-back insns if the loop spanned multiple blocks. > The revised patch includes a comment saying that loops spanning > multiple blocks could be supported, but I would like to leave that > for a future patch rather than trying to get it all working now. The patch is ok. As for the finding the branches, I'm pretty sure that /* Head of the cyclic list of the exits of the loop. */ struct loop_exit *exits; means what it says and contains all of the edges that exit the block. But doing that in a separate patch is fine. r~
Index: gcc/doc/invoke.texi =================================================================== --- gcc/doc/invoke.texi (revision 170489) +++ gcc/doc/invoke.texi (working copy) @@ -741,7 +741,7 @@ -mno-am33 -mam33 -mam33-2 -mam34 @gol -mtune=@var{cpu-type} @gol -mreturn-pointer-on-d0 @gol --mno-crt0 -mrelax -mliw} +-mno-crt0 -mrelax -mliw -msetlb} @emph{PDP-11 Options} @gccoptlist{-mfpu -msoft-float -mac0 -mno-ac0 -m40 -m45 -m10 @gol @@ -15071,6 +15071,18 @@ instructions. This option defines the preprocessor macro @samp{__NO_LIW__}. +@item -msetlb +@opindex msetlb +Allow the compiler to generate the @emph{SETLB} and @emph{Lcc} +instructions if the target is the @samp{AM33} or later. This is the +default. This option defines the preprocessor macro @samp{__SETLB__}. + +@item -mnosetlb +@opindex mnosetlb +Do not allow the compiler to generate @emph{SETLB} or @emph{Lcc} +instructions. This option defines the preprocessor macro +@samp{__NO_SETLB__}. + @end table @node PDP-11 Options Index: gcc/config/mn10300/mn10300.opt =================================================================== --- gcc/config/mn10300/mn10300.opt (revision 170489) +++ gcc/config/mn10300/mn10300.opt (working copy) @@ -54,3 +54,7 @@ mliw Target Report Mask(ALLOW_LIW) Allow gcc to generate LIW instructions + +msetlb +Target Report Mask(ALLOW_SETLB) +Allow gcc to generate the SETLB and Lcc instructions Index: gcc/config/mn10300/mn10300-modes.def =================================================================== --- gcc/config/mn10300/mn10300-modes.def (revision 170489) +++ gcc/config/mn10300/mn10300-modes.def (working copy) @@ -22,3 +22,7 @@ CC_MODE (CCZN); CC_MODE (CCZNC); CC_MODE (CC_FLOAT); + +/* A special mode used to distinguish the Lcc branch instruction + from the Bcc branch instruction. */ +CC_MODE (CC_LCC); Index: gcc/config/mn10300/predicates.md =================================================================== --- gcc/config/mn10300/predicates.md (revision 170489) +++ gcc/config/mn10300/predicates.md (working copy) @@ -56,6 +56,8 @@ { if (REGNO (op) != CC_REG) return false; + if (GET_MODE (op) == CC_LCCmode) + return false; if (GET_MODE (op) == CC_FLOATmode) return false; return GET_MODE_CLASS (GET_MODE (op)) == MODE_CC; Index: gcc/config/mn10300/mn10300.h =================================================================== --- gcc/config/mn10300/mn10300.h (revision 170489) +++ gcc/config/mn10300/mn10300.h (working copy) @@ -54,6 +55,8 @@ builtin_define (TARGET_ALLOW_LIW ? \ "__LIW__" : "__NO_LIW__");\ \ + builtin_define (TARGET_ALLOW_SETLB ? \ + "__SETLB__" : "__NO_SETLB__");\ } \ while (0) Index: gcc/config/mn10300/mn10300.md =================================================================== --- gcc/config/mn10300/mn10300.md (revision 170489) +++ gcc/config/mn10300/mn10300.md (working copy) @@ -42,6 +42,8 @@ ;; This is used to encode LIW patterns. (UNSPEC_LIW 8) + ;; This is for the low overhead loop instructions. + (UNSPEC_SETLB 9) ]) (include "predicates.md") @@ -1375,7 +1377,7 @@ DONE; }) -(define_insn "*cmpsi" +(define_insn "cmpsi" [(set (reg CC_REG) (compare (match_operand:SI 0 "register_operand" "r,r,r") (match_operand:SI 1 "nonmemory_operand" "r,O,i")))] @@ -2152,3 +2154,37 @@ [(set (attr "timings") (if_then_else (eq_attr "cpu" "am34") (const_int 13) (const_int 12)))] ) + +;; Note - in theory the doloop patterns could be used here to express +;; the SETLB and Lcc instructions. In practice this does not work because +;; the acceptable forms of the doloop patterns do not include UNSPECs +;; and without them gcc's basic block reordering code can duplicate the +;; doloop_end pattern, leading to bogus multiple decrements of the loop +;; counter. + +(define_insn "setlb" + [(unspec [(const_int 0)] UNSPEC_SETLB)] + "TARGET_AM33 && TARGET_ALLOW_SETLB" + "setlb" +) + +(define_insn "Lcc" + [(set (pc) + (if_then_else (match_operator 0 "comparison_operator" + [(reg:CC_LCC CC_REG) (const_int 0)]) + (label_ref (match_operand 1 "" "")) + (pc)))] + "TARGET_AM33 && TARGET_ALLOW_SETLB" + "L%b0 # loop back to: %1" +) + +;; (define_insn "FLcc" +;; [(set (pc) +;; (if_then_else (match_operator 0 "comparison_operator" +;; [(reg:CC_FLCC CC_REG) (const_int 0)]) +;; (label_ref (match_operand 1 "" "")) +;; (pc)))] +;; "TARGET_AM33 && TARGET_ALLOW_SETLB" +;; "FL%b0 # loop back to: %1" +;; [(set (attr "timings") (if_then_else (eq_attr "cpu" "am34") (const_int 44) (const_int 11)))] +;; ) Index: gcc/config/mn10300/mn10300.c =================================================================== --- gcc/config/mn10300/mn10300.c (revision 170489) +++ gcc/config/mn10300/mn10300.c (working copy) @@ -2674,6 +2674,7 @@ return CC_FLAG_Z | CC_FLAG_N | CC_FLAG_C; case CCZNmode: return CC_FLAG_Z | CC_FLAG_N; + case CC_LCCmode: case CC_FLOATmode: return -1; default: @@ -3134,11 +3135,82 @@ } } +/* Replace the JUMP insn with a Lcc insn that goes to LABEL. + Insert a SETLB insn before LABEL. */ + static void +mn10300_insert_setlb_lcc (rtx label, rtx jump) +{ + rtx lcc, comparison, cmp_reg; + + emit_insn_before (gen_setlb (), label); + + comparison = XEXP (SET_SRC (PATTERN (jump)), 0); + cmp_reg = XEXP (comparison, 0); + gcc_assert (REG_P (cmp_reg)); + /* FIXME: We do not currently support the FLcc instruction. */ + gcc_assert (GET_MODE (cmp_reg) != CC_FLOATmode); + + /* If the comparison has not already been split out of the jump + then do so now. */ + if (REGNO (cmp_reg) != CC_REG) + emit_insn_before (gen_cmpsi (cmp_reg, XEXP (comparison, 1)), jump); + + lcc = gen_Lcc (comparison, label); + lcc = emit_jump_insn_before (lcc, jump); + mark_jump_label (PATTERN (lcc), lcc, 0); + delete_insn (jump); +} + +static void +mn10300_scan_for_setlb_lcc (void) +{ + rtx last_code_label = NULL_RTX, r; + + /* What we're doing here is looking for a conditional branch which + branches backwards without crossing any non-debug labels. I.e. + the loop has to enter from the top. */ + + for (r = get_insns (); r != NULL_RTX; r = next_nonnote_nondebug_insn (r)) + { + switch (GET_CODE (r)) + { + case CODE_LABEL: + last_code_label = r; + break; + case NOTE: + /* This can happen if the first insn is a NOTE. */ + case INSN: + /* These are safe to ignore. */ + break; + case JUMP_INSN: + /* Check these. */ + if (GET_CODE (PATTERN (r)) == SET + && GET_CODE (SET_SRC (PATTERN (r))) == IF_THEN_ELSE) + { + if (last_code_label != NULL_RTX + && JUMP_LABEL (r) == last_code_label) + { + if (LABEL_NUSES (last_code_label) == 1) + mn10300_insert_setlb_lcc (last_code_label, r); + } + } + break; + default: + last_code_label = NULL_RTX; + break; + } + } +} + +static void mn10300_reorg (void) { if (TARGET_AM33) { + if (TARGET_ALLOW_SETLB) + mn10300_scan_for_setlb_lcc (); + if (TARGET_ALLOW_LIW) mn10300_bundle_liw (); } @@ -3176,7 +3248,7 @@ #define TARGET_ASM_OUTPUT_ADDR_CONST_EXTRA mn10300_asm_output_addr_const_extra #undef TARGET_DEFAULT_TARGET_FLAGS -#define TARGET_DEFAULT_TARGET_FLAGS MASK_MULT_BUG | MASK_PTR_A0D0 | MASK_ALLOW_LIW +#define TARGET_DEFAULT_TARGET_FLAGS MASK_MULT_BUG | MASK_PTR_A0D0 | MASK_ALLOW_LIW | MASK_ALLOW_SETLB #undef TARGET_HANDLE_OPTION #define TARGET_HANDLE_OPTION mn10300_handle_option #undef TARGET_OPTION_OVERRIDE