Message ID | 20100820135046.GC702@tyan-ft48-01.lab.bos.redhat.com |
---|---|
State | New |
Headers | show |
On 08/20/2010 06:50 AM, Jakub Jelinek wrote: > 2010-08-20 Jakub Jelinek <jakub@redhat.com> > Paolo Bonzini <bonzini@gnu.org> > > * simplify-rtx.c (simplify_unary_operation_1): Optimize > (sign_extend (zero_extend ()) and > ({sign,zero}_extend (lshiftrt (ashift X (const_int I)) (const_int I))). Ok. r~
On Fri, Aug 20, 2010 at 6:50 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Aug 20, 2010 at 11:46:48AM +0200, Paolo Bonzini wrote: >> On 08/19/2010 06:33 PM, Jakub Jelinek wrote: >> >+ /* (sign_extend:M (sign_extend:N<X>)) is (sign_extend:M<X>). */ >> >+ if (GET_CODE (op) == SIGN_EXTEND) >> >+ return simplify_gen_unary (SIGN_EXTEND, mode, XEXP (op, 0), >> >+ GET_MODE (XEXP (op, 0))); >> >> Maybe >> >> /* (sign_extend:M (sign_extend:N<X>)) is (sign_extend:M<X>). >> (sign_extend:M (zero_extend:N<X>)) is (zero_extend:M<X>). */ >> if (GET_CODE (op) == SIGN_EXTEND || GET_CODE (op) == ZERO_EXTEND) >> { >> gcc_assert (GET_MODE (op) != mode); >> return simplify_gen_unary (GET_CODE (op), mode, XEXP (op, 0), >> GET_MODE (XEXP (op, 0))); >> } >> >> ? > > Good idea. > > On Thu, Aug 19, 2010 at 07:58:24PM +0200, Bernd Schmidt wrote: >> > ZERO_EXTEND is expanded into AND instead of LSHIFTRT/ASHIFT by combine, >> > and >> > two nested ANDs should be easily optimized. If you really want, I can >> > handle the ZERO_EXTEND (AND <X> mask) case as a follow-up. >> >> Oh, in expand_compound_operation? I was thinking about finding such >> shifts in a pair of insns (happens on Thumb). Also, what about callers >> other than combine? > > Ok. The following patch thus adds optimization of sign_extend of zero_extend, > sign_extend of lshiftrt/ashift and zero_extend of lshiftrt/ashift. > > Bootstrapped/regtested on x86_64-linux and i686-linux (yes,rtl checking) > with additional patchlet to gather statistics and in both > bootstraps/regtests together this optimization happened (the fact that > simplify-rtx.c simplified it doesn't still mean the result was usable in > whatever pass asked for it) times: > (sign_extend (sign_extend)) 2059 > (sign_extend (zero_extend)) 87019 > (sign_extend (ashiftrt)) 1485 > (zero_extend (zero_extend)) 15648 > lshiftrt wasn't seen, but it is very well possible it shows up on other > targets. Ok for trunk? > > 2010-08-20 Jakub Jelinek <jakub@redhat.com> > Paolo Bonzini <bonzini@gnu.org> > > * simplify-rtx.c (simplify_unary_operation_1): Optimize > (sign_extend (zero_extend ()) and > ({sign,zero}_extend (lshiftrt (ashift X (const_int I)) (const_int I))). > Can we optimize out sign_extend in: (insn:TI 9 7 10 (set (reg:SI 0 ax [orig:68 D.6819 ] [68]) (zero_extend:SI (vec_select:QI (reg:V16QI 21 xmm0 [orig:64 x ] [64]) (parallel [ (const_int 4 [0x4]) ])))) /export/build/gnu/gcc/build-x86_64-linux/gcc/include/smmintrin.h:442 1681 {*sse4_1_pextrb} (expr_list:REG_DEAD (reg:V16QI 21 xmm0 [orig:64 x ] [64]) (nil))) (insn:TI 10 9 18 (set (reg:DI 0 ax [orig:67 D.6819 ] [67]) (sign_extend:DI (reg:SI 0 ax [orig:68 D.6819 ] [68]))) foo.c:3 126 {extendsidi2_rex64} (nil))
On Fri, Aug 20, 2010 at 09:56:41AM -0700, H.J. Lu wrote: > Can we optimize out sign_extend in: > > (insn:TI 9 7 10 (set (reg:SI 0 ax [orig:68 D.6819 ] [68]) > (zero_extend:SI (vec_select:QI (reg:V16QI 21 xmm0 [orig:64 x ] [64]) > (parallel [ > (const_int 4 [0x4]) > ])))) > /export/build/gnu/gcc/build-x86_64-linux/gcc/include/smmintrin.h:442 1681 > {*sse4_1_pextrb} > (expr_list:REG_DEAD (reg:V16QI 21 xmm0 [orig:64 x ] [64]) > (nil))) > > (insn:TI 10 9 18 (set (reg:DI 0 ax [orig:67 D.6819 ] [67]) > (sign_extend:DI (reg:SI 0 ax [orig:68 D.6819 ] [68]))) foo.c:3 126 > {extendsidi2_rex64} > (nil)) I guess only if there is a [(set (match_operand:DI 0 "register_operand" "=r") (zero_extend:DI (vec_select:QI (match_operand:V16QI 1 "register_operand" "x") (parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))] pattern for TARGET_64BIT. Not sure what exactly is pextrb ..., %ecx insn doing to the upper 32 bits of %rcx, if it clears them, such an insn would be nice to have in sse.md (similarly for pextrw/pextrd). If there is such a pattern, I think with my simplify_* changes the combiner will do its job. Jakub
On Fri, Aug 20, 2010 at 10:27 AM, Jakub Jelinek <jakub@redhat.com> wrote: > On Fri, Aug 20, 2010 at 09:56:41AM -0700, H.J. Lu wrote: >> Can we optimize out sign_extend in: >> >> (insn:TI 9 7 10 (set (reg:SI 0 ax [orig:68 D.6819 ] [68]) >> (zero_extend:SI (vec_select:QI (reg:V16QI 21 xmm0 [orig:64 x ] [64]) >> (parallel [ >> (const_int 4 [0x4]) >> ])))) >> /export/build/gnu/gcc/build-x86_64-linux/gcc/include/smmintrin.h:442 1681 >> {*sse4_1_pextrb} >> (expr_list:REG_DEAD (reg:V16QI 21 xmm0 [orig:64 x ] [64]) >> (nil))) >> >> (insn:TI 10 9 18 (set (reg:DI 0 ax [orig:67 D.6819 ] [67]) >> (sign_extend:DI (reg:SI 0 ax [orig:68 D.6819 ] [68]))) foo.c:3 126 >> {extendsidi2_rex64} >> (nil)) > > I guess only if there is a > [(set (match_operand:DI 0 "register_operand" "=r") > (zero_extend:DI > (vec_select:QI > (match_operand:V16QI 1 "register_operand" "x") > (parallel [(match_operand:SI 2 "const_0_to_15_operand" "n")]))))] > pattern for TARGET_64BIT. > Not sure what exactly is > pextrb ..., %ecx > insn doing to the upper 32 bits of %rcx, if it clears them, such an insn > would be nice to have in sse.md (similarly for pextrw/pextrd). > If there is such a pattern, I think with my simplify_* changes the combiner > will do its job. > We can certain add them. I will give it a try.
On 08/20/2010 07:27 PM, Jakub Jelinek wrote: > Not sure what exactly is > pextrb ..., %ecx > insn doing to the upper 32 bits of %rcx, if it clears them Probably yes like every other 32-bit writeback on x86_64. Paolo
--- gcc/simplify-rtx.c.jj 2010-08-20 12:33:30.000000000 +0200 +++ gcc/simplify-rtx.c 2010-08-20 13:11:15.000000000 +0200 @@ -1010,15 +1010,22 @@ simplify_unary_operation_1 (enum rtx_cod && GET_MODE_SIZE (mode) <= GET_MODE_SIZE (GET_MODE (XEXP (op, 0)))) return rtl_hooks.gen_lowpart_no_emit (mode, op); - /* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>). */ - if (GET_CODE (op) == SIGN_EXTEND) - return simplify_gen_unary (SIGN_EXTEND, mode, XEXP (op, 0), - GET_MODE (XEXP (op, 0))); + /* (sign_extend:M (sign_extend:N <X>)) is (sign_extend:M <X>). + (sign_extend:M (zero_extend:N <X>)) is (zero_extend:M <X>). */ + if (GET_CODE (op) == SIGN_EXTEND || GET_CODE (op) == ZERO_EXTEND) + { + gcc_assert (GET_MODE_BITSIZE (mode) + > GET_MODE_BITSIZE (GET_MODE (op))); + return simplify_gen_unary (GET_CODE (op), mode, XEXP (op, 0), + GET_MODE (XEXP (op, 0))); + } /* (sign_extend:M (ashiftrt:N (ashift <X> (const_int I)) (const_int I))) is (sign_extend:M (subreg:O <X>)) if there is mode with - GET_MODE_BITSIZE (N) - I bits. */ - if (GET_CODE (op) == ASHIFTRT + GET_MODE_BITSIZE (N) - I bits. + (sign_extend:M (lshiftrt:N (ashift <X> (const_int I)) (const_int I))) + is similarly (zero_extend:M (subreg:O <X>)). */ + if ((GET_CODE (op) == ASHIFTRT || GET_CODE (op) == LSHIFTRT) && GET_CODE (XEXP (op, 0)) == ASHIFT && CONST_INT_P (XEXP (op, 1)) && XEXP (XEXP (op, 0), 1) == XEXP (op, 1) @@ -1027,11 +1034,15 @@ simplify_unary_operation_1 (enum rtx_cod enum machine_mode tmode = mode_for_size (GET_MODE_BITSIZE (GET_MODE (op)) - INTVAL (XEXP (op, 1)), MODE_INT, 1); + gcc_assert (GET_MODE_BITSIZE (mode) + > GET_MODE_BITSIZE (GET_MODE (op))); if (tmode != BLKmode) { rtx inner = rtl_hooks.gen_lowpart_no_emit (tmode, XEXP (XEXP (op, 0), 0)); - return simplify_gen_unary (SIGN_EXTEND, mode, inner, tmode); + return simplify_gen_unary (GET_CODE (op) == ASHIFTRT + ? SIGN_EXTEND : ZERO_EXTEND, + mode, inner, tmode); } } @@ -1066,6 +1077,26 @@ simplify_unary_operation_1 (enum rtx_cod return simplify_gen_unary (ZERO_EXTEND, mode, XEXP (op, 0), GET_MODE (XEXP (op, 0))); + /* (zero_extend:M (lshiftrt:N (ashift <X> (const_int I)) (const_int I))) + is (zero_extend:M (subreg:O <X>)) if there is mode with + GET_MODE_BITSIZE (N) - I bits. */ + if (GET_CODE (op) == LSHIFTRT + && GET_CODE (XEXP (op, 0)) == ASHIFT + && CONST_INT_P (XEXP (op, 1)) + && XEXP (XEXP (op, 0), 1) == XEXP (op, 1) + && GET_MODE_BITSIZE (GET_MODE (op)) > INTVAL (XEXP (op, 1))) + { + enum machine_mode tmode + = mode_for_size (GET_MODE_BITSIZE (GET_MODE (op)) + - INTVAL (XEXP (op, 1)), MODE_INT, 1); + if (tmode != BLKmode) + { + rtx inner = + rtl_hooks.gen_lowpart_no_emit (tmode, XEXP (XEXP (op, 0), 0)); + return simplify_gen_unary (ZERO_EXTEND, mode, inner, tmode); + } + } + #if defined(POINTERS_EXTEND_UNSIGNED) && !defined(HAVE_ptr_extend) /* As we do not know which address space the pointer is refering to, we can do this only if the target does not support different pointer