Message ID | 20110420160959.GR17079@tyan-ft48-01.lab.bos.redhat.com |
---|---|
State | New |
Headers | show |
On 04/20/2011 09:09 AM, Jakub Jelinek wrote: > Hi! > > This splitter allows us to optimize (x {* {2,4,8},<< {1,2,3}}) {|,^} y > for constant integer y <= {1ULL,3ULL,7ULL} using lea{l,q} (| or ^ in > that case, when the low bits are known to be all 0, is like plus). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2011-04-20 Jakub Jelinek <jakub@redhat.com> > > PR target/48688 > * config/i386/i386.md (*lea_general_4): New define_insn_and_split. Any chance you could do this in combine instead? Shift-and-add patterns are a fairly common architectural feature... r~
On Wed, Apr 20, 2011 at 10:27:36AM -0700, Richard Henderson wrote: > On 04/20/2011 09:09 AM, Jakub Jelinek wrote: > > Hi! > > > > This splitter allows us to optimize (x {* {2,4,8},<< {1,2,3}}) {|,^} y > > for constant integer y <= {1ULL,3ULL,7ULL} using lea{l,q} (| or ^ in > > that case, when the low bits are known to be all 0, is like plus). > > > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > > > 2011-04-20 Jakub Jelinek <jakub@redhat.com> > > > > PR target/48688 > > * config/i386/i386.md (*lea_general_4): New define_insn_and_split. > > Any chance you could do this in combine instead? Shift-and-add patterns > are a fairly common architectural feature... I've tried to do it in simplify-rtx.c, unfortunately combine.c does exactly opposite canonicalization and thus it results in endless recursion: /* If we are adding two things that have no bits in common, convert the addition into an IOR. This will often be further simplified, for example in cases like ((a & 1) + (a & 2)), which can become a & 3. */ if (GET_MODE_BITSIZE (mode) <= HOST_BITS_PER_WIDE_INT && (nonzero_bits (XEXP (x, 0), mode) & nonzero_bits (XEXP (x, 1), mode)) == 0) { /* Try to simplify the expression further. */ rtx tor = simplify_gen_binary (IOR, mode, XEXP (x, 0), XEXP (x, 1)); temp = combine_simplify_rtx (tor, mode, in_dest, 0); /* If we could, great. If not, do not go ahead with the IOR replacement, since PLUS appears in many special purpose address arithmetic instructions. */ if (GET_CODE (temp) != CLOBBER && temp != tor) return temp; } So at least it can't be done in simplify_binary_operation_1. Jakub
On Wed, Apr 20, 2011 at 9:09 AM, Jakub Jelinek <jakub@redhat.com> wrote: > Hi! > > This splitter allows us to optimize (x {* {2,4,8},<< {1,2,3}}) {|,^} y > for constant integer y <= {1ULL,3ULL,7ULL} using lea{l,q} (| or ^ in > that case, when the low bits are known to be all 0, is like plus). > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? > > 2011-04-20 Jakub Jelinek <jakub@redhat.com> > > PR target/48688 > * config/i386/i386.md (*lea_general_4): New define_insn_and_split. > > * gcc.target/i386/pr48688.c: New test. > > --- gcc/config/i386/i386.md.jj 2011-04-19 14:08:55.000000000 +0200 > +++ gcc/config/i386/i386.md 2011-04-20 14:34:50.000000000 +0200 > @@ -6646,6 +6646,40 @@ (define_insn_and_split "*lea_general_3_z > } > [(set_attr "type" "lea") > (set_attr "mode" "SI")]) > + > +(define_insn_and_split "*lea_general_4" > + [(set (match_operand:SWI 0 "register_operand" "=r") > + (any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" "l") > + (match_operand:SWI 2 "const_int_operand" "n")) > + (match_operand 3 "const_int_operand" "n")))] > + "(<MODE>mode == DImode > + || <MODE>mode == SImode > + || !TARGET_PARTIAL_REG_STALL > + || optimize_function_for_size_p (cfun)) > + && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3 > + && ((unsigned HOST_WIDE_INT) INTVAL (operands[3]) > + <= ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))" > + "#" > + "&& reload_completed" > + [(const_int 0)] > +{ > + rtx pat; > + if (<MODE>mode != DImode) > + operands[0] = gen_lowpart (SImode, operands[0]); > + operands[1] = gen_lowpart (Pmode, operands[1]); > + operands[2] = GEN_INT (1 << INTVAL (operands[2])); > + pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]), > + INTVAL (operands[3])); > + if (Pmode != SImode && <MODE>mode != DImode) > + pat = gen_rtx_SUBREG (SImode, pat, 0); > + emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat)); > + DONE; > +} > + [(set_attr "type" "lea") > + (set (attr "mode") > + (if_then_else (eq (symbol_ref "<MODE>mode == DImode") (const_int 0)) > + (const_string "SI") > + (const_string "DI")))]) > > ;; Subtract instructions I don't think this pattern is correct. See: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49281 H.J.
--- gcc/config/i386/i386.md.jj 2011-04-19 14:08:55.000000000 +0200 +++ gcc/config/i386/i386.md 2011-04-20 14:34:50.000000000 +0200 @@ -6646,6 +6646,40 @@ (define_insn_and_split "*lea_general_3_z } [(set_attr "type" "lea") (set_attr "mode" "SI")]) + +(define_insn_and_split "*lea_general_4" + [(set (match_operand:SWI 0 "register_operand" "=r") + (any_or:SWI (ashift:SWI (match_operand:SWI 1 "index_register_operand" "l") + (match_operand:SWI 2 "const_int_operand" "n")) + (match_operand 3 "const_int_operand" "n")))] + "(<MODE>mode == DImode + || <MODE>mode == SImode + || !TARGET_PARTIAL_REG_STALL + || optimize_function_for_size_p (cfun)) + && ((unsigned HOST_WIDE_INT) INTVAL (operands[2])) - 1 < 3 + && ((unsigned HOST_WIDE_INT) INTVAL (operands[3]) + <= ((unsigned HOST_WIDE_INT) 1 << INTVAL (operands[2])))" + "#" + "&& reload_completed" + [(const_int 0)] +{ + rtx pat; + if (<MODE>mode != DImode) + operands[0] = gen_lowpart (SImode, operands[0]); + operands[1] = gen_lowpart (Pmode, operands[1]); + operands[2] = GEN_INT (1 << INTVAL (operands[2])); + pat = plus_constant (gen_rtx_MULT (Pmode, operands[1], operands[2]), + INTVAL (operands[3])); + if (Pmode != SImode && <MODE>mode != DImode) + pat = gen_rtx_SUBREG (SImode, pat, 0); + emit_insn (gen_rtx_SET (VOIDmode, operands[0], pat)); + DONE; +} + [(set_attr "type" "lea") + (set (attr "mode") + (if_then_else (eq (symbol_ref "<MODE>mode == DImode") (const_int 0)) + (const_string "SI") + (const_string "DI")))]) ;; Subtract instructions --- gcc/testsuite/gcc.target/i386/pr48688.c.jj 2011-04-20 14:55:37.000000000 +0200 +++ gcc/testsuite/gcc.target/i386/pr48688.c 2011-04-20 14:57:03.000000000 +0200 @@ -0,0 +1,24 @@ +/* PR target/48688 */ +/* { dg-do compile } */ +/* { dg-options "-O2" } */ + +int fn1 (int x) { return (x << 3) | 5; } +int fn2 (int x) { return (x * 8) | 5; } +int fn3 (int x) { return (x << 3) + 5; } +int fn4 (int x) { return (x * 8) + 5; } +int fn5 (int x) { return (x << 3) ^ 5; } +int fn6 (int x) { return (x * 8) ^ 5; } +long fn7 (long x) { return (x << 3) | 5; } +long fn8 (long x) { return (x * 8) | 5; } +long fn9 (long x) { return (x << 3) + 5; } +long fn10 (long x) { return (x * 8) + 5; } +long fn11 (long x) { return (x << 3) ^ 5; } +long fn12 (long x) { return (x * 8) ^ 5; } +long fn13 (unsigned x) { return (x << 3) | 5; } +long fn14 (unsigned x) { return (x * 8) | 5; } +long fn15 (unsigned x) { return (x << 3) + 5; } +long fn16 (unsigned x) { return (x * 8) + 5; } +long fn17 (unsigned x) { return (x << 3) ^ 5; } +long fn18 (unsigned x) { return (x * 8) ^ 5; } + +/* { dg-final { scan-assembler-not "\[ \t\]x?or\[bwlq\]\[ \t\]" } } */