Message ID | AANLkTik3nNlcrZ9AotlSCO8jB_Ox7iH70mZOHuQyZehb@mail.gmail.com |
---|---|
State | New |
Headers | show |
On 06/24/2010 12:01 AM, H.J. Lu wrote: > On Wed, Jun 23, 2010 at 2:22 PM, H.J. Lu<hjl.tools@gmail.com> wrote: >> On Wed, Jun 23, 2010 at 12:50 PM, H.J. Lu<hjl.tools@gmail.com> wrote: >>> On Wed, Jun 23, 2010 at 12:36 PM, Paolo Bonzini<bonzini@gnu.org> wrote: >>>> On 06/23/2010 09:35 PM, H.J. Lu wrote: >>>>> >>>>> On Wed, Jun 23, 2010 at 12:16 PM, Paolo Bonzini<bonzini@gnu.org> wrote: >>>>>> >>>>>> On 06/23/2010 08:24 PM, H.J. Lu wrote: >>>>>>> >>>>>>> [(set (match_operand:HI 0 "register_operand" "=a") >>>>>>> (div:HI >>>>>>> (match_operand:HI 1 "register_operand" "0") >>>>>>> (match_operand:QI 2 "nonimmediate_operand" "qm"))) >>>>>>> (clobber (reg:CC FLAGS_REG))] >>>>>> >>>>>> Maybe this: >>>>>> >>>>>> [(set (match_operand:QI 0 "register_operand" "=a") >>>>>> (subreg:QI >>>>>> (div:HI >>>>>> (match_operand:HI 1 "register_operand" "0") >>>>>> (match_operand:QI 2 "nonimmediate_operand" "qm")) 0)) >>>>>> (clobber (reg:CC FLAGS_REG))] >>>>>> >>>>> >>>>> It doesn't make a big difference since only lower 8bit >>>>> is set by it. >>>> >>>> For full divmod you have to shift mod and ior of course. I understood that >>>> you were talking of *<u>divqi3. >>>> >>> >>> I can't do shift/ior on QI output from *<u>divqi3. I have >>> >> Here is a different approach. It uses UNSPEC for 8bit divmod. >> The generated code is the same. > > An updated patch. No need for *movqi_extzv_3 and > *movqi_extzv_3_rex64. I don't understand exactly what the problem was; clearly it couldn't work as long as this pattern was cheating blatantly: (define_insn "*<u>divqi3" [(set (match_operand:QI 0 "register_operand" "=a") (any_div:QI (match_operand:HI 1 "register_operand" "0") (match_operand:QI 2 "nonimmediate_operand" "qm"))) Anyway this patch is IMO much nicer than the first ones, so if Uros is okay I don't think it's useful to pursue a more accurate representation. Just make sure that REG_EQUAL notes are added to the two extractions. Paolo
On Wed, Jun 23, 2010 at 3:39 PM, Paolo Bonzini <bonzini@gnu.org> wrote: > On 06/24/2010 12:01 AM, H.J. Lu wrote: >> >> On Wed, Jun 23, 2010 at 2:22 PM, H.J. Lu<hjl.tools@gmail.com> wrote: >>> >>> On Wed, Jun 23, 2010 at 12:50 PM, H.J. Lu<hjl.tools@gmail.com> wrote: >>>> >>>> On Wed, Jun 23, 2010 at 12:36 PM, Paolo Bonzini<bonzini@gnu.org> wrote: >>>>> >>>>> On 06/23/2010 09:35 PM, H.J. Lu wrote: >>>>>> >>>>>> On Wed, Jun 23, 2010 at 12:16 PM, Paolo Bonzini<bonzini@gnu.org> >>>>>> wrote: >>>>>>> >>>>>>> On 06/23/2010 08:24 PM, H.J. Lu wrote: >>>>>>>> >>>>>>>> [(set (match_operand:HI 0 "register_operand" "=a") >>>>>>>> (div:HI >>>>>>>> (match_operand:HI 1 "register_operand" "0") >>>>>>>> (match_operand:QI 2 "nonimmediate_operand" "qm"))) >>>>>>>> (clobber (reg:CC FLAGS_REG))] >>>>>>> >>>>>>> Maybe this: >>>>>>> >>>>>>> [(set (match_operand:QI 0 "register_operand" "=a") >>>>>>> (subreg:QI >>>>>>> (div:HI >>>>>>> (match_operand:HI 1 "register_operand" "0") >>>>>>> (match_operand:QI 2 "nonimmediate_operand" "qm")) 0)) >>>>>>> (clobber (reg:CC FLAGS_REG))] >>>>>>> >>>>>> >>>>>> It doesn't make a big difference since only lower 8bit >>>>>> is set by it. >>>>> >>>>> For full divmod you have to shift mod and ior of course. I understood >>>>> that >>>>> you were talking of *<u>divqi3. >>>>> >>>> >>>> I can't do shift/ior on QI output from *<u>divqi3. I have >>>> >>> Here is a different approach. It uses UNSPEC for 8bit divmod. >>> The generated code is the same. >> >> An updated patch. No need for *movqi_extzv_3 and >> *movqi_extzv_3_rex64. > > I don't understand exactly what the problem was; clearly it couldn't work as > long as this pattern was cheating blatantly: Upper 8bit registers aren't real registers. You can't do RA with them. > (define_insn "*<u>divqi3" > [(set (match_operand:QI 0 "register_operand" "=a") > (any_div:QI > (match_operand:HI 1 "register_operand" "0") > (match_operand:QI 2 "nonimmediate_operand" "qm"))) > > Anyway this patch is IMO much nicer than the first ones, so if Uros is okay > I don't think it's useful to pursue a more accurate representation. Just > make sure that REG_EQUAL notes are added to the two extractions. > What are REG_EQUAL notes used for? As far as the rest of gcc is concerned, there are no upper 8bit registers. But you can access bits 8-15 of HI, SI, DI registers via XXX_extract.
On 06/24/2010 01:13 AM, H.J. Lu wrote: >> I don't understand exactly what the problem was; clearly it couldn't work as >> long as this pattern was cheating blatantly: > > Upper 8bit registers aren't real registers. You can't > do RA with them. Please read what other people write. I'm saying that you should have used a HI destination just like you did with your new UNSPEC, which is acceptable. The pattern I quoted below used a QI destination; then magically you attempted to extract bit 8..15 of it with an unspec, or something like that. For anything like that the optimizers are going to bite back sooner or later. And you'd deserve it. >> (define_insn "*<u>divqi3" >> [(set (match_operand:QI 0 "register_operand" "=a") >> (any_div:QI >> (match_operand:HI 1 "register_operand" "0") >> (match_operand:QI 2 "nonimmediate_operand" "qm"))) >> >> Anyway this patch is IMO much nicer than the first ones, so if Uros is okay >> I don't think it's useful to pursue a more accurate representation. Just >> make sure that REG_EQUAL notes are added to the two extractions. >> > > What are REG_EQUAL notes used for? > As far as the rest of gcc is concerned, there are no upper 8bit > registers. But you can access bits 8-15 of HI, SI, > DI registers via XXX_extract. Understood. Using an unspec is fine for me, even though it's not an approval. But that's completely orthogonal to putting a REG_EQUAL note _on the two regs that you extract out of AX_. The notes' value should be one of (subreg:QI (udiv:HI (...) (zero_extend (...)))) (subreg:QI (div:HI (...) (sign_extend (...)))) (subreg:QI (umod:HI (...) (zero_extend (...)))) (subreg:QI (mod:HI (...) (sign_extend (...)))) But I think that with over ten years of GCC experience you do not need anyone to tell you this. In fact, regarding "what are REG_EQUAL notes used for?" my first thought was RTFM, but anyway: they are used by CSE, fwprop, GCSE and combine to simplify and eliminate redundant expressions. Paolo
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index ab90d73..68c9e47 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -104,6 +104,8 @@ UNSPEC_REP UNSPEC_LD_MPIC ; load_macho_picbase UNSPEC_TRUNC_NOOP + UNSPEC_DIVQI + UNSPEC_UDIVQI ;; For SSE/MMX support: UNSPEC_FIX_NOTRUNC @@ -760,6 +762,10 @@ ;; Used in signed and unsigned divisions. (define_code_iterator any_div [div udiv]) +(define_code_attr extend_code + [(div "SIGN_EXTEND") (udiv "ZERO_EXTEND")]) +(define_code_attr extract_code + [(div "SIGN_EXTRACT") (udiv "ZERO_EXTRACT")]) ;; Instruction prefix for signed and unsigned operations. (define_code_attr sgnprefix [(sign_extend "i") (zero_extend "") @@ -7305,17 +7311,6 @@ ;; Divide instructions -(define_insn "<u>divqi3" - [(set (match_operand:QI 0 "register_operand" "=a") - (any_div:QI - (match_operand:HI 1 "register_operand" "0") - (match_operand:QI 2 "nonimmediate_operand" "qm"))) - (clobber (reg:CC FLAGS_REG))] - "TARGET_QIMODE_MATH" - "<sgnprefix>div{b}\t%2" - [(set_attr "type" "idiv") - (set_attr "mode" "QI")]) - ;; The patterns that match these are at the end of this file. (define_expand "divxf3" @@ -7352,6 +7347,83 @@ ;; Divmod instructions. +(define_expand "<u>divmodqi4" + [(parallel [(set (match_operand:QI 0 "register_operand" "") + (any_div:QI + (match_operand:QI 1 "register_operand" "") + (match_operand:QI 2 "nonimmediate_operand" ""))) + (set (match_operand:QI 3 "register_operand" "") + (mod:QI (match_dup 1) (match_dup 2))) + (clobber (reg:CC FLAGS_REG))])] + "TARGET_QIMODE_MATH" +{ + rtx div, clobber, set; + rtx operand0, operand1; + + operand0 = gen_reg_rtx (HImode); + operand1 = gen_reg_rtx (HImode); + clobber = gen_rtx_CLOBBER (VOIDmode, gen_rtx_REG (CCmode, FLAGS_REG)); + + /* Properly extend operands[1] to HImode. */ + set = gen_rtx_SET (VOIDmode, operand1, + gen_rtx_<extend_code> (HImode, operands[1])); + if (<extend_code> == SIGN_EXTEND) + emit_insn (set); + else + emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, set, clobber))); + + /* Generate 8bit divide. Result is in AX. */ + div = gen_rtx_UNSPEC (HImode, gen_rtvec (2, operand1, operands[2]), + (<extend_code> == SIGN_EXTEND + ? UNSPEC_DIVQI : UNSPEC_UDIVQI)); + div = gen_rtx_SET (VOIDmode, operand0, div); + emit_insn (gen_rtx_PARALLEL (VOIDmode, gen_rtvec (2, div, clobber))); + + /* Extract remainder from AH. */ + if (<extend_code> == SIGN_EXTEND) + operand1 = gen_rtx_<extract_code> (QImode, operand0, + GEN_INT (8), GEN_INT (8)); + else + { + operand1 = gen_rtx_<extract_code> (SImode, operand0, + GEN_INT (8), GEN_INT (8)); + operand1 = simplify_gen_subreg (QImode, operand1, SImode, 0); + } + emit_move_insn (operands[3], operand1); + + /* Extract quotient from AL. */ + operand1 = simplify_gen_subreg (QImode, operand0, HImode, 0); + emit_move_insn (operands[0], operand1); + DONE; +}) + +;; Divide AX by r/m8, with result stored in +;; AL <- Quotient +;; AH <- Remainder +(define_insn "*divqi3" + [(set (match_operand:HI 0 "register_operand" "=a") + (unspec:HI + [(match_operand:HI 1 "register_operand" "0") + (match_operand:QI 2 "nonimmediate_operand" "qm")] + UNSPEC_DIVQI)) + (clobber (reg:CC FLAGS_REG))] + "TARGET_QIMODE_MATH" + "idiv{b}\t%2" + [(set_attr "type" "idiv") + (set_attr "mode" "QI")]) + +(define_insn "*udivqi3" + [(set (match_operand:HI 0 "register_operand" "=a") + (unspec:HI + [(match_operand:HI 1 "register_operand" "0") + (match_operand:QI 2 "nonimmediate_operand" "qm")] + UNSPEC_UDIVQI)) + (clobber (reg:CC FLAGS_REG))] + "TARGET_QIMODE_MATH" + "div{b}\t%2" + [(set_attr "type" "idiv") + (set_attr "mode" "QI")]) + (define_expand "divmod<mode>4" [(parallel [(set (match_operand:SWIM248 0 "register_operand" "") (div:SWIM248 --- /dev/null 2010-06-16 10:11:06.602750711 -0700 +++ gcc/gcc/testsuite/gcc.target/i386/umod-1.c 2010-06-21 12:01:25.705950180 -0700 @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=atom" } */ + +unsigned char +foo (unsigned char x, unsigned char y) +{ + return x % y; +} + +/* { dg-final { scan-assembler-times "divb" 1 } } */ +/* { dg-final { scan-assembler-not "divw" } } */ --- /dev/null 2010-06-16 10:11:06.602750711 -0700 +++ gcc/gcc/testsuite/gcc.target/i386/umod-2.c 2010-06-21 12:01:17.857932744 -0700 @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=atom" } */ + +extern unsigned char z; + +unsigned char +foo (unsigned char x, unsigned char y) +{ + z = x/y; + return x % y; +} + +/* { dg-final { scan-assembler-times "divb" 1 } } */ +/* { dg-final { scan-assembler-not "divw" } } */ --- /dev/null 2010-06-16 10:11:06.602750711 -0700 +++ gcc/gcc/testsuite/gcc.target/i386/umod-3.c 2010-06-21 12:02:36.809962702 -0700 @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=atom" } */ + +extern void abort (void); +extern void exit (int); + +unsigned char cx = 7; + +int +main () +{ + unsigned char cy; + + cy = cx / 6; if (cy != 1) abort (); + cy = cx % 6; if (cy != 1) abort (); + + exit(0); +} + +/* { dg-final { scan-assembler-times "divb" 1 } } */ +/* { dg-final { scan-assembler-not "divw" } } */