Message ID | AANLkTinxr9MYzOtgiyPEBtkTu9tjm=NOussw3G8qrDek@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Thu, 2010-12-16 at 14:45 -0800, Carrot Wei wrote: > Hi > > Compile the following c code with options -march=armv7-a -mthumb -Os > > int foo (int s) > { > return s == 1; > } > > GCC 4.6 generates: > > 00000000 <foo>: > 0: f1a0 0301 sub.w r3, r0, #1 // A > 4: 4258 negs r0, r3 > 6: eb40 0003 adc.w r0, r0, r3 // B > a: 4770 bx lr > > Notice that instructions A and B are 32 bits. In thumb2 we can use subs and adcs > instead so they will be 16 bits. > This sequence already contains an instruction that sets the flags as a necessary part of the sequence. Why doesn't it also generate flag-corrupting variants of the other two instructions when the registers selected are suitable? It seems silly to force the compiler to do yet more work to clean up this code. R. > The root cause is the following peephole2 generates 32 bit instructions: > > 8731 ;; Attempt to improve the sequence generated by the compare_scc splitters > 8732 ;; not to use conditional execution. > 8733 (define_peephole2 > 8734 [(set (reg:CC CC_REGNUM) > 8735 (compare:CC (match_operand:SI 1 "register_operand" "") > 8736 (match_operand:SI 2 "arm_rhs_operand" ""))) > 8737 (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0)) > 8738 (set (match_operand:SI 0 "register_operand" "") > (const_int 0))) > 8739 (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0)) > 8740 (set (match_dup 0) (const_int 1))) > 8741 (match_scratch:SI 3 "r")] > 8742 "TARGET_32BIT" > 8743 [(set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2))) > 8744 (parallel > 8745 [(set (reg:CC CC_REGNUM) > 8746 (compare:CC (const_int 0) (match_dup 3))) > 8747 (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))]) > 8748 (set (match_dup 0) > 8749 (plus:SI (plus:SI (match_dup 0) (match_dup 3)) > 8750 (geu:SI (reg:CC CC_REGNUM) (const_int 0))))]) > 8751 > > This patch modifies the peephole2 to generate 16 bit instructions, and passed > regression test on arm qemu. > > thanks > Guozhi > > > ChangeLog: > 2010-12-16 Wei Guozhi <carrot@google.com> > > PR target/46975 > * config/arm/arm.md (*addsi3_carryin_compare0_<optab>): New pattern. > (peephole2 for conditional move): Generate 16 bit instructions. > > ChangeLog: > 2010-12-16 Wei Guozhi <carrot@google.com> > > PR target/46975 > * gcc.target/arm/pr46975.c: New testcase. > > > Index: arm.md > =================================================================== > --- arm.md (revision 167861) > +++ arm.md (working copy) > @@ -974,6 +974,21 @@ > (const_string "alu_shift_reg")))] > ) > > +(define_insn "*addsi3_carryin_compare0_<optab>" > + [(set (reg:CC CC_REGNUM) > + (compare:CC > + (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r") > + (match_operand:SI 2 "arm_rhs_operand" "rI")) > + (LTUGEU:SI (reg:<cnb> CC_REGNUM) (const_int 0))) > + (const_int 0))) > + (set (match_operand:SI 0 "s_register_operand" "=r") > + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) > + (LTUGEU:SI (reg:<cnb> CC_REGNUM) (const_int 0))))] > + "TARGET_32BIT" > + "adc%.\\t%0, %1, %2" > + [(set_attr "conds" "set")] > + ) > + > (define_expand "incscc" > [(set (match_operand:SI 0 "s_register_operand" "=r,r") > (plus:SI (match_operator:SI 2 "arm_comparison_operator" > @@ -8748,14 +8763,22 @@ > (set (match_dup 0) (const_int 1))) > (match_scratch:SI 3 "r")] > "TARGET_32BIT" > - [(set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2))) > + [(parallel > + [(set (reg:CC CC_REGNUM) > + (compare:CC (match_dup 1) (match_dup 2))) > + (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))]) > (parallel > [(set (reg:CC CC_REGNUM) > (compare:CC (const_int 0) (match_dup 3))) > (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))]) > - (set (match_dup 0) > - (plus:SI (plus:SI (match_dup 0) (match_dup 3)) > - (geu:SI (reg:CC CC_REGNUM) (const_int 0))))]) > + (parallel > + [(set (reg:CC CC_REGNUM) > + (compare:CC (plus:SI (plus:SI (match_dup 0) (match_dup 3)) > + (geu:SI (reg:CC CC_REGNUM) (const_int 0))) > + (const_int 0))) > + (set (match_dup 0) > + (plus:SI (plus:SI (match_dup 0) (match_dup 3)) > + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])]) > > (define_insn "*cond_move" > [(set (match_operand:SI 0 "s_register_operand" "=r,r,r") > > > Index: pr46975.c > =================================================================== > --- pr46975.c (revision 0) > +++ pr46975.c (revision 0) > @@ -0,0 +1,9 @@ > +/* { dg-options "-mthumb -Os" } */ > +/* { dg-require-effective-target arm_thumb2_ok } */ > +/* { dg-final { scan-assembler "subs" } } */ > +/* { dg-final { scan-assembler "adcs" } } */ > + > +int foo (int s) > +{ > + return s == 1; > +}
On Fri, Dec 17, 2010 at 4:18 AM, Richard Earnshaw <rearnsha@arm.com> wrote: > > On Thu, 2010-12-16 at 14:45 -0800, Carrot Wei wrote: >> Hi >> >> Compile the following c code with options -march=armv7-a -mthumb -Os >> >> int foo (int s) >> { >> return s == 1; >> } >> >> GCC 4.6 generates: >> >> 00000000 <foo>: >> 0: f1a0 0301 sub.w r3, r0, #1 // A >> 4: 4258 negs r0, r3 >> 6: eb40 0003 adc.w r0, r0, r3 // B >> a: 4770 bx lr >> >> Notice that instructions A and B are 32 bits. In thumb2 we can use subs and adcs >> instead so they will be 16 bits. >> > > This sequence already contains an instruction that sets the flags as a > necessary part of the sequence. Why doesn't it also generate > flag-corrupting variants of the other two instructions when the That is the root cause of this problem. The old pattern simply didn't generate the flag setting version. > registers selected are suitable? It seems silly to force the compiler > to do yet more work to clean up this code. > This patch doesn't force the compiler to do more work. It directly modifies the original peephole2 pattern to generate flag setting instructions as you suggested. And the new insn pattern is for instruction adcs which didn't exist previously. thanks Guozhi
Ping On Sat, Dec 18, 2010 at 3:30 AM, Carrot Wei <carrot@google.com> wrote: > > On Fri, Dec 17, 2010 at 4:18 AM, Richard Earnshaw <rearnsha@arm.com> wrote: > > > > On Thu, 2010-12-16 at 14:45 -0800, Carrot Wei wrote: > >> Hi > >> > >> Compile the following c code with options -march=armv7-a -mthumb -Os > >> > >> int foo (int s) > >> { > >> return s == 1; > >> } > >> > >> GCC 4.6 generates: > >> > >> 00000000 <foo>: > >> 0: f1a0 0301 sub.w r3, r0, #1 // A > >> 4: 4258 negs r0, r3 > >> 6: eb40 0003 adc.w r0, r0, r3 // B > >> a: 4770 bx lr > >> > >> Notice that instructions A and B are 32 bits. In thumb2 we can use subs and adcs > >> instead so they will be 16 bits. > >> > > > > This sequence already contains an instruction that sets the flags as a > > necessary part of the sequence. Why doesn't it also generate > > flag-corrupting variants of the other two instructions when the > > That is the root cause of this problem. The old pattern simply didn't > generate the flag setting version. > > > registers selected are suitable? It seems silly to force the compiler > > to do yet more work to clean up this code. > > > > This patch doesn't force the compiler to do more work. It directly > modifies the original peephole2 pattern to generate flag setting > instructions as you suggested. And the new insn pattern is for > instruction adcs which didn't exist previously. > > thanks > Guozhi
Ping On Mon, Jan 10, 2011 at 12:30 PM, Carrot Wei <carrot@google.com> wrote: > Ping > > On Sat, Dec 18, 2010 at 3:30 AM, Carrot Wei <carrot@google.com> wrote: >> >> On Fri, Dec 17, 2010 at 4:18 AM, Richard Earnshaw <rearnsha@arm.com> wrote: >> > >> > On Thu, 2010-12-16 at 14:45 -0800, Carrot Wei wrote: >> >> Hi >> >> >> >> Compile the following c code with options -march=armv7-a -mthumb -Os >> >> >> >> int foo (int s) >> >> { >> >> return s == 1; >> >> } >> >> >> >> GCC 4.6 generates: >> >> >> >> 00000000 <foo>: >> >> 0: f1a0 0301 sub.w r3, r0, #1 // A >> >> 4: 4258 negs r0, r3 >> >> 6: eb40 0003 adc.w r0, r0, r3 // B >> >> a: 4770 bx lr >> >> >> >> Notice that instructions A and B are 32 bits. In thumb2 we can use subs and adcs >> >> instead so they will be 16 bits. >> >> >> > >> > This sequence already contains an instruction that sets the flags as a >> > necessary part of the sequence. Why doesn't it also generate >> > flag-corrupting variants of the other two instructions when the >> >> That is the root cause of this problem. The old pattern simply didn't >> generate the flag setting version. >> >> > registers selected are suitable? It seems silly to force the compiler >> > to do yet more work to clean up this code. >> > >> >> This patch doesn't force the compiler to do more work. It directly >> modifies the original peephole2 pattern to generate flag setting >> instructions as you suggested. And the new insn pattern is for >> instruction adcs which didn't exist previously. >> >> thanks >> Guozhi >
Ping On Sat, Dec 18, 2010 at 3:30 AM, Carrot Wei <carrot@google.com> wrote: > On Fri, Dec 17, 2010 at 4:18 AM, Richard Earnshaw <rearnsha@arm.com> wrote: >> >> On Thu, 2010-12-16 at 14:45 -0800, Carrot Wei wrote: >>> Hi >>> >>> Compile the following c code with options -march=armv7-a -mthumb -Os >>> >>> int foo (int s) >>> { >>> return s == 1; >>> } >>> >>> GCC 4.6 generates: >>> >>> 00000000 <foo>: >>> 0: f1a0 0301 sub.w r3, r0, #1 // A >>> 4: 4258 negs r0, r3 >>> 6: eb40 0003 adc.w r0, r0, r3 // B >>> a: 4770 bx lr >>> >>> Notice that instructions A and B are 32 bits. In thumb2 we can use subs and adcs >>> instead so they will be 16 bits. >>> >> >> This sequence already contains an instruction that sets the flags as a >> necessary part of the sequence. Why doesn't it also generate >> flag-corrupting variants of the other two instructions when the > > That is the root cause of this problem. The old pattern simply didn't > generate the flag setting version. > >> registers selected are suitable? It seems silly to force the compiler >> to do yet more work to clean up this code. >> > > This patch doesn't force the compiler to do more work. It directly > modifies the original peephole2 pattern to generate flag setting > instructions as you suggested. And the new insn pattern is for > instruction adcs which didn't exist previously. > > thanks > Guozhi >
Index: arm.md =================================================================== --- arm.md (revision 167861) +++ arm.md (working copy) @@ -974,6 +974,21 @@ (const_string "alu_shift_reg")))] ) +(define_insn "*addsi3_carryin_compare0_<optab>" + [(set (reg:CC CC_REGNUM) + (compare:CC + (plus:SI (plus:SI (match_operand:SI 1 "s_register_operand" "%r") + (match_operand:SI 2 "arm_rhs_operand" "rI")) + (LTUGEU:SI (reg:<cnb> CC_REGNUM) (const_int 0))) + (const_int 0))) + (set (match_operand:SI 0 "s_register_operand" "=r") + (plus:SI (plus:SI (match_dup 1) (match_dup 2)) + (LTUGEU:SI (reg:<cnb> CC_REGNUM) (const_int 0))))] + "TARGET_32BIT" + "adc%.\\t%0, %1, %2" + [(set_attr "conds" "set")] + ) + (define_expand "incscc" [(set (match_operand:SI 0 "s_register_operand" "=r,r") (plus:SI (match_operator:SI 2 "arm_comparison_operator" @@ -8748,14 +8763,22 @@ (set (match_dup 0) (const_int 1))) (match_scratch:SI 3 "r")] "TARGET_32BIT" - [(set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2))) + [(parallel + [(set (reg:CC CC_REGNUM) + (compare:CC (match_dup 1) (match_dup 2))) + (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))]) (parallel [(set (reg:CC CC_REGNUM) (compare:CC (const_int 0) (match_dup 3))) (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))]) - (set (match_dup 0) - (plus:SI (plus:SI (match_dup 0) (match_dup 3)) - (geu:SI (reg:CC CC_REGNUM) (const_int 0))))]) + (parallel + [(set (reg:CC CC_REGNUM) + (compare:CC (plus:SI (plus:SI (match_dup 0) (match_dup 3)) + (geu:SI (reg:CC CC_REGNUM) (const_int 0))) + (const_int 0))) + (set (match_dup 0) + (plus:SI (plus:SI (match_dup 0) (match_dup 3)) + (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])]) (define_insn "*cond_move" [(set (match_operand:SI 0 "s_register_operand" "=r,r,r") Index: pr46975.c =================================================================== --- pr46975.c (revision 0) +++ pr46975.c (revision 0) @@ -0,0 +1,9 @@ +/* { dg-options "-mthumb -Os" } */ +/* { dg-require-effective-target arm_thumb2_ok } */ +/* { dg-final { scan-assembler "subs" } } */ +/* { dg-final { scan-assembler "adcs" } } */ + +int foo (int s) +{ + return s == 1; +}