Message ID | 000001d00546$33445430$99ccfc90$@arm.com |
---|---|
State | New |
Headers | show |
> The patch tries to use REG_EQUAL to get more precise info for nonzero_bits, > which helps to remove unnecessary zero_extend. > > Here is an example when compiling Coremark, we have rtx like, > > (insn 1244 386 388 47 (set (reg:SI 263 [ D.5767 ]) > (reg:SI 384 [ D.5767 ])) 786 {*thumb2_movsi_insn} > (expr_list:REG_EQUAL (zero_extend:SI (mem:QI (reg/v/f:SI 271 [ memblock > ]) [0 *memblock_13(D)+0 S1 A8])) > (nil))) > > from "reg:SI 384", we can only know it is a 32-bit value. But from > REG_EQUAL, we can know it is an 8-bit value. Then for the following rtx seq, > > (insn 409 407 410 50 (set (reg:SI 308) > (plus:SI (reg:SI 263 [ D.5767 ]) > (const_int -48 [0xffffffffffffffd0]))) core_state.c:170 4 > {*arm_addsi3} > (nil)) > (insn 410 409 411 50 (set (reg:SI 309) > (zero_extend:SI (subreg:QI (reg:SI 308) 0))) core_state.c:170 812 > {thumb2_zero_extendqisi2_v6} > (expr_list:REG_DEAD (reg:SI 308) > (nil))) > > the zero_extend for r309 can be optimized by combine pass. This sounds like a good idea. > 2014-11-21 Zhenqiang Chen <zhenqiang.chen@arm.com> > > * combine.c (set_nonzero_bits_and_sign_copies): Try REG_EQUAL note. > > diff --git a/gcc/combine.c b/gcc/combine.c > index 6a7d16b..68a883b 100644 > --- a/gcc/combine.c > +++ b/gcc/combine.c > @@ -1713,7 +1713,15 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx > set, void *data) > > /* Don't call nonzero_bits if it cannot change anything. */ > if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) > - rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); > + { > + rtx reg_equal = insn ? find_reg_note (insn, REG_EQUAL, > NULL_RTX) > + : NULL_RTX; > + if (reg_equal) > + rsp->nonzero_bits |= nonzero_bits (XEXP (reg_equal, 0), > + nonzero_bits_mode); > + else > + rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); > + } > num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x)); > if (rsp->sign_bit_copies == 0 > > || rsp->sign_bit_copies > num) Use find_reg_equal_equiv_note instead. Are you sure that this won't yield inferior results in very peculiar cases? IOW, why not use both sources? Note that 'src' is massaged just above if SHORT_IMMEDIATES_SIGN_EXTEND is defined so we should probably do it for the note datum too, for example by factoring the code into a function and invoking it. Why not do the same for num_sign_bit_copies?
> -----Original Message----- > From: Eric Botcazou [mailto:ebotcazou@adacore.com] > Sent: Saturday, November 22, 2014 6:15 PM > To: Zhenqiang Chen > Cc: gcc-patches@gcc.gnu.org > Subject: Re: [PATCH, combine] Try REG_EQUAL for nonzero_bits > > > The patch tries to use REG_EQUAL to get more precise info for > > nonzero_bits, which helps to remove unnecessary zero_extend. > > > > Here is an example when compiling Coremark, we have rtx like, > > > > (insn 1244 386 388 47 (set (reg:SI 263 [ D.5767 ]) > > (reg:SI 384 [ D.5767 ])) 786 {*thumb2_movsi_insn} > > (expr_list:REG_EQUAL (zero_extend:SI (mem:QI (reg/v/f:SI 271 [ > > memblock > > ]) [0 *memblock_13(D)+0 S1 A8])) > > (nil))) > > > > from "reg:SI 384", we can only know it is a 32-bit value. But from > > REG_EQUAL, we can know it is an 8-bit value. Then for the following > > rtx seq, > > > > (insn 409 407 410 50 (set (reg:SI 308) > > (plus:SI (reg:SI 263 [ D.5767 ]) > > (const_int -48 [0xffffffffffffffd0]))) core_state.c:170 4 > > {*arm_addsi3} > > (nil)) > > (insn 410 409 411 50 (set (reg:SI 309) > > (zero_extend:SI (subreg:QI (reg:SI 308) 0))) core_state.c:170 > > 812 {thumb2_zero_extendqisi2_v6} > > (expr_list:REG_DEAD (reg:SI 308) > > (nil))) > > > > the zero_extend for r309 can be optimized by combine pass. > > This sounds like a good idea. > > > 2014-11-21 Zhenqiang Chen <zhenqiang.chen@arm.com> > > > > * combine.c (set_nonzero_bits_and_sign_copies): Try REG_EQUAL > note. > > > > diff --git a/gcc/combine.c b/gcc/combine.c index 6a7d16b..68a883b > > 100644 > > --- a/gcc/combine.c > > +++ b/gcc/combine.c > > @@ -1713,7 +1713,15 @@ set_nonzero_bits_and_sign_copies (rtx x, > > const_rtx set, void *data) > > > > /* Don't call nonzero_bits if it cannot change anything. */ > > if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) > > - rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); > > + { > > + rtx reg_equal = insn ? find_reg_note (insn, REG_EQUAL, > > NULL_RTX) > > + : NULL_RTX; > > + if (reg_equal) > > + rsp->nonzero_bits |= nonzero_bits (XEXP (reg_equal, 0), > > + nonzero_bits_mode); > > + else > > + rsp->nonzero_bits |= nonzero_bits (src, > nonzero_bits_mode); > > + } > > num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x)); > > if (rsp->sign_bit_copies == 0 > > > > || rsp->sign_bit_copies > num) > > Use find_reg_equal_equiv_note instead. Are you sure that this won't yield > inferior results in very peculiar cases? IOW, why not use both sources? Thanks for the comments. I will compare the two nonzero_bits from src and REG_EQUAL. Then select the smaller one. > Note that 'src' is massaged just above if SHORT_IMMEDIATES_SIGN_EXTEND > is defined so we should probably do it for the note datum too, for example > by factoring the code into a function and invoking it. > > Why not do the same for num_sign_bit_copies? Do you know why it use " SET_SRC (set)" other than "src" for num_sign_bit_copies? If it is "src", I should do the same for num_sign_bit_copies with REG_EQUAL info. Thanks! -Zhenqiang
> Thanks for the comments. I will compare the two nonzero_bits from src and > REG_EQUAL. Then select the smaller one. They are masks so you can simply AND them before ORing the result. > Do you know why it use " SET_SRC (set)" other than "src" for > num_sign_bit_copies? > > If it is "src", I should do the same for num_sign_bit_copies with REG_EQUAL > info. Probably historical reasons, let's not try to change that now. You can apply the same treatment to num_sign_bit_copies (you will need a comparison here) while preserving the "src" vs "SET_SRC (set)" discrepancy.
diff --git a/gcc/combine.c b/gcc/combine.c index 6a7d16b..68a883b 100644 --- a/gcc/combine.c +++ b/gcc/combine.c @@ -1713,7 +1713,15 @@ set_nonzero_bits_and_sign_copies (rtx x, const_rtx set, void *data) /* Don't call nonzero_bits if it cannot change anything. */ if (rsp->nonzero_bits != ~(unsigned HOST_WIDE_INT) 0) - rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); + { + rtx reg_equal = insn ? find_reg_note (insn, REG_EQUAL, NULL_RTX) + : NULL_RTX; + if (reg_equal) + rsp->nonzero_bits |= nonzero_bits (XEXP (reg_equal, 0), + nonzero_bits_mode); + else + rsp->nonzero_bits |= nonzero_bits (src, nonzero_bits_mode); + } num = num_sign_bit_copies (SET_SRC (set), GET_MODE (x));