Fix PR 61225

On 17 July 2014 11:10, Jeff Law <law@redhat.com> wrote:
> On 05/22/14 03:52, Zhenqiang Chen wrote:
>>
>> On 21 May 2014 20:43, Steven Bosscher <stevenb.gcc@gmail.com> wrote:
>>>
>>> On Wed, May 21, 2014 at 11:58 AM, Zhenqiang Chen wrote:
>>>>
>>>> Hi,
>>>>
>>>> The patch fixes the gcc.target/i386/pr49095.c FAIL in PR61225. The
>>>> test case tends to check a peephole2 optimization, which optimizes the
>>>> following sequence
>>>>
>>>>      2: bx:SI=ax:SI
>>>>      25: ax:SI=[bx:SI]
>>>>      7: {ax:SI=ax:SI-0x1;clobber flags:CC;}
>>>>      8: [bx:SI]=ax:SI
>>>>      9: flags:CCZ=cmp(ax:SI,0)
>>>> to
>>>>     2: bx:SI=ax:SI
>>>>     41: {flags:CCZ=cmp([bx:SI]-0x1,0);[bx:SI]=[bx:SI]-0x1;}
>>>>
>>>> The enhanced shrink-wrapping, which calls copyprop_hardreg_forward
>>>> changes the INSN 25 to
>>>>
>>>>      25: ax:SI=[ax:SI]
>>>>
>>>> Then peephole2 can not optimize it since two memory_operands look like
>>>> different.
>>>>
>>>> To fix it, the patch adds another peephole2 rule to read one more
>>>> insn. From the register copy, it knows the address is the same.
>>>
>>>
>>> That is one complex peephole2 to deal with a transformation like this.
>>> It seems to be like it's a too specific solution for a bigger problem.
>>>
>>> Could you please try one of the following solutions instead:
>>>
>>> 1. Track register values for peephole2 and try different alternatives
>>> based on known register equivalences? E.g. in your example, perhaps
>>> there is already a REG_EQUAL/REG_EQUIV note available on insn 25 after
>>> copyprop_hardreg_forward, to annotate that [ax:SI] is equivalent to
>>> [bx:SI] at that point (or if that information is not available, it is
>>> not very difficult to make it available). Then you could try applying
>>> peephole2 on the original pattern but also on patterns modified with
>>> the known equivalences (i.e. try peephole2 on multiple equivalent
>>> patterns for the same insn). This may expose other peephole2
>>> opportunities, not just the specific one your patch addresses.
>>
>>
>> Patch is updated according to the comment. There is no REG_EQUAL. So I
>> add it when replace_oldest_value_reg.
>>
>> ChangeLog:
>> 2014-05-22  Zhenqiang Chen  <zhenqiang.chen@linaro.org>
>>
>>          Part of PR rtl-optimization/61225
>>          * config/i386/i386-protos.h (ix86_peephole2_rtx_equal_p): New
>> proto.
>>          * config/i386/i386.c (ix86_peephole2_rtx_equal_p): New function.
>>          * regcprop.c (replace_oldest_value_reg): Add REG_EQUAL note when
>>          propagating to SET.
>
> I can't help but wonder why the new 4 insn combination code isn't presenting
> this as a nice big fat insn to the x86 backend which would eliminate the
> need for the peep2.
>
> But, assuming there's a fundamental reason why that's not kicking in...

Current combine pass can only handle

I0 -> I1 -> I2 -> I3.
I0, I1 -> I2, I2 -> I3.
I0 -> I2; I1, I2 -> I3.
I0 -> I1; I1, I2 -> I3.

For the case, it is
I1 -> I2 -> I3; I2 -> INSN

I3 and INSN looks like not related. But INSN is a COMPARE to set CC
and I3 can also set CC. I3 and INSN can be combined together as one
instruction to set CC.

The following patch enhances combine pass to handle the case.

A new parameter is added for try_combine to accept INSN as
TO_COMBINED_INSN. It reuses the 3-insn combine method to combine I1 ->
I2 -> I3. If there is TO_COMBINED_INSN, combine I2 ->
TO_COMBINED_INSN. Then create an new insn "parallel (TO_COMBINED_INSN,
I3)". refer_same_reg_p is some check to make sure the change is safe.

Bootstrap and no make check regression on X86-64 and i686.

X86-64 bootstrap logs show 358 cases were combined by the patch.

Ok for trunk?

Thanks!
-Zhenqiang

ChangeLog
2014-08-04  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

        Part of PR rtl-optimization/61225
        * combine.c (refer_same_reg_p): New function.
        (combine_instructions): Handle I1 -> I2 -> I3; I2 -> insn.
        (try_combine): Add one more parameter TO_COMBINED_INSN, which is
        used to create a new insn parallel (TO_COMBINED_INSN, I3).

testsuite/ChangeLog:
2014-08-04  Zhenqiang Chen  <zhenqiang.chen@linaro.org>

        * gcc.target/i386/pr61225.c: New test.

+/* { dg-final { cleanup-rtl-dump "combine" } } */

> In replace_oldest_value_reg, why not use reg_overlap_mentioned_p to
> determine if the REGNO of NEW_RTX is modified by INSN?  I'd look to avoid
> some of those calls to single_set (insn).  Just call it once and reuse the
> value.
>
> Shouldn't you be ensuring the REG_EQUAL note is unique?  I think we have a
> routine to avoid creating a note that already exists.
>
> Don't you have to ensure that the value in the REG_EQUAL note has not
> changed?  A REG_EQUAL note denotes an equivalence that holds at the single
> insn where it appears.  If you want to use the value elsewhere you'd have to
> ensure the value hasn't been changed.  If RTX referred to by the REG_EQUAL
> note is a MEM, this can be relatively difficult due to aliasing issues.
>
> Jeff
>
>
>
>

Fix PR 61225

Commit Message

Comments

Patch