diff mbox

RFA: patch to fix a bad code generation for PR64110 -- new constraints addition

Message ID 54B7015B.4030008@redhat.com
State New
Headers show

Commit Message

Vladimir Makarov Jan. 14, 2015, 11:52 p.m. UTC
The problem of unexpected code generation is discussed on

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110

   The following patch introduces 2 new constraints '^' and '$'  which 
are analogous to '?' and '!' but disfavor given alternative when *the 
operand with the new constraint* needs a reload ('?' and '!' disfavor 
the alternative if *any* operand needs a reload).  I hope the new 
constraints will be useful for other insns and targets.

   The patch was successfully bootstrapped and tested on x86-64.

   I just need an approval for changes in sse.md, stmt.c, and genoutput.c

Thanks.

2015-01-14  Vladimir Makarov  <vmakarov@redhat.com>

         PR rtl-optimization/64110
         * stmt.c (parse_output_constraint): Process '^' and '$'.
         (parse_input_constraint): Ditto.
         * lra-constraints.c (process_alt_operands): Process the new
         constraints.
         * ira-costs.c (record_reg_classes): Process the new constraint
         '^'.
         * genoutput.c (indep_constraints): Add '^' and '$'.
         * config/i386/sse.md (*vec_dup<mode>): Use '$' instead of '!'.
         * doc/md.texi: Add description of the new constraints.

2015-01-14  Vladimir Makarov  <vmakarov@redhat.com>

         PR rtl-optimization/64110
         * gcc.target/i386/pr64110.c: Add scan-assembler.

Comments

Jeff Law Jan. 15, 2015, 4:43 a.m. UTC | #1
On 01/14/15 16:52, Vladimir Makarov wrote:
>    The problem of unexpected code generation is discussed on
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110
>
>    The following patch introduces 2 new constraints '^' and '$'  which
> are analogous to '?' and '!' but disfavor given alternative when *the
> operand with the new constraint* needs a reload ('?' and '!' disfavor
> the alternative if *any* operand needs a reload).  I hope the new
> constraints will be useful for other insns and targets.
Right.  This gives us finer grained control over when to disparage an 
alternative.

Reloading some of the operands in an alternative may not be a big deal, 
but there may be other operands in the alternative that if a reload was 
needed for that operand would be so bad that we'd want to reject the 
entire alternative.

The example I had in mind when I read Vlad's analysis in the BZ were the 
old movb and addb patterns on the PA.  Basically we have some side 
effect like addition/subtraction/register copy along with a conditional 
jump.

(define_insn ""
   [(set (pc)
         (if_then_else
           (match_operator 2 "movb_comparison_operator"
            [(match_operand:SI 1 "register_operand" "r,r,r,r") 
(const_int 0)])
           (label_ref (match_operand 3 "" ""))
           (pc)))
    (set (match_operand:SI 0 "reg_before_reload_operand" "=!r,!*f,*Q,!*q")
         (match_dup 1))]

Needing a reload for operand 1 really isn't a big deal here, but 
reloading operand 0 is a disaster.  This would be a good place to use 
the new constraint modifiers.

I can distinctly recall running into similar issues on other ports 
through the years.  I wouldn't be at all surprised if a notable 
percentage of the "!" and "?"s that appear in our machine descriptions 
would be better off as "^" and "$".


>
> 2015-01-14  Vladimir Makarov <vmakarov@redhat.com>
>
>          PR rtl-optimization/64110
>          * stmt.c (parse_output_constraint): Process '^' and '$'.
>          (parse_input_constraint): Ditto.
>          * lra-constraints.c (process_alt_operands): Process the new
>          constraints.
>          * ira-costs.c (record_reg_classes): Process the new constraint
>          '^'.
>          * genoutput.c (indep_constraints): Add '^' and '$'.
>          * config/i386/sse.md (*vec_dup<mode>): Use '$' instead of '!'.
>          * doc/md.texi: Add description of the new constraints.
>
> 2015-01-14  Vladimir Makarov <vmakarov@redhat.com>
>
>          PR rtl-optimization/64110
>          * gcc.target/i386/pr64110.c: Add scan-assembler.
>
>
> pr64110-3.patch
>
>
> Index: config/i386/sse.md
> ===================================================================
> --- config/i386/sse.md	(revision 219262)
> +++ config/i386/sse.md	(working copy)
> @@ -16713,7 +16713,7 @@
>   (define_insn "*vec_dup<mode>"
>     [(set (match_operand:AVX2_VEC_DUP_MODE 0 "register_operand" "=x,x,x")
>   	(vec_duplicate:AVX2_VEC_DUP_MODE
> -	  (match_operand:<ssescalarmode> 1 "nonimmediate_operand" "m,x,!r")))]
> +	  (match_operand:<ssescalarmode> 1 "nonimmediate_operand" "m,x,$r")))]
>     "TARGET_AVX2"
>     "@
>      v<sseintprefix>broadcast<bcstscalarsuff>\t{%1, %0|%0, %1}
> Index: doc/md.texi
> ===================================================================
> --- doc/md.texi	(revision 219262)
> +++ doc/md.texi	(working copy)
> @@ -1503,6 +1503,18 @@ in it.
>   Disparage severely the alternative that the @samp{!} appears in.
>   This alternative can still be used if it fits without reloading,
>   but if reloading is needed, some other alternative will be used.
> +
> +@cindex @samp{^} in constraint
> +@cindex caret
> +@item ^
> +This constraint is analogous to @samp{?} but it disparages slightly
> +the alternative only unless the corresponding operand applies exactly.
> +
> +@cindex @samp{$} in constraint
> +@cindex dollar sign
> +@item $
> +This constraint is analogous to @samp{!} but it disparages severely
> +the alternative only unless the corresponding operand applies exactly.
>   @end table
I found these hard to parse.

This disparages severely the alternative if the operand with the 
@samp{$} needs a reload.

Seems clearer to me.

With the doc update this is good for the trunk.

Jeff
Richard Sandiford Jan. 24, 2015, 11:29 a.m. UTC | #2
Jeff Law <law@redhat.com> writes:
> On 01/14/15 16:52, Vladimir Makarov wrote:
>>    The problem of unexpected code generation is discussed on
>>
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110
>>
>>    The following patch introduces 2 new constraints '^' and '$'  which
>> are analogous to '?' and '!' but disfavor given alternative when *the
>> operand with the new constraint* needs a reload ('?' and '!' disfavor
>> the alternative if *any* operand needs a reload).  I hope the new
>> constraints will be useful for other insns and targets.
> Right.  This gives us finer grained control over when to disparage an 
> alternative.
>
> Reloading some of the operands in an alternative may not be a big deal, 
> but there may be other operands in the alternative that if a reload was 
> needed for that operand would be so bad that we'd want to reject the 
> entire alternative.
>
> The example I had in mind when I read Vlad's analysis in the BZ were the 
> old movb and addb patterns on the PA.  Basically we have some side 
> effect like addition/subtraction/register copy along with a conditional 
> jump.
>
> (define_insn ""
>    [(set (pc)
>          (if_then_else
>            (match_operator 2 "movb_comparison_operator"
>             [(match_operand:SI 1 "register_operand" "r,r,r,r") 
> (const_int 0)])
>            (label_ref (match_operand 3 "" ""))
>            (pc)))
>     (set (match_operand:SI 0 "reg_before_reload_operand" "=!r,!*f,*Q,!*q")
>          (match_dup 1))]
>
> Needing a reload for operand 1 really isn't a big deal here, but 
> reloading operand 0 is a disaster.  This would be a good place to use 
> the new constraint modifiers.
>
> I can distinctly recall running into similar issues on other ports 
> through the years.  I wouldn't be at all surprised if a notable 
> percentage of the "!" and "?"s that appear in our machine descriptions 
> would be better off as "^" and "$".

Yeah.  I expect in practice most people who used "?" and "!" attached
them to a particular operand for a reason.  From a quick scan through
386.exp it looked like almost all uses would either want this behaviour
or wouldn't care.  An interesting exception is:

(define_insn "extendsidi2_1"
  [(set (match_operand:DI 0 "nonimmediate_operand" "=*A,r,?r,?*o")
	(sign_extend:DI (match_operand:SI 1 "register_operand" "0,0,r,r")))
   (clobber (reg:CC FLAGS_REG))
   (clobber (match_scratch:SI 2 "=X,X,X,&r"))]
  "!TARGET_64BIT"
  "#")

I don't know how effective the third alternative is with LRA.  Surely
a "r<-0" alternative is by definition a case where "r<-r" is possible
only with a "?"-cost reload?  Seems to me we could just delete it.
But assuming it does some good, I suppose the "?" really does apply to
the alternative as a whole.  If we had to reload operand 1 or operand 0,
there's an extra cost if it can't use the same register as the other
operand.

Wouldn't it be better to make "?" and "!" behave the new way and only
add new constraints if it turns out that the old behaviour really is
useful in some cases?

Maybe stage 4 isn't the time to be making that kind of change.
Still, it'd be great if someone who's set up do x86_64 benchmarking
could measure the effect of making "?" and "!" behave like the
new constraints.

Thanks,
Richard
H.J. Lu Jan. 24, 2015, 1:42 p.m. UTC | #3
On Sat, Jan 24, 2015 at 3:29 AM, Richard Sandiford
<rdsandiford@googlemail.com> wrote:
> Jeff Law <law@redhat.com> writes:
>> On 01/14/15 16:52, Vladimir Makarov wrote:
>>>    The problem of unexpected code generation is discussed on
>>>
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64110
>>>
>>>    The following patch introduces 2 new constraints '^' and '$'  which
>>> are analogous to '?' and '!' but disfavor given alternative when *the
>>> operand with the new constraint* needs a reload ('?' and '!' disfavor
>>> the alternative if *any* operand needs a reload).  I hope the new
>>> constraints will be useful for other insns and targets.
>> Right.  This gives us finer grained control over when to disparage an
>> alternative.
>>
>> Reloading some of the operands in an alternative may not be a big deal,
>> but there may be other operands in the alternative that if a reload was
>> needed for that operand would be so bad that we'd want to reject the
>> entire alternative.
>>
>> The example I had in mind when I read Vlad's analysis in the BZ were the
>> old movb and addb patterns on the PA.  Basically we have some side
>> effect like addition/subtraction/register copy along with a conditional
>> jump.
>>
>> (define_insn ""
>>    [(set (pc)
>>          (if_then_else
>>            (match_operator 2 "movb_comparison_operator"
>>             [(match_operand:SI 1 "register_operand" "r,r,r,r")
>> (const_int 0)])
>>            (label_ref (match_operand 3 "" ""))
>>            (pc)))
>>     (set (match_operand:SI 0 "reg_before_reload_operand" "=!r,!*f,*Q,!*q")
>>          (match_dup 1))]
>>
>> Needing a reload for operand 1 really isn't a big deal here, but
>> reloading operand 0 is a disaster.  This would be a good place to use
>> the new constraint modifiers.
>>
>> I can distinctly recall running into similar issues on other ports
>> through the years.  I wouldn't be at all surprised if a notable
>> percentage of the "!" and "?"s that appear in our machine descriptions
>> would be better off as "^" and "$".
>
> Yeah.  I expect in practice most people who used "?" and "!" attached
> them to a particular operand for a reason.  From a quick scan through
> 386.exp it looked like almost all uses would either want this behaviour
> or wouldn't care.  An interesting exception is:
>
> (define_insn "extendsidi2_1"
>   [(set (match_operand:DI 0 "nonimmediate_operand" "=*A,r,?r,?*o")
>         (sign_extend:DI (match_operand:SI 1 "register_operand" "0,0,r,r")))
>    (clobber (reg:CC FLAGS_REG))
>    (clobber (match_scratch:SI 2 "=X,X,X,&r"))]
>   "!TARGET_64BIT"
>   "#")
>
> I don't know how effective the third alternative is with LRA.  Surely
> a "r<-0" alternative is by definition a case where "r<-r" is possible
> only with a "?"-cost reload?  Seems to me we could just delete it.
> But assuming it does some good, I suppose the "?" really does apply to
> the alternative as a whole.  If we had to reload operand 1 or operand 0,
> there's an extra cost if it can't use the same register as the other
> operand.
>
> Wouldn't it be better to make "?" and "!" behave the new way and only
> add new constraints if it turns out that the old behaviour really is
> useful in some cases?
>
> Maybe stage 4 isn't the time to be making that kind of change.
> Still, it'd be great if someone who's set up do x86_64 benchmarking
> could measure the effect of making "?" and "!" behave like the
> new constraints.

Areg, can we run some benchmarks?
Jeff Law Jan. 26, 2015, 7:52 p.m. UTC | #4
On 01/24/15 04:29, Richard Sandiford wrote:
>
> Yeah.  I expect in practice most people who used "?" and "!" attached
> them to a particular operand for a reason.  From a quick scan through
> 386.exp it looked like almost all uses would either want this behaviour
> or wouldn't care.  An interesting exception is:
>
> (define_insn "extendsidi2_1"
>    [(set (match_operand:DI 0 "nonimmediate_operand" "=*A,r,?r,?*o")
> 	(sign_extend:DI (match_operand:SI 1 "register_operand" "0,0,r,r")))
>     (clobber (reg:CC FLAGS_REG))
>     (clobber (match_scratch:SI 2 "=X,X,X,&r"))]
>    "!TARGET_64BIT"
>    "#")
>
> I don't know how effective the third alternative is with LRA.  Surely
> a "r<-0" alternative is by definition a case where "r<-r" is possible
> only with a "?"-cost reload?  Seems to me we could just delete it.
> But assuming it does some good, I suppose the "?" really does apply to
> the alternative as a whole.  If we had to reload operand 1 or operand 0,
> there's an extra cost if it can't use the same register as the other
> operand.
>
> Wouldn't it be better to make "?" and "!" behave the new way and only
> add new constraints if it turns out that the old behaviour really is
> useful in some cases?
>
> Maybe stage 4 isn't the time to be making that kind of change.
> Still, it'd be great if someone who's set up do x86_64 benchmarking
> could measure the effect of making "?" and "!" behave like the
> new constraints.
My worry isn't the x86_64 port, but all the others that folks don't test 
as regularly.

I'd rather go the other direction, have folks familiar with the port go 
through it changing the constraints where it makes sense.  That just 
seems a hell of a lot safer.

A port maintainer could certainly hack something together for testing 
purposes to guide them as to whether or not there's something to be 
gained by converting many/most of the ?! to the new constraints.

jeff
Richard Sandiford Jan. 27, 2015, 2:08 p.m. UTC | #5
Jeff Law <law@redhat.com> writes:
> On 01/24/15 04:29, Richard Sandiford wrote:
>>
>> Yeah.  I expect in practice most people who used "?" and "!" attached
>> them to a particular operand for a reason.  From a quick scan through
>> 386.exp it looked like almost all uses would either want this behaviour
>> or wouldn't care.  An interesting exception is:
>>
>> (define_insn "extendsidi2_1"
>>    [(set (match_operand:DI 0 "nonimmediate_operand" "=*A,r,?r,?*o")
>> 	(sign_extend:DI (match_operand:SI 1 "register_operand" "0,0,r,r")))
>>     (clobber (reg:CC FLAGS_REG))
>>     (clobber (match_scratch:SI 2 "=X,X,X,&r"))]
>>    "!TARGET_64BIT"
>>    "#")
>>
>> I don't know how effective the third alternative is with LRA.  Surely
>> a "r<-0" alternative is by definition a case where "r<-r" is possible
>> only with a "?"-cost reload?  Seems to me we could just delete it.
>> But assuming it does some good, I suppose the "?" really does apply to
>> the alternative as a whole.  If we had to reload operand 1 or operand 0,
>> there's an extra cost if it can't use the same register as the other
>> operand.
>>
>> Wouldn't it be better to make "?" and "!" behave the new way and only
>> add new constraints if it turns out that the old behaviour really is
>> useful in some cases?
>>
>> Maybe stage 4 isn't the time to be making that kind of change.
>> Still, it'd be great if someone who's set up do x86_64 benchmarking
>> could measure the effect of making "?" and "!" behave like the
>> new constraints.
> My worry isn't the x86_64 port, but all the others that folks don't test 
> as regularly.
>
> I'd rather go the other direction, have folks familiar with the port go 
> through it changing the constraints where it makes sense.  That just 
> seems a hell of a lot safer.
>
> A port maintainer could certainly hack something together for testing 
> purposes to guide them as to whether or not there's something to be 
> gained by converting many/most of the ?! to the new constraints.

Yeah, but in practice that's only ever going to be a partial transition.
Many port maintainers won't look at this, so we'll have to support both
versions indefinitely, even if the new behaviour turns out to be the
best for all cases.

I just think we're going to regret having two sets of constraints with
such subtly different meanings.

Looking back at the original PR, Jakub said:

  The ! has been added by me for PR63594, so it isn't there from the era
  when i?86 backend was using reload.  If there is a better way to
  express that RA should prefer to use memory or xmm register and only
  use r constraint if it already is in a r register and doesn't need to
  be reloaded, I can use that.  Whether it is ?, ??? or something else.
  ! description in gcc docs just fitted most what I wanted...

In some ways this seems to match the intention of "*".  Originally I think
it was just an RA-only thing and was ignored by reload, but LRA does take it
into account too (which sounds like progress to me).

If I revert the patch locally and change the *vec_dup<mode> pattern to
use "*", it passes both the test for PR64110 and the tests for PR63594.
Would that be OK as an alternative?

Thanks,
Richard
Jeff Law Jan. 27, 2015, 2:41 p.m. UTC | #6
On 01/27/15 07:08, Richard Sandiford wrote:
>
> Yeah, but in practice that's only ever going to be a partial transition.
> Many port maintainers won't look at this, so we'll have to support both
> versions indefinitely, even if the new behaviour turns out to be the
> best for all cases.
Yes, most likely.   I find myself pondering the related question of how 
we get ports to transition to LRA and if we could tie these together. 
Maintainers are going to need to transition to LRA if we're ever going 
to start removing blobs of reload.  As a part of that transition they're 
presumably going to be looking closely at their backend and could make 
the constraint transition.

In an ideal world, we'd declare release X.Y has a cut-off point.  Ports 
that haven't transitioned to LRA get deprecated at that point.  Those 
ports are the ones most likely not to make the constraint transition as 
well.  I think we would have to consider any uses of ?! that remain 
after that point as intentional.

>
> I just think we're going to regret having two sets of constraints with
> such subtly different meanings.
But isn't that inevitable?  While I suspect that most instances of ?! 
should be converted, there may be some that should not.  If that's the 
case then we're going to have both forever.


>
> Looking back at the original PR, Jakub said:
>
>    The ! has been added by me for PR63594, so it isn't there from the era
>    when i?86 backend was using reload.  If there is a better way to
>    express that RA should prefer to use memory or xmm register and only
>    use r constraint if it already is in a r register and doesn't need to
>    be reloaded, I can use that.  Whether it is ?, ??? or something else.
>    ! description in gcc docs just fitted most what I wanted...
>
> In some ways this seems to match the intention of "*".  Originally I think
> it was just an RA-only thing and was ignored by reload, but LRA does take it
> into account too (which sounds like progress to me).
>
> If I revert the patch locally and change the *vec_dup<mode> pattern to
> use "*", it passes both the test for PR64110 and the tests for PR63594.
> Would that be OK as an alternative?
I think that's up to Uros and Jakub to sort out.

Jeff
Vladimir Makarov Jan. 27, 2015, 4:46 p.m. UTC | #7
On 01/27/2015 09:08 AM, Richard Sandiford wrote:
>
> Yeah, but in practice that's only ever going to be a partial transition.
> Many port maintainers won't look at this, so we'll have to support both
> versions indefinitely, even if the new behaviour turns out to be the
> best for all cases.
>
> I just think we're going to regret having two sets of constraints with
> such subtly different meanings.
>
> Looking back at the original PR, Jakub said:
>
>   The ! has been added by me for PR63594, so it isn't there from the era
>   when i?86 backend was using reload.  If there is a better way to
>   express that RA should prefer to use memory or xmm register and only
>   use r constraint if it already is in a r register and doesn't need to
>   be reloaded, I can use that.  Whether it is ?, ??? or something else.
>   ! description in gcc docs just fitted most what I wanted...
>
> In some ways this seems to match the intention of "*".  Originally I think
> it was just an RA-only thing and was ignored by reload, but LRA does take it
> into account too (which sounds like progress to me).
  I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
should decide what class is better based on costs of alternatives and
the explicit exclusion of register class by using '*' is a bad practice.

  Saying that I believe we should do register class preferrencing
algorithm more alternative oriented.  The algorithm should choose first
an alternative (of may be subset of alternatives) and then register
classes.  I think it is more logical.  It would permits us to rid off
all such constraints including '*' and use only one like '?' which
increases the alternative cost.

  In perspective it is even better to rid of '?' too and have some hook
(or attribute) to get insn alternative costs which can be depended on
sub-target or other run-time characteristics.  Otherwise we need to
duplicate insn descriptions and put different insn guards.  I am going
to work on this.  But it is hard to say will it work well (may be I have
some performance issues with this).  This hook somehow (min or average
of the values for all alternatives) can be used in combiner and other
algorithms need an insn cost. That is how I see the solution of the
problem in a long perspective.
> If I revert the patch locally and change the *vec_dup<mode> pattern to
> use "*", it passes both the test for PR64110 and the tests for PR63594.
> Would that be OK as an alternative?
>
  I don't think it will work in general case.  It probably works because
a different class is chosen in IRA.  If IRA for some reasons choose the
same class, we might see the same problem in LRA.  I also don't like
when register classes are excluded by '*' for IRA (see my thoughts above).
Richard Sandiford Jan. 27, 2015, 5:11 p.m. UTC | #8
Vladimir Makarov <vmakarov@redhat.com> writes:
> On 01/27/2015 09:08 AM, Richard Sandiford wrote:
>> Yeah, but in practice that's only ever going to be a partial transition.
>> Many port maintainers won't look at this, so we'll have to support both
>> versions indefinitely, even if the new behaviour turns out to be the
>> best for all cases.
>>
>> I just think we're going to regret having two sets of constraints with
>> such subtly different meanings.
>>
>> Looking back at the original PR, Jakub said:
>>
>>   The ! has been added by me for PR63594, so it isn't there from the era
>>   when i?86 backend was using reload.  If there is a better way to
>>   express that RA should prefer to use memory or xmm register and only
>>   use r constraint if it already is in a r register and doesn't need to
>>   be reloaded, I can use that.  Whether it is ?, ??? or something else.
>>   ! description in gcc docs just fitted most what I wanted...
>>
>> In some ways this seems to match the intention of "*".  Originally I think
>> it was just an RA-only thing and was ignored by reload, but LRA does take it
>> into account too (which sounds like progress to me).
>   I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
> should decide what class is better based on costs of alternatives and
> the explicit exclusion of register class by using '*' is a bad practice.
>
>   Saying that I believe we should do register class preferrencing
> algorithm more alternative oriented.  The algorithm should choose first
> an alternative (of may be subset of alternatives) and then register
> classes.  I think it is more logical.  It would permits us to rid off
> all such constraints including '*' and use only one like '?' which
> increases the alternative cost.
>
>   In perspective it is even better to rid of '?' too and have some hook
> (or attribute) to get insn alternative costs which can be depended on
> sub-target or other run-time characteristics.  Otherwise we need to
> duplicate insn descriptions and put different insn guards.  I am going
> to work on this.  But it is hard to say will it work well (may be I have
> some performance issues with this).  This hook somehow (min or average
> of the values for all alternatives) can be used in combiner and other
> algorithms need an insn cost. That is how I see the solution of the
> problem in a long perspective.

Definitely agree that it'd be better to remove these constraints
in favour of a new attribute.  preferred_for_size and preferred_for_speed
give something similar, though they're much more stringent than what
we need here.

>> If I revert the patch locally and change the *vec_dup<mode> pattern to
>> use "*", it passes both the test for PR64110 and the tests for PR63594.
>> Would that be OK as an alternative?
>>
>   I don't think it will work in general case.  It probably works because
> a different class is chosen in IRA.  If IRA for some reasons choose the
> same class, we might see the same problem in LRA.

But isn't that the point of '*'?  It should stop IRA from using the 'r'
alternative as an indication that 'r' is a good choice for this instruction.
If IRA chooses 'r' anyway, it must be because other instructions that
use the same allocno strongly prefer 'r'.

And in those some circumstances -- i.e. if IRA does choose 'r' despite
the constraints in this instruction -- then I think we do want to use the
'r' alternative.  And AIUI that's also what the new constraint is designed
to do.  If IRA chooses 'r' anyway, the new constraint causes LRA to prefer
the 'r' alternative _even if_ another operand (the destination) has to
be reloaded, which is the fundamental difference between the new constraint
and '!'.

So I'm still not sure why '*' wouldn't do what we want.

>  I also don't like when register classes are excluded by '*' for IRA
> (see my thoughts above).

Understood, and I agree it would be good to move to attributes.
But in a way, I think that's an even better reason to try to avoid
adding these new constraints.  It sounds like we're hoping to get rid
of them as soon as we've added them :-)

Thanks,
Richard
Vladimir Makarov Jan. 27, 2015, 10:22 p.m. UTC | #9
On 01/27/2015 12:11 PM, Richard Sandiford wrote:
> Vladimir Makarov <vmakarov@redhat.com> writes:
>> On 01/27/2015 09:08 AM, Richard Sandiford wrote:
>>> Yeah, but in practice that's only ever going to be a partial transition.
>>> Many port maintainers won't look at this, so we'll have to support both
>>> versions indefinitely, even if the new behaviour turns out to be the
>>> best for all cases.
>>>
>>> I just think we're going to regret having two sets of constraints with
>>> such subtly different meanings.
>>>
>>> Looking back at the original PR, Jakub said:
>>>
>>>   The ! has been added by me for PR63594, so it isn't there from the era
>>>   when i?86 backend was using reload.  If there is a better way to
>>>   express that RA should prefer to use memory or xmm register and only
>>>   use r constraint if it already is in a r register and doesn't need to
>>>   be reloaded, I can use that.  Whether it is ?, ??? or something else.
>>>   ! description in gcc docs just fitted most what I wanted...
>>>
>>> In some ways this seems to match the intention of "*".  Originally I think
>>> it was just an RA-only thing and was ignored by reload, but LRA does take it
>>> into account too (which sounds like progress to me).
>>   I guess we don't need '*' in many cases.  It is overused.  Imho, IRA
>> should decide what class is better based on costs of alternatives and
>> the explicit exclusion of register class by using '*' is a bad practice.
>>
>>   Saying that I believe we should do register class preferrencing
>> algorithm more alternative oriented.  The algorithm should choose first
>> an alternative (of may be subset of alternatives) and then register
>> classes.  I think it is more logical.  It would permits us to rid off
>> all such constraints including '*' and use only one like '?' which
>> increases the alternative cost.
>>
>>   In perspective it is even better to rid of '?' too and have some hook
>> (or attribute) to get insn alternative costs which can be depended on
>> sub-target or other run-time characteristics.  Otherwise we need to
>> duplicate insn descriptions and put different insn guards.  I am going
>> to work on this.  But it is hard to say will it work well (may be I have
>> some performance issues with this).  This hook somehow (min or average
>> of the values for all alternatives) can be used in combiner and other
>> algorithms need an insn cost. That is how I see the solution of the
>> problem in a long perspective.
> Definitely agree that it'd be better to remove these constraints
> in favour of a new attribute.  preferred_for_size and preferred_for_speed
> give something similar, though they're much more stringent than what
> we need here.
>
>>> If I revert the patch locally and change the *vec_dup<mode> pattern to
>>> use "*", it passes both the test for PR64110 and the tests for PR63594.
>>> Would that be OK as an alternative?
>>>
>>   I don't think it will work in general case.  It probably works because
>> a different class is chosen in IRA.  If IRA for some reasons choose the
>> same class, we might see the same problem in LRA.
> But isn't that the point of '*'?  It should stop IRA from using the 'r'
> alternative as an indication that 'r' is a good choice for this instruction.
> If IRA chooses 'r' anyway, it must be because other instructions that
> use the same allocno strongly prefer 'r'.
>
> And in those some circumstances -- i.e. if IRA does choose 'r' despite
> the constraints in this instruction -- then I think we do want to use the
> 'r' alternative.  And AIUI that's also what the new constraint is designed
> to do.  If IRA chooses 'r' anyway, the new constraint causes LRA to prefer
> the 'r' alternative _even if_ another operand (the destination) has to
> be reloaded, which is the fundamental difference between the new constraint
> and '!'.
>
> So I'm still not sure why '*' wouldn't do what we want.
Frequently use of '*' (and sometimes '!' for reload) means that we need
splitting for this alternative probably into 2/3 insns.  Instead of '*'
use we would need to set up costs of all these insns.  I believe just
ignoring the class with '*' is wrong.  There are some cases where we
need '*' to avoid definitely this reg class, e.g. mmx when we use other
classes for fp values.  But I guess this solution is not reliable and
without the constraints we could set the alternative cost very high to
have a reliable right solution.

>>  I also don't like when register classes are excluded by '*' for IRA
>> (see my thoughts above).
> Understood, and I agree it would be good to move to attributes.
> But in a way, I think that's an even better reason to try to avoid
> adding these new constraints.  It sounds like we're hoping to get rid
> of them as soon as we've added them :-)
>
>
Sometimes to get rid off, you should add more :)

But to be serious, what I wrote can not be implemented for GCC-5.0 (and
the generated code performance is still unknown for the proposed
approach).  I believe the current solution is more reliable than using
'*'.   Ridding off the new constraints will be much much smaller problem
than ridding of other constraints.
diff mbox

Patch

Index: config/i386/sse.md
===================================================================
--- config/i386/sse.md	(revision 219262)
+++ config/i386/sse.md	(working copy)
@@ -16713,7 +16713,7 @@ 
 (define_insn "*vec_dup<mode>"
   [(set (match_operand:AVX2_VEC_DUP_MODE 0 "register_operand" "=x,x,x")
 	(vec_duplicate:AVX2_VEC_DUP_MODE
-	  (match_operand:<ssescalarmode> 1 "nonimmediate_operand" "m,x,!r")))]
+	  (match_operand:<ssescalarmode> 1 "nonimmediate_operand" "m,x,$r")))]
   "TARGET_AVX2"
   "@
    v<sseintprefix>broadcast<bcstscalarsuff>\t{%1, %0|%0, %1}
Index: doc/md.texi
===================================================================
--- doc/md.texi	(revision 219262)
+++ doc/md.texi	(working copy)
@@ -1503,6 +1503,18 @@  in it.
 Disparage severely the alternative that the @samp{!} appears in.
 This alternative can still be used if it fits without reloading,
 but if reloading is needed, some other alternative will be used.
+
+@cindex @samp{^} in constraint
+@cindex caret
+@item ^
+This constraint is analogous to @samp{?} but it disparages slightly
+the alternative only unless the corresponding operand applies exactly.
+
+@cindex @samp{$} in constraint
+@cindex dollar sign
+@item $
+This constraint is analogous to @samp{!} but it disparages severely
+the alternative only unless the corresponding operand applies exactly.
 @end table
 
 @ifset INTERNALS
Index: genoutput.c
===================================================================
--- genoutput.c	(revision 219262)
+++ genoutput.c	(working copy)
@@ -209,7 +209,7 @@  struct constraint_data
 
 /* All machine-independent constraint characters (except digits) that
    are handled outside the define*_constraint mechanism.  */
-static const char indep_constraints[] = ",=+%*?!#&g";
+static const char indep_constraints[] = ",=+%*?!^$#&g";
 
 static struct constraint_data *
 constraints_by_letter_table[1 << CHAR_BIT];
Index: ira-costs.c
===================================================================
--- ira-costs.c	(revision 219262)
+++ ira-costs.c	(working copy)
@@ -762,6 +762,10 @@  record_reg_classes (int n_alts, int n_op
 		  c = *++p;
 		  break;
 
+		case '^':
+		  alt_cost += 2;
+		  break;
+
 		case '?':
 		  alt_cost += 2;
 		  break;
Index: lra-constraints.c
===================================================================
--- lra-constraints.c	(revision 219262)
+++ lra-constraints.c	(working copy)
@@ -1640,6 +1640,7 @@  process_alt_operands (int only_alternati
      then REJECT is ignored, but otherwise it gets this much counted
      against it in addition to the reloading needed.  */
   int reject;
+  int op_reject;
   /* The number of elements in the following array.  */
   int early_clobbered_regs_num;
   /* Numbers of operands which are early clobber registers.  */
@@ -1789,6 +1790,7 @@  process_alt_operands (int only_alternati
 	     track.  */
 	  lra_assert (*p != 0 && *p != ',');
 
+	  op_reject = 0;
 	  /* Scan this alternative's specs for this operand; set WIN
 	     if the operand fits any letter in this alternative.
 	     Otherwise, clear BADOP if this operand could fit some
@@ -1811,6 +1813,13 @@  process_alt_operands (int only_alternati
 		  early_clobber_p = true;
 		  break;
 
+		case '$':
+		  op_reject += LRA_MAX_REJECT;
+		  break;
+		case '^':
+		  op_reject += LRA_LOSER_COST_FACTOR;
+		  break;
+
 		case '#':
 		  /* Ignore rest of this alternative.  */
 		  c = '\0';
@@ -2097,6 +2106,7 @@  process_alt_operands (int only_alternati
 	      int const_to_mem = 0;
 	      bool no_regs_p;
 
+	      reject += op_reject;
 	      /* Never do output reload of stack pointer.  It makes
 		 impossible to do elimination when SP is changed in
 		 RTL.  */
Index: stmt.c
===================================================================
--- stmt.c	(revision 219262)
+++ stmt.c	(working copy)
@@ -292,6 +292,7 @@  parse_output_constraint (const char **co
 	break;
 
       case '?':  case '!':  case '*':  case '&':  case '#':
+      case '$':  case '^':
       case 'E':  case 'F':  case 'G':  case 'H':
       case 's':  case 'i':  case 'n':
       case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
@@ -382,6 +383,7 @@  parse_input_constraint (const char **con
 
       case '<':  case '>':
       case '?':  case '!':  case '*':  case '#':
+      case '$':  case '^':
       case 'E':  case 'F':  case 'G':  case 'H':
       case 's':  case 'i':  case 'n':
       case 'I':  case 'J':  case 'K':  case 'L':  case 'M':
Index: testsuite/gcc.target/i386/pr64110.c
===================================================================
--- testsuite/gcc.target/i386/pr64110.c	(revision 219262)
+++ testsuite/gcc.target/i386/pr64110.c	(working copy)
@@ -1,5 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O3 -march=core-avx2" } */
+/* { dg-final { scan-assembler "vmovd\[\\t \]" } } */
 
 int foo (void);
 int a;