Message ID | AM5PR0802MB2610DDF3A7F7AA8DE2E5446E83110@AM5PR0802MB2610.eurprd08.prod.outlook.com |
---|---|
State | New |
Headers | show |
On 26/04/17 13:39, Wilco Dijkstra wrote: > Float to int moves currently generate inefficient code due to > hacks used in the movsi and movdi patterns. The 'r = w' variant > uses '*' which explicitly tells the register allocator to ignore it. > As a result float to int moves typically spill to the stack, which is > extremely inefficient. For example: > > static inline unsigned asuint (float f) > { > union { float f; unsigned i; } u = {f}; > return u.i; > } > > float foo (float x) > { > unsigned i = asuint (x); > if (__builtin_expect (i > 42, 0)) > return x*x; > return i; > } > > generates: > > sub sp, sp, #16 > str s0, [sp, 12] > ldr w0, [sp, 12] > cmp w0, 42 > bhi .L7 > scvtf s0, w0 > add sp, sp, 16 > ret > .L7: > fmul s0, s0, s0 > add sp, sp, 16 > ret > > Removing '*' from the variant generates: > > fmov w0, s0 > cmp w0, 42 > bhi .L6 > scvtf s0, w0 > ret > .L6: > fmul s0, s0, s0 > ret > > Passes regress & bootstrap, OK for commit? > > ChangeLog: > 2017-04-26 Wilco Dijkstra <wdijkstr@arm.com> > > * config/aarch64/aarch64.md (movsi_aarch64): Remove '*' from r=w. > (movdi_aarch64): Likewise. > OK. While on the subject, why is the w->w operation also hidden? R. > -- > diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md > index 51368e29f2d1fd12f48a972bd81a08589a720e07..d656e92e1ff02bdc90c824227ec3b2e1ccfe665a 100644 > --- a/gcc/config/aarch64/aarch64.md > +++ b/gcc/config/aarch64/aarch64.md > @@ -1026,8 +1026,8 @@ (define_expand "mov<mode>" > ) > > (define_insn_and_split "*movsi_aarch64" > - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r ,*w, r,*w") > - (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,m, m,rZ,*w,S,Ush,rZ,*w,*w"))] > + [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r ,*w,r,*w") > + (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,m, m,rZ,*w,S,Ush,rZ,w,*w"))] > "(register_operand (operands[0], SImode) > || aarch64_reg_or_zero (operands[1], SImode))" > "@ > @@ -1058,8 +1058,8 @@ (define_insn_and_split "*movsi_aarch64" > ) > > (define_insn_and_split "*movdi_aarch64" > - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r, *w, r,*w,w") > - (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,m, m,rZ,*w,S,Ush,rZ,*w,*w,Dd"))] > + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r, *w,r,*w,w") > + (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,m, m,rZ,*w,S,Ush,rZ,w,*w,Dd"))] > "(register_operand (operands[0], DImode) > || aarch64_reg_or_zero (operands[1], DImode))" > "@ >
Richard Earnshaw (lists) wrote:
> While on the subject, why is the w->w operation also hidden?
No idea, this just fixes one case where it is obvious the use of '*' is incorrect.
However I think all uses of '*' in md files are incorrect and the feature should
be removed. '?' already exists for cases where the alternative may be expensive.
Wilco
On 05/05/17 17:10, Wilco Dijkstra wrote: > Richard Earnshaw (lists) wrote: > >> While on the subject, why is the w->w operation also hidden? > > No idea, this just fixes one case where it is obvious the use of '*' is > incorrect. > > However I think all uses of '*' in md files are incorrect and the > feature should > be removed. '?' already exists for cases where the alternative may be > expensive. > It's not quite as simple as that. It may be, however, that we should only use it for restricting subclasses (eg generally avoiding high registers on Thumb1). However, things have changed somewhat since the move to LRA and what was once true might be quite different now. R. > Wilco
Richard Earnshaw (lists) wrote: > On 05/05/17 17:10, Wilco Dijkstra wrote: > > However I think all uses of '*' in md files are incorrect and the > > feature should > > be removed. '?' already exists for cases where the alternative may be > > expensive. > > > > It's not quite as simple as that. It may be, however, that we should > only use it for restricting subclasses (eg generally avoiding high > registers on Thumb1). We needed to add TARGET_IRA_CHANGE_PSEUDO_ALLOCNO_CLASS to ensure register preferencing works correctly, so this can be used to choose low/high registers on Thumb-1. > However, things have changed somewhat since the move to LRA and what was > once true might be quite different now. It's possible that existing uses of '*' were trying to hack around the preferencing issues, but with this callback I don't think they are needed. We will still require '?' and '!' to set the relative cost of alternatives - if you have a negate and one of the operands is a SIMD register (so you need one int<->FP move either way), and SIMD negate has a higher latency, you need '?' to say that it is more expensive than the integer negate. Wilco
diff --git a/gcc/config/aarch64/aarch64.md b/gcc/config/aarch64/aarch64.md index 51368e29f2d1fd12f48a972bd81a08589a720e07..d656e92e1ff02bdc90c824227ec3b2e1ccfe665a 100644 --- a/gcc/config/aarch64/aarch64.md +++ b/gcc/config/aarch64/aarch64.md @@ -1026,8 +1026,8 @@ (define_expand "mov<mode>" ) (define_insn_and_split "*movsi_aarch64" - [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r ,*w, r,*w") - (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,m, m,rZ,*w,S,Ush,rZ,*w,*w"))] + [(set (match_operand:SI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r ,*w,r,*w") + (match_operand:SI 1 "aarch64_mov_operand" " r,r,k,M,n,m, m,rZ,*w,S,Ush,rZ,w,*w"))] "(register_operand (operands[0], SImode) || aarch64_reg_or_zero (operands[1], SImode))" "@ @@ -1058,8 +1058,8 @@ (define_insn_and_split "*movsi_aarch64" ) (define_insn_and_split "*movdi_aarch64" - [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r, *w, r,*w,w") - (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,m, m,rZ,*w,S,Ush,rZ,*w,*w,Dd"))] + [(set (match_operand:DI 0 "nonimmediate_operand" "=r,k,r,r,r,r,*w,m, m,r,r, *w,r,*w,w") + (match_operand:DI 1 "aarch64_mov_operand" " r,r,k,N,n,m, m,rZ,*w,S,Ush,rZ,w,*w,Dd"))] "(register_operand (operands[0], DImode) || aarch64_reg_or_zero (operands[1], DImode))" "@