Message ID | alpine.DEB.2.02.1209021730070.12824@stedding.saclay.inria.fr |
---|---|
State | New |
Headers | show |
Hello, any advice? http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00044.html On Sun, 2 Sep 2012, Marc Glisse wrote: > Hello, > > this patch passes bootstrap+testsuite. It is probably wrong in many ways, but > I don't know enough to do more without some advice. > > The goal is to recognize that v[0]+v[1] can be computed with haddpd. With the > patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, thanks to > it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd. > > 1) Is a define_insn the right tool? > 2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because vec_duplicate > doesn't match vec_concat. Do we really need to duplicate (no pun intended) > the pattern? > 3) v[0]+v[1] is not recognized. Some pass changed their order, and nothing > tries the reverse order. I can see 3 ways: canonicalize the order at some > point, let combine try both orders for commutative operators or make the > patterns more flexible (I don't know how many would need changing). > 4) I don't understand the set_attr part. I copied it from the haddpd > define_insn, and removed (set_attr "type" "sseadd") because it crashed the > compiler. isa and prefix make sense and they match the alternatives, but I am > not sure about "mode" (removing it still works IIRC). > > > 2012-09-02 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * config/i386/sse.md (*sse3_h<plusminus_insn>v2df3_low): New. > > gcc/testsuite/ > * gcc.target/i386/pr54400.c: New testcase.
Adding an x86 maintainer in Cc: On Tue, 11 Sep 2012, Marc Glisse wrote: > Hello, > > any advice? > http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00044.html > > > On Sun, 2 Sep 2012, Marc Glisse wrote: > >> Hello, >> >> this patch passes bootstrap+testsuite. It is probably wrong in many ways, >> but I don't know enough to do more without some advice. >> >> The goal is to recognize that v[0]+v[1] can be computed with haddpd. With >> the patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, >> thanks to it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd. >> >> 1) Is a define_insn the right tool? >> 2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because >> vec_duplicate doesn't match vec_concat. Do we really need to duplicate (no >> pun intended) the pattern? >> 3) v[0]+v[1] is not recognized. Some pass changed their order, and nothing >> tries the reverse order. I can see 3 ways: canonicalize the order at some >> point, let combine try both orders for commutative operators or make the >> patterns more flexible (I don't know how many would need changing). >> 4) I don't understand the set_attr part. I copied it from the haddpd >> define_insn, and removed (set_attr "type" "sseadd") because it crashed the >> compiler. isa and prefix make sense and they match the alternatives, but I >> am not sure about "mode" (removing it still works IIRC). >> >> >> 2012-09-02 Marc Glisse <marc.glisse@inria.fr> >> >> gcc/ >> * config/i386/sse.md (*sse3_h<plusminus_insn>v2df3_low): New. >> >> gcc/testsuite/ >> * gcc.target/i386/pr54400.c: New testcase.
On Wed, Sep 26, 2012 at 5:16 PM, Marc Glisse <marc.glisse@inria.fr> wrote: > Adding an x86 maintainer in Cc: >>> this patch passes bootstrap+testsuite. It is probably wrong in many ways, >>> but I don't know enough to do more without some advice. >>> >>> The goal is to recognize that v[0]+v[1] can be computed with haddpd. With >>> the patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, >>> thanks to it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd. >>> >>> 1) Is a define_insn the right tool? Yes, combine pass looks at insn patterns to determine which insn is correct. >>> 2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because >>> vec_duplicate doesn't match vec_concat. Do we really need to duplicate (no >>> pun intended) the pattern? You can add this transformation to simplify-rtx.c. Probably vec_concat with two equal operands can be canonicalized as vec_duplicate. >>> 3) v[0]+v[1] is not recognized. Some pass changed their order, and >>> nothing tries the reverse order. I can see 3 ways: canonicalize the order at >>> some point, let combine try both orders for commutative operators or make >>> the patterns more flexible (I don't know how many would need changing). In your case, split the macroized pattern to plus and minus. Plus pattern should use "const_0_to_1_operand" predicate, and in the insn constraint, you just check that operands are different. >>> 4) I don't understand the set_attr part. I copied it from the haddpd >>> define_insn, and removed (set_attr "type" "sseadd") because it crashed the >>> compiler. isa and prefix make sense and they match the alternatives, but I >>> am not sure about "mode" (removing it still works IIRC). Somewhere deep inside attribute calculations, it is assumed that sseadd has two operands. Your pattern doesn't satisfy this assumption. You have to add sseadd1, please see for example sseiadd/sseiadd1 pair. Uros.
Index: testsuite/gcc.target/i386/pr54400.c =================================================================== --- testsuite/gcc.target/i386/pr54400.c (revision 0) +++ testsuite/gcc.target/i386/pr54400.c (revision 0) @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse3 -mfpmath=sse" } */ + +#include <x86intrin.h> + +double f (__m128d p) +{ + return p[0] - p[1]; +} + +/* { dg-final { scan-assembler "hsubpd" } } */ Property changes on: testsuite/gcc.target/i386/pr54400.c ___________________________________________________________________ Added: svn:keywords + Author Date Id Revision URL Added: svn:eol-style + native Index: config/i386/sse.md =================================================================== --- config/i386/sse.md (revision 190861) +++ config/i386/sse.md (working copy) @@ -1231,20 +1231,37 @@ (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))))] "TARGET_SSE3" "@ h<plusminus_mnemonic>pd\t{%2, %0|%0, %2} vh<plusminus_mnemonic>pd\t{%2, %1, %0|%0, %1, %2}" [(set_attr "isa" "noavx,avx") (set_attr "type" "sseadd") (set_attr "prefix" "orig,vex") (set_attr "mode" "V2DF")]) +(define_insn "*sse3_h<plusminus_insn>v2df3_low" + [(set (match_operand:DF 0 "register_operand" "=x,x") + (plusminus:DF + (vec_select:DF + (match_operand:V2DF 1 "register_operand" "0,x") + (parallel [(const_int 0)])) + (vec_select:DF + (match_dup 1) + (parallel [(const_int 1)]))))] + "TARGET_SSE3" + "@ + h<plusminus_mnemonic>pd\t{%0, %0|%0, %0} + vh<plusminus_mnemonic>pd\t{%1, %1, %0|%0, %1, %1}" + [(set_attr "isa" "noavx,avx") + (set_attr "prefix" "orig,vex") + (set_attr "mode" "V2DF")]) + (define_insn "avx_h<plusminus_insn>v8sf3" [(set (match_operand:V8SF 0 "register_operand" "=x") (vec_concat:V8SF (vec_concat:V4SF (vec_concat:V2SF (plusminus:SF (vec_select:SF (match_operand:V8SF 1 "register_operand" "x") (parallel [(const_int 0)])) (vec_select:SF (match_dup 1) (parallel [(const_int 1)])))