diff mbox

[i386] recognize haddpd

Message ID alpine.DEB.2.02.1209021730070.12824@stedding.saclay.inria.fr
State New
Headers show

Commit Message

Marc Glisse Sept. 2, 2012, 4:20 p.m. UTC
Hello,

this patch passes bootstrap+testsuite. It is probably wrong in many ways, 
but I don't know enough to do more without some advice.

The goal is to recognize that v[0]+v[1] can be computed with haddpd. With 
the patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, 
thanks to it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd.

1) Is a define_insn the right tool?
2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because 
vec_duplicate doesn't match vec_concat. Do we really need to duplicate 
(no pun intended) the pattern?
3) v[0]+v[1] is not recognized. Some pass changed their order, and nothing 
tries the reverse order. I can see 3 ways: canonicalize the order 
at some point, let combine try both orders for commutative operators or 
make the patterns more flexible (I don't know how many would need changing).
4) I don't understand the set_attr part. I copied it from the haddpd 
define_insn, and removed (set_attr "type" "sseadd") because it crashed the 
compiler. isa and prefix make sense and they match the alternatives, but I 
am not sure about "mode" (removing it still works IIRC).


2012-09-02  Marc Glisse  <marc.glisse@inria.fr>

gcc/
 	* config/i386/sse.md (*sse3_h<plusminus_insn>v2df3_low): New.

gcc/testsuite/
 	* gcc.target/i386/pr54400.c: New testcase.

Comments

Marc Glisse Sept. 11, 2012, 11:29 a.m. UTC | #1
Hello,

any advice?
http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00044.html


On Sun, 2 Sep 2012, Marc Glisse wrote:

> Hello,
>
> this patch passes bootstrap+testsuite. It is probably wrong in many ways, but 
> I don't know enough to do more without some advice.
>
> The goal is to recognize that v[0]+v[1] can be computed with haddpd. With the 
> patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, thanks to 
> it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd.
>
> 1) Is a define_insn the right tool?
> 2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because vec_duplicate 
> doesn't match vec_concat. Do we really need to duplicate (no pun intended) 
> the pattern?
> 3) v[0]+v[1] is not recognized. Some pass changed their order, and nothing 
> tries the reverse order. I can see 3 ways: canonicalize the order at some 
> point, let combine try both orders for commutative operators or make the 
> patterns more flexible (I don't know how many would need changing).
> 4) I don't understand the set_attr part. I copied it from the haddpd 
> define_insn, and removed (set_attr "type" "sseadd") because it crashed the 
> compiler. isa and prefix make sense and they match the alternatives, but I am 
> not sure about "mode" (removing it still works IIRC).
>
>
> 2012-09-02  Marc Glisse  <marc.glisse@inria.fr>
>
> gcc/
> 	* config/i386/sse.md (*sse3_h<plusminus_insn>v2df3_low): New.
>
> gcc/testsuite/
> 	* gcc.target/i386/pr54400.c: New testcase.
Marc Glisse Sept. 26, 2012, 3:16 p.m. UTC | #2
Adding an x86 maintainer in Cc:

On Tue, 11 Sep 2012, Marc Glisse wrote:

> Hello,
>
> any advice?
> http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00044.html
>
>
> On Sun, 2 Sep 2012, Marc Glisse wrote:
>
>> Hello,
>> 
>> this patch passes bootstrap+testsuite. It is probably wrong in many ways, 
>> but I don't know enough to do more without some advice.
>> 
>> The goal is to recognize that v[0]+v[1] can be computed with haddpd. With 
>> the patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also, 
>> thanks to it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd.
>> 
>> 1) Is a define_insn the right tool?
>> 2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because 
>> vec_duplicate doesn't match vec_concat. Do we really need to duplicate (no 
>> pun intended) the pattern?
>> 3) v[0]+v[1] is not recognized. Some pass changed their order, and nothing 
>> tries the reverse order. I can see 3 ways: canonicalize the order at some 
>> point, let combine try both orders for commutative operators or make the 
>> patterns more flexible (I don't know how many would need changing).
>> 4) I don't understand the set_attr part. I copied it from the haddpd 
>> define_insn, and removed (set_attr "type" "sseadd") because it crashed the 
>> compiler. isa and prefix make sense and they match the alternatives, but I 
>> am not sure about "mode" (removing it still works IIRC).
>> 
>> 
>> 2012-09-02  Marc Glisse  <marc.glisse@inria.fr>
>> 
>> gcc/
>> 	* config/i386/sse.md (*sse3_h<plusminus_insn>v2df3_low): New.
>> 
>> gcc/testsuite/
>> 	* gcc.target/i386/pr54400.c: New testcase.
Uros Bizjak Sept. 28, 2012, 8:09 a.m. UTC | #3
On Wed, Sep 26, 2012 at 5:16 PM, Marc Glisse <marc.glisse@inria.fr> wrote:
> Adding an x86 maintainer in Cc:

>>> this patch passes bootstrap+testsuite. It is probably wrong in many ways,
>>> but I don't know enough to do more without some advice.
>>>
>>> The goal is to recognize that v[0]+v[1] can be computed with haddpd. With
>>> the patch, v[0]-v[1] becomes hsubpd and v[1]+v[0] becomes haddpd. Also,
>>> thanks to it, {v[0]-v[1], w[0]-w[1]} is now recognized as a single hsubpd.
>>>
>>> 1) Is a define_insn the right tool?

Yes, combine pass looks at insn patterns to determine which insn is correct.

>>> 2) {v[0]-v[1], v[0]-v[1]} is not recognized as a hsubpd because
>>> vec_duplicate doesn't match vec_concat. Do we really need to duplicate (no
>>> pun intended) the pattern?

You can add this transformation to simplify-rtx.c. Probably vec_concat
with two equal operands can be canonicalized as vec_duplicate.

>>> 3) v[0]+v[1] is not recognized. Some pass changed their order, and
>>> nothing tries the reverse order. I can see 3 ways: canonicalize the order at
>>> some point, let combine try both orders for commutative operators or make
>>> the patterns more flexible (I don't know how many would need changing).

In your case, split the macroized pattern to plus and minus. Plus
pattern should use "const_0_to_1_operand" predicate, and in the insn
constraint, you just check that operands are different.

>>> 4) I don't understand the set_attr part. I copied it from the haddpd
>>> define_insn, and removed (set_attr "type" "sseadd") because it crashed the
>>> compiler. isa and prefix make sense and they match the alternatives, but I
>>> am not sure about "mode" (removing it still works IIRC).

Somewhere deep inside attribute calculations, it is assumed that
sseadd has two operands. Your pattern doesn't satisfy this assumption.
You have to add sseadd1, please see for example sseiadd/sseiadd1 pair.

Uros.
diff mbox

Patch

Index: testsuite/gcc.target/i386/pr54400.c
===================================================================
--- testsuite/gcc.target/i386/pr54400.c	(revision 0)
+++ testsuite/gcc.target/i386/pr54400.c	(revision 0)
@@ -0,0 +1,11 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse3 -mfpmath=sse" } */
+
+#include <x86intrin.h>
+
+double f (__m128d p)
+{
+  return p[0] - p[1];
+}
+
+/* { dg-final { scan-assembler "hsubpd" } } */

Property changes on: testsuite/gcc.target/i386/pr54400.c
___________________________________________________________________
Added: svn:keywords
   + Author Date Id Revision URL
Added: svn:eol-style
   + native

Index: config/i386/sse.md
===================================================================
--- config/i386/sse.md	(revision 190861)
+++ config/i386/sse.md	(working copy)
@@ -1231,20 +1231,37 @@ 
 	    (vec_select:DF (match_dup 2) (parallel [(const_int 1)])))))]
   "TARGET_SSE3"
   "@
    h<plusminus_mnemonic>pd\t{%2, %0|%0, %2}
    vh<plusminus_mnemonic>pd\t{%2, %1, %0|%0, %1, %2}"
   [(set_attr "isa" "noavx,avx")
    (set_attr "type" "sseadd")
    (set_attr "prefix" "orig,vex")
    (set_attr "mode" "V2DF")])
 
+(define_insn "*sse3_h<plusminus_insn>v2df3_low"
+  [(set (match_operand:DF 0 "register_operand" "=x,x")
+	(plusminus:DF
+	  (vec_select:DF
+	    (match_operand:V2DF 1 "register_operand" "0,x")
+	    (parallel [(const_int 0)]))
+	  (vec_select:DF
+	    (match_dup 1)
+	    (parallel [(const_int 1)]))))]
+  "TARGET_SSE3"
+  "@
+   h<plusminus_mnemonic>pd\t{%0, %0|%0, %0}
+   vh<plusminus_mnemonic>pd\t{%1, %1, %0|%0, %1, %1}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "prefix" "orig,vex")
+   (set_attr "mode" "V2DF")])
+
 (define_insn "avx_h<plusminus_insn>v8sf3"
   [(set (match_operand:V8SF 0 "register_operand" "=x")
 	(vec_concat:V8SF
 	  (vec_concat:V4SF
 	    (vec_concat:V2SF
 	      (plusminus:SF
 		(vec_select:SF
 		  (match_operand:V8SF 1 "register_operand" "x")
 		  (parallel [(const_int 0)]))
 		(vec_select:SF (match_dup 1) (parallel [(const_int 1)])))