diff mbox series

[committed] amdgcn: Add 64-bit vector not

Message ID a7baea35-ea9d-5f11-520e-009c8da3735d@codesourcery.com
State New
Headers show
Series [committed] amdgcn: Add 64-bit vector not | expand

Commit Message

Andrew Stubbs April 4, 2023, 9:57 a.m. UTC
I've committed this patch to add a missing vector operator on amdgcn.

The architecture doesn't have a 64-bit not instruction so we didn't have 
an insn for it, but the vectorizer didn't like that and caused the 
v64df_pow function to use 2MB of stack frame. This is a problem when you 
typically have over 3000 threads and only want to allocate 32k of stack 
space each!

Andrew
amdgcn: Add 64-bit vector not

gcc/ChangeLog:

	* config/gcn/gcn-valu.md (one_cmpl<mode>2<exec>): New.
diff mbox series

Patch

diff --git a/gcc/config/gcn/gcn-valu.md b/gcc/config/gcn/gcn-valu.md
index 44d107145db..c0b43fcfb64 100644
--- a/gcc/config/gcn/gcn-valu.md
+++ b/gcc/config/gcn/gcn-valu.md
@@ -2791,6 +2791,23 @@  (define_expand "neg<mode>2"
     DONE;
   })
 
+(define_insn_and_split "one_cmpl<mode>2<exec>"
+  [(set (match_operand:V_DI 0 "register_operand"  "=   v")
+        (not:V_DI
+          (match_operand:V_DI 1 "gcn_alu_operand" "vSvDB")))]
+  ""
+  "#"
+  "reload_completed"
+  [(set (match_dup 3) (not:<VnSI> (match_dup 5)))
+   (set (match_dup 4) (not:<VnSI> (match_dup 6)))]
+  {
+    operands[3] = gcn_operand_part (<VnDI>mode, operands[0], 0);
+    operands[4] = gcn_operand_part (<VnDI>mode, operands[0], 1);
+    operands[5] = gcn_operand_part (<VnDI>mode, operands[1], 0);
+    operands[6] = gcn_operand_part (<VnDI>mode, operands[1], 1);
+  }
+  [(set_attr "type" "mult")])
+
 ;; }}}
 ;; {{{ FP binops - special cases