tree-optimization/116791 - Elementwise SLP vectorization

Message ID	20240923095228.4F53D3858C39@sourceware.org
State	New
Headers	show Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 640863858D20 Date: Mon, 23 Sep 2024 11:52:04 +0200 (CEST) From: Richard Biener <rguenther@suse.de> To: gcc-patches@gcc.gnu.org Subject: [PATCH] tree-optimization/116791 - Elementwise SLP vectorization MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: list Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org Message-Id: <20240923095228.4F53D3858C39@sourceware.org>
Series	tree-optimization/116791 - Elementwise SLP vectorization \| expand tree-optimization/116791 - Elementwise SLP vectorization

Message ID

20240923095228.4F53D3858C39@sourceware.org

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 640863858D20
Date: Mon, 23 Sep 2024 11:52:04 +0200 (CEST)
From: Richard Biener <rguenther@suse.de>
To: gcc-patches@gcc.gnu.org
Subject: [PATCH] tree-optimization/116791 - Elementwise SLP vectorization
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Precedence: list
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org
Message-Id: <20240923095228.4F53D3858C39@sourceware.org>

Series

tree-optimization/116791 - Elementwise SLP vectorization | expand

Commit Message

Richard Biener Sept. 23, 2024, 9:52 a.m. UTC

The following restricts the elementwise SLP vectorization to the
single-lane case which is the reason I enabled it to avoid regressions
with non-SLP.  The PR shows that multi-line SLP loads with elementwise
accesses require work, I'll open a new bug to track this for the
future.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

Richard.

	PR tree-optimization/116791
	* tree-vect-stmts.cc (get_group_load_store_type): Only
	fall back to elementwise access for single-lane SLP, restore
	hard failure mode for other cases.

	* gcc.dg/vect/pr116791.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/pr116791.c | 20 ++++++++++++++++++++
 gcc/tree-vect-stmts.cc               | 23 +++++++++++++++++------
 2 files changed, 37 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/pr116791.c

diff --git a/gcc/testsuite/gcc.dg/vect/pr116791.c b/gcc/testsuite/gcc.dg/vect/pr116791.c
new file mode 100644
index 00000000000..d9700a88fcc
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr116791.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-mavx2" { target avx2 } } */
+
+struct nine_context {
+  unsigned tex_stage[8][33];
+};
+struct fvec4 {
+  float x[2];
+};
+void f(struct fvec4 *dst, struct nine_context *context)
+{
+  unsigned s;
+  for (s = 0; s < 8; ++s)
+    {
+      float *rgba = &dst[s].x[0];
+      unsigned color = context->tex_stage[s][0];
+      rgba[0] = (float)((color >> 16) & 0xFF) / 0xFF;
+      rgba[1] = (float)((color >> 8) & 0xFF) / 0xFF;
+    }
+}
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index a7032c21d66..2e85b683789 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -2192,12 +2192,23 @@  get_group_load_store_type (vec_info *vinfo, stmt_vec_info stmt_info,
 	      && single_element_p
 	      && maybe_gt (group_size, TYPE_VECTOR_SUBPARTS (vectype)))
 	    {
-	      *memory_access_type = VMAT_ELEMENTWISE;
-	      if (dump_enabled_p ())
-		dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
-				 "single-element interleaving not supported "
-				 "for not adjacent vector loads, using "
-				 "elementwise access\n");
+	      if (SLP_TREE_LANES (slp_node) == 1)
+		{
+		  *memory_access_type = VMAT_ELEMENTWISE;
+		  if (dump_enabled_p ())
+		    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+				     "single-element interleaving not supported "
+				     "for not adjacent vector loads, using "
+				     "elementwise access\n");
+		}
+	      else
+		{
+		  if (dump_enabled_p ())
+		    dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+				     "single-element interleaving not supported "
+				     "for not adjacent vector loads\n");
+		  return false;
+		}
 	    }
 	}
     }

tree-optimization/116791 - Elementwise SLP vectorization

Commit Message

Patch