[v2] tree-optimization/116573 - .SELECT_VL for SLP

Message ID	20240917110916.D532E139CE@imap1.dmz-prg2.suse.org
State	New
Headers	show Return-Path: <gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 611893858D26 Date: Tue, 17 Sep 2024 13:09:08 +0200 (CEST) From: Richard Biener <rguenther@suse.de> To: gcc-patches@gcc.gnu.org cc: RISC-V <patchworks-ci@rivosinc.com> Subject: [PATCH][v2] tree-optimization/116573 - .SELECT_VL for SLP MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Message-Id: <20240917110916.D532E139CE@imap1.dmz-prg2.suse.org> default: False [-4.30 / 50.00]; BAYES_HAM(-3.00)[100.00%]; NEURAL_HAM_LONG(-1.00)[-1.000]; NEURAL_HAM_SHORT(-0.20)[-1.000]; MIME_GOOD(-0.10)[text/plain]; RCVD_VIA_SMTP_AUTH(0.00)[]; ARC_NA(0.00)[]; MISSING_XM_UA(0.00)[]; RCPT_COUNT_TWO(0.00)[2]; RCVD_TLS_ALL(0.00)[]; DKIM_SIGNED(0.00)[suse.de:s=susede2_rsa,suse.de:s=susede2_ed25519]; FUZZY_BLOCKED(0.00)[rspamd.com]; FROM_HAS_DN(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; TO_MATCH_ENVRCPT_ALL(0.00)[]; RCVD_COUNT_TWO(0.00)[2]; TO_DN_SOME(0.00)[] Precedence: list Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	[v2] tree-optimization/116573 - .SELECT_VL for SLP \| expand [v2] tree-optimization/116573 - .SELECT_VL for SLP

Message ID

20240917110916.D532E139CE@imap1.dmz-prg2.suse.org

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 611893858D26
Date: Tue, 17 Sep 2024 13:09:08 +0200 (CEST)
From: Richard Biener <rguenther@suse.de>
To: gcc-patches@gcc.gnu.org
cc: RISC-V <patchworks-ci@rivosinc.com>
Subject: [PATCH][v2] tree-optimization/116573 - .SELECT_VL for SLP
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Message-Id: <20240917110916.D532E139CE@imap1.dmz-prg2.suse.org>
Precedence: list
Errors-To: gcc-patches-bounces~incoming=patchwork.ozlabs.org@gcc.gnu.org

Series

[v2] tree-optimization/116573 - .SELECT_VL for SLP | expand

Commit Message

Richard Biener Sept. 17, 2024, 11:09 a.m. UTC

The following restores the use of .SELECT_VL for testcases where it
is safe to use even when using SLP.  I've for now restricted it
to single-lane SLP plus optimistically allow store-lane nodes
and assume single-lane roots are not widened but at most to
load-lane who should be fine.

v2 fixes latent issues in vectorizable_load/store.

Bootstrap and regtest running on x86_64-unknown-linux-gnu.

	PR tree-optimization/116573
	* tree-vect-loop.cc (vect_analyze_loop_2): Allow .SELECV_VL
	for SLP but disable it when there's multi-lane instances.
	* tree-vect-stmts.cc (vectorizable_store): Only compute the
	ptr increment when generating code.
	(vectorizable_load): Likewise.
---
 gcc/tree-vect-loop.cc  | 15 ++++++++++++++-
 gcc/tree-vect-stmts.cc | 10 ++++++----
 2 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 62c7f90779f..199d79029e4 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -3078,10 +3078,23 @@  start_over:
       if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
 					  OPTIMIZE_FOR_SPEED)
 	  && LOOP_VINFO_LENS (loop_vinfo).length () == 1
-	  && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp
+	  && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1
 	  && (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
 	      || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()))
 	LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) = true;
+
+      /* If any of the SLP instances cover more than a single lane
+	 we cannot use .SELECT_VL at the moment, even if the number
+	 of lanes is uniform throughout the SLP graph.  */
+      if (LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo))
+	for (slp_instance inst : LOOP_VINFO_SLP_INSTANCES (loop_vinfo))
+	  if (SLP_TREE_LANES (SLP_INSTANCE_TREE (inst)) != 1
+	      && !(SLP_INSTANCE_KIND (inst) == slp_inst_kind_store
+		   && SLP_INSTANCE_TREE (inst)->ldst_lanes))
+	    {
+	      LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) = false;
+	      break;
+	    }
     }
 
   /* Decide whether this loop_vinfo should use partial vectors or peeling,
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index b1353c91fce..9b9a30600a9 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -8744,8 +8744,9 @@  vectorizable_store (vec_info *vinfo,
 	aggr_type = build_array_type_nelts (elem_type, group_size * nunits);
       else
 	aggr_type = vectype;
-      bump = vect_get_data_ptr_increment (vinfo, gsi, dr_info, aggr_type,
-					  memory_access_type, loop_lens);
+      if (!costing_p)
+	bump = vect_get_data_ptr_increment (vinfo, gsi, dr_info, aggr_type,
+					    memory_access_type, loop_lens);
     }
 
   if (mask && !costing_p)
@@ -10820,8 +10821,9 @@  vectorizable_load (vec_info *vinfo,
 	aggr_type = build_array_type_nelts (elem_type, group_size * nunits);
       else
 	aggr_type = vectype;
-      bump = vect_get_data_ptr_increment (vinfo, gsi, dr_info, aggr_type,
-					  memory_access_type, loop_lens);
+      if (!costing_p)
+	bump = vect_get_data_ptr_increment (vinfo, gsi, dr_info, aggr_type,
+					    memory_access_type, loop_lens);
     }
 
   auto_vec<tree> vec_offsets;

[v2] tree-optimization/116573 - .SELECT_VL for SLP

Commit Message

Patch