diff mbox series

[v2] tree-optimization/115868 - ICE with .MASK_CALL in simdclone

Message ID 20240712065607.22D1C13686@imap1.dmz-prg2.suse.org
State New
Headers show
Series [v2] tree-optimization/115868 - ICE with .MASK_CALL in simdclone | expand

Commit Message

Richard Biener July 12, 2024, 6:56 a.m. UTC
The following adjusts mask recording which didn't take into account
that we can merge call arguments from two vectors like

  _50 = {vect_d_1.253_41, vect_d_1.254_43};
  _51 = VIEW_CONVERT_EXPR<unsigned char>(mask__19.257_49);
  _52 = (unsigned int) _51;
  _53 = _Z3bazd.simdclone.7 (_50, _52);
  _54 = BIT_FIELD_REF <_53, 256, 0>;
  _55 = BIT_FIELD_REF <_53, 256, 256>;

The testcase g++.dg/vect/pr68762-2.cc exercises this on x86_64 with
partial vector usage enabled and AVX512 support.

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

	PR tree-optimization/115868
	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Correctly
	compute the number of mask copies required for vect_record_loop_mask.
---
 gcc/tree-vect-stmts.cc | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)
diff mbox series

Patch

diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 2e4d500d1f2..8530a98e6d6 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4349,9 +4349,14 @@  vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 	    case SIMD_CLONE_ARG_TYPE_MASK:
 	      if (loop_vinfo
 		  && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo))
-		vect_record_loop_mask (loop_vinfo,
-				       &LOOP_VINFO_MASKS (loop_vinfo),
-				       ncopies, vectype, op);
+		{
+		  unsigned nmasks
+		    = exact_div (ncopies * bestn->simdclone->simdlen,
+				 TYPE_VECTOR_SUBPARTS (vectype)).to_constant ();
+		  vect_record_loop_mask (loop_vinfo,
+					 &LOOP_VINFO_MASKS (loop_vinfo),
+					 nmasks, vectype, op);
+		}
 
 	      break;
 	    }