diff mbox series

[2/4] middle-end: Fix VEC_PERM_EXPR lowering since relaxation of vector sizes

Message ID Zwz4u2JYLRxlR8hk@arm.com
State New
Headers show
Series [1/4] middle-end: support multi-step zero-extends using VEC_PERM_EXPR | expand

Commit Message

Tamar Christina Oct. 14, 2024, 10:55 a.m. UTC
Hi All,

In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger vector
than the size of the input vectors.  However various passes and transformations
were not updated to account for this.

I have patches in these area that I will be upstreaming with individual patches
that expose them.

This one is that vectlower tries to lower based on the size of the input vectors
rather than the size of the output.  As a consequence it creates an invalid
vector of half the size.

Luckily we ICE because the resulting nunits doesn't match the vector size.

Tests in the AArch64 patch test for this behaviour.

Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf,
x86_64-pc-linux-gnu -m32, -m64 and no issues.

Ok for master?

Thanks,
Tamar

gcc/ChangeLog:

	* tree-vect-generic.cc (lower_vec_perm): Use output vector size instead
	of input vector when determining output nunits.

---




--

Comments

Richard Biener Oct. 15, 2024, 12:21 p.m. UTC | #1
On Mon, 14 Oct 2024, Tamar Christina wrote:

> Hi All,
> 
> In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger vector
> than the size of the input vectors.  However various passes and transformations
> were not updated to account for this.
> 
> I have patches in these area that I will be upstreaming with individual patches
> that expose them.
> 
> This one is that vectlower tries to lower based on the size of the input vectors
> rather than the size of the output.  As a consequence it creates an invalid
> vector of half the size.
> 
> Luckily we ICE because the resulting nunits doesn't match the vector size.
> 
> Tests in the AArch64 patch test for this behaviour.
> 
> Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf,
> x86_64-pc-linux-gnu -m32, -m64 and no issues.
> 
> Ok for master?

OK.

Do you have a testcase btw?

> Thanks,
> Tamar
> 
> gcc/ChangeLog:
> 
> 	* tree-vect-generic.cc (lower_vec_perm): Use output vector size instead
> 	of input vector when determining output nunits.
> 
> ---
> diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
> index 3041fb8fcf235ba86f37ef73aa089330a2fd0b77..f86f7eabb255fde50b30fa3b85db367df930f321 100644
> --- a/gcc/tree-vect-generic.cc
> +++ b/gcc/tree-vect-generic.cc
> @@ -1500,6 +1500,7 @@ lower_vec_perm (gimple_stmt_iterator *gsi)
>    tree mask = gimple_assign_rhs3 (stmt);
>    tree vec0 = gimple_assign_rhs1 (stmt);
>    tree vec1 = gimple_assign_rhs2 (stmt);
> +  tree res_vect_type = TREE_TYPE (gimple_assign_lhs (stmt));
>    tree vect_type = TREE_TYPE (vec0);
>    tree mask_type = TREE_TYPE (mask);
>    tree vect_elt_type = TREE_TYPE (vect_type);
> @@ -1512,7 +1513,7 @@ lower_vec_perm (gimple_stmt_iterator *gsi)
>    location_t loc = gimple_location (gsi_stmt (*gsi));
>    unsigned i;
>  
> -  if (!TYPE_VECTOR_SUBPARTS (vect_type).is_constant (&elements))
> +  if (!TYPE_VECTOR_SUBPARTS (res_vect_type).is_constant (&elements))
>      return;
>  
>    if (TREE_CODE (mask) == SSA_NAME)
> @@ -1672,9 +1673,9 @@ lower_vec_perm (gimple_stmt_iterator *gsi)
>      }
>  
>    if (constant_p)
> -    constr = build_vector_from_ctor (vect_type, v);
> +    constr = build_vector_from_ctor (res_vect_type, v);
>    else
> -    constr = build_constructor (vect_type, v);
> +    constr = build_constructor (res_vect_type, v);
>    gimple_assign_set_rhs_from_tree (gsi, constr);
>    update_stmt (gsi_stmt (*gsi));
>  }
> 
> 
> 
> 
>
Tamar Christina Oct. 15, 2024, 12:45 p.m. UTC | #2
> -----Original Message-----
> From: Richard Biener <rguenther@suse.de>
> Sent: Tuesday, October 15, 2024 1:22 PM
> To: Tamar Christina <Tamar.Christina@arm.com>
> Cc: gcc-patches@gcc.gnu.org; nd <nd@arm.com>
> Subject: Re: [PATCH 2/4]middle-end: Fix VEC_PERM_EXPR lowering since
> relaxation of vector sizes
> 
> On Mon, 14 Oct 2024, Tamar Christina wrote:
> 
> > Hi All,
> >
> > In GCC 14 VEC_PERM_EXPR was relaxed to be able to permute to a 2x larger
> vector
> > than the size of the input vectors.  However various passes and transformations
> > were not updated to account for this.
> >
> > I have patches in these area that I will be upstreaming with individual patches
> > that expose them.
> >
> > This one is that vectlower tries to lower based on the size of the input vectors
> > rather than the size of the output.  As a consequence it creates an invalid
> > vector of half the size.
> >
> > Luckily we ICE because the resulting nunits doesn't match the vector size.
> >
> > Tests in the AArch64 patch test for this behaviour.
> >
> > Bootstrapped Regtested on aarch64-none-linux-gnu, arm-none-linux-gnueabihf,
> > x86_64-pc-linux-gnu -m32, -m64 and no issues.
> >
> > Ok for master?
> 
> OK.
> 
> Do you have a testcase btw?

It was relying on the auto-vect of the zero extends as TBLs.
But I'll create one using the gimple front-end.

Today __shufflevector zero pads permutes so you can't generate
this yet.. (in my next patch series).

I'll write a gimple one and commit with this then.

Thanks,
Tamar
> 
> > Thanks,
> > Tamar
> >
> > gcc/ChangeLog:
> >
> > 	* tree-vect-generic.cc (lower_vec_perm): Use output vector size instead
> > 	of input vector when determining output nunits.
> >
> > ---
> > diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
> > index
> 3041fb8fcf235ba86f37ef73aa089330a2fd0b77..f86f7eabb255fde50b30fa3b85
> db367df930f321 100644
> > --- a/gcc/tree-vect-generic.cc
> > +++ b/gcc/tree-vect-generic.cc
> > @@ -1500,6 +1500,7 @@ lower_vec_perm (gimple_stmt_iterator *gsi)
> >    tree mask = gimple_assign_rhs3 (stmt);
> >    tree vec0 = gimple_assign_rhs1 (stmt);
> >    tree vec1 = gimple_assign_rhs2 (stmt);
> > +  tree res_vect_type = TREE_TYPE (gimple_assign_lhs (stmt));
> >    tree vect_type = TREE_TYPE (vec0);
> >    tree mask_type = TREE_TYPE (mask);
> >    tree vect_elt_type = TREE_TYPE (vect_type);
> > @@ -1512,7 +1513,7 @@ lower_vec_perm (gimple_stmt_iterator *gsi)
> >    location_t loc = gimple_location (gsi_stmt (*gsi));
> >    unsigned i;
> >
> > -  if (!TYPE_VECTOR_SUBPARTS (vect_type).is_constant (&elements))
> > +  if (!TYPE_VECTOR_SUBPARTS (res_vect_type).is_constant (&elements))
> >      return;
> >
> >    if (TREE_CODE (mask) == SSA_NAME)
> > @@ -1672,9 +1673,9 @@ lower_vec_perm (gimple_stmt_iterator *gsi)
> >      }
> >
> >    if (constant_p)
> > -    constr = build_vector_from_ctor (vect_type, v);
> > +    constr = build_vector_from_ctor (res_vect_type, v);
> >    else
> > -    constr = build_constructor (vect_type, v);
> > +    constr = build_constructor (res_vect_type, v);
> >    gimple_assign_set_rhs_from_tree (gsi, constr);
> >    update_stmt (gsi_stmt (*gsi));
> >  }
> >
> >
> >
> >
> >
> 
> --
> Richard Biener <rguenther@suse.de>
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
diff mbox series

Patch

diff --git a/gcc/tree-vect-generic.cc b/gcc/tree-vect-generic.cc
index 3041fb8fcf235ba86f37ef73aa089330a2fd0b77..f86f7eabb255fde50b30fa3b85db367df930f321 100644
--- a/gcc/tree-vect-generic.cc
+++ b/gcc/tree-vect-generic.cc
@@ -1500,6 +1500,7 @@  lower_vec_perm (gimple_stmt_iterator *gsi)
   tree mask = gimple_assign_rhs3 (stmt);
   tree vec0 = gimple_assign_rhs1 (stmt);
   tree vec1 = gimple_assign_rhs2 (stmt);
+  tree res_vect_type = TREE_TYPE (gimple_assign_lhs (stmt));
   tree vect_type = TREE_TYPE (vec0);
   tree mask_type = TREE_TYPE (mask);
   tree vect_elt_type = TREE_TYPE (vect_type);
@@ -1512,7 +1513,7 @@  lower_vec_perm (gimple_stmt_iterator *gsi)
   location_t loc = gimple_location (gsi_stmt (*gsi));
   unsigned i;
 
-  if (!TYPE_VECTOR_SUBPARTS (vect_type).is_constant (&elements))
+  if (!TYPE_VECTOR_SUBPARTS (res_vect_type).is_constant (&elements))
     return;
 
   if (TREE_CODE (mask) == SSA_NAME)
@@ -1672,9 +1673,9 @@  lower_vec_perm (gimple_stmt_iterator *gsi)
     }
 
   if (constant_p)
-    constr = build_vector_from_ctor (vect_type, v);
+    constr = build_vector_from_ctor (res_vect_type, v);
   else
-    constr = build_constructor (vect_type, v);
+    constr = build_constructor (res_vect_type, v);
   gimple_assign_set_rhs_from_tree (gsi, constr);
   update_stmt (gsi_stmt (*gsi));
 }