diff mbox series

vect: Try smaller vector size when SLP split fails

Message ID aa1b1f0d-95c9-ce18-352d-582799302e2d@codesourcery.com
State New
Headers show
Series vect: Try smaller vector size when SLP split fails | expand

Commit Message

Andrew Stubbs Aug. 5, 2020, 1:29 p.m. UTC
This patch improves SLP performance in combination with some patches I 
have in development to add multiple vector sizes to amdgcn.

The problem is that amdgcn's preferred vector size has 64 lanes, and SLP 
does not support lane masking.  My patches will add smaller vector sizes 
(32, 16, 8, 4, 2) which make the lane masking implicit, but still SLP 
doesn't use them; it simply rejects the first size it sees and gives up.

This patch detects the rejection early and looks to see if there is a 
smaller, more suitable vector size.  The result is many more successful 
SLP testcases.

OK to commit? (I have an x86_64 bootstrap and test in progress.)

Andrew

Comments

Richard Biener Aug. 5, 2020, 2:22 p.m. UTC | #1
On Wed, Aug 5, 2020 at 3:30 PM Andrew Stubbs <ams@codesourcery.com> wrote:
>
> This patch improves SLP performance in combination with some patches I
> have in development to add multiple vector sizes to amdgcn.
>
> The problem is that amdgcn's preferred vector size has 64 lanes, and SLP
> does not support lane masking.  My patches will add smaller vector sizes
> (32, 16, 8, 4, 2) which make the lane masking implicit, but still SLP
> doesn't use them; it simply rejects the first size it sees and gives up.
>
> This patch detects the rejection early and looks to see if there is a
> smaller, more suitable vector size.  The result is many more successful
> SLP testcases.
>
> OK to commit? (I have an x86_64 bootstrap and test in progress.)

Is this about basic-block SLP?  There it should eventually split groups.
For loop based SLP did you specify the autovectorize_vector_modes
hook?  Otherwise the vectorizer only tries a single size.

Richard.

> Andrew
diff mbox series

Patch

vect: Try smaller vector size when SLP split fails

If the preferred vector size is larger than can be used then try again with
a smaller size.  This allows SLP to work more effectively on targets with very
large vectors.

gcc/ChangeLog:

	* tree-vect-slp.c (vect_analyze_slp_instance): Reduce vector size if
	the default mode is too large.

diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 72192b5a813..95518a263c7 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -2367,6 +2367,16 @@  vect_analyze_slp_instance (vec_info *vinfo,
       for (i = 0; i < group_size; i++)
 	if (!matches[i]) break;
 
+      if (i > 1 && i < group_size && i < const_nunits && scalar_type)
+	{
+	  tree vec = get_vectype_for_scalar_type (vinfo, scalar_type, i);
+	  if (vec)
+	    {
+	      nunits = TYPE_VECTOR_SUBPARTS (vec);
+	      const_nunits = nunits.to_constant ();
+	    }
+	}
+
       if (i >= const_nunits && i < group_size)
 	{
 	  /* Split into two groups at the first vector boundary before i.  */