diff mbox series

[6/8] vect: Add vector_mode paramater to simd_clone_usable

Message ID 4eda2924-2fe1-63ed-d6c5-2bdea8fd34d3@arm.com
State New
Headers show
Series [1/8] parloops: Copy target and optimizations when creating a function clone | expand

Commit Message

Andre Vieira (lists) Aug. 30, 2023, 9:14 a.m. UTC
This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE 
hook to enable rejecting SVE modes when the target architecture does not 
support SVE.

gcc/ChangeLog:

	* config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode
	parameter and use to to reject SVE modes when target architecture does
	not support SVE.
	* config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused mode parameter.
	* config/i386/i386.cc (ix86_simd_clone_usable): Likewise.
	* doc/tm.texi (TARGET_SIMD_CLONE_USABLE): Document new parameter.
	* target.def (usable): Add new parameter.
	* tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass vector mode
	to TARGET_SIMD_CLONE_CALL hook.

Comments

Andre Vieira (lists) Aug. 30, 2023, 9:17 a.m. UTC | #1
Forgot to CC this one to maintainers...

On 30/08/2023 10:14, Andre Vieira (lists) via Gcc-patches wrote:
> This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE 
> hook to enable rejecting SVE modes when the target architecture does not 
> support SVE.
> 
> gcc/ChangeLog:
> 
>      * config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode
>      parameter and use to to reject SVE modes when target architecture does
>      not support SVE.
>      * config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused mode 
> parameter.
>      * config/i386/i386.cc (ix86_simd_clone_usable): Likewise.
>      * doc/tm.texi (TARGET_SIMD_CLONE_USABLE): Document new parameter.
>      * target.def (usable): Add new parameter.
>      * tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass vector mode
>      to TARGET_SIMD_CLONE_CALL hook.
Richard Biener Aug. 30, 2023, 1:01 p.m. UTC | #2
On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE
> hook to enable rejecting SVE modes when the target architecture does not
> support SVE.

How does the graph node of the SIMD clone lack this information?  That is, it
should have information on the types (and thus modes) for all formal arguments
and return values already, no?  At least the target would know how to
instantiate
it if it's not readily available at the point of use.

> gcc/ChangeLog:
>
>         * config/aarch64/aarch64.cc (aarch64_simd_clone_usable): Add mode
>         parameter and use to to reject SVE modes when target architecture does
>         not support SVE.
>         * config/gcn/gcn.cc (gcn_simd_clone_usable): Add unused mode parameter.
>         * config/i386/i386.cc (ix86_simd_clone_usable): Likewise.
>         * doc/tm.texi (TARGET_SIMD_CLONE_USABLE): Document new parameter.
>         * target.def (usable): Add new parameter.
>         * tree-vect-stmts.cc (vectorizable_simd_clone_call): Pass vector mode
>         to TARGET_SIMD_CLONE_CALL hook.
Andre Vieira (lists) Aug. 30, 2023, 3:02 p.m. UTC | #3
On 30/08/2023 14:01, Richard Biener wrote:
> On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
>>
>> This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE
>> hook to enable rejecting SVE modes when the target architecture does not
>> support SVE.
> 
> How does the graph node of the SIMD clone lack this information?  That is, it
> should have information on the types (and thus modes) for all formal arguments
> and return values already, no?  At least the target would know how to
> instantiate
> it if it's not readily available at the point of use.
> 

Yes it does, but that's the modes the simd clone itself uses, it does 
not know what vector_mode we are currently vectorizing for. Which is 
exactly why we need the vinfo's vector_mode to make sure the simd clone 
and its types are compatible with the vector mode.

In practice, to make sure that a SVE simd clones are only used in loops 
being vectorized for SVE modes. Having said that... I just realized that 
the simdlen check already takes care of that currently...

by simdlen check I mean the one that writes off simdclones that match:
         if (!constant_multiple_p (vf, n->simdclone->simdlen, &num_calls)

However, when using -msve-vector-bits this will become an issue, as the 
VF will be constant and we will match NEON simdclones.  This requires 
some further attention though given that we now also reject the use of 
SVE simdclones when using -msve-vector-bits, and I'm not entirely sure 
we should...

I'm going on holidays for 2 weeks now though, so I'll have a look at 
that scenario when I get back. Same with other feedback, didn't expect 
feedback this quickly ;) Thank you!!

Kind regards,
Andre
Richard Biener Aug. 31, 2023, 6:39 a.m. UTC | #4
On Wed, Aug 30, 2023 at 5:02 PM Andre Vieira (lists)
<andre.simoesdiasvieira@arm.com> wrote:
>
>
>
> On 30/08/2023 14:01, Richard Biener wrote:
> > On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches
> > <gcc-patches@gcc.gnu.org> wrote:
> >>
> >> This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE
> >> hook to enable rejecting SVE modes when the target architecture does not
> >> support SVE.
> >
> > How does the graph node of the SIMD clone lack this information?  That is, it
> > should have information on the types (and thus modes) for all formal arguments
> > and return values already, no?  At least the target would know how to
> > instantiate
> > it if it's not readily available at the point of use.
> >
>
> Yes it does, but that's the modes the simd clone itself uses, it does
> not know what vector_mode we are currently vectorizing for. Which is
> exactly why we need the vinfo's vector_mode to make sure the simd clone
> and its types are compatible with the vector mode.
>
> In practice, to make sure that a SVE simd clones are only used in loops
> being vectorized for SVE modes. Having said that... I just realized that
> the simdlen check already takes care of that currently...
>
> by simdlen check I mean the one that writes off simdclones that match:
>          if (!constant_multiple_p (vf, n->simdclone->simdlen, &num_calls)
>
> However, when using -msve-vector-bits this will become an issue, as the
> VF will be constant and we will match NEON simdclones.  This requires
> some further attention though given that we now also reject the use of
> SVE simdclones when using -msve-vector-bits, and I'm not entirely sure
> we should...

Hmm, but vectorizable_simdclone should check for compatible types here
and if they are compatible why should we reject them?  Are -msve-vector-bits
"SVE" modes different from "NEON" modes?  I suppose not, because otherwise
the type compatibility check would say incompatible.

> I'm going on holidays for 2 weeks now though, so I'll have a look at
> that scenario when I get back. Same with other feedback, didn't expect
> feedback this quickly ;) Thank you!!
>
> Kind regards,
> Andre
>
Andre Vieira (lists) Sept. 28, 2023, 3:57 p.m. UTC | #5
On 31/08/2023 07:39, Richard Biener wrote:
> On Wed, Aug 30, 2023 at 5:02 PM Andre Vieira (lists)
> <andre.simoesdiasvieira@arm.com> wrote:
>>
>>
>>
>> On 30/08/2023 14:01, Richard Biener wrote:
>>> On Wed, Aug 30, 2023 at 11:15 AM Andre Vieira (lists) via Gcc-patches
>>> <gcc-patches@gcc.gnu.org> wrote:
>>>>
>>>> This patch adds a machine_mode parameter to the TARGET_SIMD_CLONE_USABLE
>>>> hook to enable rejecting SVE modes when the target architecture does not
>>>> support SVE.
>>>
>>> How does the graph node of the SIMD clone lack this information?  That is, it
>>> should have information on the types (and thus modes) for all formal arguments
>>> and return values already, no?  At least the target would know how to
>>> instantiate
>>> it if it's not readily available at the point of use.
>>>
>>
>> Yes it does, but that's the modes the simd clone itself uses, it does
>> not know what vector_mode we are currently vectorizing for. Which is
>> exactly why we need the vinfo's vector_mode to make sure the simd clone
>> and its types are compatible with the vector mode.
>>
>> In practice, to make sure that a SVE simd clones are only used in loops
>> being vectorized for SVE modes. Having said that... I just realized that
>> the simdlen check already takes care of that currently...
>>
>> by simdlen check I mean the one that writes off simdclones that match:
>>           if (!constant_multiple_p (vf, n->simdclone->simdlen, &num_calls)
>>
>> However, when using -msve-vector-bits this will become an issue, as the
>> VF will be constant and we will match NEON simdclones.  This requires
>> some further attention though given that we now also reject the use of
>> SVE simdclones when using -msve-vector-bits, and I'm not entirely sure
>> we should...
> 
> Hmm, but vectorizable_simdclone should check for compatible types here
> and if they are compatible why should we reject them?  Are -msve-vector-bits
> "SVE" modes different from "NEON" modes?  I suppose not, because otherwise
> the type compatibility check would say incompatible.
> 
Prior to transformation we do all checks on the original scalar values, 
not the vector types. But I do believe you are right in that we don't 
need to pass the vector_mode. The simdlen check should be enough and if 
the length is the same or a multiple of the rest of the could should be 
able to deal with that and any conversions when dealing with things like 
SVE types that require the attribute.

I'll update the patch series soon and after that I'll look at how this 
reacts to -msve-vector-bits in more detail.

Thanks,
Andre
diff mbox series

Patch

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 5fb4c863d875871d6de865e72ce360506a3694d2..a13d3fba05f9f9d2989b36c681bc77d71e943e0d 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -27498,12 +27498,18 @@  aarch64_simd_clone_adjust (struct cgraph_node *node)
 /* Implement TARGET_SIMD_CLONE_USABLE.  */
 
 static int
-aarch64_simd_clone_usable (struct cgraph_node *node)
+aarch64_simd_clone_usable (struct cgraph_node *node, machine_mode vector_mode)
 {
   switch (node->simdclone->vecsize_mangle)
     {
     case 'n':
-      if (!TARGET_SIMD)
+      if (!TARGET_SIMD
+	  || aarch64_sve_mode_p (vector_mode))
+	return -1;
+      return 0;
+    case 's':
+      if (!TARGET_SVE
+	  || !aarch64_sve_mode_p (vector_mode))
 	return -1;
       return 0;
     default:
diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 02f4dedec4214b1eea9e6f5057ed57d7e0db316a..252676273f06500c99df6ae251f0406c618df891 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -5599,7 +5599,8 @@  gcn_simd_clone_adjust (struct cgraph_node *ARG_UNUSED (node))
 /* Implement TARGET_SIMD_CLONE_USABLE.  */
 
 static int
-gcn_simd_clone_usable (struct cgraph_node *ARG_UNUSED (node))
+gcn_simd_clone_usable (struct cgraph_node *ARG_UNUSED (node),
+		       machine_mode ARG_UNUSED (mode))
 {
   /* We don't need to do anything here because
      gcn_simd_clone_compute_vecsize_and_simdlen currently only returns one
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index 5d57726e22cea8bcaa8ac8b1b25ac420193f39bb..84f0d5a7cb679e6be92001f59802276635506e97 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -24379,7 +24379,8 @@  ix86_simd_clone_compute_vecsize_and_simdlen (struct cgraph_node *node,
    slightly less desirable, etc.).  */
 
 static int
-ix86_simd_clone_usable (struct cgraph_node *node)
+ix86_simd_clone_usable (struct cgraph_node *node,
+			machine_mode mode ATTRIBUTE_UNUSED)
 {
   switch (node->simdclone->vecsize_mangle)
     {
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 95ba56e05ae4a0f11639cc4a21d6736c53ad5ef1..bde22e562ebb9069122eb3b142ab8f4a4ae56a3a 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -6336,11 +6336,13 @@  This hook should add implicit @code{attribute(target("..."))} attribute
 to SIMD clone @var{node} if needed.
 @end deftypefn
 
-@deftypefn {Target Hook} int TARGET_SIMD_CLONE_USABLE (struct cgraph_node *@var{})
+@deftypefn {Target Hook} int TARGET_SIMD_CLONE_USABLE (struct cgraph_node *@var{}, @var{machine_mode})
 This hook should return -1 if SIMD clone @var{node} shouldn't be used
-in vectorized loops in current function, or non-negative number if it is
-usable.  In that case, the smaller the number is, the more desirable it is
-to use it.
+in vectorized loops being vectorized with mode @var{m} in current function, or
+non-negative number if it is usable.  In that case, the smaller the number is,
+the more desirable it is to use it.
+@end deftypefn
+
 @end deftypefn
 
 @deftypefn {Target Hook} int TARGET_SIMT_VF (void)
diff --git a/gcc/target.def b/gcc/target.def
index 7d684296c17897b4ceecb31c5de1ae8665a8228e..6a0cbc454526ee29011451b570354bf234a4eabd 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1645,10 +1645,11 @@  void, (struct cgraph_node *), NULL)
 DEFHOOK
 (usable,
 "This hook should return -1 if SIMD clone @var{node} shouldn't be used\n\
-in vectorized loops in current function, or non-negative number if it is\n\
-usable.  In that case, the smaller the number is, the more desirable it is\n\
-to use it.",
-int, (struct cgraph_node *), NULL)
+in vectorized loops being vectorized with mode @var{m} in current function, or\n\
+non-negative number if it is usable.  In that case, the smaller the number is,\n\
+the more desirable it is to use it.",
+int, (struct cgraph_node *, machine_mode), NULL)
+
 
 HOOK_VECTOR_END (simd_clone)
 
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 7217f36a250d549b955c874d7c7644d94982b0b5..dc2fc20ef9fe777132308c9e33f7731d62717466 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -4195,7 +4195,7 @@  vectorizable_simd_clone_call (vec_info *vinfo, stmt_vec_info stmt_info,
 	  this_badness += exact_log2 (num_calls) * 4096;
 	if (n->simdclone->inbranch)
 	  this_badness += 8192;
-	int target_badness = targetm.simd_clone.usable (n);
+	int target_badness = targetm.simd_clone.usable (n, vinfo->vector_mode);
 	if (target_badness < 0)
 	  continue;
 	this_badness += target_badness * 512;