diff mbox series

Try fixing RISC-V .SELECT_VL with SLP

Message ID 20240913062733.1A65A13999@imap1.dmz-prg2.suse.org
State New
Headers show
Series Try fixing RISC-V .SELECT_VL with SLP | expand

Commit Message

Richard Biener Sept. 13, 2024, 6:27 a.m. UTC
The following simply removes a seemingly bogus guard.

	* tree-vect-loop.cc (vect_analyze_loop_1): Remove SLP guard
	from .SELECT_VL disabling.
---
 gcc/tree-vect-loop.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Robin Dapp Sept. 14, 2024, 9:37 a.m. UTC | #1
> The following simply removes a seemingly bogus guard.
>
> 	* tree-vect-loop.cc (vect_analyze_loop_1): Remove SLP guard
> 	from .SELECT_VL disabling.
> ---
>  gcc/tree-vect-loop.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
> index cc15492f6a0..378e7c560bd 100644
> --- a/gcc/tree-vect-loop.cc
> +++ b/gcc/tree-vect-loop.cc
> @@ -3078,7 +3078,7 @@ start_over:
>        if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
>  					  OPTIMIZE_FOR_SPEED)
>  	  && LOOP_VINFO_LENS (loop_vinfo).length () == 1
> -	  && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp
> +	  && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1

I don't think it will just work like that.  The problem is that we adjust data
pointer increments according to the current vector length in
vect_get_loop_variant_data_ptr_increment.

To this end we use the SELECT_VL result and multiply it with the current
dataref's step.  Without having looked into it closer I suppose using step is
not suffcient or maybe even wrong in an SLP setting and I guess this
complication lead to SLP being disabled initially.
Richard Biener Sept. 14, 2024, 6:12 p.m. UTC | #2
> Am 14.09.2024 um 11:38 schrieb Robin Dapp <rdapp.gcc@gmail.com>:
> 
> 
>> 
>> The following simply removes a seemingly bogus guard.
>> 
>>    * tree-vect-loop.cc (vect_analyze_loop_1): Remove SLP guard
>>    from .SELECT_VL disabling.
>> ---
>> gcc/tree-vect-loop.cc | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
>> index cc15492f6a0..378e7c560bd 100644
>> --- a/gcc/tree-vect-loop.cc
>> +++ b/gcc/tree-vect-loop.cc
>> @@ -3078,7 +3078,7 @@ start_over:
>>       if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
>>                      OPTIMIZE_FOR_SPEED)
>>      && LOOP_VINFO_LENS (loop_vinfo).length () == 1
>> -      && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp
>> +      && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1
> 
> I don't think it will just work like that.  The problem is that we adjust data
> pointer increments according to the current vector length in
> vect_get_loop_variant_data_ptr_increment.
> 
> To this end we use the SELECT_VL result and multiply it with the current
> dataref's step.  Without having looked into it closer I suppose using step is
> not suffcient or maybe even wrong in an SLP setting and I guess this
> complication lead to SLP being disabled initially.

Yes, that’s one complication.  Another is that the length would need to be scaled by the SLP group size - but maybe one could use lmul for that…  (assuming an uniform loop)

Restricting SLP to the case of all single-lane nodes should work though.

Richard.

> --
> Regards
> Robin
>
diff mbox series

Patch

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index cc15492f6a0..378e7c560bd 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -3078,7 +3078,7 @@  start_over:
       if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
 					  OPTIMIZE_FOR_SPEED)
 	  && LOOP_VINFO_LENS (loop_vinfo).length () == 1
-	  && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp
+	  && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1
 	  && (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
 	      || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()))
 	LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) = true;