Message ID | 20240913062733.1A65A13999@imap1.dmz-prg2.suse.org |
---|---|
State | New |
Headers | show |
Series | Try fixing RISC-V .SELECT_VL with SLP | expand |
> The following simply removes a seemingly bogus guard. > > * tree-vect-loop.cc (vect_analyze_loop_1): Remove SLP guard > from .SELECT_VL disabling. > --- > gcc/tree-vect-loop.cc | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc > index cc15492f6a0..378e7c560bd 100644 > --- a/gcc/tree-vect-loop.cc > +++ b/gcc/tree-vect-loop.cc > @@ -3078,7 +3078,7 @@ start_over: > if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type, > OPTIMIZE_FOR_SPEED) > && LOOP_VINFO_LENS (loop_vinfo).length () == 1 > - && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp > + && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 I don't think it will just work like that. The problem is that we adjust data pointer increments according to the current vector length in vect_get_loop_variant_data_ptr_increment. To this end we use the SELECT_VL result and multiply it with the current dataref's step. Without having looked into it closer I suppose using step is not suffcient or maybe even wrong in an SLP setting and I guess this complication lead to SLP being disabled initially.
> Am 14.09.2024 um 11:38 schrieb Robin Dapp <rdapp.gcc@gmail.com>: > > >> >> The following simply removes a seemingly bogus guard. >> >> * tree-vect-loop.cc (vect_analyze_loop_1): Remove SLP guard >> from .SELECT_VL disabling. >> --- >> gcc/tree-vect-loop.cc | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc >> index cc15492f6a0..378e7c560bd 100644 >> --- a/gcc/tree-vect-loop.cc >> +++ b/gcc/tree-vect-loop.cc >> @@ -3078,7 +3078,7 @@ start_over: >> if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type, >> OPTIMIZE_FOR_SPEED) >> && LOOP_VINFO_LENS (loop_vinfo).length () == 1 >> - && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp >> + && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 > > I don't think it will just work like that. The problem is that we adjust data > pointer increments according to the current vector length in > vect_get_loop_variant_data_ptr_increment. > > To this end we use the SELECT_VL result and multiply it with the current > dataref's step. Without having looked into it closer I suppose using step is > not suffcient or maybe even wrong in an SLP setting and I guess this > complication lead to SLP being disabled initially. Yes, that’s one complication. Another is that the length would need to be scaled by the SLP group size - but maybe one could use lmul for that… (assuming an uniform loop) Restricting SLP to the case of all single-lane nodes should work though. Richard. > -- > Regards > Robin >
diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index cc15492f6a0..378e7c560bd 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -3078,7 +3078,7 @@ start_over: if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type, OPTIMIZE_FOR_SPEED) && LOOP_VINFO_LENS (loop_vinfo).length () == 1 - && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp + && LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo) || !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ())) LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) = true;