Message ID | 20240904132650.2720446-1-christophe.lyon@linaro.org |
---|---|
Headers | show |
Series | arm: [MVE intrinsics] Re-implement more intrinsics | expand |
ping? On Wed, 4 Sept 2024 at 15:27, Christophe Lyon <christophe.lyon@linaro.org> wrote: > > Hi, > > This is v2 of the patch series I sent in > https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657065.html. > > I have taken into account the feedback I received, and added more > patches to the series, converting more MVE intrinsics to the new > framework. > > Changes v1-v2: > > - I kept patch #1 as-is (so, no change): the comment is a bit verbose, > but I don't this this causes any harm. > - patch #3: use conditionals in a few more places, making the code > more compact and hopefully easier to read. > - patch #5: use "su64" instead of "ss8" for the immediate parameter. > - patch #6: restore alphabetical order in arm-mve-builtins-base.def > - patch #7: remove trailing ')' in comments in mve.md > - patch #10: remove trailing ')' in comments in mve.md > - patch #11: remove unused parameter names to avoid warnings. > - patch #13: fix a comment. > - patch #15: fix a comment. > > Patches 16-36 are new: > - #16: rework vctp > - #17-21: rework v[id]dup + cleanups > - #22: fix checks of immediate arguments, noticed after the discussion > on patch #5 > - #23-27: rework v[id]wdup + cleanups > - #28-30: rework vshlc > - #31-35: rework vadc/vadci/vsbc/vsbci > - #36: introduce long_type_suffix and half_type_suffix helpers to > avoid some code duplication. > > Tested on arm-eabi with > --target_board=arm-qemu{-mthumb/-mfloat-abi=hard/-march=armv8.1-m.main+mve.fp+fp.dp} > > Christophe Lyon (36): > arm: [MVE intrinsics] improve comment for orrq shape > arm: [MVE intrinsics] remove useless resolve from create shape > arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h > arm: [MVE intrinsics] factorize vcvtq > arm: [MVE intrinsics] add vcvt shape > arm: [MVE intrinsics] rework vcvtq > arm: [MVE intrinsics] factorize vcvtbq vcvttq > arm: [MVE intrinsics] add vcvt_f16_f32 and vcvt_f32_f16 shapes > arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 > vcvtbq_f32_f16 vcvttq_f32_f16 > arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq > arm: [MVE intrinsics] add vcvtx shape > arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq > arm: [MVE intrinsics] rework vbicq > arm: [MVE intrinsics] factorize vorn > arm: [MVE intrinsics] rework vorn > arm: [MVE intrinsics] rework vctp > arm: [MVE intrinsics] factorize vddup vidup > arm: [MVE intrinsics] add viddup shape > arm: [MVE intrinsics] rework vddup vidup > arm: [MVE intrinsics] update v[id]dup tests > arm: [MVE intrinsics] remove v[id]dup expanders > arm: [MVE intrinsics] fix checks of immediate arguments > arm: [MVE intrinsics] factorize vdwdup viwdup > arm: [MVE intrinsics] add vidwdup shape > arm: [MVE intrinsics] rework vdwdup viwdup > arm: [MVE intrinsics] update v[id]wdup tests > arm: [MVE intrinsics] remove useless v[id]wdup expanders > arm: [MVE intrinsics] add vshlc shape > arm: [MVE intrinsics] rework vshlcq > arm: [MVE intrinsics] remove vshlcq useless expanders > arm: [MVE intrinsics] add vadc_vsbc shape > arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci > arm: [MVE intrinsics] rework vadciq > arm: [MVE intrinsics] rework vadcq > arm: [MVE intrinsics] rework vsbcq vsbciq > arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers > > gcc/config/arm/arm-builtins.cc | 20 - > gcc/config/arm/arm-mve-builtins-base.cc | 593 ++ > gcc/config/arm/arm-mve-builtins-base.def | 44 +- > gcc/config/arm/arm-mve-builtins-base.h | 22 + > gcc/config/arm/arm-mve-builtins-functions.h | 815 +-- > gcc/config/arm/arm-mve-builtins-shapes.cc | 645 +- > gcc/config/arm/arm-mve-builtins-shapes.h | 9 + > gcc/config/arm/arm-mve-builtins.cc | 95 +- > gcc/config/arm/arm-mve-builtins.def | 1 + > gcc/config/arm/arm-mve-builtins.h | 12 +- > gcc/config/arm/arm_mve.h | 6353 +++-------------- > gcc/config/arm/arm_mve_builtins.def | 20 - > gcc/config/arm/iterators.md | 68 +- > gcc/config/arm/mve.md | 832 +-- > .../arm/mve/intrinsics/vddupq_m_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vddupq_m_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vddupq_m_wb_u8.c | 18 +- > .../arm/mve/intrinsics/vddupq_wb_u16.c | 14 +- > .../arm/mve/intrinsics/vddupq_wb_u32.c | 14 +- > .../arm/mve/intrinsics/vddupq_wb_u8.c | 14 +- > .../arm/mve/intrinsics/vddupq_x_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vddupq_x_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vddupq_x_wb_u8.c | 18 +- > .../arm/mve/intrinsics/vdwdupq_m_wb_u16.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_m_wb_u32.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_m_wb_u8.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_wb_u16.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_wb_u32.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_wb_u8.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_x_wb_u16.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_x_wb_u32.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_x_wb_u8.c | 6 +- > .../arm/mve/intrinsics/vidupq_m_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vidupq_m_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vidupq_m_wb_u8.c | 18 +- > .../arm/mve/intrinsics/vidupq_wb_u16.c | 14 +- > .../arm/mve/intrinsics/vidupq_wb_u32.c | 14 +- > .../arm/mve/intrinsics/vidupq_wb_u8.c | 14 +- > .../arm/mve/intrinsics/vidupq_x_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vidupq_x_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vidupq_x_wb_u8.c | 18 +- > .../arm/mve/intrinsics/viwdupq_m_wb_u16.c | 6 +- > .../arm/mve/intrinsics/viwdupq_m_wb_u32.c | 6 +- > .../arm/mve/intrinsics/viwdupq_m_wb_u8.c | 6 +- > .../arm/mve/intrinsics/viwdupq_wb_u16.c | 6 +- > .../arm/mve/intrinsics/viwdupq_wb_u32.c | 6 +- > .../arm/mve/intrinsics/viwdupq_wb_u8.c | 6 +- > .../arm/mve/intrinsics/viwdupq_x_wb_u16.c | 6 +- > .../arm/mve/intrinsics/viwdupq_x_wb_u32.c | 6 +- > .../arm/mve/intrinsics/viwdupq_x_wb_u8.c | 6 +- > 50 files changed, 2975 insertions(+), 6962 deletions(-) > > -- > 2.34.1 >
On 04/09/2024 14:26, Christophe Lyon wrote: > Hi, > > This is v2 of the patch series I sent in > https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657065.html. > > I have taken into account the feedback I received, and added more > patches to the series, converting more MVE intrinsics to the new > framework. > > Changes v1-v2: > > - I kept patch #1 as-is (so, no change): the comment is a bit verbose, > but I don't this this causes any harm. > - patch #3: use conditionals in a few more places, making the code > more compact and hopefully easier to read. > - patch #5: use "su64" instead of "ss8" for the immediate parameter. > - patch #6: restore alphabetical order in arm-mve-builtins-base.def > - patch #7: remove trailing ')' in comments in mve.md > - patch #10: remove trailing ')' in comments in mve.md > - patch #11: remove unused parameter names to avoid warnings. > - patch #13: fix a comment. > - patch #15: fix a comment. > > Patches 16-36 are new: > - #16: rework vctp > - #17-21: rework v[id]dup + cleanups > - #22: fix checks of immediate arguments, noticed after the discussion > on patch #5 > - #23-27: rework v[id]wdup + cleanups > - #28-30: rework vshlc > - #31-35: rework vadc/vadci/vsbc/vsbci > - #36: introduce long_type_suffix and half_type_suffix helpers to > avoid some code duplication. > > Tested on arm-eabi with > --target_board=arm-qemu{-mthumb/-mfloat-abi=hard/-march=armv8.1-m.main+mve.fp+fp.dp} > There are a couple of (minor) nits on patches 2 and 6, and a question I think needs resolving on patch 34 (re generation of the carry flag), but otherwise the rest of this series is fine. Thanks! R. > Christophe Lyon (36): > arm: [MVE intrinsics] improve comment for orrq shape > arm: [MVE intrinsics] remove useless resolve from create shape > arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h > arm: [MVE intrinsics] factorize vcvtq > arm: [MVE intrinsics] add vcvt shape > arm: [MVE intrinsics] rework vcvtq > arm: [MVE intrinsics] factorize vcvtbq vcvttq > arm: [MVE intrinsics] add vcvt_f16_f32 and vcvt_f32_f16 shapes > arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 > vcvtbq_f32_f16 vcvttq_f32_f16 > arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq > arm: [MVE intrinsics] add vcvtx shape > arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq > arm: [MVE intrinsics] rework vbicq > arm: [MVE intrinsics] factorize vorn > arm: [MVE intrinsics] rework vorn > arm: [MVE intrinsics] rework vctp > arm: [MVE intrinsics] factorize vddup vidup > arm: [MVE intrinsics] add viddup shape > arm: [MVE intrinsics] rework vddup vidup > arm: [MVE intrinsics] update v[id]dup tests > arm: [MVE intrinsics] remove v[id]dup expanders > arm: [MVE intrinsics] fix checks of immediate arguments > arm: [MVE intrinsics] factorize vdwdup viwdup > arm: [MVE intrinsics] add vidwdup shape > arm: [MVE intrinsics] rework vdwdup viwdup > arm: [MVE intrinsics] update v[id]wdup tests > arm: [MVE intrinsics] remove useless v[id]wdup expanders > arm: [MVE intrinsics] add vshlc shape > arm: [MVE intrinsics] rework vshlcq > arm: [MVE intrinsics] remove vshlcq useless expanders > arm: [MVE intrinsics] add vadc_vsbc shape > arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci > arm: [MVE intrinsics] rework vadciq > arm: [MVE intrinsics] rework vadcq > arm: [MVE intrinsics] rework vsbcq vsbciq > arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers > > gcc/config/arm/arm-builtins.cc | 20 - > gcc/config/arm/arm-mve-builtins-base.cc | 593 ++ > gcc/config/arm/arm-mve-builtins-base.def | 44 +- > gcc/config/arm/arm-mve-builtins-base.h | 22 + > gcc/config/arm/arm-mve-builtins-functions.h | 815 +-- > gcc/config/arm/arm-mve-builtins-shapes.cc | 645 +- > gcc/config/arm/arm-mve-builtins-shapes.h | 9 + > gcc/config/arm/arm-mve-builtins.cc | 95 +- > gcc/config/arm/arm-mve-builtins.def | 1 + > gcc/config/arm/arm-mve-builtins.h | 12 +- > gcc/config/arm/arm_mve.h | 6353 +++-------------- > gcc/config/arm/arm_mve_builtins.def | 20 - > gcc/config/arm/iterators.md | 68 +- > gcc/config/arm/mve.md | 832 +-- > .../arm/mve/intrinsics/vddupq_m_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vddupq_m_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vddupq_m_wb_u8.c | 18 +- > .../arm/mve/intrinsics/vddupq_wb_u16.c | 14 +- > .../arm/mve/intrinsics/vddupq_wb_u32.c | 14 +- > .../arm/mve/intrinsics/vddupq_wb_u8.c | 14 +- > .../arm/mve/intrinsics/vddupq_x_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vddupq_x_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vddupq_x_wb_u8.c | 18 +- > .../arm/mve/intrinsics/vdwdupq_m_wb_u16.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_m_wb_u32.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_m_wb_u8.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_wb_u16.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_wb_u32.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_wb_u8.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_x_wb_u16.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_x_wb_u32.c | 6 +- > .../arm/mve/intrinsics/vdwdupq_x_wb_u8.c | 6 +- > .../arm/mve/intrinsics/vidupq_m_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vidupq_m_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vidupq_m_wb_u8.c | 18 +- > .../arm/mve/intrinsics/vidupq_wb_u16.c | 14 +- > .../arm/mve/intrinsics/vidupq_wb_u32.c | 14 +- > .../arm/mve/intrinsics/vidupq_wb_u8.c | 14 +- > .../arm/mve/intrinsics/vidupq_x_wb_u16.c | 18 +- > .../arm/mve/intrinsics/vidupq_x_wb_u32.c | 18 +- > .../arm/mve/intrinsics/vidupq_x_wb_u8.c | 18 +- > .../arm/mve/intrinsics/viwdupq_m_wb_u16.c | 6 +- > .../arm/mve/intrinsics/viwdupq_m_wb_u32.c | 6 +- > .../arm/mve/intrinsics/viwdupq_m_wb_u8.c | 6 +- > .../arm/mve/intrinsics/viwdupq_wb_u16.c | 6 +- > .../arm/mve/intrinsics/viwdupq_wb_u32.c | 6 +- > .../arm/mve/intrinsics/viwdupq_wb_u8.c | 6 +- > .../arm/mve/intrinsics/viwdupq_x_wb_u16.c | 6 +- > .../arm/mve/intrinsics/viwdupq_x_wb_u32.c | 6 +- > .../arm/mve/intrinsics/viwdupq_x_wb_u8.c | 6 +- > 50 files changed, 2975 insertions(+), 6962 deletions(-) >
On Mon, 14 Oct 2024 at 20:30, Richard Earnshaw (lists) <Richard.Earnshaw@arm.com> wrote: > > On 04/09/2024 14:26, Christophe Lyon wrote: > > Hi, > > > > This is v2 of the patch series I sent in > > https://gcc.gnu.org/pipermail/gcc-patches/2024-July/657065.html. > > > > I have taken into account the feedback I received, and added more > > patches to the series, converting more MVE intrinsics to the new > > framework. > > > > Changes v1-v2: > > > > - I kept patch #1 as-is (so, no change): the comment is a bit verbose, > > but I don't this this causes any harm. > > - patch #3: use conditionals in a few more places, making the code > > more compact and hopefully easier to read. > > - patch #5: use "su64" instead of "ss8" for the immediate parameter. > > - patch #6: restore alphabetical order in arm-mve-builtins-base.def > > - patch #7: remove trailing ')' in comments in mve.md > > - patch #10: remove trailing ')' in comments in mve.md > > - patch #11: remove unused parameter names to avoid warnings. > > - patch #13: fix a comment. > > - patch #15: fix a comment. > > > > Patches 16-36 are new: > > - #16: rework vctp > > - #17-21: rework v[id]dup + cleanups > > - #22: fix checks of immediate arguments, noticed after the discussion > > on patch #5 > > - #23-27: rework v[id]wdup + cleanups > > - #28-30: rework vshlc > > - #31-35: rework vadc/vadci/vsbc/vsbci > > - #36: introduce long_type_suffix and half_type_suffix helpers to > > avoid some code duplication. > > > > Tested on arm-eabi with > > --target_board=arm-qemu{-mthumb/-mfloat-abi=hard/-march=armv8.1-m.main+mve.fp+fp.dp} > > > > There are a couple of (minor) nits on patches 2 and 6, and a question I think needs resolving on patch 34 (re generation of the carry flag), but otherwise the rest of this series is fine. > Thanks, I've just pushed the series with the small changes requested on patches 2 and 6. Christophe > Thanks! > R. > > > Christophe Lyon (36): > > arm: [MVE intrinsics] improve comment for orrq shape > > arm: [MVE intrinsics] remove useless resolve from create shape > > arm: [MVE intrinsics] Cleanup arm-mve-builtins-functions.h > > arm: [MVE intrinsics] factorize vcvtq > > arm: [MVE intrinsics] add vcvt shape > > arm: [MVE intrinsics] rework vcvtq > > arm: [MVE intrinsics] factorize vcvtbq vcvttq > > arm: [MVE intrinsics] add vcvt_f16_f32 and vcvt_f32_f16 shapes > > arm: [MVE intrinsics] rework vcvtbq_f16_f32 vcvttq_f16_f32 > > vcvtbq_f32_f16 vcvttq_f32_f16 > > arm: [MVE intrinsics] factorize vcvtaq vcvtmq vcvtnq vcvtpq > > arm: [MVE intrinsics] add vcvtx shape > > arm: [MVE intrinsics] rework vcvtaq vcvtmq vcvtnq vcvtpq > > arm: [MVE intrinsics] rework vbicq > > arm: [MVE intrinsics] factorize vorn > > arm: [MVE intrinsics] rework vorn > > arm: [MVE intrinsics] rework vctp > > arm: [MVE intrinsics] factorize vddup vidup > > arm: [MVE intrinsics] add viddup shape > > arm: [MVE intrinsics] rework vddup vidup > > arm: [MVE intrinsics] update v[id]dup tests > > arm: [MVE intrinsics] remove v[id]dup expanders > > arm: [MVE intrinsics] fix checks of immediate arguments > > arm: [MVE intrinsics] factorize vdwdup viwdup > > arm: [MVE intrinsics] add vidwdup shape > > arm: [MVE intrinsics] rework vdwdup viwdup > > arm: [MVE intrinsics] update v[id]wdup tests > > arm: [MVE intrinsics] remove useless v[id]wdup expanders > > arm: [MVE intrinsics] add vshlc shape > > arm: [MVE intrinsics] rework vshlcq > > arm: [MVE intrinsics] remove vshlcq useless expanders > > arm: [MVE intrinsics] add vadc_vsbc shape > > arm: [MVE intrinsics] factorize vadc vadci vsbc vsbci > > arm: [MVE intrinsics] rework vadciq > > arm: [MVE intrinsics] rework vadcq > > arm: [MVE intrinsics] rework vsbcq vsbciq > > arm: [MVE intrinsics] use long_type_suffix / half_type_suffix helpers > > > > gcc/config/arm/arm-builtins.cc | 20 - > > gcc/config/arm/arm-mve-builtins-base.cc | 593 ++ > > gcc/config/arm/arm-mve-builtins-base.def | 44 +- > > gcc/config/arm/arm-mve-builtins-base.h | 22 + > > gcc/config/arm/arm-mve-builtins-functions.h | 815 +-- > > gcc/config/arm/arm-mve-builtins-shapes.cc | 645 +- > > gcc/config/arm/arm-mve-builtins-shapes.h | 9 + > > gcc/config/arm/arm-mve-builtins.cc | 95 +- > > gcc/config/arm/arm-mve-builtins.def | 1 + > > gcc/config/arm/arm-mve-builtins.h | 12 +- > > gcc/config/arm/arm_mve.h | 6353 +++-------------- > > gcc/config/arm/arm_mve_builtins.def | 20 - > > gcc/config/arm/iterators.md | 68 +- > > gcc/config/arm/mve.md | 832 +-- > > .../arm/mve/intrinsics/vddupq_m_wb_u16.c | 18 +- > > .../arm/mve/intrinsics/vddupq_m_wb_u32.c | 18 +- > > .../arm/mve/intrinsics/vddupq_m_wb_u8.c | 18 +- > > .../arm/mve/intrinsics/vddupq_wb_u16.c | 14 +- > > .../arm/mve/intrinsics/vddupq_wb_u32.c | 14 +- > > .../arm/mve/intrinsics/vddupq_wb_u8.c | 14 +- > > .../arm/mve/intrinsics/vddupq_x_wb_u16.c | 18 +- > > .../arm/mve/intrinsics/vddupq_x_wb_u32.c | 18 +- > > .../arm/mve/intrinsics/vddupq_x_wb_u8.c | 18 +- > > .../arm/mve/intrinsics/vdwdupq_m_wb_u16.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_m_wb_u32.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_m_wb_u8.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_wb_u16.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_wb_u32.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_wb_u8.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_x_wb_u16.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_x_wb_u32.c | 6 +- > > .../arm/mve/intrinsics/vdwdupq_x_wb_u8.c | 6 +- > > .../arm/mve/intrinsics/vidupq_m_wb_u16.c | 18 +- > > .../arm/mve/intrinsics/vidupq_m_wb_u32.c | 18 +- > > .../arm/mve/intrinsics/vidupq_m_wb_u8.c | 18 +- > > .../arm/mve/intrinsics/vidupq_wb_u16.c | 14 +- > > .../arm/mve/intrinsics/vidupq_wb_u32.c | 14 +- > > .../arm/mve/intrinsics/vidupq_wb_u8.c | 14 +- > > .../arm/mve/intrinsics/vidupq_x_wb_u16.c | 18 +- > > .../arm/mve/intrinsics/vidupq_x_wb_u32.c | 18 +- > > .../arm/mve/intrinsics/vidupq_x_wb_u8.c | 18 +- > > .../arm/mve/intrinsics/viwdupq_m_wb_u16.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_m_wb_u32.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_m_wb_u8.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_wb_u16.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_wb_u32.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_wb_u8.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_x_wb_u16.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_x_wb_u32.c | 6 +- > > .../arm/mve/intrinsics/viwdupq_x_wb_u8.c | 6 +- > > 50 files changed, 2975 insertions(+), 6962 deletions(-) > > >