mbox series

[V3,00/10] optabs: Make all `*dot_prod_optab's modeled as conversions

Message ID 20240815084425.2519197-1-victor.donascimento@arm.com
Headers show
Series optabs: Make all `*dot_prod_optab's modeled as conversions | expand

Message

Victor Do Nascimento Aug. 15, 2024, 8:44 a.m. UTC
Changes in this revision:

* [PATCH 2/10] - Make use of overloaded `directly_supported_p' in
`vect_supportable_conv_optab_p' to avoid code duplication.

-----

Given the specification in the GCC internals manual defines the
{u|s}dot_prod<m> standard name as taking "two signed elements of the
same mode, adding them to a third operand of wider mode", there is
currently ambiguity in the relationship between the mode of the first
two arguments and that of the third.

This vagueness means that, in theory, different modes may be
supportable in the third argument.  This flexibility would allow for a
given backend to add to the accumulator a different number of
vectorized products, e.g. A backend may provide instructions for both:

  accum += a[0] * b[0]

and

  accum += a[0] * b[0] + a[1] * b[1],

as is now seen in the SVE2.1 extension to AArch64.  In spite of the
aforementioned flexibility, modeling the dot-product operation as a
direct optab means that we have no way to encode both input and the
accumulator data modes into the backend pattern name, which prevents
us from harnessing this flexibility.

The purpose of this patch-series is therefore to remedy this current
shortcoming, moving the `dot_prod' from its current implementation as
a direct optab to an implementation where, as a conversion optab, we
are able to differentiate between dot products taking the same input
mode but resulting in a different output mode.

Regression-tested on x86_64, aarch64 and armhf.  I'd appreciate help
running relevant tests on the remaining architectures, i.e. arc, mips,
altivec and c6x to ensure I've not inadvertently broken anything for
those back-ends.

Victor Do Nascimento (10):
  optabs: Make all `*dot_prod_optab's modeled as conversions
  autovectorizer: Add basic support for convert optabs
  aarch64: Fix aarch64 backend-use of (u|s|us)dot_prod patterns
  arm: Fix arm backend-use of (u|s|us)dot_prod patterns
  i386: Fix dot_prod backend patterns for mmx and sse targets
  arc: Adjust dot-product backend patterns
  mips:  Adjust dot-product backend patterns
  rs6000: Adjust altivec dot-product backend patterns
  c6x:  Adjust dot-product backend patterns
  autovectorizer: Test autovectorization of different dot-prod modes.

 gcc/config/aarch64/aarch64-builtins.cc        |  7 ++
 gcc/config/aarch64/aarch64-simd-builtins.def  |  6 +-
 gcc/config/aarch64/aarch64-simd.md            |  9 +-
 .../aarch64/aarch64-sve-builtins-base.cc      | 13 +--
 gcc/config/aarch64/aarch64-sve-builtins.cc    | 17 ++++
 gcc/config/aarch64/aarch64-sve-builtins.h     |  3 +
 gcc/config/aarch64/aarch64-sve.md             |  6 +-
 gcc/config/aarch64/aarch64-sve2.md            |  2 +-
 gcc/config/arc/simdext.md                     |  8 +-
 gcc/config/arm/arm-builtins.cc                | 95 +++++++++++++++++++
 gcc/config/arm/arm-protos.h                   |  3 +
 gcc/config/arm/arm.cc                         |  1 +
 gcc/config/arm/arm_neon_builtins.def          |  3 -
 gcc/config/arm/neon.md                        |  6 +-
 gcc/config/c6x/c6x.md                         |  2 +-
 gcc/config/i386/mmx.md                        | 30 +++---
 gcc/config/i386/sse.md                        | 38 ++++----
 gcc/config/mips/loongson-mmi.md               |  2 +-
 gcc/config/rs6000/altivec.md                  |  4 +-
 gcc/doc/md.texi                               | 46 ++++-----
 gcc/gimple-match-exports.cc                   | 23 +++++
 gcc/gimple-match.h                            |  2 +
 gcc/optabs.cc                                 |  3 +-
 gcc/optabs.def                                |  6 +-
 .../gcc.dg/vect/vect-dotprod-twoway.c         | 39 ++++++++
 .../aarch64/sme/vect-dotprod-twoway.c         | 25 +++++
 .../gcc.target/aarch64/vect-dotprod-twoway.c  | 65 +++++++++++++
 gcc/testsuite/lib/target-supports.exp         |  8 ++
 gcc/tree-vect-loop.cc                         |  1 +
 gcc/tree-vect-patterns.cc                     | 33 ++++++-
 30 files changed, 410 insertions(+), 96 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-dotprod-twoway.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sme/vect-dotprod-twoway.c
 create mode 100644 gcc/testsuite/gcc.target/aarch64/vect-dotprod-twoway.c