[v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

Message ID	20240617014108.2831124-1-pan2.li@intel.com
State	New
Headers	show Return-Path: <gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org> DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6D1D93858D29 From: pan2.li@intel.com To: gcc-patches@gcc.gnu.org Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, richard.guenther@gmail.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li <pan2.li@intel.com> Subject: [PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD Date: Mon, 17 Jun 2024 09:41:08 +0800 Message-Id: <20240617014108.2831124-1-pan2.li@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org
Series	[v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD \| expand [v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

Message ID

20240617014108.2831124-1-pan2.li@intel.com

State

New

Headers

DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 6D1D93858D29
From: pan2.li@intel.com
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai, kito.cheng@gmail.com, richard.guenther@gmail.com,
 jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Pan Li <pan2.li@intel.com>
Subject: [PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD
Date: Mon, 17 Jun 2024 09:41:08 +0800
Message-Id: <20240617014108.2831124-1-pan2.li@intel.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: gcc-patches-bounces+incoming=patchwork.ozlabs.org@gcc.gnu.org

Series

[v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD | expand

Commit Message

Li, Pan2 June 17, 2024, 1:41 a.m. UTC

From: Pan Li <pan2.li@intel.com>

When investigate the vectorization of .SAT_ADD,  we notice there
are additional 2 forms,  aka form 7 and 8 for .SAT_ADD.

Form 7:
  #define DEF_SAT_U_ADD_FMT_7(T)          \
  T __attribute__((noinline))             \
  sat_u_add_##T##_fmt_7 (T x, T y)        \
  {                                       \
    return x > (T)(x + y) ? -1 : (x + y); \
  }

Form 8:
  #define DEF_SAT_U_ADD_FMT_8(T)           \
  T __attribute__((noinline))              \
  sat_u_add_##T##_fmt_8 (T x, T y)         \
  {                                        \
    return x <= (T)(x + y) ? (x + y) : -1; \
  }

Thus,  add above 2 forms to the match gimple_unsigned_integer_sat_add,
and then the vectorizer can try to recog the pattern like form 7 and
form 8.

The below test suites are passed for this patch:
1. The rv64gcv fully regression test with newlib.
2. The rv64gcv build with glibc.
3. The x86 bootstrap test.
4. The x86 fully regression test.

gcc/ChangeLog:

	* match.pd: Add form 7 and 8 for the unsigned .SAT_ADD match.

Signed-off-by: Pan Li <pan2.li@intel.com>
---
 gcc/match.pd | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

Richard Biener June 18, 2024, 11:03 a.m. UTC | #1

On Mon, Jun 17, 2024 at 3:41 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> When investigate the vectorization of .SAT_ADD,  we notice there
> are additional 2 forms,  aka form 7 and 8 for .SAT_ADD.
>
> Form 7:
>   #define DEF_SAT_U_ADD_FMT_7(T)          \
>   T __attribute__((noinline))             \
>   sat_u_add_##T##_fmt_7 (T x, T y)        \
>   {                                       \
>     return x > (T)(x + y) ? -1 : (x + y); \
>   }
>
> Form 8:
>   #define DEF_SAT_U_ADD_FMT_8(T)           \
>   T __attribute__((noinline))              \
>   sat_u_add_##T##_fmt_8 (T x, T y)         \
>   {                                        \
>     return x <= (T)(x + y) ? (x + y) : -1; \
>   }
>
> Thus,  add above 2 forms to the match gimple_unsigned_integer_sat_add,
> and then the vectorizer can try to recog the pattern like form 7 and
> form 8.
>
> The below test suites are passed for this patch:
> 1. The rv64gcv fully regression test with newlib.
> 2. The rv64gcv build with glibc.
> 3. The x86 bootstrap test.
> 4. The x86 fully regression test.

OK.

Note that fold-const.cc has canonicalization for the minus one to be put last:

      /* If the second operand is simpler than the third, swap them
         since that produces better jump optimization results.  */
      if (truth_value_p (TREE_CODE (arg0))
          && tree_swap_operands_p (op1, op2))
        {
          location_t loc0 = expr_location_or (arg0, loc);
          /* See if this can be inverted.  If it can't, possibly because
             it was a floating-point inequality comparison, don't do
             anything.  */
          tem = fold_invert_truthvalue (loc0, arg0);
          if (tem)
            return fold_build3_loc (loc, code, type, tem, op2, op1);

we might want to consider such transform in match.pd, in this case this
would allow to elide one of the patterns.

Richard.

> gcc/ChangeLog:
>
>         * match.pd: Add form 7 and 8 for the unsigned .SAT_ADD match.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/match.pd | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 99968d316ed..aae6d30a5e4 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3144,6 +3144,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
>    integer_minus_onep (usadd_left_part_2 @0 @1)))
>
> +/* Unsigned saturation add, case 7 (branch with le):
> +   SAT_ADD = x <= (X + Y) ? (X + Y) : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep))
> +
> +/* Unsigned saturation add, case 8 (branch with gt):
> +   SAT_ADD = x > (X + Y) ? -1 : (X + Y).  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2))
> +
>  /* Unsigned saturation sub, case 1 (branch with gt):
>     SAT_U_SUB = X > Y ? X - Y : 0  */
>  (match (unsigned_integer_sat_sub @0 @1)
> --
> 2.34.1
>

Li, Pan2 June 18, 2024, 1:34 p.m. UTC | #2

Thanks Richard for comments.

> we might want to consider such transform in match.pd, in this case this
> would allow to elide one of the patterns.

That makes much more sense to me, it is not good idea to have many patterns for SAT_ADD,
will commit this first and have a try in another PATCH for this.

Pan

-----Original Message-----
From: Richard Biener <richard.guenther@gmail.com> 
Sent: Tuesday, June 18, 2024 7:03 PM
To: Li, Pan2 <pan2.li@intel.com>
Cc: gcc-patches@gcc.gnu.org; juzhe.zhong@rivai.ai; kito.cheng@gmail.com; jeffreyalaw@gmail.com; rdapp.gcc@gmail.com
Subject: Re: [PATCH v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

On Mon, Jun 17, 2024 at 3:41 AM <pan2.li@intel.com> wrote:
>
> From: Pan Li <pan2.li@intel.com>
>
> When investigate the vectorization of .SAT_ADD,  we notice there
> are additional 2 forms,  aka form 7 and 8 for .SAT_ADD.
>
> Form 7:
>   #define DEF_SAT_U_ADD_FMT_7(T)          \
>   T __attribute__((noinline))             \
>   sat_u_add_##T##_fmt_7 (T x, T y)        \
>   {                                       \
>     return x > (T)(x + y) ? -1 : (x + y); \
>   }
>
> Form 8:
>   #define DEF_SAT_U_ADD_FMT_8(T)           \
>   T __attribute__((noinline))              \
>   sat_u_add_##T##_fmt_8 (T x, T y)         \
>   {                                        \
>     return x <= (T)(x + y) ? (x + y) : -1; \
>   }
>
> Thus,  add above 2 forms to the match gimple_unsigned_integer_sat_add,
> and then the vectorizer can try to recog the pattern like form 7 and
> form 8.
>
> The below test suites are passed for this patch:
> 1. The rv64gcv fully regression test with newlib.
> 2. The rv64gcv build with glibc.
> 3. The x86 bootstrap test.
> 4. The x86 fully regression test.

OK.

Note that fold-const.cc has canonicalization for the minus one to be put last:

      /* If the second operand is simpler than the third, swap them
         since that produces better jump optimization results.  */
      if (truth_value_p (TREE_CODE (arg0))
          && tree_swap_operands_p (op1, op2))
        {
          location_t loc0 = expr_location_or (arg0, loc);
          /* See if this can be inverted.  If it can't, possibly because
             it was a floating-point inequality comparison, don't do
             anything.  */
          tem = fold_invert_truthvalue (loc0, arg0);
          if (tem)
            return fold_build3_loc (loc, code, type, tem, op2, op1);

we might want to consider such transform in match.pd, in this case this
would allow to elide one of the patterns.

Richard.

> gcc/ChangeLog:
>
>         * match.pd: Add form 7 and 8 for the unsigned .SAT_ADD match.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
> ---
>  gcc/match.pd | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 99968d316ed..aae6d30a5e4 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -3144,6 +3144,16 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>   (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
>    integer_minus_onep (usadd_left_part_2 @0 @1)))
>
> +/* Unsigned saturation add, case 7 (branch with le):
> +   SAT_ADD = x <= (X + Y) ? (X + Y) : -1.  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep))
> +
> +/* Unsigned saturation add, case 8 (branch with gt):
> +   SAT_ADD = x > (X + Y) ? -1 : (X + Y).  */
> +(match (unsigned_integer_sat_add @0 @1)
> + (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2))
> +
>  /* Unsigned saturation sub, case 1 (branch with gt):
>     SAT_U_SUB = X > Y ? X - Y : 0  */
>  (match (unsigned_integer_sat_sub @0 @1)
> --
> 2.34.1
>

diff --git a/gcc/match.pd b/gcc/match.pd
index 99968d316ed..aae6d30a5e4 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -3144,6 +3144,16 @@  DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
  (cond^ (ne (imagpart (IFN_ADD_OVERFLOW:c @0 @1)) integer_zerop)
   integer_minus_onep (usadd_left_part_2 @0 @1)))
 
+/* Unsigned saturation add, case 7 (branch with le):
+   SAT_ADD = x <= (X + Y) ? (X + Y) : -1.  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (le @0 (usadd_left_part_1@2 @0 @1)) @2 integer_minus_onep))
+
+/* Unsigned saturation add, case 8 (branch with gt):
+   SAT_ADD = x > (X + Y) ? -1 : (X + Y).  */
+(match (unsigned_integer_sat_add @0 @1)
+ (cond^ (gt @0 (usadd_left_part_1@2 @0 @1)) integer_minus_onep @2))
+
 /* Unsigned saturation sub, case 1 (branch with gt):
    SAT_U_SUB = X > Y ? X - Y : 0  */
 (match (unsigned_integer_sat_sub @0 @1)

[v1] Match: Support forms 7 and 8 for the unsigned .SAT_ADD

Commit Message

Comments

Patch