diff mbox series

[1/2] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.

Message ID 20240521051220.8653-1-hongtao.liu@intel.com
State New
Headers show
Series [1/2] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode. | expand

Commit Message

liuhongt May 21, 2024, 5:12 a.m. UTC
When mask is (1 << (prec - imm) - 1) which is used to clear upper bits
of A, then it can be simplified to LSHIFTRT.

i.e Simplify
(and:v8hi
  (ashifrt:v8hi A 8)
  (const_vector 0xff x8))
to
(lshifrt:v8hi A 8)

Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
Ok of trunk?

gcc/ChangeLog:

	PR target/114428
	* simplify-rtx.cc
	(simplify_context::simplify_binary_operation_1):
	Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for
	specific mask.
---
 gcc/simplify-rtx.cc | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

Comments

Hongtao Liu May 24, 2024, 2:25 a.m. UTC | #1
CC for review.

On Tue, May 21, 2024 at 1:12 PM liuhongt <hongtao.liu@intel.com> wrote:
>
> When mask is (1 << (prec - imm) - 1) which is used to clear upper bits
> of A, then it can be simplified to LSHIFTRT.
>
> i.e Simplify
> (and:v8hi
>   (ashifrt:v8hi A 8)
>   (const_vector 0xff x8))
> to
> (lshifrt:v8hi A 8)
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok of trunk?
>
> gcc/ChangeLog:
>
>         PR target/114428
>         * simplify-rtx.cc
>         (simplify_context::simplify_binary_operation_1):
>         Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for
>         specific mask.
> ---
>  gcc/simplify-rtx.cc | 25 +++++++++++++++++++++++++
>  1 file changed, 25 insertions(+)
>
> diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
> index 53f54d1d392..6c91409200e 100644
> --- a/gcc/simplify-rtx.cc
> +++ b/gcc/simplify-rtx.cc
> @@ -4021,6 +4021,31 @@ simplify_context::simplify_binary_operation_1 (rtx_code code,
>             return tem;
>         }
>
> +      /* (and:v4si
> +          (ashiftrt:v4si A 16)
> +          (const_vector: 0xffff x4))
> +        is just (lshiftrt:v4si A 16).  */
> +      if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT
> +         && (CONST_INT_P (XEXP (op0, 1))
> +             || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR
> +                 && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1))))
> +         && GET_CODE (op1) == CONST_VECTOR
> +         && CONST_VECTOR_DUPLICATE_P (op1))
> +       {
> +         unsigned HOST_WIDE_INT shift_count
> +           = (CONST_INT_P (XEXP (op0, 1))
> +              ? UINTVAL (XEXP (op0, 1))
> +              : UINTVAL (XVECEXP (XEXP (op0, 1), 0, 0)));
> +         unsigned HOST_WIDE_INT inner_prec
> +           = GET_MODE_PRECISION (GET_MODE_INNER (mode));
> +
> +         /* Avoid UD shift count.  */
> +         if (shift_count < inner_prec
> +             && (UINTVAL (XVECEXP (op1, 0, 0))
> +                 == (HOST_WIDE_INT_1U << (inner_prec - shift_count)) - 1))
> +           return simplify_gen_binary (LSHIFTRT, mode, XEXP (op0, 0), XEXP (op0, 1));
> +       }
> +
>        tem = simplify_byte_swapping_operation (code, mode, op0, op1);
>        if (tem)
>         return tem;
> --
> 2.31.1
>
Jeff Law June 4, 2024, 1:50 p.m. UTC | #2
On 5/23/24 8:25 PM, Hongtao Liu wrote:
> CC for review.
> 
> On Tue, May 21, 2024 at 1:12 PM liuhongt <hongtao.liu@intel.com> wrote:
>>
>> When mask is (1 << (prec - imm) - 1) which is used to clear upper bits
>> of A, then it can be simplified to LSHIFTRT.
>>
>> i.e Simplify
>> (and:v8hi
>>    (ashifrt:v8hi A 8)
>>    (const_vector 0xff x8))
>> to
>> (lshifrt:v8hi A 8)
>>
>> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
>> Ok of trunk?
>>
>> gcc/ChangeLog:
>>
>>          PR target/114428
>>          * simplify-rtx.cc
>>          (simplify_context::simplify_binary_operation_1):
>>          Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for
>>          specific mask.

Can you add a testcase for this?  I don't mind if it's x86 specific and 
does a bit of asm scanning.

Also note that the context for this patch has changed, so it won't 
automatically apply.  So be extra careful when updating so that it goes 
into the right place (all the more reason to have a testcase validating 
that the optimization works correctly).


I think the patch itself is fine.  So further review is just for the 
testcase and should be easy.

jeff

ps.  It seems to help RISC-V as well :-)
diff mbox series

Patch

diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc
index 53f54d1d392..6c91409200e 100644
--- a/gcc/simplify-rtx.cc
+++ b/gcc/simplify-rtx.cc
@@ -4021,6 +4021,31 @@  simplify_context::simplify_binary_operation_1 (rtx_code code,
 	    return tem;
 	}
 
+      /* (and:v4si
+	   (ashiftrt:v4si A 16)
+	   (const_vector: 0xffff x4))
+	 is just (lshiftrt:v4si A 16).  */
+      if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT
+	  && (CONST_INT_P (XEXP (op0, 1))
+	      || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR
+		  && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1))))
+	  && GET_CODE (op1) == CONST_VECTOR
+	  && CONST_VECTOR_DUPLICATE_P (op1))
+	{
+	  unsigned HOST_WIDE_INT shift_count
+	    = (CONST_INT_P (XEXP (op0, 1))
+	       ? UINTVAL (XEXP (op0, 1))
+	       : UINTVAL (XVECEXP (XEXP (op0, 1), 0, 0)));
+	  unsigned HOST_WIDE_INT inner_prec
+	    = GET_MODE_PRECISION (GET_MODE_INNER (mode));
+
+	  /* Avoid UD shift count.  */
+	  if (shift_count < inner_prec
+	      && (UINTVAL (XVECEXP (op1, 0, 0))
+		  == (HOST_WIDE_INT_1U << (inner_prec - shift_count)) - 1))
+	    return simplify_gen_binary (LSHIFTRT, mode, XEXP (op0, 0), XEXP (op0, 1));
+	}
+
       tem = simplify_byte_swapping_operation (code, mode, op0, op1);
       if (tem)
 	return tem;