Message ID | 20240521051220.8653-1-hongtao.liu@intel.com |
---|---|
State | New |
Headers | show |
Series | [1/2] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode. | expand |
CC for review. On Tue, May 21, 2024 at 1:12 PM liuhongt <hongtao.liu@intel.com> wrote: > > When mask is (1 << (prec - imm) - 1) which is used to clear upper bits > of A, then it can be simplified to LSHIFTRT. > > i.e Simplify > (and:v8hi > (ashifrt:v8hi A 8) > (const_vector 0xff x8)) > to > (lshifrt:v8hi A 8) > > Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. > Ok of trunk? > > gcc/ChangeLog: > > PR target/114428 > * simplify-rtx.cc > (simplify_context::simplify_binary_operation_1): > Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for > specific mask. > --- > gcc/simplify-rtx.cc | 25 +++++++++++++++++++++++++ > 1 file changed, 25 insertions(+) > > diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc > index 53f54d1d392..6c91409200e 100644 > --- a/gcc/simplify-rtx.cc > +++ b/gcc/simplify-rtx.cc > @@ -4021,6 +4021,31 @@ simplify_context::simplify_binary_operation_1 (rtx_code code, > return tem; > } > > + /* (and:v4si > + (ashiftrt:v4si A 16) > + (const_vector: 0xffff x4)) > + is just (lshiftrt:v4si A 16). */ > + if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT > + && (CONST_INT_P (XEXP (op0, 1)) > + || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR > + && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1)))) > + && GET_CODE (op1) == CONST_VECTOR > + && CONST_VECTOR_DUPLICATE_P (op1)) > + { > + unsigned HOST_WIDE_INT shift_count > + = (CONST_INT_P (XEXP (op0, 1)) > + ? UINTVAL (XEXP (op0, 1)) > + : UINTVAL (XVECEXP (XEXP (op0, 1), 0, 0))); > + unsigned HOST_WIDE_INT inner_prec > + = GET_MODE_PRECISION (GET_MODE_INNER (mode)); > + > + /* Avoid UD shift count. */ > + if (shift_count < inner_prec > + && (UINTVAL (XVECEXP (op1, 0, 0)) > + == (HOST_WIDE_INT_1U << (inner_prec - shift_count)) - 1)) > + return simplify_gen_binary (LSHIFTRT, mode, XEXP (op0, 0), XEXP (op0, 1)); > + } > + > tem = simplify_byte_swapping_operation (code, mode, op0, op1); > if (tem) > return tem; > -- > 2.31.1 >
On 5/23/24 8:25 PM, Hongtao Liu wrote: > CC for review. > > On Tue, May 21, 2024 at 1:12 PM liuhongt <hongtao.liu@intel.com> wrote: >> >> When mask is (1 << (prec - imm) - 1) which is used to clear upper bits >> of A, then it can be simplified to LSHIFTRT. >> >> i.e Simplify >> (and:v8hi >> (ashifrt:v8hi A 8) >> (const_vector 0xff x8)) >> to >> (lshifrt:v8hi A 8) >> >> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. >> Ok of trunk? >> >> gcc/ChangeLog: >> >> PR target/114428 >> * simplify-rtx.cc >> (simplify_context::simplify_binary_operation_1): >> Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for >> specific mask. Can you add a testcase for this? I don't mind if it's x86 specific and does a bit of asm scanning. Also note that the context for this patch has changed, so it won't automatically apply. So be extra careful when updating so that it goes into the right place (all the more reason to have a testcase validating that the optimization works correctly). I think the patch itself is fine. So further review is just for the testcase and should be easy. jeff ps. It seems to help RISC-V as well :-)
diff --git a/gcc/simplify-rtx.cc b/gcc/simplify-rtx.cc index 53f54d1d392..6c91409200e 100644 --- a/gcc/simplify-rtx.cc +++ b/gcc/simplify-rtx.cc @@ -4021,6 +4021,31 @@ simplify_context::simplify_binary_operation_1 (rtx_code code, return tem; } + /* (and:v4si + (ashiftrt:v4si A 16) + (const_vector: 0xffff x4)) + is just (lshiftrt:v4si A 16). */ + if (VECTOR_MODE_P (mode) && GET_CODE (op0) == ASHIFTRT + && (CONST_INT_P (XEXP (op0, 1)) + || (GET_CODE (XEXP (op0, 1)) == CONST_VECTOR + && CONST_VECTOR_DUPLICATE_P (XEXP (op0, 1)))) + && GET_CODE (op1) == CONST_VECTOR + && CONST_VECTOR_DUPLICATE_P (op1)) + { + unsigned HOST_WIDE_INT shift_count + = (CONST_INT_P (XEXP (op0, 1)) + ? UINTVAL (XEXP (op0, 1)) + : UINTVAL (XVECEXP (XEXP (op0, 1), 0, 0))); + unsigned HOST_WIDE_INT inner_prec + = GET_MODE_PRECISION (GET_MODE_INNER (mode)); + + /* Avoid UD shift count. */ + if (shift_count < inner_prec + && (UINTVAL (XVECEXP (op1, 0, 0)) + == (HOST_WIDE_INT_1U << (inner_prec - shift_count)) - 1)) + return simplify_gen_binary (LSHIFTRT, mode, XEXP (op0, 0), XEXP (op0, 1)); + } + tem = simplify_byte_swapping_operation (code, mode, op0, op1); if (tem) return tem;