Message ID | 232a38b1-76c2-476d-1be0-a1958e5624bb@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | Check rrotate optab first when transforming lrotate | expand |
On Mon, Jul 15, 2019 at 04:50:13PM +0800, Kewen.Lin wrote: > In match.pd and expmed.c, we have some codes to transform lrotate to > rrotate if rotation count is const. But they don't consider the target > whether supports the rrotate. It leads to some suboptimal generated > code since some optimization can't get expected result by querying > target optab. One typical case is that we miss some rotation > vectorization opportunity on Power. This will not do the right thing if neither lrotate nor rrotate is supported, you want to canonicalize in that case IMHO. The code formatting is wrong (|| at the end of line, overly long lines). Finally, what is the reason why Power doesn't support one of the rotates? Especially for rotates by constant, you can transform those in the define_expand. Canonicalizing code is highly desirable during GIMPLE, it means if you have say left rotate by 23 in one spot and right rotate by bitsize - 23 in another spot, e.g. SCCVN can merge that code. So, IMNSHO, just improve the backend. Jakub
On Mon, Jul 15, 2019 at 10:59 AM Jakub Jelinek <jakub@redhat.com> wrote: > > On Mon, Jul 15, 2019 at 04:50:13PM +0800, Kewen.Lin wrote: > > In match.pd and expmed.c, we have some codes to transform lrotate to > > rrotate if rotation count is const. But they don't consider the target > > whether supports the rrotate. It leads to some suboptimal generated > > code since some optimization can't get expected result by querying > > target optab. One typical case is that we miss some rotation > > vectorization opportunity on Power. > > This will not do the right thing if neither lrotate nor rrotate is > supported, you want to canonicalize in that case IMHO. > The code formatting is wrong (|| at the end of line, overly long lines). > > Finally, what is the reason why Power doesn't support one of the rotates? > Especially for rotates by constant, you can transform those in the > define_expand. > > Canonicalizing code is highly desirable during GIMPLE, it means if you have > say left rotate by 23 in one spot and right rotate by bitsize - 23 in > another spot, e.g. SCCVN can merge that code. > > So, IMNSHO, just improve the backend. Agreed. Or alternatively improve the generic expander. Richard. > Jakub
Jakub Jelinek <jakub@redhat.com> writes: > On Mon, Jul 15, 2019 at 04:50:13PM +0800, Kewen.Lin wrote: >> In match.pd and expmed.c, we have some codes to transform lrotate to >> rrotate if rotation count is const. But they don't consider the target >> whether supports the rrotate. It leads to some suboptimal generated >> code since some optimization can't get expected result by querying >> target optab. One typical case is that we miss some rotation >> vectorization opportunity on Power. > > This will not do the right thing if neither lrotate nor rrotate is > supported, you want to canonicalize in that case IMHO. > The code formatting is wrong (|| at the end of line, overly long lines). > > Finally, what is the reason why Power doesn't support one of the rotates? > Especially for rotates by constant, you can transform those in the > define_expand. > > Canonicalizing code is highly desirable during GIMPLE, it means if you have > say left rotate by 23 in one spot and right rotate by bitsize - 23 in > another spot, e.g. SCCVN can merge that code. > > So, IMNSHO, just improve the backend. Agreed FWIW, if the target supports rotates by a constant. If it only supports rotates by a variable then I think expand should try to use the rrotate optab for lrotates. The current code looks odd though. The comment above the expmed.c hunk is: /* Canonicalize rotates by constant amount. If op1 is bitsize / 2, prefer left rotation, if op1 is from bitsize / 2 + 1 to bitsize - 1, use other direction of rotate with 1 .. bitsize / 2 - 1 amount instead. */ whereas rtl.texi says: @item (rotate:@var{m} @var{x} @var{c}) @itemx (rotatert:@var{m} @var{x} @var{c}) Similar but represent left and right rotate. If @var{c} is a constant, use @code{rotate}. simplify-rtx.c seems to stick to the rtl.texi version and use ROTATE rather than ROTATERT for constant rotates. If a target prefers the expmed.c version then I think it should handle that in the asm template. Thanks, Richard
Hi Jakub, on 2019/7/15 下午4:59, Jakub Jelinek wrote: > On Mon, Jul 15, 2019 at 04:50:13PM +0800, Kewen.Lin wrote: >> In match.pd and expmed.c, we have some codes to transform lrotate to >> rrotate if rotation count is const. But they don't consider the target >> whether supports the rrotate. It leads to some suboptimal generated >> code since some optimization can't get expected result by querying >> target optab. One typical case is that we miss some rotation >> vectorization opportunity on Power. > > This will not do the right thing if neither lrotate nor rrotate is > supported, you want to canonicalize in that case IMHO. > The code formatting is wrong (|| at the end of line, overly long lines). > > Finally, what is the reason why Power doesn't support one of the rotates? > Especially for rotates by constant, you can transform those in the > define_expand. > > Canonicalizing code is highly desirable during GIMPLE, it means if you have > say left rotate by 23 in one spot and right rotate by bitsize - 23 in > another spot, e.g. SCCVN can merge that code. > > So, IMNSHO, just improve the backend. > OK, I see. Thanks for the explanation. I'll try to fix it in the backend. Thanks, Kewen
On Mon, Jul 15, 2019 at 10:59:29AM +0200, Jakub Jelinek wrote: > On Mon, Jul 15, 2019 at 04:50:13PM +0800, Kewen.Lin wrote: > > In match.pd and expmed.c, we have some codes to transform lrotate to > > rrotate if rotation count is const. But they don't consider the target > > whether supports the rrotate. It leads to some suboptimal generated > > code since some optimization can't get expected result by querying > > target optab. One typical case is that we miss some rotation > > vectorization opportunity on Power. > > This will not do the right thing if neither lrotate nor rrotate is > supported, you want to canonicalize in that case IMHO. > The code formatting is wrong (|| at the end of line, overly long lines). > > Finally, what is the reason why Power doesn't support one of the rotates? Because it is pointless duplication, and we have some dozen rotate patterns already. Canonicalising this representation is highly desirable. Everything in RTL already does work fine, fwiw (and did since before rotatert existed). Segher
diff --git a/gcc/expmed.c b/gcc/expmed.c index d7f8e9a5d76..6cd3341a0e4 100644 --- a/gcc/expmed.c +++ b/gcc/expmed.c @@ -2491,7 +2491,9 @@ expand_shift_1 (enum tree_code code, machine_mode mode, rtx shifted, if (rotate && CONST_INT_P (op1) && IN_RANGE (INTVAL (op1), GET_MODE_BITSIZE (scalar_mode) / 2 + left, - GET_MODE_BITSIZE (scalar_mode) - 1)) + GET_MODE_BITSIZE (scalar_mode) - 1) + && ((left && optab_handler (rrotate_optab, mode) != CODE_FOR_nothing) || + (!left && optab_handler (lrotate_optab, mode) != CODE_FOR_nothing))) { op1 = gen_int_shift_amount (mode, (GET_MODE_BITSIZE (scalar_mode) - INTVAL (op1))); diff --git a/gcc/match.pd b/gcc/match.pd index 88dae4231d8..a63cd15e129 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -2418,11 +2418,14 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) /* Rewrite an LROTATE_EXPR by a constant into an RROTATE_EXPR by a new constant. */ + (simplify (lrotate @0 INTEGER_CST@1) + (if ((VECTOR_TYPE_P (type) && target_supports_op_p (type, RROTATE_EXPR, optab_vector)) + || (!VECTOR_TYPE_P (type) && target_supports_op_p (type, RROTATE_EXPR, optab_scalar))) (rrotate @0 { const_binop (MINUS_EXPR, TREE_TYPE (@1), build_int_cst (TREE_TYPE (@1), - element_precision (type)), @1); })) + element_precision (type)), @1); }))) /* Turn (a OP c1) OP c2 into a OP (c1+c2). */ (for op (lrotate rrotate rshift lshift) diff --git a/gcc/testsuite/gcc.target/powerpc/vec_rotate.c b/gcc/testsuite/gcc.target/powerpc/vec_rotate.c new file mode 100644 index 00000000000..16d1f297d2d --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/vec_rotate.c @@ -0,0 +1,47 @@ +/* { dg-options "-O3" } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ + +/* Check LROTATE to RROTATE transformation on const rotation count is disabled + on Power by checking optab, it helps vectorizer to exploit vector rotation + instructions. */ + +#define N 256 +unsigned long long sud[N], rud[N]; +unsigned int suw[N], ruw[N]; +unsigned short suh[N], ruh[N]; +unsigned char sub[N], rub[N]; + +void +testULL () +{ + for (int i = 0; i < 256; ++i) + rud[i] = (sud[i] >> 8) | (sud[i] << (sizeof (sud[0]) * 8 - 8)); +} + +void +testUW () +{ + for (int i = 0; i < 256; ++i) + ruw[i] = (suw[i] >> 8) | (suw[i] << (sizeof (suw[0]) * 8 - 8)); +} + +void +testUH () +{ + for (int i = 0; i < 256; ++i) + ruh[i] = (unsigned short) (suh[i] >> 9) + | (unsigned short) (suh[i] << (sizeof (suh[0]) * 8 - 9)); +} + +void +testUB () +{ + for (int i = 0; i < 256; ++i) + rub[i] = (unsigned char) (sub[i] >> 5) + | (unsigned char) (sub[i] << (sizeof (sub[0]) * 8 - 5)); +} + +/* { dg-final { scan-assembler {\mvrld\M} } } */ +/* { dg-final { scan-assembler {\mvrlw\M} } } */ +/* { dg-final { scan-assembler {\mvrlh\M} } } */ +/* { dg-final { scan-assembler {\mvrlb\M} } } */