Message ID | alpine.DEB.2.02.1209091258520.5537@stedding.saclay.inria.fr |
---|---|
State | New |
Headers | show |
On Sun, 9 Sep 2012, Marc Glisse wrote: > Hello, > > this patch lets the compiler try to rewrite: > > (vec_concat (vec_select x [a]) (vec_select x [b])) > > as: > > vec_select x [a b] > > or even just "x" if appropriate. > > In a first iteration I was restricting it to b-a==1, but it seemed better not > to: it helps for {v[1],v[0]} and doesn't change anything for unknown > patterns. > > Note that I am planning to do a similar optimization at tree level, but it > shouldn't make this one useless because such patterns can be created during > rtl passes. The testcase may need an additional -fno-tree-xxx to still be > useful at that point though. Since the tree-ssa patch was reviewed faster, assume there is a -fno-tree-forwprop in dg-options for the testcase. > bootstrap+testsuite on x86_64-linux-gnu. > > 2012-09-09 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * simplify-rtx.c (simplify_binary_operation_1): Handle vec_concat > of vec_selects from the same vector. > > gcc/testsuite/ > * gcc.target/i386/vect-rebuild.c: New testcase. http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00540.html
Hello, RTL optimizers looks like a promising category, and Eric like a nice maintainer for it, adding in Cc: ;-) (I bothered Richard Sandiford last time) On Tue, 11 Sep 2012, Marc Glisse wrote: > On Sun, 9 Sep 2012, Marc Glisse wrote: > >> Hello, >> >> this patch lets the compiler try to rewrite: >> >> (vec_concat (vec_select x [a]) (vec_select x [b])) >> >> as: >> >> vec_select x [a b] >> >> or even just "x" if appropriate. >> >> In a first iteration I was restricting it to b-a==1, but it seemed better >> not to: it helps for {v[1],v[0]} and doesn't change anything for unknown >> patterns. >> >> Note that I am planning to do a similar optimization at tree level, but it >> shouldn't make this one useless because such patterns can be created during >> rtl passes. The testcase may need an additional -fno-tree-xxx to still be >> useful at that point though. > > Since the tree-ssa patch was reviewed faster, assume there is a > -fno-tree-forwprop in dg-options for the testcase. > >> bootstrap+testsuite on x86_64-linux-gnu. >> >> 2012-09-09 Marc Glisse <marc.glisse@inria.fr> >> >> gcc/ >> * simplify-rtx.c (simplify_binary_operation_1): Handle vec_concat >> of vec_selects from the same vector. >> >> gcc/testsuite/ >> * gcc.target/i386/vect-rebuild.c: New testcase. > > http://gcc.gnu.org/ml/gcc-patches/2012-09/msg00540.html
> this patch lets the compiler try to rewrite: > > (vec_concat (vec_select x [a]) (vec_select x [b])) > > as: > > vec_select x [a b] > > or even just "x" if appropriate. > > In a first iteration I was restricting it to b-a==1, but it seemed better > not to: it helps for {v[1],v[0]} and doesn't change anything for unknown > patterns. > > Note that I am planning to do a similar optimization at tree level, but it > shouldn't make this one useless because such patterns can be created > during rtl passes. The testcase may need an additional -fno-tree-xxx to > still be useful at that point though. > > > bootstrap+testsuite on x86_64-linux-gnu. > > 2012-09-09 Marc Glisse <marc.glisse@inria.fr> > > gcc/ > * simplify-rtx.c (simplify_binary_operation_1): Handle vec_concat > of vec_selects from the same vector. * simplify-rtx.c (simplify_binary_operation_1) <VEC_CONCAT>: Handle VEC_SELECTs from the same vector. > gcc/testsuite/ > * gcc.target/i386/vect-rebuild.c: New testcase. OK, but: 1) Always add a comment describing the simplification when you add one, 2) The condition: > + if (GET_MODE (XEXP (trueop0, 0)) == mode > + && INTVAL (XVECEXP (XEXP (trueop1, 1), 0, 0)) > + - INTVAL (XVECEXP (XEXP (trueop0, 1), 0, 0)) == 1) > + return XEXP (trueop0, 0); can be simplified: if GET_MODE (XEXP (trueop0, 0)) == mode, then XEXP (trueop0, 0) is a 2-element vector so the only possible case is (0,1). That would probably even be more correct since you don't test CONST_INT_P for the indices, while the test is done in the VEC_SELECT case. Why not generalizing to all kinds of VEC_SELECTs instead of just scalar ones?
Index: simplify-rtx.c =================================================================== --- simplify-rtx.c (revision 191106) +++ simplify-rtx.c (working copy) @@ -3357,20 +3357,38 @@ simplify_binary_operation_1 (enum rtx_co if (!VECTOR_MODE_P (op1_mode)) RTVEC_ELT (v, i) = trueop1; else RTVEC_ELT (v, i) = CONST_VECTOR_ELT (trueop1, i - in_n_elts); } } return gen_rtx_CONST_VECTOR (mode, v); } + + if (GET_CODE (trueop0) == VEC_SELECT + && GET_CODE (trueop1) == VEC_SELECT + && !VECTOR_MODE_P (op0_mode) + && !VECTOR_MODE_P (op1_mode) + && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))) + { + if (GET_MODE (XEXP (trueop0, 0)) == mode + && INTVAL (XVECEXP (XEXP (trueop1, 1), 0, 0)) + - INTVAL (XVECEXP (XEXP (trueop0, 1), 0, 0)) == 1) + return XEXP (trueop0, 0); + + rtvec vec = rtvec_alloc (2); + RTVEC_ELT (vec, 0) = XVECEXP (XEXP (trueop0, 1), 0, 0); + RTVEC_ELT (vec, 1) = XVECEXP (XEXP (trueop1, 1), 0, 0); + return simplify_gen_binary (VEC_SELECT, mode, XEXP (trueop0, 0), + gen_rtx_PARALLEL (VOIDmode, vec)); + } } return 0; default: gcc_unreachable (); } return 0; } Index: testsuite/gcc.target/i386/vect-rebuild.c =================================================================== --- testsuite/gcc.target/i386/vect-rebuild.c (revision 0) +++ testsuite/gcc.target/i386/vect-rebuild.c (revision 0) @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O -mavx" } */ + +typedef double v2df __attribute__ ((__vector_size__ (16))); +typedef double v4df __attribute__ ((__vector_size__ (32))); + +v2df f (v2df x) +{ + v2df xx = { x[0], x[1] }; + return xx; +} + +v2df g (v2df x) +{ + v2df xx = { x[1], x[0] }; + return xx; +} + +v2df h (v4df x) +{ + v2df xx = { x[2], x[3] }; + return xx; +} + +/* { dg-final { scan-assembler-not "unpck" } } */ +/* { dg-final { scan-assembler-times "\tv?permilpd\[ \t\]" 1 } } */ +/* { dg-final { scan-assembler-times "\tv?extractf128\[ \t\]" 1 } } */