Message ID | CABYV9SW_krtcEK_V21uaN7YgfRwU-Td+fGh7Q_-7Ha0VgHArSA@mail.gmail.com |
---|---|
State | New |
Headers | show |
Ping. Richard, the patch in the attachment should be submitted asap. The other problem could wait for a while. Thanks, Artem. On Tue, Oct 4, 2011 at 12:04 AM, Artem Shinkarov <artyom.shinkaroff@gmail.com> wrote: > On Mon, Oct 3, 2011 at 6:12 PM, Richard Henderson <rth@redhat.com> wrote: >> On 10/03/2011 09:43 AM, Artem Shinkarov wrote: >>> Hi, Richard >>> >>> There is a problem with the testcases of the patch you have committed >>> for me. The code in every test-case is doubled. Could you please, >>> apply the following patch, otherwise it would fail all the tests from >>> the vector-shuffle-patch would fail. >> >> Huh. Dunno what happened there. Fixed. >> >>> Also, if it is possible, could you change my name from in the >>> ChangeLog from "Artem Shinkarov" to "Artjoms Sinkarovs". The last >>> version is the way I am spelled in the passport, and the name I use in >>> the ChangeLog. >> >> Fixed. >> >> >> r~ >> > > Richard, there was a problem causing segfault in ix86_expand_vshuffle > which I have fixed with the patch attached. > > Another thing I cannot figure out is the following case: > #define vector(elcount, type) \ > __attribute__((vector_size((elcount)*sizeof(type)))) type > > vector (8, short) __attribute__ ((noinline)) > f (vector (8, short) x, vector (8, short) y, vector (8, short) mask) { > return __builtin_shuffle (x, y, mask); > } > > int main (int argc, char *argv[]) { > vector (8, short) v0 = {argc, 1,2,3,4,5,6,7}; > vector (8, short) v1 = {argc, 1,argc,3,4,5,argc,7}; > vector (8, short) mask0 = {0,2,3,1,4,5,6,7}; > vector (8, short) v2; > int i; > > v2 = f (v0, v1, mask0); > /* v2 = __builtin_shuffle (v0, v1, mask0); */ > for (i = 0; i < 8; i ++) > __builtin_printf ("%i, ", v2[i]); > > return 0; > } > > I am compiling with support of ssse3, in my case it is ./xgcc -B. b.c > -O3 -mtune=core2 -march=core2 > > And I get 1, 1, 1, 3, 4, 5, 1, 7, on the output, which is wrong. > > But if I will call __builtin_shuffle directly, then the answer is correct. > > Any ideas? > > > Thanks, > Artem. >
On 10/04/2011 08:18 AM, Artem Shinkarov wrote: > Ping. > > Richard, the patch in the attachment should be submitted asap. The > other problem could wait for a while. The patch in the attachment is wrong too. I've re-written the x86 backend support, adding TARGET_XOP in the process. I've also re-written the test cases so that they actually test what we wanted. Patch to follow once testing is complete. r~
Index: gcc/config/i386/i386.c =================================================================== --- gcc/config/i386/i386.c (revision 179464) +++ gcc/config/i386/i386.c (working copy) @@ -19312,14 +19312,17 @@ ix86_expand_vshuffle (rtx operands[]) xops[1] = operands[1]; xops[2] = operands[2]; xops[3] = gen_rtx_EQ (mode, mask, w_vector); - xops[4] = t1; - xops[5] = t2; + xops[4] = t2; + xops[5] = t1; return ix86_expand_int_vcond (xops); } - /* mask = mask * {w, w, ...} */ - new_mask = expand_simple_binop (maskmode, MULT, new_mask, w_vector, + /* mask = mask * {16/w, 16/w, ...} */ + for (i = 0; i < w; i++) + vec[i] = GEN_INT (16/w); + vt = gen_rtx_CONST_VECTOR (maskmode, gen_rtvec_v (w, vec)); + new_mask = expand_simple_binop (maskmode, MULT, new_mask, vt, NULL_RTX, 0, OPTAB_DIRECT); /* Convert mask to vector of chars. */ @@ -19332,7 +19335,7 @@ ix86_expand_vshuffle (rtx operands[]) ... */ for (i = 0; i < w; i++) for (j = 0; j < 16/w; j++) - vec[i*w+j] = GEN_INT (i*16/w); + vec[i*(16/w)+j] = GEN_INT (i*16/w); vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec)); vt = force_reg (V16QImode, vt); @@ -19344,7 +19347,7 @@ ix86_expand_vshuffle (rtx operands[]) new_mask = new_mask + {0,1,..,16/w, 0,1,..,16/w, ...} */ for (i = 0; i < w; i++) for (j = 0; j < 16/w; j++) - vec[i*w+j] = GEN_INT (j); + vec[i*(16/w)+j] = GEN_INT (j); vt = gen_rtx_CONST_VECTOR (V16QImode, gen_rtvec_v (16, vec)); new_mask = expand_simple_binop (V16QImode, PLUS, new_mask, vt, @@ -19386,8 +19389,8 @@ ix86_expand_vshuffle (rtx operands[]) xops[1] = operands[1]; xops[2] = operands[2]; xops[3] = gen_rtx_EQ (mode, mask, w_vector); - xops[4] = t1; - xops[5] = t2; + xops[4] = t2; + xops[5] = t1; return ix86_expand_int_vcond (xops); }