Message ID | 1390520607.3449.92.camel@gnopaine |
---|---|
State | New |
Headers | show |
On Thu, Jan 23, 2014 at 6:43 PM, Bill Schmidt <wschmidt@linux.vnet.ibm.com> wrote: > Hi, > > While testing another patch, I hit a regression at -O2 for two of the > vector shuffle tests. This patch fixes the problem. > > The problem was introduced with the little endian fixes for > VEC_PERM_EXPR. The original change performed the necessary > transformation at expand time, but this is incorrect. This creates a > technically incorrect representation in the RTL, and RTL simplification > can subsequently fold the expression in an unexpected way. The correct > fix is to delay the transformation until final code generation. > > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no > regressions. Is this ok for trunk? I will also plan to backport this > to ibm/gcc-4_8-branch after burn-in. > > Thanks, > Bill > > > 2014-01-23 Bill Schmidt <wschmidt@linux.vnet.ibm.com> > > * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove > correction for little endian... > * config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to > here. Okay. Would you please move the comment also? Thanks, David
On Thu, 2014-01-23 at 19:42 -0500, David Edelsohn wrote: > On Thu, Jan 23, 2014 at 6:43 PM, Bill Schmidt > <wschmidt@linux.vnet.ibm.com> wrote: > > Hi, > > > > While testing another patch, I hit a regression at -O2 for two of the > > vector shuffle tests. This patch fixes the problem. > > > > The problem was introduced with the little endian fixes for > > VEC_PERM_EXPR. The original change performed the necessary > > transformation at expand time, but this is incorrect. This creates a > > technically incorrect representation in the RTL, and RTL simplification > > can subsequently fold the expression in an unexpected way. The correct > > fix is to delay the transformation until final code generation. > > > > Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no > > regressions. Is this ok for trunk? I will also plan to backport this > > to ibm/gcc-4_8-branch after burn-in. > > > > Thanks, > > Bill > > > > > > 2014-01-23 Bill Schmidt <wschmidt@linux.vnet.ibm.com> > > > > * config/rs6000/rs6000.c (rs6000_expand_vec_perm_const_1): Remove > > correction for little endian... > > * config/rs6000/vsx.md (vsx_xxpermdi2_<mode>_1): ...and move it to > > here. > > Okay. Would you please move the comment also? Oh, sure. Will do. Bill > > Thanks, David >
Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 206889) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -30115,22 +30121,6 @@ rs6000_expand_vec_perm_const_1 (rtx target, rtx op vmode = GET_MODE (target); gcc_assert (GET_MODE_NUNITS (vmode) == 2); dmode = mode_for_vector (GET_MODE_INNER (vmode), 4); - - /* For little endian, swap operands and invert/swap selectors - to get the correct xxpermdi. The operand swap sets up the - inputs as a little endian array. The selectors are swapped - because they are defined to use big endian ordering. The - selectors are inverted to get the correct doublewords for - little endian ordering. */ - if (!BYTES_BIG_ENDIAN) - { - int n; - perm0 = 3 - perm0; - perm1 = 3 - perm1; - n = perm0, perm0 = perm1, perm1 = n; - x = op0, op0 = op1, op1 = x; - } - x = gen_rtx_VEC_CONCAT (dmode, op0, op1); v = gen_rtvec (2, GEN_INT (perm0), GEN_INT (perm1)); x = gen_rtx_VEC_SELECT (vmode, x, gen_rtx_PARALLEL (VOIDmode, v)); Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 206889) +++ gcc/config/rs6000/vsx.md (working copy) @@ -1634,9 +1634,26 @@ (match_operand 4 "const_2_to_3_operand" "")])))] "VECTOR_MEM_VSX_P (<MODE>mode)" { - int mask = (INTVAL (operands[3]) << 1) | (INTVAL (operands[4]) - 2); + int op3, op4, mask; + + if (BYTES_BIG_ENDIAN) + { + op3 = INTVAL (operands[3]); + op4 = INTVAL (operands[4]); + } + else + { + op3 = 3 - INTVAL (operands[4]); + op4 = 3 - INTVAL (operands[3]); + } + + mask = (op3 << 1) | (op4 - 2); operands[3] = GEN_INT (mask); - return "xxpermdi %x0,%x1,%x2,%3"; + + if (BYTES_BIG_ENDIAN) + return "xxpermdi %x0,%x1,%x2,%3"; + else + return "xxpermdi %x0,%x2,%x1,%3"; } [(set_attr "type" "vecperm")])