diff mbox

[rs6000] Be careful with special permute masks for little endian

Message ID 1382359742.6275.162.camel@gnopaine
State New
Headers show

Commit Message

Bill Schmidt Oct. 21, 2013, 12:49 p.m. UTC
Hi,

In altivec_expand_vec_perm_const, we look for special masks that match
the behavior of specific instructions, so we can use those instructions
rather than load a constant control vector and perform a permute.  Some
of the masks must be treated differently for little endian mode.

The masks that represent merge-high and merge-low operations have
reversed meanings in little-endian, because of the reversed ordering of
the vector elements.  

The masks that represent vector-pack operations remain correct when the
mode of the input operands matches the natural mode of the instruction,
but not otherwise.  This is because the pack instructions always select
the rightmost, low-order bits of the vector element.  There are cases
where we use this, for example, with a V8SI vector matching a vpkuwum
mask in order to select the odd-numbered elements of the vector.  In
little endian mode, this instruction will get us the even-numbered
elements instead.  There is no alternative instruction with the desired
behavior, so I've just disabled use of those masks for little endian
when the mode isn't natural.

These changes fix 32 failures in the test suite for little endian mode.
Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
failures.  Is this ok for trunk?

Thanks,
Bill


2013-10-21  Bill Schmidt  <wschmidt@vnet.ibm.com>

	* config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
	meaning of merge-high and merge-low masks for little endian; avoid
	use of vector-pack masks for little endian for mismatched modes.

Comments

David Edelsohn Oct. 21, 2013, 2:02 p.m. UTC | #1
On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> Hi,
>
> In altivec_expand_vec_perm_const, we look for special masks that match
> the behavior of specific instructions, so we can use those instructions
> rather than load a constant control vector and perform a permute.  Some
> of the masks must be treated differently for little endian mode.
>
> The masks that represent merge-high and merge-low operations have
> reversed meanings in little-endian, because of the reversed ordering of
> the vector elements.
>
> The masks that represent vector-pack operations remain correct when the
> mode of the input operands matches the natural mode of the instruction,
> but not otherwise.  This is because the pack instructions always select
> the rightmost, low-order bits of the vector element.  There are cases
> where we use this, for example, with a V8SI vector matching a vpkuwum
> mask in order to select the odd-numbered elements of the vector.  In
> little endian mode, this instruction will get us the even-numbered
> elements instead.  There is no alternative instruction with the desired
> behavior, so I've just disabled use of those masks for little endian
> when the mode isn't natural.
>
> These changes fix 32 failures in the test suite for little endian mode.
> Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
> failures.  Is this ok for trunk?
>
> Thanks,
> Bill
>
>
> 2013-10-21  Bill Schmidt  <wschmidt@vnet.ibm.com>
>
>         * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
>         meaning of merge-high and merge-low masks for little endian; avoid
>         use of vector-pack masks for little endian for mismatched modes.

Okay.

Thanks, David
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 203792)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -28837,17 +28838,23 @@  altivec_expand_vec_perm_const (rtx operands[4])
       {  1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
     { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
       {  2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb,
+    { OPTION_MASK_ALTIVEC, 
+      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
       {  0, 16,  1, 17,  2, 18,  3, 19,  4, 20,  5, 21,  6, 22,  7, 23 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh,
+    { OPTION_MASK_ALTIVEC,
+      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
       {  0,  1, 16, 17,  2,  3, 18, 19,  4,  5, 20, 21,  6,  7, 22, 23 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw,
+    { OPTION_MASK_ALTIVEC,
+      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw,
       {  0,  1,  2,  3, 16, 17, 18, 19,  4,  5,  6,  7, 20, 21, 22, 23 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb,
+    { OPTION_MASK_ALTIVEC,
+      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb,
       {  8, 24,  9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh,
+    { OPTION_MASK_ALTIVEC,
+      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh,
       {  8,  9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
-    { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw,
+    { OPTION_MASK_ALTIVEC,
+      BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw,
       {  8,  9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
     { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
       {  0,  1,  2,  3, 16, 17, 18, 19,  8,  9, 10, 11, 24, 25, 26, 27 } },
@@ -28980,6 +28987,22 @@  altivec_expand_vec_perm_const (rtx operands[4])
 	  enum machine_mode omode = insn_data[icode].operand[0].mode;
 	  enum machine_mode imode = insn_data[icode].operand[1].mode;
 
+	  /* For little-endian, don't use vpkuwum and vpkuhum if the
+	     underlying vector type is not V4SI and V8HI, respectively.
+	     For example, using vpkuwum with a V8HI picks up the even
+	     halfwords (BE numbering) when the even halfwords (LE
+	     numbering) are what we need.  */
+	  if (!BYTES_BIG_ENDIAN
+	      && icode == CODE_FOR_altivec_vpkuwum
+	      && GET_CODE (op0) == SUBREG
+	      && GET_MODE (XEXP (op0, 0)) != V4SImode)
+	    continue;
+	  if (!BYTES_BIG_ENDIAN
+	      && icode == CODE_FOR_altivec_vpkuhum
+	      && GET_CODE (op0) == SUBREG
+	      && GET_MODE (XEXP (op0, 0)) != V8HImode)
+	    continue;
+
           /* For little-endian, the two input operands must be swapped
              (or swapped back) to ensure proper right-to-left numbering
              from 0 to 2N-1.  */