diff mbox

[5.1,rs6000] Fix PR65787

Message ID 1429277282.20720.8.camel@gnopaine
State New
Headers show

Commit Message

Bill Schmidt April 17, 2015, 1:28 p.m. UTC
On Fri, 2015-04-17 at 07:27 -0500, Bill Schmidt wrote:
> Note that Jakub requested a small change in the bugzilla commentary,
> which I've implemented.  I'm doing a regstrap now.
> 
> Bill
> 

Here's the revised and tested patch.  OK for trunk and gcc-5-branch?

Thanks,
Bill


[gcc]

2015-04-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/65787
	* config/rs6000/rs6000.c (rtx_is_swappable_p): Handle case where
	vec_extract operation is wrapped in a PARALLEL with a CLOBBER.
	(adjust_extract): Likewise.

[gcc/testsuite]

2015-04-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>

	PR target/65787
	* gcc.target/powerpc/pr65787.c: New.

Comments

David Edelsohn April 17, 2015, 2:42 p.m. UTC | #1
On Fri, Apr 17, 2015 at 9:28 AM, Bill Schmidt
<wschmidt@linux.vnet.ibm.com> wrote:
> On Fri, 2015-04-17 at 07:27 -0500, Bill Schmidt wrote:
>> Note that Jakub requested a small change in the bugzilla commentary,
>> which I've implemented.  I'm doing a regstrap now.
>>
>> Bill
>>
>
> Here's the revised and tested patch.  OK for trunk and gcc-5-branch?
>
> Thanks,
> Bill
>
>
> [gcc]
>
> 2015-04-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         PR target/65787
>         * config/rs6000/rs6000.c (rtx_is_swappable_p): Handle case where
>         vec_extract operation is wrapped in a PARALLEL with a CLOBBER.
>         (adjust_extract): Likewise.
>
> [gcc/testsuite]
>
> 2015-04-16  Bill Schmidt  <wschmidt@linux.vnet.ibm.com>
>
>         PR target/65787
>         * gcc.target/powerpc/pr65787.c: New.

The revised patch is okay.

Thanks, David
Jakub Jelinek April 17, 2015, 2:49 p.m. UTC | #2
On Fri, Apr 17, 2015 at 08:28:02AM -0500, Bill Schmidt wrote:
> On Fri, 2015-04-17 at 07:27 -0500, Bill Schmidt wrote:
> > Note that Jakub requested a small change in the bugzilla commentary,
> > which I've implemented.  I'm doing a regstrap now.
> > 
> > Bill
> > 
> 
> Here's the revised and tested patch.  OK for trunk and gcc-5-branch?

You have actually mailed the original patch again, not the revised one.

That said, PARALLEL seems to be already handled by rtx_is_swappable_p,
so if it isn't handled correctly, perhaps there is a bug in that function.

  for (i = 0; i < GET_RTX_LENGTH (code); ++i)
    if (fmt[i] == 'e' || fmt[i] == 'u')
      {
        unsigned int special_op = SH_NONE;
        ok &= rtx_is_swappable_p (XEXP (op, i), &special_op);
        /* Ensure we never have two kinds of special handling
           for the same insn.  */
        if (*special != SH_NONE && special_op != SH_NONE
            && *special != special_op)
          return 0;
        *special = special_op;
      }
    else if (fmt[i] == 'E')
      for (j = 0; j < XVECLEN (op, i); ++j)
        {
          unsigned int special_op = SH_NONE;
          ok &= rtx_is_swappable_p (XVECEXP (op, i, j), &special_op);
          /* Ensure we never have two kinds of special handling
             for the same insn.  */
          if (*special != SH_NONE && special_op != SH_NONE
              && *special != special_op)
            return 0;
          *special = special_op;
        }

If the comments are correct, then supposedly if say on the PARALLEL with
a SET that is SH_EXTRACT and a CLOBBER that is SH_NONE, both returning 1,
the outcome should by return 1 (that is the case), but *special should be
SH_EXTRACT, rather than the last case wins right now (SH_NONE).
If so, then perhaps after each of the ok &= rtx_is_swappable_p ...
line should be
  if (special_op == SH_NONE)
    continue;
so that we never store SH_NONE (leave that just to the initial store), and
then just
  if (*special != SH_NONE && *special != special_op)
    return 0;

	Jakub
Bill Schmidt April 17, 2015, 3:02 p.m. UTC | #3
On Fri, 2015-04-17 at 16:49 +0200, Jakub Jelinek wrote:
> On Fri, Apr 17, 2015 at 08:28:02AM -0500, Bill Schmidt wrote:
> > On Fri, 2015-04-17 at 07:27 -0500, Bill Schmidt wrote:
> > > Note that Jakub requested a small change in the bugzilla commentary,
> > > which I've implemented.  I'm doing a regstrap now.
> > > 
> > > Bill
> > > 
> > 
> > Here's the revised and tested patch.  OK for trunk and gcc-5-branch?
> 
> You have actually mailed the original patch again, not the revised one.

Yes, sorry, I had just noticed that.  I forgot to download the revised
patch to the system I send my mail from.  This is what I get for
multitasking during a meeting...

> 
> That said, PARALLEL seems to be already handled by rtx_is_swappable_p,
> so if it isn't handled correctly, perhaps there is a bug in that function.
> 
>   for (i = 0; i < GET_RTX_LENGTH (code); ++i)
>     if (fmt[i] == 'e' || fmt[i] == 'u')
>       {
>         unsigned int special_op = SH_NONE;
>         ok &= rtx_is_swappable_p (XEXP (op, i), &special_op);
>         /* Ensure we never have two kinds of special handling
>            for the same insn.  */
>         if (*special != SH_NONE && special_op != SH_NONE
>             && *special != special_op)
>           return 0;
>         *special = special_op;
>       }
>     else if (fmt[i] == 'E')
>       for (j = 0; j < XVECLEN (op, i); ++j)
>         {
>           unsigned int special_op = SH_NONE;
>           ok &= rtx_is_swappable_p (XVECEXP (op, i, j), &special_op);
>           /* Ensure we never have two kinds of special handling
>              for the same insn.  */
>           if (*special != SH_NONE && special_op != SH_NONE
>               && *special != special_op)
>             return 0;
>           *special = special_op;
>         }
> 
> If the comments are correct, then supposedly if say on the PARALLEL with
> a SET that is SH_EXTRACT and a CLOBBER that is SH_NONE, both returning 1,
> the outcome should by return 1 (that is the case), but *special should be
> SH_EXTRACT, rather than the last case wins right now (SH_NONE).
> If so, then perhaps after each of the ok &= rtx_is_swappable_p ...
> line should be
>   if (special_op == SH_NONE)
>     continue;
> so that we never store SH_NONE (leave that just to the initial store), and
> then just
>   if (*special != SH_NONE && *special != special_op)
>     return 0;

Hm, yes, there is definitely a problem here.  Let me look at reworking
this.  Thanks!

Bill

> 
> 	Jakub
>
diff mbox

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 222158)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -34204,6 +34204,20 @@  rtx_is_swappable_p (rtx op, unsigned int *special)
       else
 	return 0;
 
+    case PARALLEL: {
+      /* A vec_extract operation may be wrapped in a PARALLEL with a
+	 clobber, so account for that possibility.  */
+      unsigned int len = XVECLEN (op, 0);
+
+      if (len != 2)
+	return 0;
+
+      if (GET_CODE (XVECEXP (op, 0, 1)) != CLOBBER)
+	return 0;
+
+      return rtx_is_swappable_p (XVECEXP (op, 0, 0), special);
+    }
+
     case UNSPEC:
       {
 	/* Various operations are unsafe for this optimization, at least
@@ -34603,7 +34617,10 @@  permute_store (rtx_insn *insn)
 static void
 adjust_extract (rtx_insn *insn)
 {
-  rtx src = SET_SRC (PATTERN (insn));
+  rtx pattern = PATTERN (insn);
+  if (GET_CODE (pattern) == PARALLEL)
+    pattern = XVECEXP (pattern, 0, 0);
+  rtx src = SET_SRC (pattern);
   /* The vec_select may be wrapped in a vec_duplicate for a splat, so
      account for that.  */
   rtx sel = GET_CODE (src) == VEC_DUPLICATE ? XEXP (src, 0) : src;
Index: gcc/testsuite/gcc.target/powerpc/pr65787.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/pr65787.c	(revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr65787.c	(working copy)
@@ -0,0 +1,21 @@ 
+/* { dg-do compile { target { powerpc64le-*-* } } } */
+/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power8" } } */
+/* { dg-options "-mcpu=power8 -O3" } */
+/* { dg-final { scan-assembler "xxsldwi \[0-9\]*,\[0-9\]*,\[0-9\]*,3" } } */
+/* { dg-final { scan-assembler-not "xxpermdi" } } */
+
+/* This test verifies that a vector extract operand properly has its
+   lane changed by the swap optimization.  Element 2 of LE corresponds
+   to element 1 of BE.  When doublewords are swapped, this becomes
+   element 3 of BE, so we need to shift the vector left by 3 words
+   to be able to extract the correct value from BE element zero.  */
+
+typedef float  v4f32 __attribute__ ((__vector_size__ (16)));
+
+void foo (float);
+extern v4f32 x, y;
+
+int main() {
+  v4f32 z = x + y;
+  foo (z[2]);
+}