diff mbox

[06/15] rs6000: Use rldiwi in constant construction

Message ID 1439341904-9345-7-git-send-email-rth@redhat.com
State New
Headers show

Commit Message

Richard Henderson Aug. 12, 2015, 1:11 a.m. UTC
Save 2 insns for constants like 0x5555555555555555 where
the low 32-bits can be replicated to the high 32-bits.

Cc: David Edelsohn <dje.gcc@gmail.com>
---
	* config/rs6000/rs6000.c (genimm_ppc::exam_search): Check for
	two equal 32-bit pieces.
	(genimm_ppc::generate): Handle VEC_DUPLICATE.
---
 gcc/config/rs6000/rs6000.c | 15 +++++++++++++++
 1 file changed, 15 insertions(+)

Comments

Segher Boessenkool Aug. 12, 2015, 2:02 p.m. UTC | #1
On Tue, Aug 11, 2015 at 06:11:35PM -0700, Richard Henderson wrote:
> @@ -8173,6 +8173,13 @@ genimm_ppc::exam_search (HOST_WIDE_INT c, int budget)
>    if (exam_mask (-1, c, sub_budget))
>      return true;
>  
> +  /* If the two halves are equal, use an insert.  */
> +  if (c >> 32 == test && exam_sub (test, sub_budget))
> +    {
> +      opN (VEC_DUPLICATE, 0xffffffffu); /* RLDIMI */
> +      return true;
> +    }

Does this work for c with the high bit set?  I think you need
to cast it to unsigned HOST_WIDE_INT first?


Segher
Richard Henderson Aug. 12, 2015, 3:55 p.m. UTC | #2
On 08/12/2015 07:02 AM, Segher Boessenkool wrote:
> On Tue, Aug 11, 2015 at 06:11:35PM -0700, Richard Henderson wrote:
>> @@ -8173,6 +8173,13 @@ genimm_ppc::exam_search (HOST_WIDE_INT c, int budget)
>>    if (exam_mask (-1, c, sub_budget))
>>      return true;
>>  
>> +  /* If the two halves are equal, use an insert.  */
>> +  if (c >> 32 == test && exam_sub (test, sub_budget))
>> +    {
>> +      opN (VEC_DUPLICATE, 0xffffffffu); /* RLDIMI */
>> +      return true;
>> +    }
> 
> Does this work for c with the high bit set?  I think you need
> to cast it to unsigned HOST_WIDE_INT first?

Indeed, a sign-extension works better.  It means the base constant will use
LIS+ORIS without trying to create an unsigned version.

If you're talking about ubsan sort of restrictions on shifting signed
constants...  I choose to totally ignore that.  Certainly no where else in gcc
has been audited for that, beginning with hwint.h itself.


r~
Segher Boessenkool Aug. 13, 2015, 2:43 a.m. UTC | #3
On Wed, Aug 12, 2015 at 08:55:51AM -0700, Richard Henderson wrote:
> On 08/12/2015 07:02 AM, Segher Boessenkool wrote:
> > On Tue, Aug 11, 2015 at 06:11:35PM -0700, Richard Henderson wrote:
> >> @@ -8173,6 +8173,13 @@ genimm_ppc::exam_search (HOST_WIDE_INT c, int budget)
> >>    if (exam_mask (-1, c, sub_budget))
> >>      return true;
> >>  
> >> +  /* If the two halves are equal, use an insert.  */
> >> +  if (c >> 32 == test && exam_sub (test, sub_budget))
> >> +    {
> >> +      opN (VEC_DUPLICATE, 0xffffffffu); /* RLDIMI */
> >> +      return true;
> >> +    }
> > 
> > Does this work for c with the high bit set?  I think you need
> > to cast it to unsigned HOST_WIDE_INT first?
> 
> Indeed, a sign-extension works better.  It means the base constant will use
> LIS+ORIS without trying to create an unsigned version.

Patch 8/15 changes this so that "test" is assigned the sign-extended low
32 bits right before this code; that should work just fine.

> If you're talking about ubsan sort of restrictions on shifting signed
> constants...  I choose to totally ignore that.

Good plan.  We rely on arithmetic shifts rounding towards negative
infinity, and so does the rest of the world.

> Certainly no where else in gcc
> has been audited for that, beginning with hwint.h itself.

Yes.  And there are much worse problems, like many things not working
right if your HOST_WIDE_INT would happen to be more than 64 bits; we
cannot really shake those out because there is no actual system to
test that on -- but it also doesn't actually matter, because there is
no system to run it on :-)


Segher
Mike Stump Aug. 13, 2015, 6:58 p.m. UTC | #4
On Aug 12, 2015, at 7:43 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> Yes.  And there are much worse problems, like many things not working
> right if your HOST_WIDE_INT would happen to be more than 64 bits; we
> cannot really shake those out because there is no actual system to
> test that on -- but it also doesn't actually matter, because there is
> no system to run it on :-)

Lots of systems support 128 bit types just fine, and one could use TImode for HOST_WIDE_INT, if one really, really wanted to.  x86_64 I think is one of those systems.
Joseph Myers Aug. 13, 2015, 8:23 p.m. UTC | #5
On Thu, 13 Aug 2015, Mike Stump wrote:

> On Aug 12, 2015, at 7:43 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote:
> > Yes.  And there are much worse problems, like many things not working
> > right if your HOST_WIDE_INT would happen to be more than 64 bits; we
> > cannot really shake those out because there is no actual system to
> > test that on -- but it also doesn't actually matter, because there is
> > no system to run it on :-)
> 
> Lots of systems support 128 bit types just fine, and one could use 
> TImode for HOST_WIDE_INT, if one really, really wanted to.  x86_64 I 
> think is one of those systems.

There's no printf support (or other standard library support); you'd run 
into practical problems there.
diff mbox

Patch

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 6af5cf3..59c5014 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -8173,6 +8173,13 @@  genimm_ppc::exam_search (HOST_WIDE_INT c, int budget)
   if (exam_mask (-1, c, sub_budget))
     return true;
 
+  /* If the two halves are equal, use an insert.  */
+  if (c >> 32 == test && exam_sub (test, sub_budget))
+    {
+      opN (VEC_DUPLICATE, 0xffffffffu); /* RLDIMI */
+      return true;
+    }
+
   /* Shift the constant left.  */
   test = HOST_WIDE_INT_UC (0xffffffff00000000);
   if ((c & test) == c && exam_sub (c >> 32, sub_budget))
@@ -8230,6 +8237,14 @@  genimm_ppc::generate (rtx dest, machine_mode mode) const
 	case ASHIFT:
 	  x = gen_rtx_fmt_ee (r, mode, op1, op2);
 	  break;
+	case VEC_DUPLICATE:
+	  /* Abusing the rtx code to indicate RLDIMI.
+	     This should match *rotl<mode>3_insert_3.  */
+	  x = GEN_INT (exact_log2 (op[i] + 1));
+	  x = gen_rtx_IOR (mode,
+			   gen_rtx_AND (mode, op1, op2),
+			   gen_rtx_ASHIFT (mode, op1, x));
+	  break;
 	default:
 	  gcc_unreachable ();
 	}