diff mbox

[5/6] Do constant folding for shift operations.

Message ID 97bbf40c055a9949f5fbf185764792679fb8273a.1305889001.git.batuzovk@ispras.ru
State New
Headers show

Commit Message

Kirill Batuzov May 20, 2011, 12:39 p.m. UTC
Perform constant forlding for SHR, SHL, SAR, ROTR, ROTL operations.

Signed-off-by: Kirill Batuzov <batuzovk@ispras.ru>
---
 tcg/optimize.c |   87 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 87 insertions(+), 0 deletions(-)

Comments

Richard Henderson May 20, 2011, 6:37 p.m. UTC | #1
On 05/20/2011 05:39 AM, Kirill Batuzov wrote:
> +    case INDEX_op_sar_i32:
> +#if TCG_TARGET_REG_BITS == 64
> +        x &= 0xffffffff;
> +        y &= 0xffffffff;
> +#endif
> +        r = x & 0x80000000;
> +        x &= ~0x80000000;
> +        x >>= y;
> +        r |= r - (r >> y);
> +        x |= r;
> +        return x;
> +

Any reason you're emulating the 32-bit shift by
hand, rather than letting the compiler do it?  I.e.

  x = (int32_t)x >> (int32_t)y;


r~
Kirill Batuzov May 26, 2011, 12:36 p.m. UTC | #2
On Fri, 20 May 2011, Richard Henderson wrote:

> 
> On 05/20/2011 05:39 AM, Kirill Batuzov wrote:
> > +    case INDEX_op_sar_i32:
> > +#if TCG_TARGET_REG_BITS == 64
> > +        x &= 0xffffffff;
> > +        y &= 0xffffffff;
> > +#endif
> > +        r = x & 0x80000000;
> > +        x &= ~0x80000000;
> > +        x >>= y;
> > +        r |= r - (r >> y);
> > +        x |= r;
> > +        return x;
> > +
> 
> Any reason you're emulating the 32-bit shift by
> hand, rather than letting the compiler do it?  I.e.
> 
>   x = (int32_t)x >> (int32_t)y;
>
This expression has an implementation-defined behavior accroding to
C99 6.5.7 so we decided to emulate signed shifts by hand.

----
  Kirill.
Richard Henderson May 26, 2011, 1:56 p.m. UTC | #3
On 05/26/2011 05:36 AM, Kirill Batuzov wrote:
>>   x = (int32_t)x >> (int32_t)y;
>>
> This expression has an implementation-defined behavior accroding to
> C99 6.5.7 so we decided to emulate signed shifts by hand.

Technically, yes.  In practice, no.  GCC, ICC, LLVM, MSVC all know
what the user wants here and will implement it "properly".


r~
Blue Swirl May 26, 2011, 7:14 p.m. UTC | #4
On Thu, May 26, 2011 at 4:56 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 05/26/2011 05:36 AM, Kirill Batuzov wrote:
>>>   x = (int32_t)x >> (int32_t)y;
>>>
>> This expression has an implementation-defined behavior accroding to
>> C99 6.5.7 so we decided to emulate signed shifts by hand.
>
> Technically, yes.  In practice, no.  GCC, ICC, LLVM, MSVC all know
> what the user wants here and will implement it "properly".

Can't this be probed by configure? Then a wrapper could be introduced
for signed shifts.
Richard Henderson May 26, 2011, 8:10 p.m. UTC | #5
On 05/26/2011 12:14 PM, Blue Swirl wrote:
> On Thu, May 26, 2011 at 4:56 PM, Richard Henderson <rth@twiddle.net> wrote:
>> On 05/26/2011 05:36 AM, Kirill Batuzov wrote:
>>>>   x = (int32_t)x >> (int32_t)y;
>>>>
>>> This expression has an implementation-defined behavior accroding to
>>> C99 6.5.7 so we decided to emulate signed shifts by hand.
>>
>> Technically, yes.  In practice, no.  GCC, ICC, LLVM, MSVC all know
>> what the user wants here and will implement it "properly".
> 
> Can't this be probed by configure? Then a wrapper could be introduced
> for signed shifts.

I don't see the point.  The C99 implementation defined escape hatch
exists for weird cpus.  Which we won't be supporting as a QEMU host.


r~
Blue Swirl May 26, 2011, 8:25 p.m. UTC | #6
On Thu, May 26, 2011 at 11:10 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 05/26/2011 12:14 PM, Blue Swirl wrote:
>> On Thu, May 26, 2011 at 4:56 PM, Richard Henderson <rth@twiddle.net> wrote:
>>> On 05/26/2011 05:36 AM, Kirill Batuzov wrote:
>>>>>   x = (int32_t)x >> (int32_t)y;
>>>>>
>>>> This expression has an implementation-defined behavior accroding to
>>>> C99 6.5.7 so we decided to emulate signed shifts by hand.
>>>
>>> Technically, yes.  In practice, no.  GCC, ICC, LLVM, MSVC all know
>>> what the user wants here and will implement it "properly".
>>
>> Can't this be probed by configure? Then a wrapper could be introduced
>> for signed shifts.
>
> I don't see the point.  The C99 implementation defined escape hatch
> exists for weird cpus.  Which we won't be supporting as a QEMU host.

Maybe not, but a compiler with this property could arrive. For
example, GCC developers could decide that since this weirdness is
allowed by the standard, it may be implemented as well.
Richard Henderson May 26, 2011, 9:14 p.m. UTC | #7
On 05/26/2011 01:25 PM, Blue Swirl wrote:
>> I don't see the point.  The C99 implementation defined escape hatch
>> exists for weird cpus.  Which we won't be supporting as a QEMU host.
> 
> Maybe not, but a compiler with this property could arrive. For
> example, GCC developers could decide that since this weirdness is
> allowed by the standard, it may be implemented as well.

If you like, you can write a configure test for it.  But, honestly,
essentially every place in qemu that uses shifts on signed types
would have to be audited.  Really.

The C99 hook exists to efficiently support targets that don't have
arithmetic shift operations.  Honestly.


r~
Paolo Bonzini May 27, 2011, 7:09 a.m. UTC | #8
On 05/26/2011 09:14 PM, Blue Swirl wrote:
>>>>    x = (int32_t)x>>  (int32_t)y;
>>>> >>>
>>> >>  This expression has an implementation-defined behavior accroding to
>>> >>  C99 6.5.7 so we decided to emulate signed shifts by hand.
>> >
>> >  Technically, yes.  In practice, no.  GCC, ICC, LLVM, MSVC all know
>> >  what the user wants here and will implement it "properly".
>
> Can't this be probed by configure? Then a wrapper could be introduced
> for signed shifts.

The reason for implementation-defined behavior is basically to allow for 
non-two's-complement machine, which isn't really practical to support.

Paolo
Jamie Lokier May 27, 2011, 3:41 p.m. UTC | #9
Richard Henderson wrote:
> On 05/26/2011 01:25 PM, Blue Swirl wrote:
> >> I don't see the point.  The C99 implementation defined escape hatch
> >> exists for weird cpus.  Which we won't be supporting as a QEMU host.
> > 
> > Maybe not, but a compiler with this property could arrive. For
> > example, GCC developers could decide that since this weirdness is
> > allowed by the standard, it may be implemented as well.
> 
> If you like, you can write a configure test for it.  But, honestly,
> essentially every place in qemu that uses shifts on signed types
> would have to be audited.  Really.

I agree, the chance of qemu ever working, or needing to work, on a non
two's complement machine is pretty remote!

> The C99 hook exists to efficiently support targets that don't have
> arithmetic shift operations.  Honestly.

If you care, this should be portable without a configure test, as
constant folding should have the same behaviour:

    (((int32_t)-3 >> 1 == (int32_t)-2)
     ? (int32_t)x >> (int32_t)y
     : long_winded_portable_shift_right(x, y))

-- Jamie
Blue Swirl May 27, 2011, 5:07 p.m. UTC | #10
On Fri, May 27, 2011 at 12:14 AM, Richard Henderson <rth@twiddle.net> wrote:
> On 05/26/2011 01:25 PM, Blue Swirl wrote:
>>> I don't see the point.  The C99 implementation defined escape hatch
>>> exists for weird cpus.  Which we won't be supporting as a QEMU host.
>>
>> Maybe not, but a compiler with this property could arrive. For
>> example, GCC developers could decide that since this weirdness is
>> allowed by the standard, it may be implemented as well.
>
> If you like, you can write a configure test for it.  But, honestly,
> essentially every place in qemu that uses shifts on signed types
> would have to be audited.  Really.

OK.

> The C99 hook exists to efficiently support targets that don't have
> arithmetic shift operations.  Honestly.

So it would be impossible for a compiler developer to change the logic
for shifts for some supported two's-complement logic CPUs (like x86)
just because it's legal?
Richard Henderson May 27, 2011, 7:54 p.m. UTC | #11
On 05/27/2011 10:07 AM, Blue Swirl wrote:
>> The C99 hook exists to efficiently support targets that don't have
>> arithmetic shift operations.  Honestly.
> 
> So it would be impossible for a compiler developer to change the logic
> for shifts for some supported two's-complement logic CPUs (like x86)
> just because it's legal?

Not without being lynched, no.


r~
diff mbox

Patch

diff --git a/tcg/optimize.c b/tcg/optimize.c
index a02d5c1..b6b0dc4 100644
--- a/tcg/optimize.c
+++ b/tcg/optimize.c
@@ -99,6 +99,11 @@  static int op_bits(int op)
     case INDEX_op_and_i32:
     case INDEX_op_or_i32:
     case INDEX_op_xor_i32:
+    case INDEX_op_shl_i32:
+    case INDEX_op_shr_i32:
+    case INDEX_op_sar_i32:
+    case INDEX_op_rotl_i32:
+    case INDEX_op_rotr_i32:
         return 32;
 #if TCG_TARGET_REG_BITS == 64
     case INDEX_op_mov_i64:
@@ -108,6 +113,11 @@  static int op_bits(int op)
     case INDEX_op_and_i64:
     case INDEX_op_or_i64:
     case INDEX_op_xor_i64:
+    case INDEX_op_shl_i64:
+    case INDEX_op_shr_i64:
+    case INDEX_op_sar_i64:
+    case INDEX_op_rotl_i64:
+    case INDEX_op_rotr_i64:
         return 64;
 #endif
     default:
@@ -131,6 +141,7 @@  static int op_to_movi(int op)
 
 static TCGArg do_constant_folding_2(int op, TCGArg x, TCGArg y)
 {
+    TCGArg r;
     switch (op) {
     case INDEX_op_add_i32:
 #if TCG_TARGET_REG_BITS == 64
@@ -168,6 +179,72 @@  static TCGArg do_constant_folding_2(int op, TCGArg x, TCGArg y)
 #endif
         return x ^ y;
 
+    case INDEX_op_shl_i32:
+#if TCG_TARGET_REG_BITS == 64
+        y &= 0xffffffff;
+    case INDEX_op_shl_i64:
+#endif
+        return x << y;
+
+    case INDEX_op_shr_i32:
+#if TCG_TARGET_REG_BITS == 64
+        x &= 0xffffffff;
+        y &= 0xffffffff;
+    case INDEX_op_shr_i64:
+#endif
+        /* Assuming TCGArg to be unsigned */
+        return x >> y;
+
+    case INDEX_op_sar_i32:
+#if TCG_TARGET_REG_BITS == 64
+        x &= 0xffffffff;
+        y &= 0xffffffff;
+#endif
+        r = x & 0x80000000;
+        x &= ~0x80000000;
+        x >>= y;
+        r |= r - (r >> y);
+        x |= r;
+        return x;
+
+#if TCG_TARGET_REG_BITS == 64
+    case INDEX_op_sar_i64:
+        r = x & 0x8000000000000000ULL;
+        x &= ~0x8000000000000000ULL;
+        x >>= y;
+        r |= r - (r >> y);
+        x |= r;
+        return x;
+#endif
+
+    case INDEX_op_rotr_i32:
+#if TCG_TARGET_REG_BITS == 64
+        x &= 0xffffffff;
+        y &= 0xffffffff;
+#endif
+        x = (x << (32 - y)) | (x >> y);
+        return x;
+
+#if TCG_TARGET_REG_BITS == 64
+    case INDEX_op_rotr_i64:
+        x = (x << (64 - y)) | (x >> y);
+        return x;
+#endif
+
+    case INDEX_op_rotl_i32:
+#if TCG_TARGET_REG_BITS == 64
+        x &= 0xffffffff;
+        y &= 0xffffffff;
+#endif
+        x = (x << y) | (x >> (32 - y));
+        return x;
+
+#if TCG_TARGET_REG_BITS == 64
+    case INDEX_op_rotl_i64:
+        x = (x << y) | (x >> (64 - y));
+        return x;
+#endif
+
     default:
         fprintf(stderr,
                 "Unrecognized operation %d in do_constant_folding.\n", op);
@@ -297,11 +374,21 @@  static TCGArg *tcg_constant_folding(TCGContext *s, uint16_t *tcg_opc_ptr,
         case INDEX_op_add_i32:
         case INDEX_op_sub_i32:
         case INDEX_op_mul_i32:
+        case INDEX_op_shl_i32:
+        case INDEX_op_shr_i32:
+        case INDEX_op_sar_i32:
+        case INDEX_op_rotl_i32:
+        case INDEX_op_rotr_i32:
 #if TCG_TARGET_REG_BITS == 64
         case INDEX_op_xor_i64:
         case INDEX_op_add_i64:
         case INDEX_op_sub_i64:
         case INDEX_op_mul_i64:
+        case INDEX_op_shl_i64:
+        case INDEX_op_shr_i64:
+        case INDEX_op_sar_i64:
+        case INDEX_op_rotl_i64:
+        case INDEX_op_rotr_i64:
 #endif
             if (state[args[1]] == TCG_TEMP_CONST
                 && state[args[2]] == TCG_TEMP_CONST) {