diff mbox

[5/5] target-sh4: improve shad instruction

Message ID 1437736471-26124-6-git-send-email-aurelien@aurel32.net
State New
Headers show

Commit Message

Aurelien Jarno July 24, 2015, 11:14 a.m. UTC
The SH4 shad instruction can shift in both direction, depending on the
sign of the shift. This is currently implemented using branches, which
is not really efficient and prevents the optimizer to do its job. In
practice it is often used with a constant loaded in a register just
before.

Simplify the implementation by computing both the value shifted to the
left and to the right, and then selecting the correct one with a
movcond. As with a negative value the shift amount can go up to 32 which
is undefined, we shift the value in two steps.

Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
---
 target-sh4/translate.c | 53 +++++++++++++++++++++-----------------------------
 1 file changed, 22 insertions(+), 31 deletions(-)

Comments

Aurelien Jarno July 24, 2015, 11:16 a.m. UTC | #1
On 2015-07-24 13:14, Aurelien Jarno wrote:
> The SH4 shad instruction can shift in both direction, depending on the
> sign of the shift. This is currently implemented using branches, which
> is not really efficient and prevents the optimizer to do its job. In
> practice it is often used with a constant loaded in a register just
> before.
> 
> Simplify the implementation by computing both the value shifted to the
> left and to the right, and then selecting the correct one with a
> movcond. As with a negative value the shift amount can go up to 32 which
> is undefined, we shift the value in two steps.
> 
> Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
>  target-sh4/translate.c | 53 +++++++++++++++++++++-----------------------------
>  1 file changed, 22 insertions(+), 31 deletions(-)

Despite the subject, this patch is of course also for 2.5.
diff mbox

Patch

diff --git a/target-sh4/translate.c b/target-sh4/translate.c
index b0b888c..d436879 100644
--- a/target-sh4/translate.c
+++ b/target-sh4/translate.c
@@ -832,37 +832,28 @@  static void _decode_opc(DisasContext * ctx)
 	return;
     case 0x400c:		/* shad Rm,Rn */
 	{
-            TCGLabel *label1 = gen_new_label();
-            TCGLabel *label2 = gen_new_label();
-            TCGLabel *label3 = gen_new_label();
-            TCGLabel *label4 = gen_new_label();
-	    TCGv shift;
-	    tcg_gen_brcondi_i32(TCG_COND_LT, REG(B7_4), 0, label1);
-	    /* Rm positive, shift to the left */
-            shift = tcg_temp_new();
-	    tcg_gen_andi_i32(shift, REG(B7_4), 0x1f);
-	    tcg_gen_shl_i32(REG(B11_8), REG(B11_8), shift);
-	    tcg_temp_free(shift);
-	    tcg_gen_br(label4);
-	    /* Rm negative, shift to the right */
-	    gen_set_label(label1);
-            shift = tcg_temp_new();
-	    tcg_gen_andi_i32(shift, REG(B7_4), 0x1f);
-	    tcg_gen_brcondi_i32(TCG_COND_EQ, shift, 0, label2);
-	    tcg_gen_not_i32(shift, REG(B7_4));
-	    tcg_gen_andi_i32(shift, shift, 0x1f);
-	    tcg_gen_addi_i32(shift, shift, 1);
-	    tcg_gen_sar_i32(REG(B11_8), REG(B11_8), shift);
-	    tcg_temp_free(shift);
-	    tcg_gen_br(label4);
-	    /* Rm = -32 */
-	    gen_set_label(label2);
-	    tcg_gen_brcondi_i32(TCG_COND_LT, REG(B11_8), 0, label3);
-	    tcg_gen_movi_i32(REG(B11_8), 0);
-	    tcg_gen_br(label4);
-	    gen_set_label(label3);
-	    tcg_gen_movi_i32(REG(B11_8), 0xffffffff);
-	    gen_set_label(label4);
+            TCGv t0 = tcg_temp_new();
+            TCGv t1 = tcg_temp_new();
+            TCGv t2 = tcg_temp_new();
+
+            tcg_gen_andi_i32(t0, REG(B7_4), 0x1f);
+
+            /* positive case: shift to the left */
+            tcg_gen_shl_i32(t1, REG(B11_8), t0);
+
+            /* negative case: shift to the right in two steps to
+               correctly handle the -32 case */
+            tcg_gen_xori_i32(t0, t0, 0x1f);
+            tcg_gen_sar_i32(t2, REG(B11_8), t0);
+            tcg_gen_sari_i32(t2, t2, 1);
+
+            /* select between the two cases */
+            tcg_gen_movi_i32(t0, 0);
+            tcg_gen_movcond_i32(TCG_COND_GE, REG(B11_8), REG(B7_4), t0, t1, t2);
+
+            tcg_temp_free(t0);
+            tcg_temp_free(t1);
+            tcg_temp_free(t2);
 	}
 	return;
     case 0x400d:		/* shld Rm,Rn */