Message ID | 20090917120406.J54069@stanley.csl.cornell.edu |
---|---|
State | Superseded |
Headers | show |
On Thu, Sep 17, 2009 at 6:07 PM, Vince Weaver <vince@csl.cornell.edu> wrote: > > What would be nice is a tcg > subtract-from instruction, which I know some architectures have. Maybe > tcg does have it and I should look harder. There is tcg_gen_subfi. Laurent
Vince Weaver <vince@csl.cornell.edu> writes: > tcg_gen_andi_i64(tmp1, cpu_ir[rb], 7); > tcg_gen_shli_i64(tmp1, tmp1, 3); > - tmp2 = tcg_const_i64(64); > - tcg_gen_sub_i64(tmp1, tmp2, tmp1); > - tcg_temp_free(tmp2); > + tcg_gen_andi_i64(tmp1, tmp1, 0x3f); > + tcg_gen_neg_i64(tmp1, tmp1); > + tcg_gen_addi_i64(tmp1, tmp1, 64); This wastes an operation. If you switch andi and neg you don't need to add 64. Andreas.
On Thu, Sep 17, 2009 at 12:07:23PM -0400, Vince Weaver wrote: > On Wed, 16 Sep 2009, Aurelien Jarno wrote: > > > In case tmp1 = 0, it becomes 64, and then 0 again after the and, so > > rc=ra<<0. > > Ah, I see. I completely missed that optimization. > > How does this updated patch look? I removed one of the TCGv variables > too. Does that help performance? What would be nice is a tcg Yes it looks ok. Removing one normal TCGv variable doesn't really help. What helps is removing a tcg temp_local variable or a branch. > subtract-from instruction, which I know some architectures have. Maybe > tcg does have it and I should look harder. You can use tcg_gen_subfi_i64 (tcg result, immediate, tcg arg). > diff --git a/target-alpha/translate.c b/target-alpha/translate.c > index 9d2bc45..af2a43c 100644 > --- a/target-alpha/translate.c > +++ b/target-alpha/translate.c > @@ -524,14 +524,16 @@ static inline void gen_ext_h(void(*tcg_gen_ext_i64)(TCGv t0, TCGv t1), > else > tcg_gen_mov_i64(cpu_ir[rc], cpu_ir[ra]); > } else { > - TCGv tmp1, tmp2; ap> + TCGv tmp1; > tmp1 = tcg_temp_new(); > + > tcg_gen_andi_i64(tmp1, cpu_ir[rb], 7); > tcg_gen_shli_i64(tmp1, tmp1, 3); > - tmp2 = tcg_const_i64(64); > - tcg_gen_sub_i64(tmp1, tmp2, tmp1); > - tcg_temp_free(tmp2); > + tcg_gen_andi_i64(tmp1, tmp1, 0x3f); > + tcg_gen_neg_i64(tmp1, tmp1); > + tcg_gen_addi_i64(tmp1, tmp1, 64); > tcg_gen_shl_i64(cpu_ir[rc], cpu_ir[ra], tmp1); > + > tcg_temp_free(tmp1); > } > if (tcg_gen_ext_i64) > @@ -1316,7 +1318,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) > break; > case 0x6A: > /* EXTLH */ > - gen_ext_h(&tcg_gen_ext16u_i64, ra, rb, rc, islit, lit); > + gen_ext_h(&tcg_gen_ext32u_i64, ra, rb, rc, islit, lit); > break; > case 0x72: > /* MSKQH */ >
diff --git a/target-alpha/translate.c b/target-alpha/translate.c index 9d2bc45..af2a43c 100644 --- a/target-alpha/translate.c +++ b/target-alpha/translate.c @@ -524,14 +524,16 @@ static inline void gen_ext_h(void(*tcg_gen_ext_i64)(TCGv t0, TCGv t1), else tcg_gen_mov_i64(cpu_ir[rc], cpu_ir[ra]); } else { - TCGv tmp1, tmp2; + TCGv tmp1; tmp1 = tcg_temp_new(); + tcg_gen_andi_i64(tmp1, cpu_ir[rb], 7); tcg_gen_shli_i64(tmp1, tmp1, 3); - tmp2 = tcg_const_i64(64); - tcg_gen_sub_i64(tmp1, tmp2, tmp1); - tcg_temp_free(tmp2); + tcg_gen_andi_i64(tmp1, tmp1, 0x3f); + tcg_gen_neg_i64(tmp1, tmp1); + tcg_gen_addi_i64(tmp1, tmp1, 64); tcg_gen_shl_i64(cpu_ir[rc], cpu_ir[ra], tmp1); + tcg_temp_free(tmp1); } if (tcg_gen_ext_i64) @@ -1316,7 +1318,7 @@ static inline int translate_one(DisasContext *ctx, uint32_t insn) break; case 0x6A: /* EXTLH */ - gen_ext_h(&tcg_gen_ext16u_i64, ra, rb, rc, islit, lit); + gen_ext_h(&tcg_gen_ext32u_i64, ra, rb, rc, islit, lit); break; case 0x72: /* MSKQH */