Message ID | 20240423063234.76282-7-rathc@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | target/ppc: Move fixed-point insns to | expand |
On Tue Apr 23, 2024 at 4:32 PM AEST, Chinmay Rath wrote: > Moving the below instructions to decodetree specification : > > divd[u, e, eu][o][.] : XO-form > mod{sd, ud} : X-form > > With this patch, all the fixed-point arithmetic instructions have been > moved to decodetree. > The changes were verified by validating that the tcg ops generated by those > instructions remain the same, which were captured using the '-d in_asm,op' flag. > Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is > now used to divide doubleword operands as well, and not just words. > > Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> > Reviewed-by: Richard Henderson <richard.henderson@linaro.org> [...] > +static bool do_divd(DisasContext *ctx, arg_XO *a, bool sign) > +{ > + gen_op_arith_divd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], > + sign, a->oe, a->rc); > + return true; > +} > + > +static bool do_modd(DisasContext *ctx, arg_X *a, bool sign) > +{ > + REQUIRE_INSNS_FLAGS2(ctx, ISA300); > + gen_op_arith_modd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], > + sign); > + return true; > +} > + > +TRANS64(DIVD, do_divd, true); > +TRANS64(DIVDU, do_divd, false); > +TRANS64(DIVDE, do_dive, gen_helper_DIVDE); > +TRANS64(DIVDEU, do_dive, gen_helper_DIVDEU); > + > +TRANS64(MODSD, do_modd, true); > +TRANS64(MODUD, do_modd, false); Sigh. I'm having to fix a bunch of these for 32-bit builds. Just doing the #ifdef TARGET_PPC64 ... #else qemu_build_not_reached(); thing. Which is quite ugly and actually prevents using some of these macros and requires open coding (e.g., because DIVDE helper is not declared for 32-bit in this case). Maybe we should move 64-bit only instructions into their own .decode file and not build them for 32-bit, so we don't have to add all these dummy translate functions for them. For now I'll try to squash in the fixes. Thanks, Nick
On 5/17/24 14:48, Nicholas Piggin wrote: > On Tue Apr 23, 2024 at 4:32 PM AEST, Chinmay Rath wrote: >> Moving the below instructions to decodetree specification : >> >> divd[u, e, eu][o][.] : XO-form >> mod{sd, ud} : X-form >> >> With this patch, all the fixed-point arithmetic instructions have been >> moved to decodetree. >> The changes were verified by validating that the tcg ops generated by those >> instructions remain the same, which were captured using the '-d in_asm,op' flag. >> Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is >> now used to divide doubleword operands as well, and not just words. >> >> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > [...] > >> +static bool do_divd(DisasContext *ctx, arg_XO *a, bool sign) >> +{ >> + gen_op_arith_divd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], >> + sign, a->oe, a->rc); >> + return true; >> +} >> + >> +static bool do_modd(DisasContext *ctx, arg_X *a, bool sign) >> +{ >> + REQUIRE_INSNS_FLAGS2(ctx, ISA300); >> + gen_op_arith_modd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], >> + sign); >> + return true; >> +} >> + >> +TRANS64(DIVD, do_divd, true); >> +TRANS64(DIVDU, do_divd, false); >> +TRANS64(DIVDE, do_dive, gen_helper_DIVDE); >> +TRANS64(DIVDEU, do_dive, gen_helper_DIVDEU); >> + >> +TRANS64(MODSD, do_modd, true); >> +TRANS64(MODUD, do_modd, false); > > Sigh. I'm having to fix a bunch of these for 32-bit builds. Just > doing the #ifdef TARGET_PPC64 ... #else qemu_build_not_reached(); > thing. > > Which is quite ugly and actually prevents using some of these > macros and requires open coding (e.g., because DIVDE helper is > not declared for 32-bit in this case). Compare sparc: # define gen_helper_pdist ({ qemu_build_not_reached(); NULL; }) etc. > Maybe we should move 64-bit only instructions into their own > .decode file and not build them for 32-bit, so we don't have > to add all these dummy translate functions for them. That's another option, yes. The decodetree script will take multiple input files to produce one output, so you could separate the insns by base vs 64-bit. r~
On Sat May 18, 2024 at 8:48 PM AEST, Richard Henderson wrote: > On 5/17/24 14:48, Nicholas Piggin wrote: > > On Tue Apr 23, 2024 at 4:32 PM AEST, Chinmay Rath wrote: > >> Moving the below instructions to decodetree specification : > >> > >> divd[u, e, eu][o][.] : XO-form > >> mod{sd, ud} : X-form > >> > >> With this patch, all the fixed-point arithmetic instructions have been > >> moved to decodetree. > >> The changes were verified by validating that the tcg ops generated by those > >> instructions remain the same, which were captured using the '-d in_asm,op' flag. > >> Also, remaned do_divwe method in fixedpoint-impl.c.inc to do_dive because it is > >> now used to divide doubleword operands as well, and not just words. > >> > >> Signed-off-by: Chinmay Rath <rathc@linux.ibm.com> > >> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> > > > > [...] > > > >> +static bool do_divd(DisasContext *ctx, arg_XO *a, bool sign) > >> +{ > >> + gen_op_arith_divd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], > >> + sign, a->oe, a->rc); > >> + return true; > >> +} > >> + > >> +static bool do_modd(DisasContext *ctx, arg_X *a, bool sign) > >> +{ > >> + REQUIRE_INSNS_FLAGS2(ctx, ISA300); > >> + gen_op_arith_modd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], > >> + sign); > >> + return true; > >> +} > >> + > >> +TRANS64(DIVD, do_divd, true); > >> +TRANS64(DIVDU, do_divd, false); > >> +TRANS64(DIVDE, do_dive, gen_helper_DIVDE); > >> +TRANS64(DIVDEU, do_dive, gen_helper_DIVDEU); > >> + > >> +TRANS64(MODSD, do_modd, true); > >> +TRANS64(MODUD, do_modd, false); > > > > Sigh. I'm having to fix a bunch of these for 32-bit builds. Just > > doing the #ifdef TARGET_PPC64 ... #else qemu_build_not_reached(); > > thing. > > > > Which is quite ugly and actually prevents using some of these > > macros and requires open coding (e.g., because DIVDE helper is > > not declared for 32-bit in this case). > > Compare sparc: > > # define gen_helper_pdist ({ qemu_build_not_reached(); NULL; }) > > etc. That would help indeed. > > > Maybe we should move 64-bit only instructions into their own > > .decode file and not build them for 32-bit, so we don't have > > to add all these dummy translate functions for them. > > That's another option, yes. The decodetree script will take multiple input files to > produce one output, so you could separate the insns by base vs 64-bit. Thinking about it a bit more, I guess the downside is that you would usually like to group instruction variants that operate on 64-bit data together with the others in the .decode file. Thanks, Nick
On Mon, 20 May 2024, 09:18 Nicholas Piggin, <npiggin@gmail.com> wrote: > > That's another option, yes. The decodetree script will take multiple > input files to > > produce one output, so you could separate the insns by base vs 64-bit. > > Thinking about it a bit more, I guess the downside is that you > would usually like to group instruction variants that operate on > 64-bit data together with the others in the .decode file. > A third option is not too care as much about minimising the ppc32 build and include all of the helpers all of the time. r~ >
diff --git a/target/ppc/helper.h b/target/ppc/helper.h index 09d0b0074b..e862bdceaf 100644 --- a/target/ppc/helper.h +++ b/target/ppc/helper.h @@ -52,8 +52,8 @@ DEF_HELPER_FLAGS_2(icbiep, TCG_CALL_NO_WG, void, env, tl) DEF_HELPER_5(lscbx, tl, env, tl, i32, i32, i32) #if defined(TARGET_PPC64) -DEF_HELPER_4(divdeu, i64, env, i64, i64, i32) -DEF_HELPER_4(divde, i64, env, i64, i64, i32) +DEF_HELPER_4(DIVDEU, i64, env, i64, i64, i32) +DEF_HELPER_4(DIVDE, i64, env, i64, i64, i32) #endif DEF_HELPER_4(DIVWEU, tl, env, tl, tl, i32) DEF_HELPER_4(DIVWE, tl, env, tl, tl, i32) diff --git a/target/ppc/insn32.decode b/target/ppc/insn32.decode index 61c59bbde0..509961023b 100644 --- a/target/ppc/insn32.decode +++ b/target/ppc/insn32.decode @@ -384,6 +384,14 @@ MADDLD 000100 ..... ..... ..... ..... 110011 @VA MADDHD 000100 ..... ..... ..... ..... 110000 @VA MADDHDU 000100 ..... ..... ..... ..... 110001 @VA +DIVD 011111 ..... ..... ..... . 111101001 . @XO +DIVDU 011111 ..... ..... ..... . 111001001 . @XO +DIVDE 011111 ..... ..... ..... . 110101001 . @XO +DIVDEU 011111 ..... ..... ..... . 110001001 . @XO + +MODSD 011111 ..... ..... ..... 1100001001 - @X +MODUD 011111 ..... ..... ..... 0100001001 - @X + ## Fixed-Point Logical Instructions CFUGED 011111 ..... ..... ..... 0011011100 - @X diff --git a/target/ppc/int_helper.c b/target/ppc/int_helper.c index bc25d5b062..585c2b65d3 100644 --- a/target/ppc/int_helper.c +++ b/target/ppc/int_helper.c @@ -101,7 +101,7 @@ target_ulong helper_DIVWE(CPUPPCState *env, target_ulong ra, target_ulong rb, #if defined(TARGET_PPC64) -uint64_t helper_divdeu(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe) +uint64_t helper_DIVDEU(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe) { uint64_t rt = 0; int overflow = 0; @@ -120,7 +120,7 @@ uint64_t helper_divdeu(CPUPPCState *env, uint64_t ra, uint64_t rb, uint32_t oe) return rt; } -uint64_t helper_divde(CPUPPCState *env, uint64_t rau, uint64_t rbu, uint32_t oe) +uint64_t helper_DIVDE(CPUPPCState *env, uint64_t rau, uint64_t rbu, uint32_t oe) { uint64_t rt = 0; int64_t ra = (int64_t)rau; diff --git a/target/ppc/translate.c b/target/ppc/translate.c index 8fa125d0ae..8900da85e5 100644 --- a/target/ppc/translate.c +++ b/target/ppc/translate.c @@ -1778,21 +1778,11 @@ static inline void gen_op_arith_divw(DisasContext *ctx, TCGv ret, gen_set_Rc0(ctx, ret); } } -/* div[wd]eu[o][.] */ -#define GEN_DIVE(name, hlpr, compute_ov) \ -static void gen_##name(DisasContext *ctx) \ -{ \ - TCGv_i32 t0 = tcg_constant_i32(compute_ov); \ - gen_helper_##hlpr(cpu_gpr[rD(ctx->opcode)], tcg_env, \ - cpu_gpr[rA(ctx->opcode)], cpu_gpr[rB(ctx->opcode)], t0); \ - if (unlikely(Rc(ctx->opcode) != 0)) { \ - gen_set_Rc0(ctx, cpu_gpr[rD(ctx->opcode)]); \ - } \ -} #if defined(TARGET_PPC64) -static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, TCGv arg1, - TCGv arg2, int sign, int compute_ov) +static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, + TCGv arg1, TCGv arg2, bool sign, + bool compute_ov, bool compute_rc0) { TCGv_i64 t0 = tcg_temp_new_i64(); TCGv_i64 t1 = tcg_temp_new_i64(); @@ -1824,29 +1814,10 @@ static inline void gen_op_arith_divd(DisasContext *ctx, TCGv ret, TCGv arg1, tcg_gen_or_tl(cpu_so, cpu_so, cpu_ov); } - if (unlikely(Rc(ctx->opcode) != 0)) { + if (unlikely(compute_rc0)) { gen_set_Rc0(ctx, ret); } } - -#define GEN_INT_ARITH_DIVD(name, opc3, sign, compute_ov) \ -static void glue(gen_, name)(DisasContext *ctx) \ -{ \ - gen_op_arith_divd(ctx, cpu_gpr[rD(ctx->opcode)], \ - cpu_gpr[rA(ctx->opcode)], cpu_gpr[rB(ctx->opcode)], \ - sign, compute_ov); \ -} -/* divdu divdu. divduo divduo. */ -GEN_INT_ARITH_DIVD(divdu, 0x0E, 0, 0); -GEN_INT_ARITH_DIVD(divduo, 0x1E, 0, 1); -/* divd divd. divdo divdo. */ -GEN_INT_ARITH_DIVD(divd, 0x0F, 1, 0); -GEN_INT_ARITH_DIVD(divdo, 0x1F, 1, 1); - -GEN_DIVE(divdeu, divdeu, 0); -GEN_DIVE(divdeuo, divdeu, 1); -GEN_DIVE(divde, divde, 0); -GEN_DIVE(divdeo, divde, 1); #endif static inline void gen_op_arith_modw(DisasContext *ctx, TCGv ret, TCGv arg1, @@ -1905,17 +1876,6 @@ static inline void gen_op_arith_modd(DisasContext *ctx, TCGv ret, TCGv arg1, tcg_gen_remu_i64(ret, t0, t1); } } - -#define GEN_INT_ARITH_MODD(name, opc3, sign) \ -static void glue(gen_, name)(DisasContext *ctx) \ -{ \ - gen_op_arith_modd(ctx, cpu_gpr[rD(ctx->opcode)], \ - cpu_gpr[rA(ctx->opcode)], cpu_gpr[rB(ctx->opcode)], \ - sign); \ -} - -GEN_INT_ARITH_MODD(modud, 0x08, 0); -GEN_INT_ARITH_MODD(modsd, 0x18, 1); #endif /* Common subf function */ @@ -6395,23 +6355,6 @@ GEN_HANDLER(lvsr, 0x1f, 0x06, 0x01, 0x00000001, PPC_ALTIVEC), GEN_HANDLER(mfvscr, 0x04, 0x2, 0x18, 0x001ff800, PPC_ALTIVEC), GEN_HANDLER(mtvscr, 0x04, 0x2, 0x19, 0x03ff0000, PPC_ALTIVEC), -#if defined(TARGET_PPC64) -#undef GEN_INT_ARITH_DIVD -#define GEN_INT_ARITH_DIVD(name, opc3, sign, compute_ov) \ -GEN_HANDLER(name, 0x1F, 0x09, opc3, 0x00000000, PPC_64B) -GEN_INT_ARITH_DIVD(divdu, 0x0E, 0, 0), -GEN_INT_ARITH_DIVD(divduo, 0x1E, 0, 1), -GEN_INT_ARITH_DIVD(divd, 0x0F, 1, 0), -GEN_INT_ARITH_DIVD(divdo, 0x1F, 1, 1), - -GEN_HANDLER_E(divdeu, 0x1F, 0x09, 0x0C, 0, PPC_NONE, PPC2_DIVE_ISA206), -GEN_HANDLER_E(divdeuo, 0x1F, 0x09, 0x1C, 0, PPC_NONE, PPC2_DIVE_ISA206), -GEN_HANDLER_E(divde, 0x1F, 0x09, 0x0D, 0, PPC_NONE, PPC2_DIVE_ISA206), -GEN_HANDLER_E(divdeo, 0x1F, 0x09, 0x1D, 0, PPC_NONE, PPC2_DIVE_ISA206), -GEN_HANDLER_E(modsd, 0x1F, 0x09, 0x18, 0x00000001, PPC_NONE, PPC2_ISA300), -GEN_HANDLER_E(modud, 0x1F, 0x09, 0x08, 0x00000001, PPC_NONE, PPC2_ISA300), -#endif - #undef GEN_LOGICAL1 #undef GEN_LOGICAL2 #define GEN_LOGICAL2(name, tcg_op, opc, type) \ diff --git a/target/ppc/translate/fixedpoint-impl.c.inc b/target/ppc/translate/fixedpoint-impl.c.inc index d9a5db83b3..f5dbf188bd 100644 --- a/target/ppc/translate/fixedpoint-impl.c.inc +++ b/target/ppc/translate/fixedpoint-impl.c.inc @@ -468,7 +468,7 @@ static bool do_divw(DisasContext *ctx, arg_XO *a, int sign) return true; } -static bool do_divwe(DisasContext *ctx, arg_XO *a, +static bool do_dive(DisasContext *ctx, arg_XO *a, void (*helper)(TCGv, TCGv_ptr, TCGv, TCGv, TCGv_i32)) { REQUIRE_INSNS_FLAGS2(ctx, DIVE_ISA206); @@ -482,8 +482,8 @@ static bool do_divwe(DisasContext *ctx, arg_XO *a, TRANS(DIVW, do_divw, 1); TRANS(DIVWU, do_divw, 0); -TRANS(DIVWE, do_divwe, gen_helper_DIVWE); -TRANS(DIVWEU, do_divwe, gen_helper_DIVWEU); +TRANS(DIVWE, do_dive, gen_helper_DIVWE); +TRANS(DIVWEU, do_dive, gen_helper_DIVWEU); static bool do_modw(DisasContext *ctx, arg_X *a, bool sign) { @@ -614,6 +614,29 @@ static bool trans_MADDHDU(DisasContext *ctx, arg_MADDHDU *a) return true; } +static bool do_divd(DisasContext *ctx, arg_XO *a, bool sign) +{ + gen_op_arith_divd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], + sign, a->oe, a->rc); + return true; +} + +static bool do_modd(DisasContext *ctx, arg_X *a, bool sign) +{ + REQUIRE_INSNS_FLAGS2(ctx, ISA300); + gen_op_arith_modd(ctx, cpu_gpr[a->rt], cpu_gpr[a->ra], cpu_gpr[a->rb], + sign); + return true; +} + +TRANS64(DIVD, do_divd, true); +TRANS64(DIVDU, do_divd, false); +TRANS64(DIVDE, do_dive, gen_helper_DIVDE); +TRANS64(DIVDEU, do_dive, gen_helper_DIVDEU); + +TRANS64(MODSD, do_modd, true); +TRANS64(MODUD, do_modd, false); + static bool trans_INVALID(DisasContext *ctx, arg_INVALID *a) { gen_invalid(ctx);