Message ID | 1636700049-24381-5-git-send-email-gaosong@loongson.cn |
---|---|
State | New |
Headers | show |
Series | Add LoongArch linux-user emulation support | expand |
On 11/12/21 7:53 AM, Song Gao wrote: > +# > +# Fields > +# > +%rd 0:5 > +%rj 5:5 > +%rk 10:5 > +%sa2 15:2 > +%si12 10:s12 > +%ui12 10:12 > +%si16 10:s16 > +%si20 5:s20 You should only create separate field definitions like this when they are complex: e.g. the logical field is disjoint or there's a need for !function. > + > +# > +# Argument sets > +# > +&fmt_rdrjrk rd rj rk > +&fmt_rdrjsi12 rd rj si12 > +&fmt_rdrjrksa2 rd rj rk sa2 > +&fmt_rdrjsi16 rd rj si16 > +&fmt_rdrjui12 rd rj ui12 > +&fmt_rdsi20 rd si20 Some of these should be combined. The width of the immediate is a detail of the format, not the decoded argument set. Thus you should have &fmt_rdimm rd imm &fmt_rdrjimm rd rj imm &fmt_rdrjrk rd rj rk &fmt_rdrjrksa rd rj rk sa > +alsl_w 0000 00000000 010 .. ..... ..... ..... @fmt_rdrjrksa2 > +alsl_wu 0000 00000000 011 .. ..... ..... ..... @fmt_rdrjrksa2 > +alsl_d 0000 00000010 110 .. ..... ..... ..... @fmt_rdrjrksa2 The encoding of these insns is that the shift is sa+1. While you compensate for this in gen_alsl_*, we print the "wrong" number in the disassembly. I think it would be better to do %sa2p1 15:2 !function=plus_1 @fmt_rdrjrksa2p1 .... ........ ... .. rk:5 rj:5 rd:5 \ &fmt_rdrjrksa sa=%sa2p1 r~
On 11/12/21 22:05, Richard Henderson wrote: > On 11/12/21 7:53 AM, Song Gao wrote: >> +# >> +# Fields >> +# >> +%rd 0:5 >> +%rj 5:5 >> +%rk 10:5 >> +%sa2 15:2 >> +%si12 10:s12 >> +%ui12 10:12 >> +%si16 10:s16 >> +%si20 5:s20 > > You should only create separate field definitions like this when they > are complex: e.g. the logical field is disjoint or there's a need for > !function. > >> + >> +# >> +# Argument sets >> +# >> +&fmt_rdrjrk rd rj rk >> +&fmt_rdrjsi12 rd rj si12 >> +&fmt_rdrjrksa2 rd rj rk sa2 >> +&fmt_rdrjsi16 rd rj si16 >> +&fmt_rdrjui12 rd rj ui12 >> +&fmt_rdsi20 rd si20 > > Some of these should be combined. The width of the immediate is a > detail of the format, not the decoded argument set. Thus you should have > > &fmt_rdimm rd imm > &fmt_rdrjimm rd rj imm > &fmt_rdrjrk rd rj rk > &fmt_rdrjrksa rd rj rk sa I'd like to add, that the organization of the whole decodetree file closely resembles that of the ISA manual, most likely on purpose (while not stated anywhere in the patch). However the manual itself is not without errors or inconsistencies; for example, the 9 "base instruction formats" classification is nowhere near accurate, and here we can see the author is forced to create ad-hoc names (repeating the operand slots). I suggest just generating the descriptions from the loongarch-opcodes project [1]; no need to duplicate work. I'll happily help if you decide to do that. [1]: https://github.com/loongson-community/loongarch-opcodes > >> +alsl_w 0000 00000000 010 .. ..... ..... ..... >> @fmt_rdrjrksa2 >> +alsl_wu 0000 00000000 011 .. ..... ..... ..... @fmt_rdrjrksa2 >> +alsl_d 0000 00000010 110 .. ..... ..... ..... @fmt_rdrjrksa2 > > The encoding of these insns is that the shift is sa+1. > > While you compensate for this in gen_alsl_*, we print the "wrong" > number in the disassembly. I think it would be better to do > > %sa2p1 15:2 !function=plus_1 > @fmt_rdrjrksa2p1 .... ........ ... .. rk:5 rj:5 rd:5 \ > &fmt_rdrjrksa sa=%sa2p1 Here again, the manual was inconsistent with the binutils implementation; the manual says (for ALSL.W, it's SLADD in loongarch-opcodes project's revised mnemonics): "ALSL.W logically left-shifts rj[31:0] by (sa2+1) bits, [snip]" (translation mine, not copied from the official translation) Clearly the "+1" part is not meant to show up in disassembly. Yet the binutils implementation acts as if the operand should be pre-added 1 in source code, and disassembles and prints as such, obvious mismatch here. I'd suggest fixing the disassembly code to remove this inconsistency. And the "+1" "feature" is not used anywhere else AFAIK, so it wouldn't hurt to just delete everything about it. > > > r~ >
Hi Richard, On 2021/11/12 下午10:05, Richard Henderson wrote: > On 11/12/21 7:53 AM, Song Gao wrote: >> +# >> +# Fields >> +# >> +%rd 0:5 >> +%rj 5:5 >> +%rk 10:5 >> +%sa2 15:2 >> +%si12 10:s12 >> +%ui12 10:12 >> +%si16 10:s16 >> +%si20 5:s20 > > You should only create separate field definitions like this when they > are complex: e.g. the logical field is disjoint or there's a need for > !function. > >> + >> +# >> +# Argument sets >> +# >> +&fmt_rdrjrk rd rj rk >> +&fmt_rdrjsi12 rd rj si12 >> +&fmt_rdrjrksa2 rd rj rk sa2 >> +&fmt_rdrjsi16 rd rj si16 >> +&fmt_rdrjui12 rd rj ui12 >> +&fmt_rdsi20 rd si20 > > Some of these should be combined. The width of the immediate is a > detail of the format, not the decoded argument set. Thus you should have > > &fmt_rdimm rd imm > &fmt_rdrjimm rd rj imm > &fmt_rdrjrk rd rj rk > &fmt_rdrjrksa rd rj rk sa > 'The width of the immediate is a detail of the format' means: &fmt_rdrjimm rd rj imm @fmt_rdrjimm .... ...... imm:12 rj:5 rd:5 &fmt_rdrjimm @fmt_rdrjimm14 .... .... imm:14 rj:5 rd:5 &fmt_rdrjimm @fmt_rdrjimm16 .... .. imm:16 rj:5 rd:5 &fmt_rdrjimm and we print in the disassembly, liks this output_rdrjimm(DisasContext *ctx, arg_fmt_rdrjimm * a, const char *mnemonic) { output(ctx, mnemonic, "%s, %s, 0x%x", regnames[a->rd], regnames[a->rj], a->imm); } is that right? >> +alsl_w 0000 00000000 010 .. ..... ..... ..... >> @fmt_rdrjrksa2 >> +alsl_wu 0000 00000000 011 .. ..... ..... ..... >> @fmt_rdrjrksa2 >> +alsl_d 0000 00000010 110 .. ..... ..... ..... >> @fmt_rdrjrksa2 > > The encoding of these insns is that the shift is sa+1. > > While you compensate for this in gen_alsl_*, we print the "wrong" > number in the disassembly. I think it would be better to do > > %sa2p1 15:2 !function=plus_1 > @fmt_rdrjrksa2p1 .... ........ ... .. rk:5 rj:5 rd:5 \ > &fmt_rdrjrksa sa=%sa2p1 > 1. We print sa in disassembly output_rdrjrksa(DisasContext *ctx, arg_fmt_rdrjsa* a, const char *memonic) { output(ctx, memonic, "%s, %s, %s, 0x0x", regnames[a->rd], regnames[a->rj], a->sa) } 2. We use sa on gen_alsl_* not (sa2+1). 3 bytepick_w use the same print functions. but the Field/Argurment/Format is %sa2 15:2 &fmt_rdrjrksa rd rj sa @fmt_rdrjrk sa2 .... ........ ... sa:2 rk:5 rj:5 rd:5 &fmt_rdrjrksa Is my understanding right? Thanks. Song Gao
On 11/15/21 4:59 AM, gaosong wrote: > 'The width of the immediate is a detail of the format' means: > > &fmt_rdrjimm rd rj imm > > @fmt_rdrjimm .... ...... imm:12 rj:5 rd:5 &fmt_rdrjimm > @fmt_rdrjimm14 .... .... imm:14 rj:5 rd:5 &fmt_rdrjimm > @fmt_rdrjimm16 .... .. imm:16 rj:5 rd:5 &fmt_rdrjimm > > and we print in the disassembly, liks this > > output_rdrjimm(DisasContext *ctx, arg_fmt_rdrjimm * a, const char *mnemonic) > { > output(ctx, mnemonic, "%s, %s, 0x%x", regnames[a->rd], regnames[a->rj], a->imm); > } > > is that right? Yes. I'll note that regnames[] is defined in target/loongarch/cpu.c, which is not available when we want to use this disassembler for tcg/loongarch64/. I think it would be easier to print this as "r%d", a->rd so that you do not need to rely on the external strings. I also think you should print signed numbers, "%d", because 0xfffffff8 (truncated to 32 bits) is not really the correct representation of -8 for a 64-bit operand. > 1. We print sa in disassembly... > 2. We use sa on gen_alsl_* not (sa2+1). > 3. bytepick_w use the same print functions. > Is my understanding right? Yes, that is the issue I am describing. r~
Hi Richard, On 2021/11/15 下午4:42, Richard Henderson wrote: > On 11/15/21 4:59 AM, gaosong wrote: >> 'The width of the immediate is a detail of the format' means: >> >> &fmt_rdrjimm rd rj imm >> >> @fmt_rdrjimm .... ...... imm:12 rj:5 rd:5 &fmt_rdrjimm >> @fmt_rdrjimm14 .... .... imm:14 rj:5 rd:5 &fmt_rdrjimm >> @fmt_rdrjimm16 .... .. imm:16 rj:5 rd:5 &fmt_rdrjimm >> >> and we print in the disassembly, liks this >> >> output_rdrjimm(DisasContext *ctx, arg_fmt_rdrjimm * a, const char >> *mnemonic) >> { >> output(ctx, mnemonic, "%s, %s, 0x%x", regnames[a->rd], >> regnames[a->rj], a->imm); >> } >> >> is that right? > > Yes. > > I'll note that regnames[] is defined in target/loongarch/cpu.c, which > is not available when we want to use this disassembler for > tcg/loongarch64/. I think it would be easier to print this as > > "r%d", a->rd > > so that you do not need to rely on the external strings. > > I also think you should print signed numbers, "%d", because 0xfffffff8 > (truncated to 32 bits) is not really the correct representation of -8 > for a 64-bit operand. > > >> 1. We print sa in disassembly... >> 2. We use sa on gen_alsl_* not (sa2+1). >> 3. bytepick_w use the same print functions. >> Is my understanding right? > > Yes, that is the issue I am describing. > I see that insns.decode format is not very consistent with other architectures, such ARM/RISCV I'll correct it , like this: # Fields # %sa2p1 15:2 !function=plus_1 # # Argument sets # &r_i rd imm &rrr rd rj rk &rr_i rd rj imm &rrr_sa rd rj rk sa # # Formats # @fmt_rrr .... ........ ..... rk:5 rj:5 rd:5 &rrr @fmt_r_i20 .... ... imm:s20 rd:5 &r_i @fmt_rr_i12 .... ...... imm:s12 rj:5 rd:5 &rr_i @fmt_rr_ui12 .... ...... imm:12 rj:5 rd:5 &rr_i @fmt_rr_i16 .... .. imm:s16 rj:5 rd:5 &rr_i @fmt_rrr_sa2p1 .... ........ ... .. rk:5 rj:5 rd:5 &rrr_sa sa=%sa2p1 # # Fixed point arithmetic operation instruction # add_w 0000 00000001 00000 ..... ..... ..... @fmt_rrr add_d 0000 00000001 00001 ..... ..... ..... @fmt_rrr sub_w 0000 00000001 00010 ..... ..... ..... @fmt_rrr sub_d 0000 00000001 00011 ..... ..... ..... @fmt_rrr slt 0000 00000001 00100 ..... ..... ..... @fmt_rrr sltu 0000 00000001 00101 ..... ..... ..... @fmt_rrr slti 0000 001000 ............ ..... ..... @fmt_rr_i12 and trans_xxx.c.inc static bool gen_rrr(DisasContext *ctx, arg_rrr *a, ...) {} static bool gen_rr_i12(DisasContext *ctx, arg_rr_i *a, ) {} static bool gen_rrr_sa2p1(DisasContext *ctx, arg_rrr_sa *a, ...) {} ... Richard, is that OK? Thanks, Song Gao
On 11/17/21 8:57 AM, gaosong wrote: > I see that insns.decode format is not very consistent with other architectures, such > ARM/RISCV No. I don't like how riscv has done it, though they have quite a few split fields, so perhaps they thought it looked weird. > # > # Argument sets > # > &r_i rd imm > &rrr rd rj rk > &rr_i rd rj imm > &rrr_sa rd rj rk sa > > # > # Formats > # > @fmt_rrr .... ........ ..... rk:5 rj:5 rd:5 &rrr > @fmt_r_i20 .... ... imm:s20 rd:5 &r_i > @fmt_rr_i12 .... ...... imm:s12 rj:5 rd:5 &rr_i > @fmt_rr_ui12 .... ...... imm:12 rj:5 rd:5 &rr_i > @fmt_rr_i16 .... .. imm:s16 rj:5 rd:5 &rr_i > @fmt_rrr_sa2p1 .... ........ ... .. rk:5 rj:5 rd:5 &rrr_sa sa=%sa2p1 > > # > # Fixed point arithmetic operation instruction > # > add_w 0000 00000001 00000 ..... ..... ..... @fmt_rrr > add_d 0000 00000001 00001 ..... ..... ..... @fmt_rrr > sub_w 0000 00000001 00010 ..... ..... ..... @fmt_rrr > sub_d 0000 00000001 00011 ..... ..... ..... @fmt_rrr > slt 0000 00000001 00100 ..... ..... ..... @fmt_rrr > sltu 0000 00000001 00101 ..... ..... ..... @fmt_rrr > slti 0000 001000 ............ ..... ..... @fmt_rr_i12 > > > and trans_xxx.c.inc > > static bool gen_rrr(DisasContext *ctx, arg_rrr *a, ...) {} > static bool gen_rr_i12(DisasContext *ctx, arg_rr_i *a, ) {} gen_rr_i ? > static bool gen_rrr_sa2p1(DisasContext *ctx, arg_rrr_sa *a, ...) {} gen_rrr_sa ? > Richard, is that OK? Other than those two nits, this looks very clean. Thanks, r~
Hi Richard, On 2021/11/17 下午4:28, Richard Henderson wrote: > On 11/17/21 8:57 AM, gaosong wrote: >> I see that insns.decode format is not very consistent with other >> architectures, such ARM/RISCV > > No. I don't like how riscv has done it, though they have quite a few > split fields, so perhaps they thought it looked weird. > > >> # >> # Argument sets >> # >> &r_i rd imm >> &rrr rd rj rk >> &rr_i rd rj imm >> &rrr_sa rd rj rk sa >> >> # >> # Formats >> # >> @fmt_rrr .... ........ ..... rk:5 rj:5 rd:5 &rrr >> @fmt_r_i20 .... ... imm:s20 rd:5 &r_i >> @fmt_rr_i12 .... ...... imm:s12 rj:5 rd:5 &rr_i >> @fmt_rr_ui12 .... ...... imm:12 rj:5 rd:5 &rr_i >> @fmt_rr_i16 .... .. imm:s16 rj:5 rd:5 &rr_i >> @fmt_rrr_sa2p1 .... ........ ... .. rk:5 rj:5 rd:5 &rrr_sa >> sa=%sa2p1 >> >> # >> # Fixed point arithmetic operation instruction >> # >> add_w 0000 00000001 00000 ..... ..... ..... @fmt_rrr >> add_d 0000 00000001 00001 ..... ..... ..... @fmt_rrr >> sub_w 0000 00000001 00010 ..... ..... ..... @fmt_rrr >> sub_d 0000 00000001 00011 ..... ..... ..... @fmt_rrr >> slt 0000 00000001 00100 ..... ..... ..... @fmt_rrr >> sltu 0000 00000001 00101 ..... ..... ..... @fmt_rrr >> slti 0000 001000 ............ ..... ..... >> @fmt_rr_i12 >> >> >> and trans_xxx.c.inc >> >> static bool gen_rrr(DisasContext *ctx, arg_rrr *a, ...) {} >> static bool gen_rr_i12(DisasContext *ctx, arg_rr_i *a, ) {} > > gen_rr_i ? The code is not written completely, like this: gen_rr_i12: @fmt_rr_i12 .... ...... imm:s12 rj:5 rd:5 &rr_i slti 0000 001000 ............ ..... ..... @fmt_rr_i12 sltui 0000 001001 ............ ..... ..... @fmt_rr_i12 ... gen_rr_ui12: @fmt_rr_ui12 .... ...... imm:12 rj:5 rd:5 &rr_i andi 0000 001101 ............ ..... ..... @fmt_rr_ui12 ori 0000 001110 ............ ..... ..... @fmt_rr_ui12 xori 0000 001111 ............ ..... ..... @fmt_rr_ui12 ... @fmt_rr_i12 and @fmt_rr_ui12 are two 'Formats', but they use the same 'Argument sets'(rr_i). > >> static bool gen_rrr_sa2p1(DisasContext *ctx, arg_rrr_sa *a, ...) {} > > gen_rrr_sa ? > Likewise. gen_rrr_sa2p1: @fmt_rrr_sa2p1 .... ........ ... .. rk:5 rj:5 rd:5 &fmt_rr_sa sa=%sa2p1 lsl_w 0000 00000000 010 .. ..... ..... ..... @fmt_rrr_sa2p1 alsl_wu 0000 00000000 011 .. ..... ..... ..... @fmt_rrr_sa2p1 alsl_d 0000 00000010 110 .. ..... ..... ..... @fmt_rrr_sa2p1 ... gen_rrr_sa2: @fmt_rrr_sa2 .... ........ ... sa:2 rk:5 rj:5 rd:5 &fmt_rr_sa bytepick_w 0000 00000000 100 .. ..... ..... ..... @fmt_rrr_sa3 ... gen_rrr_sa3: @fmt_rrr_sa3 .... ........ .. sa:3 rk:5 rj:5 rd:5 &fmt_rr_sa bytepick_d 0000 00000000 11 ... ..... ..... ..... @fmt_rrr_sa3 ... >> Richard, is that OK? > > Other than those two nits, this looks very clean. Thanks, > OK, I'll correct it on v11. Thanks. Song Gao
On 11/17/21 10:29 AM, gaosong wrote: >> gen_rr_i ? > > The code is not written completely, like this: > > gen_rr_i12: > > @fmt_rr_i12 .... ...... imm:s12 rj:5 rd:5 &rr_i > slti 0000 001000 ............ ..... ..... @fmt_rr_i12 > sltui 0000 001001 ............ ..... ..... @fmt_rr_i12 > ... > > gen_rr_ui12: > > @fmt_rr_ui12 .... ...... imm:12 rj:5 rd:5 &rr_i > andi 0000 001101 ............ ..... ..... @fmt_rr_ui12 > ori 0000 001110 ............ ..... ..... @fmt_rr_ui12 > xori 0000 001111 ............ ..... ..... @fmt_rr_ui12 > ... > > @fmt_rr_i12 and @fmt_rr_ui12 are two 'Formats', but they use the same 'Argument sets'(rr_i). What I meant is that there would be a single gen_rr_i function handing the argument set rr_i; no need for two gen_rr_i* functions. > gen_rrr_sa2p1: > > @fmt_rrr_sa2p1 .... ........ ... .. rk:5 rj:5 rd:5 &fmt_rr_sa sa=%sa2p1 > lsl_w 0000 00000000 010 .. ..... ..... ..... @fmt_rrr_sa2p1 > alsl_wu 0000 00000000 011 .. ..... ..... ..... @fmt_rrr_sa2p1 > alsl_d 0000 00000010 110 .. ..... ..... ..... @fmt_rrr_sa2p1 > ... > > gen_rrr_sa2: > @fmt_rrr_sa2 .... ........ ... sa:2 rk:5 rj:5 rd:5 &fmt_rr_sa > bytepick_w 0000 00000000 100 .. ..... ..... ..... @fmt_rrr_sa3 > ... > > gen_rrr_sa3: > @fmt_rrr_sa3 .... ........ .. sa:3 rk:5 rj:5 rd:5 &fmt_rr_sa > bytepick_d 0000 00000000 11 ... ..... ..... ..... @fmt_rrr_sa3 > ... Likewise a single gen_rrr_sa function. r~
Hi Richard, On 2021/11/17 下午5:55, Richard Henderson wrote: >> >> @fmt_rr_i12 and @fmt_rr_ui12 are two 'Formats', but they use the >> same 'Argument sets'(rr_i). > > What I meant is that there would be a single gen_rr_i function handing > the argument set rr_i; no need for two gen_rr_i* functions. Got it. Thanks. Song Gao
diff --git a/target/loongarch/insn_trans/trans_arith.c.inc b/target/loongarch/insn_trans/trans_arith.c.inc new file mode 100644 index 0000000..384a158 --- /dev/null +++ b/target/loongarch/insn_trans/trans_arith.c.inc @@ -0,0 +1,319 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (c) 2021 Loongson Technology Corporation Limited + */ + +static bool gen_r3(DisasContext *ctx, arg_fmt_rdrjrk *a, + DisasExtend src1_ext, DisasExtend src2_ext, + DisasExtend dst_ext, void (*func)(TCGv, TCGv, TCGv)) +{ + TCGv dest = gpr_dst(ctx, a->rd, dst_ext); + TCGv src1 = gpr_src(ctx, a->rj, src1_ext); + TCGv src2 = gpr_src(ctx, a->rk, src2_ext); + + func(dest, src1, src2); + + /* dst_ext is EXT_NONE and input is dest, We don't run gen_set_gpr. */ + if (dst_ext) { + gen_set_gpr(a->rd, dest, dst_ext); + } + return true; +} + +static bool gen_r2_si12(DisasContext *ctx, arg_fmt_rdrjsi12 *a, + DisasExtend src_ext, DisasExtend dst_ext, + void (*func)(TCGv, TCGv, TCGv)) +{ + TCGv dest = gpr_dst(ctx, a->rd, dst_ext); + TCGv src1 = gpr_src(ctx, a->rj, src_ext); + TCGv src2 = tcg_constant_tl(a->si12); + + func(dest, src1, src2); + + if (dst_ext) { + gen_set_gpr(a->rd, dest, dst_ext); + } + return true; +} + +static bool gen_r3_sa2(DisasContext *ctx, arg_fmt_rdrjrksa2 *a, + DisasExtend src_ext, DisasExtend dst_ext, + void (*func)(TCGv, TCGv, TCGv, TCGv, target_long)) +{ + TCGv dest = gpr_dst(ctx, a->rd, dst_ext); + TCGv src1 = gpr_src(ctx, a->rj, src_ext); + TCGv src2 = gpr_src(ctx, a->rk, src_ext); + TCGv temp = tcg_temp_new(); + + func(dest, src1, src2, temp, a->sa2); + + if (dst_ext) { + gen_set_gpr(a->rd, dest, dst_ext); + } + tcg_temp_free(temp); + return true; +} + +static bool trans_lu12i_w(DisasContext *ctx, arg_lu12i_w *a) +{ + TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE); + + tcg_gen_movi_tl(dest, a->si20 << 12); + return true; +} + +static bool gen_pc(DisasContext *ctx, arg_fmt_rdsi20 *a, + target_ulong (*func)(target_ulong, int)) +{ + TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE); + target_ulong addr = func(ctx->base.pc_next, a->si20); + + tcg_gen_movi_tl(dest, addr); + return true; +} + +static bool gen_r2_ui12(DisasContext *ctx, arg_fmt_rdrjui12 *a, + void (*func)(TCGv, TCGv, target_long)) +{ + TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE); + TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE); + + func(dest, src1, a->ui12); + return true; +} + +static void gen_slt(TCGv dest, TCGv src1, TCGv src2) +{ + tcg_gen_setcond_tl(TCG_COND_LT, dest, src1, src2); +} + +static void gen_sltu(TCGv dest, TCGv src1, TCGv src2) +{ + tcg_gen_setcond_tl(TCG_COND_LTU, dest, src1, src2); +} + +static void gen_mulh_w(TCGv dest, TCGv src1, TCGv src2) +{ + tcg_gen_mul_i64(dest, src1, src2); + tcg_gen_sari_i64(dest, dest, 32); +} + +static void gen_mulh_wu(TCGv dest, TCGv src1, TCGv src2) +{ + tcg_gen_mul_i64(dest, src1, src2); + tcg_gen_sari_i64(dest, dest, 32); +} + +static void gen_mulh_d(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv discard = tcg_temp_new(); + tcg_gen_muls2_tl(discard, dest, src1, src2); + tcg_temp_free(discard); +} + +static void gen_mulh_du(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv discard = tcg_temp_new(); + tcg_gen_mulu2_tl(discard, dest, src1, src2); + tcg_temp_free(discard); +} + +static void prep_divisor_d(TCGv ret, TCGv src1, TCGv src2) +{ + TCGv t0 = tcg_temp_new(); + TCGv t1 = tcg_temp_new(); + TCGv zero = tcg_constant_tl(0); + + /* + * If min / -1, set the divisor to 1. + * This avoids potential host overflow trap and produces min. + * If x / 0, set the divisor to 1. + * This avoids potential host overflow trap; + * the required result is undefined. + */ + tcg_gen_setcondi_tl(TCG_COND_EQ, ret, src1, INT64_MIN); + tcg_gen_setcondi_tl(TCG_COND_EQ, t0, src2, -1); + tcg_gen_setcondi_tl(TCG_COND_EQ, t1, src2, 0); + tcg_gen_and_tl(ret, ret, t0); + tcg_gen_or_tl(ret, ret, t1); + tcg_gen_movcond_tl(TCG_COND_NE, ret, ret, zero, ret, src2); + + tcg_temp_free(t0); + tcg_temp_free(t1); +} + +static void prep_divisor_du(TCGv ret, TCGv src2) +{ + TCGv zero = tcg_constant_tl(0); + TCGv one = tcg_constant_tl(1); + + /* + * If x / 0, set the divisor to 1. + * This avoids potential host overflow trap; + * the required result is undefined. + */ + tcg_gen_movcond_tl(TCG_COND_EQ, ret, src2, zero, one, src2); +} + +static void gen_div_d(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv t0 = tcg_temp_new(); + prep_divisor_d(t0, src1, src2); + tcg_gen_div_tl(dest, src1, t0); + tcg_temp_free(t0); +} + +static void gen_rem_d(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv t0 = tcg_temp_new(); + prep_divisor_d(t0, src1, src2); + tcg_gen_rem_tl(dest, src1, t0); + tcg_temp_free(t0); +} + +static void gen_div_du(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv t0 = tcg_temp_new(); + prep_divisor_du(t0, src2); + tcg_gen_divu_tl(dest, src1, t0); + tcg_temp_free(t0); +} + +static void gen_rem_du(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv t0 = tcg_temp_new(); + prep_divisor_du(t0, src2); + tcg_gen_remu_tl(dest, src1, t0); + tcg_temp_free(t0); +} + +static void gen_div_w(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv t0 = tcg_temp_new(); + /* We need not check for integer overflow for div_w. */ + prep_divisor_du(t0, src2); + tcg_gen_div_tl(dest, src1, t0); + tcg_temp_free(t0); +} + +static void gen_rem_w(TCGv dest, TCGv src1, TCGv src2) +{ + TCGv t0 = tcg_temp_new(); + /* We need not check for integer overflow for rem_w. */ + prep_divisor_du(t0, src2); + tcg_gen_rem_tl(dest, src1, t0); + tcg_temp_free(t0); +} + +static void gen_alsl_w(TCGv dest, TCGv src1, TCGv src2, + TCGv temp, target_long sa2) +{ + tcg_gen_shli_tl(temp, src1, sa2 + 1); + tcg_gen_add_tl(dest, temp, src2); +} + +static void gen_alsl_wu(TCGv dest, TCGv src1, TCGv src2, + TCGv temp, target_long sa2) +{ + tcg_gen_shli_tl(temp, src1, sa2 + 1); + tcg_gen_add_tl(dest, temp, src2); +} + +static void gen_alsl_d(TCGv dest, TCGv src1, TCGv src2, + TCGv temp, target_long sa2) +{ + tcg_gen_shli_tl(temp, src1, sa2 + 1); + tcg_gen_add_tl(dest, temp, src2); +} + +static bool trans_lu32i_d(DisasContext *ctx, arg_lu32i_d *a) +{ + TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE); + TCGv src1 = gpr_src(ctx, a->rd, EXT_NONE); + TCGv src2 = tcg_constant_tl(a->si20); + + tcg_gen_deposit_tl(dest, src1, src2, 32, 32); + return true; +} + +static bool trans_lu52i_d(DisasContext *ctx, arg_lu52i_d *a) +{ + TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE); + TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE); + TCGv src2 = tcg_constant_tl(a->si12); + + tcg_gen_deposit_tl(dest, src1, src2, 52, 12); + return true; +} + +static target_ulong gen_pcaddi(target_ulong pc, int si20) +{ + return pc + (si20 << 2); +} + +static target_ulong gen_pcalau12i(target_ulong pc, int si20) +{ + return (pc + (si20 << 12)) & ~0xfff; +} + +static target_ulong gen_pcaddu12i(target_ulong pc, int si20) +{ + return pc + (si20 << 12); +} + +static target_ulong gen_pcaddu18i(target_ulong pc, int si20) +{ + return pc + ((target_ulong)(si20) << 18); +} + +static bool trans_addu16i_d(DisasContext *ctx, arg_addu16i_d *a) +{ + TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE); + TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE); + + tcg_gen_addi_tl(dest, src1, a->si16 << 16); + return true; +} + +TRANS(add_w, gen_r3, EXT_NONE, EXT_NONE, EXT_SIGN, tcg_gen_add_tl) +TRANS(add_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_add_tl) +TRANS(sub_w, gen_r3, EXT_NONE, EXT_NONE, EXT_SIGN, tcg_gen_sub_tl) +TRANS(sub_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_sub_tl) +TRANS(and, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_and_tl) +TRANS(or, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_or_tl) +TRANS(xor, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_xor_tl) +TRANS(nor, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_nor_tl) +TRANS(andn, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_andc_tl) +TRANS(orn, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_orc_tl) +TRANS(slt, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_slt) +TRANS(sltu, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_sltu) +TRANS(mul_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_SIGN, tcg_gen_mul_tl) +TRANS(mul_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_mul_tl) +TRANS(mulh_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_NONE, gen_mulh_w) +TRANS(mulh_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_NONE, gen_mulh_wu) +TRANS(mulh_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_mulh_d) +TRANS(mulh_du, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_mulh_du) +TRANS(mulw_d_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_NONE, tcg_gen_mul_tl) +TRANS(mulw_d_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_NONE, tcg_gen_mul_tl) +TRANS(div_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_SIGN, gen_div_w) +TRANS(mod_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_SIGN, gen_rem_w) +TRANS(div_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_SIGN, gen_div_du) +TRANS(mod_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_SIGN, gen_rem_du) +TRANS(div_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_div_d) +TRANS(mod_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_rem_d) +TRANS(div_du, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_div_du) +TRANS(mod_du, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_rem_du) +TRANS(slti, gen_r2_si12, EXT_NONE, EXT_NONE, gen_slt) +TRANS(sltui, gen_r2_si12, EXT_NONE, EXT_NONE, gen_sltu) +TRANS(addi_w, gen_r2_si12, EXT_NONE, EXT_SIGN, tcg_gen_add_tl) +TRANS(addi_d, gen_r2_si12, EXT_NONE, EXT_NONE, tcg_gen_add_tl) +TRANS(alsl_w, gen_r3_sa2, EXT_NONE, EXT_SIGN, gen_alsl_w) +TRANS(alsl_wu, gen_r3_sa2, EXT_NONE, EXT_ZERO, gen_alsl_wu) +TRANS(alsl_d, gen_r3_sa2, EXT_NONE, EXT_NONE, gen_alsl_d) +TRANS(pcaddi, gen_pc, gen_pcaddi) +TRANS(pcalau12i, gen_pc, gen_pcalau12i) +TRANS(pcaddu12i, gen_pc, gen_pcaddu12i) +TRANS(pcaddu18i, gen_pc, gen_pcaddu18i) +TRANS(andi, gen_r2_ui12, tcg_gen_andi_tl) +TRANS(ori, gen_r2_ui12, tcg_gen_ori_tl) +TRANS(xori, gen_r2_ui12, tcg_gen_xori_tl) diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode new file mode 100644 index 0000000..3e6a051 --- /dev/null +++ b/target/loongarch/insns.decode @@ -0,0 +1,88 @@ +# SPDX-License-Identifier: GPL-2.0-or-later +# +# LoongArch instruction decode definitions. +# +# Copyright (c) 2021 Loongson Technology Corporation Limited +# + +# +# Fields +# +%rd 0:5 +%rj 5:5 +%rk 10:5 +%sa2 15:2 +%si12 10:s12 +%ui12 10:12 +%si16 10:s16 +%si20 5:s20 + +# +# Argument sets +# +&fmt_rdrjrk rd rj rk +&fmt_rdrjsi12 rd rj si12 +&fmt_rdrjrksa2 rd rj rk sa2 +&fmt_rdrjsi16 rd rj si16 +&fmt_rdrjui12 rd rj ui12 +&fmt_rdsi20 rd si20 + +# +# Formats +# +@fmt_rdrjrk .... ........ ..... ..... ..... ..... &fmt_rdrjrk %rd %rj %rk +@fmt_rdrjsi12 .... ...... ............ ..... ..... &fmt_rdrjsi12 %rd %rj %si12 +@fmt_rdrjui12 .... ...... ............ ..... ..... &fmt_rdrjui12 %rd %rj %ui12 +@fmt_rdrjrksa2 .... ........ ... .. ..... ..... ..... &fmt_rdrjrksa2 %rd %rj %rk %sa2 +@fmt_rdrjsi16 .... .. ................ ..... ..... &fmt_rdrjsi16 %rd %rj %si16 +@fmt_rdsi20 .... ... .................... ..... &fmt_rdsi20 %rd %si20 + +# +# Fixed point arithmetic operation instruction +# +add_w 0000 00000001 00000 ..... ..... ..... @fmt_rdrjrk +add_d 0000 00000001 00001 ..... ..... ..... @fmt_rdrjrk +sub_w 0000 00000001 00010 ..... ..... ..... @fmt_rdrjrk +sub_d 0000 00000001 00011 ..... ..... ..... @fmt_rdrjrk +slt 0000 00000001 00100 ..... ..... ..... @fmt_rdrjrk +sltu 0000 00000001 00101 ..... ..... ..... @fmt_rdrjrk +slti 0000 001000 ............ ..... ..... @fmt_rdrjsi12 +sltui 0000 001001 ............ ..... ..... @fmt_rdrjsi12 +nor 0000 00000001 01000 ..... ..... ..... @fmt_rdrjrk +and 0000 00000001 01001 ..... ..... ..... @fmt_rdrjrk +or 0000 00000001 01010 ..... ..... ..... @fmt_rdrjrk +xor 0000 00000001 01011 ..... ..... ..... @fmt_rdrjrk +orn 0000 00000001 01100 ..... ..... ..... @fmt_rdrjrk +andn 0000 00000001 01101 ..... ..... ..... @fmt_rdrjrk +mul_w 0000 00000001 11000 ..... ..... ..... @fmt_rdrjrk +mulh_w 0000 00000001 11001 ..... ..... ..... @fmt_rdrjrk +mulh_wu 0000 00000001 11010 ..... ..... ..... @fmt_rdrjrk +mul_d 0000 00000001 11011 ..... ..... ..... @fmt_rdrjrk +mulh_d 0000 00000001 11100 ..... ..... ..... @fmt_rdrjrk +mulh_du 0000 00000001 11101 ..... ..... ..... @fmt_rdrjrk +mulw_d_w 0000 00000001 11110 ..... ..... ..... @fmt_rdrjrk +mulw_d_wu 0000 00000001 11111 ..... ..... ..... @fmt_rdrjrk +div_w 0000 00000010 00000 ..... ..... ..... @fmt_rdrjrk +mod_w 0000 00000010 00001 ..... ..... ..... @fmt_rdrjrk +div_wu 0000 00000010 00010 ..... ..... ..... @fmt_rdrjrk +mod_wu 0000 00000010 00011 ..... ..... ..... @fmt_rdrjrk +div_d 0000 00000010 00100 ..... ..... ..... @fmt_rdrjrk +mod_d 0000 00000010 00101 ..... ..... ..... @fmt_rdrjrk +div_du 0000 00000010 00110 ..... ..... ..... @fmt_rdrjrk +mod_du 0000 00000010 00111 ..... ..... ..... @fmt_rdrjrk +alsl_w 0000 00000000 010 .. ..... ..... ..... @fmt_rdrjrksa2 +alsl_wu 0000 00000000 011 .. ..... ..... ..... @fmt_rdrjrksa2 +alsl_d 0000 00000010 110 .. ..... ..... ..... @fmt_rdrjrksa2 +lu12i_w 0001 010 .................... ..... @fmt_rdsi20 +lu32i_d 0001 011 .................... ..... @fmt_rdsi20 +lu52i_d 0000 001100 ............ ..... ..... @fmt_rdrjsi12 +pcaddi 0001 100 .................... ..... @fmt_rdsi20 +pcalau12i 0001 101 .................... ..... @fmt_rdsi20 +pcaddu12i 0001 110 .................... ..... @fmt_rdsi20 +pcaddu18i 0001 111 .................... ..... @fmt_rdsi20 +addi_w 0000 001010 ............ ..... ..... @fmt_rdrjsi12 +addi_d 0000 001011 ............ ..... ..... @fmt_rdrjsi12 +addu16i_d 0001 00 ................ ..... ..... @fmt_rdrjsi16 +andi 0000 001101 ............ ..... ..... @fmt_rdrjui12 +ori 0000 001110 ............ ..... ..... @fmt_rdrjui12 +xori 0000 001111 ............ ..... ..... @fmt_rdrjui12 diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c index 048c895..d4e0bf3 100644 --- a/target/loongarch/translate.c +++ b/target/loongarch/translate.c @@ -57,6 +57,11 @@ static void loongarch_tr_init_disas_context(DisasContextBase *dcbase, /* Bound the number of insns to execute to those left on the page. */ bound = -(ctx->base.pc_first | TARGET_PAGE_MASK) / 4; ctx->base.max_insns = MIN(ctx->base.max_insns, bound); + + ctx->ntemp = 0; + memset(ctx->temp, 0, sizeof(ctx->temp)); + + ctx->zero = tcg_constant_tl(0); } static void loongarch_tr_tb_start(DisasContextBase *dcbase, CPUState *cs) @@ -70,6 +75,73 @@ static void loongarch_tr_insn_start(DisasContextBase *dcbase, CPUState *cs) tcg_gen_insn_start(ctx->base.pc_next); } +/* + * Wrappers for getting reg values. + * + * The $zero register does not have cpu_gpr[0] allocated -- we supply the + * constant zero as a source, and an uninitialized sink as destination. + * + * Further, we may provide an extension for word operations. + */ +static TCGv temp_new(DisasContext *ctx) +{ + assert(ctx->ntemp < ARRAY_SIZE(ctx->temp)); + return ctx->temp[ctx->ntemp++] = tcg_temp_new(); +} + +static TCGv gpr_src(DisasContext *ctx, int reg_num, DisasExtend src_ext) +{ + TCGv t; + + if (reg_num == 0) { + return ctx->zero; + } + + switch (src_ext) { + case EXT_NONE: + return cpu_gpr[reg_num]; + case EXT_SIGN: + t = temp_new(ctx); + tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]); + return t; + case EXT_ZERO: + t = temp_new(ctx); + tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]); + return t; + } + g_assert_not_reached(); +} + +static TCGv gpr_dst(DisasContext *ctx, int reg_num, DisasExtend dst_ext) +{ + if (reg_num == 0 || dst_ext) { + return temp_new(ctx); + } + return cpu_gpr[reg_num]; +} + +static void gen_set_gpr(int reg_num, TCGv t, DisasExtend dst_ext) +{ + if (reg_num != 0) { + switch (dst_ext) { + case EXT_NONE: + tcg_gen_mov_tl(cpu_gpr[reg_num], t); + break; + case EXT_SIGN: + tcg_gen_ext32s_tl(cpu_gpr[reg_num], t); + break; + case EXT_ZERO: + tcg_gen_ext32u_tl(cpu_gpr[reg_num], t); + break; + default: + g_assert_not_reached(); + } + } +} + +#include "decode-insns.c.inc" +#include "insn_trans/trans_arith.c.inc" + static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs) { CPULoongArchState *env = cs->env_ptr; @@ -83,6 +155,12 @@ static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs) generate_exception(ctx, EXCP_INE); } + for (int i = ctx->ntemp - 1; i >= 0; --i) { + tcg_temp_free(ctx->temp[i]); + ctx->temp[i] = NULL; + } + ctx->ntemp = 0; + ctx->base.pc_next += 4; } diff --git a/target/loongarch/translate.h b/target/loongarch/translate.h index 6cc7f1a..9cc1251 100644 --- a/target/loongarch/translate.h +++ b/target/loongarch/translate.h @@ -10,11 +10,30 @@ #include "exec/translator.h" +#define TRANS(NAME, FUNC, ...) \ + static bool trans_##NAME(DisasContext *ctx, arg_##NAME * a) \ + { return FUNC(ctx, a, __VA_ARGS__); } + +/* + * If an operation is being performed on less than TARGET_LONG_BITS, + * it may require the inputs to be sign- or zero-extended; which will + * depend on the exact operation being performed. + */ +typedef enum { + EXT_NONE, + EXT_SIGN, + EXT_ZERO, +} DisasExtend; + typedef struct DisasContext { DisasContextBase base; target_ulong page_start; uint32_t opcode; int mem_idx; + TCGv zero; + /* Space for 3 operands plus 1 extra for address computation. */ + TCGv temp[4]; + uint8_t ntemp; } DisasContext; void generate_exception(DisasContext *ctx, int excp);