diff mbox series

[v10,04/26] target/loongarch: Add fixed point arithmetic instruction translation

Message ID 1636700049-24381-5-git-send-email-gaosong@loongson.cn
State New
Headers show
Series Add LoongArch linux-user emulation support | expand

Commit Message

gaosong Nov. 12, 2021, 6:53 a.m. UTC
This includes:
- ADD.{W/D}, SUB.{W/D}
- ADDI.{W/D}, ADDU16ID
- ALSL.{W[U]/D}
- LU12I.W, LU32I.D LU52I.D
- SLT[U], SLT[U]I
- PCADDI, PCADDU12I, PCADDU18I, PCALAU12I
- AND, OR, NOR, XOR, ANDN, ORN
- MUL.{W/D}, MULH.{W[U]/D[U]}
- MULW.D.W[U]
- DIV.{W[U]/D[U]}, MOD.{W[U]/D[U]}
- ANDI, ORI, XORI

Signed-off-by: Song Gao <gaosong@loongson.cn>
Signed-off-by: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
---
 target/loongarch/insn_trans/trans_arith.c.inc | 319 ++++++++++++++++++++++++++
 target/loongarch/insns.decode                 |  88 +++++++
 target/loongarch/translate.c                  |  78 +++++++
 target/loongarch/translate.h                  |  19 ++
 4 files changed, 504 insertions(+)
 create mode 100644 target/loongarch/insn_trans/trans_arith.c.inc
 create mode 100644 target/loongarch/insns.decode

Comments

Richard Henderson Nov. 12, 2021, 2:05 p.m. UTC | #1
On 11/12/21 7:53 AM, Song Gao wrote:
> +#
> +# Fields
> +#
> +%rd      0:5
> +%rj      5:5
> +%rk      10:5
> +%sa2     15:2
> +%si12    10:s12
> +%ui12    10:12
> +%si16    10:s16
> +%si20    5:s20

You should only create separate field definitions like this when they are complex: e.g. 
the logical field is disjoint or there's a need for !function.

> +
> +#
> +# Argument sets
> +#
> +&fmt_rdrjrk         rd rj rk
> +&fmt_rdrjsi12       rd rj si12
> +&fmt_rdrjrksa2      rd rj rk sa2
> +&fmt_rdrjsi16       rd rj si16
> +&fmt_rdrjui12       rd rj ui12
> +&fmt_rdsi20         rd si20

Some of these should be combined.  The width of the immediate is a detail of the format, 
not the decoded argument set.  Thus you should have

&fmt_rdimm     rd imm
&fmt_rdrjimm   rd rj imm
&fmt_rdrjrk    rd rj rk
&fmt_rdrjrksa  rd rj rk sa

> +alsl_w           0000 00000000 010 .. ..... ..... .....   @fmt_rdrjrksa2
> +alsl_wu          0000 00000000 011 .. ..... ..... .....   @fmt_rdrjrksa2
> +alsl_d           0000 00000010 110 .. ..... ..... .....   @fmt_rdrjrksa2

The encoding of these insns is that the shift is sa+1.

While you compensate for this in gen_alsl_*, we print the "wrong" number in the 
disassembly.  I think it would be better to do

%sa2p1     15:2 !function=plus_1
@fmt_rdrjrksa2p1  .... ........ ... .. rk:5 rj:5 rd:5 \
                   &fmt_rdrjrksa sa=%sa2p1


r~
WANG Xuerui Nov. 13, 2021, 3:18 a.m. UTC | #2
On 11/12/21 22:05, Richard Henderson wrote:
> On 11/12/21 7:53 AM, Song Gao wrote:
>> +#
>> +# Fields
>> +#
>> +%rd      0:5
>> +%rj      5:5
>> +%rk      10:5
>> +%sa2     15:2
>> +%si12    10:s12
>> +%ui12    10:12
>> +%si16    10:s16
>> +%si20    5:s20
>
> You should only create separate field definitions like this when they 
> are complex: e.g. the logical field is disjoint or there's a need for 
> !function.
>
>> +
>> +#
>> +# Argument sets
>> +#
>> +&fmt_rdrjrk         rd rj rk
>> +&fmt_rdrjsi12       rd rj si12
>> +&fmt_rdrjrksa2      rd rj rk sa2
>> +&fmt_rdrjsi16       rd rj si16
>> +&fmt_rdrjui12       rd rj ui12
>> +&fmt_rdsi20         rd si20
>
> Some of these should be combined.  The width of the immediate is a 
> detail of the format, not the decoded argument set.  Thus you should have
>
> &fmt_rdimm     rd imm
> &fmt_rdrjimm   rd rj imm
> &fmt_rdrjrk    rd rj rk
> &fmt_rdrjrksa  rd rj rk sa

I'd like to add, that the organization of the whole decodetree file 
closely resembles that of the ISA manual, most likely on purpose (while 
not stated anywhere in the patch). However the manual itself is not 
without errors or inconsistencies; for example, the 9 "base instruction 
formats" classification is nowhere near accurate, and here we can see 
the author is forced to create ad-hoc names (repeating the operand 
slots). I suggest just generating the descriptions from the 
loongarch-opcodes project [1]; no need to duplicate work. I'll happily 
help if you decide to do that.

[1]: https://github.com/loongson-community/loongarch-opcodes

>
>> +alsl_w           0000 00000000 010 .. ..... ..... .....   
>> @fmt_rdrjrksa2
>> +alsl_wu          0000 00000000 011 .. ..... ..... ..... @fmt_rdrjrksa2
>> +alsl_d           0000 00000010 110 .. ..... ..... ..... @fmt_rdrjrksa2
>
> The encoding of these insns is that the shift is sa+1.
>
> While you compensate for this in gen_alsl_*, we print the "wrong" 
> number in the disassembly.  I think it would be better to do
>
> %sa2p1     15:2 !function=plus_1
> @fmt_rdrjrksa2p1  .... ........ ... .. rk:5 rj:5 rd:5 \
>                   &fmt_rdrjrksa sa=%sa2p1

Here again, the manual was inconsistent with the binutils 
implementation; the manual says (for ALSL.W, it's SLADD in 
loongarch-opcodes project's revised mnemonics):

"ALSL.W logically left-shifts rj[31:0] by (sa2+1) bits, [snip]" 
(translation mine, not copied from the official translation)

Clearly the "+1" part is not meant to show up in disassembly. Yet the 
binutils implementation acts as if the operand should be pre-added 1 in 
source code, and disassembles and prints as such, obvious mismatch here. 
I'd suggest fixing the disassembly code to remove this inconsistency. 
And the "+1" "feature" is not used anywhere else AFAIK, so it wouldn't 
hurt to just delete everything about it.

>
>
> r~
>
gaosong Nov. 15, 2021, 3:59 a.m. UTC | #3
Hi Richard,

On 2021/11/12 下午10:05, Richard Henderson wrote:
> On 11/12/21 7:53 AM, Song Gao wrote:
>> +#
>> +# Fields
>> +#
>> +%rd      0:5
>> +%rj      5:5
>> +%rk      10:5
>> +%sa2     15:2
>> +%si12    10:s12
>> +%ui12    10:12
>> +%si16    10:s16
>> +%si20    5:s20
>
> You should only create separate field definitions like this when they 
> are complex: e.g. the logical field is disjoint or there's a need for 
> !function.
>
>> +
>> +#
>> +# Argument sets
>> +#
>> +&fmt_rdrjrk         rd rj rk
>> +&fmt_rdrjsi12       rd rj si12
>> +&fmt_rdrjrksa2      rd rj rk sa2
>> +&fmt_rdrjsi16       rd rj si16
>> +&fmt_rdrjui12       rd rj ui12
>> +&fmt_rdsi20         rd si20
>
> Some of these should be combined.  The width of the immediate is a 
> detail of the format, not the decoded argument set.  Thus you should have
>
> &fmt_rdimm     rd imm
> &fmt_rdrjimm   rd rj imm
> &fmt_rdrjrk    rd rj rk
> &fmt_rdrjrksa  rd rj rk sa
>
'The width of the immediate is a detail of the format'  means:

&fmt_rdrjimm         rd  rj imm

@fmt_rdrjimm         .... ...... imm:12  rj:5 rd:5     &fmt_rdrjimm
@fmt_rdrjimm14         .... .... imm:14  rj:5 rd:5     &fmt_rdrjimm
@fmt_rdrjimm16           .... .. imm:16  rj:5 rd:5     &fmt_rdrjimm

and we print in the disassembly, liks this

output_rdrjimm(DisasContext *ctx, arg_fmt_rdrjimm * a,  const char *mnemonic)
{
     output(ctx, mnemonic, "%s, %s, 0x%x", regnames[a->rd], regnames[a->rj], a->imm);
}

is that right?

>> +alsl_w           0000 00000000 010 .. ..... ..... .....   
>> @fmt_rdrjrksa2
>> +alsl_wu          0000 00000000 011 .. ..... ..... .....   
>> @fmt_rdrjrksa2
>> +alsl_d           0000 00000010 110 .. ..... ..... .....   
>> @fmt_rdrjrksa2
>
> The encoding of these insns is that the shift is sa+1.
>
> While you compensate for this in gen_alsl_*, we print the "wrong" 
> number in the disassembly.  I think it would be better to do
>
> %sa2p1     15:2 !function=plus_1
> @fmt_rdrjrksa2p1  .... ........ ... .. rk:5 rj:5 rd:5 \
>                   &fmt_rdrjrksa sa=%sa2p1
>
1. We print sa in disassembly output_rdrjrksa(DisasContext *ctx, 
arg_fmt_rdrjsa* a, const char *memonic) { output(ctx, memonic, "%s, %s, 
%s, 0x0x", regnames[a->rd], regnames[a->rj], a->sa) } 2. We use sa on 
gen_alsl_* not (sa2+1). 3 bytepick_w use the same print functions. but 
the Field/Argurment/Format is %sa2 15:2 &fmt_rdrjrksa rd rj sa 
@fmt_rdrjrk sa2 .... ........ ... sa:2 rk:5 rj:5 rd:5 &fmt_rdrjrksa Is 
my understanding right? Thanks. Song Gao
Richard Henderson Nov. 15, 2021, 8:42 a.m. UTC | #4
On 11/15/21 4:59 AM, gaosong wrote:
> 'The width of the immediate is a detail of the format'  means:
> 
> &fmt_rdrjimm         rd  rj imm
> 
> @fmt_rdrjimm         .... ...... imm:12  rj:5 rd:5     &fmt_rdrjimm
> @fmt_rdrjimm14         .... .... imm:14  rj:5 rd:5     &fmt_rdrjimm
> @fmt_rdrjimm16           .... .. imm:16  rj:5 rd:5     &fmt_rdrjimm
> 
> and we print in the disassembly, liks this
> 
> output_rdrjimm(DisasContext *ctx, arg_fmt_rdrjimm * a,  const char *mnemonic)
> {
>      output(ctx, mnemonic, "%s, %s, 0x%x", regnames[a->rd], regnames[a->rj], a->imm);
> }
> 
> is that right?

Yes.

I'll note that regnames[] is defined in target/loongarch/cpu.c, which is not available 
when we want to use this disassembler for tcg/loongarch64/.  I think it would be easier to 
print this as

     "r%d", a->rd

so that you do not need to rely on the external strings.

I also think you should print signed numbers, "%d", because 0xfffffff8 (truncated to 32 
bits) is not really the correct representation of -8 for a 64-bit operand.


> 1. We print sa in disassembly...
> 2. We use sa on gen_alsl_* not (sa2+1).
> 3. bytepick_w use the same print functions.
> Is my understanding right?

Yes, that is the issue I am describing.


r~
gaosong Nov. 17, 2021, 7:57 a.m. UTC | #5
Hi Richard,

On 2021/11/15 下午4:42, Richard Henderson wrote:
> On 11/15/21 4:59 AM, gaosong wrote:
>> 'The width of the immediate is a detail of the format'  means:
>>
>> &fmt_rdrjimm         rd  rj imm
>>
>> @fmt_rdrjimm         .... ...... imm:12  rj:5 rd:5 &fmt_rdrjimm
>> @fmt_rdrjimm14         .... .... imm:14  rj:5 rd:5 &fmt_rdrjimm
>> @fmt_rdrjimm16           .... .. imm:16  rj:5 rd:5 &fmt_rdrjimm
>>
>> and we print in the disassembly, liks this
>>
>> output_rdrjimm(DisasContext *ctx, arg_fmt_rdrjimm * a,  const char 
>> *mnemonic)
>> {
>>      output(ctx, mnemonic, "%s, %s, 0x%x", regnames[a->rd], 
>> regnames[a->rj], a->imm);
>> }
>>
>> is that right?
>
> Yes.
>
> I'll note that regnames[] is defined in target/loongarch/cpu.c, which 
> is not available when we want to use this disassembler for 
> tcg/loongarch64/.  I think it would be easier to print this as
>
>     "r%d", a->rd
>
> so that you do not need to rely on the external strings.
>
> I also think you should print signed numbers, "%d", because 0xfffffff8 
> (truncated to 32 bits) is not really the correct representation of -8 
> for a 64-bit operand.
>
>
>> 1. We print sa in disassembly...
>> 2. We use sa on gen_alsl_* not (sa2+1).
>> 3. bytepick_w use the same print functions.
>> Is my understanding right?
>
> Yes, that is the issue I am describing.
>
I see that  insns.decode format is not very consistent with other 
architectures, such ARM/RISCV

I'll correct it , like this:

# Fields
#
%sa2p1     15:2         !function=plus_1

#
# Argument sets
#
&r_i          rd imm
&rrr          rd rj rk
&rr_i         rd rj imm
&rrr_sa     rd rj rk sa

#
# Formats
#
@fmt_rrr             .... ........ ..... rk:5 rj:5 rd:5 &rrr
@fmt_r_i20                        .... ... imm:s20 rd:5 &r_i
@fmt_rr_i12               .... ...... imm:s12 rj:5 rd:5 &rr_i
@fmt_rr_ui12               .... ...... imm:12 rj:5 rd:5 &rr_i
@fmt_rr_i16                   .... .. imm:s16 rj:5 rd:5 &rr_i
@fmt_rrr_sa2p1      .... ........ ... .. rk:5 rj:5 rd:5 &rrr_sa  sa=%sa2p1

#
# Fixed point arithmetic operation instruction
#
add_w            0000 00000001 00000 ..... ..... .....    @fmt_rrr
add_d            0000 00000001 00001 ..... ..... .....    @fmt_rrr
sub_w            0000 00000001 00010 ..... ..... .....    @fmt_rrr
sub_d            0000 00000001 00011 ..... ..... .....    @fmt_rrr
slt              0000 00000001 00100 ..... ..... ..... @fmt_rrr
sltu             0000 00000001 00101 ..... ..... ..... @fmt_rrr
slti             0000 001000 ............ ..... .....               
@fmt_rr_i12


and trans_xxx.c.inc

static bool gen_rrr(DisasContext *ctx, arg_rrr *a, ...) {}
static bool gen_rr_i12(DisasContext *ctx, arg_rr_i *a, ) {}
static bool gen_rrr_sa2p1(DisasContext *ctx, arg_rrr_sa *a, ...) {}
...

Richard, is that OK?

Thanks,
Song Gao
Richard Henderson Nov. 17, 2021, 8:28 a.m. UTC | #6
On 11/17/21 8:57 AM, gaosong wrote:
> I see that  insns.decode format is not very consistent with other architectures, such 
> ARM/RISCV

No.  I don't like how riscv has done it, though they have quite a few split fields, so 
perhaps they thought it looked weird.


> #
> # Argument sets
> #
> &r_i          rd imm
> &rrr          rd rj rk
> &rr_i         rd rj imm
> &rrr_sa     rd rj rk sa
> 
> #
> # Formats
> #
> @fmt_rrr             .... ........ ..... rk:5 rj:5 rd:5 &rrr
> @fmt_r_i20                        .... ... imm:s20 rd:5 &r_i
> @fmt_rr_i12               .... ...... imm:s12 rj:5 rd:5 &rr_i
> @fmt_rr_ui12               .... ...... imm:12 rj:5 rd:5 &rr_i
> @fmt_rr_i16                   .... .. imm:s16 rj:5 rd:5 &rr_i
> @fmt_rrr_sa2p1      .... ........ ... .. rk:5 rj:5 rd:5 &rrr_sa  sa=%sa2p1
> 
> #
> # Fixed point arithmetic operation instruction
> #
> add_w            0000 00000001 00000 ..... ..... .....    @fmt_rrr
> add_d            0000 00000001 00001 ..... ..... .....    @fmt_rrr
> sub_w            0000 00000001 00010 ..... ..... .....    @fmt_rrr
> sub_d            0000 00000001 00011 ..... ..... .....    @fmt_rrr
> slt              0000 00000001 00100 ..... ..... ..... @fmt_rrr
> sltu             0000 00000001 00101 ..... ..... ..... @fmt_rrr
> slti             0000 001000 ............ ..... .....               @fmt_rr_i12
> 
> 
> and trans_xxx.c.inc
> 
> static bool gen_rrr(DisasContext *ctx, arg_rrr *a, ...) {}
> static bool gen_rr_i12(DisasContext *ctx, arg_rr_i *a, ) {}

gen_rr_i ?

> static bool gen_rrr_sa2p1(DisasContext *ctx, arg_rrr_sa *a, ...) {}

gen_rrr_sa ?

> Richard, is that OK?

Other than those two nits, this looks very clean.  Thanks,


r~
gaosong Nov. 17, 2021, 9:29 a.m. UTC | #7
Hi Richard,

On 2021/11/17 下午4:28, Richard Henderson wrote:
> On 11/17/21 8:57 AM, gaosong wrote:
>> I see that  insns.decode format is not very consistent with other 
>> architectures, such ARM/RISCV
>
> No.  I don't like how riscv has done it, though they have quite a few 
> split fields, so perhaps they thought it looked weird.
>
>
>> #
>> # Argument sets
>> #
>> &r_i          rd imm
>> &rrr          rd rj rk
>> &rr_i         rd rj imm
>> &rrr_sa     rd rj rk sa
>>
>> #
>> # Formats
>> #
>> @fmt_rrr             .... ........ ..... rk:5 rj:5 rd:5 &rrr
>> @fmt_r_i20                        .... ... imm:s20 rd:5 &r_i
>> @fmt_rr_i12               .... ...... imm:s12 rj:5 rd:5 &rr_i
>> @fmt_rr_ui12               .... ...... imm:12 rj:5 rd:5 &rr_i
>> @fmt_rr_i16                   .... .. imm:s16 rj:5 rd:5 &rr_i
>> @fmt_rrr_sa2p1      .... ........ ... .. rk:5 rj:5 rd:5 &rrr_sa  
>> sa=%sa2p1
>>
>> #
>> # Fixed point arithmetic operation instruction
>> #
>> add_w            0000 00000001 00000 ..... ..... ..... @fmt_rrr
>> add_d            0000 00000001 00001 ..... ..... ..... @fmt_rrr
>> sub_w            0000 00000001 00010 ..... ..... ..... @fmt_rrr
>> sub_d            0000 00000001 00011 ..... ..... ..... @fmt_rrr
>> slt              0000 00000001 00100 ..... ..... ..... @fmt_rrr
>> sltu             0000 00000001 00101 ..... ..... ..... @fmt_rrr
>> slti             0000 001000 ............ ..... .....               
>> @fmt_rr_i12
>>
>>
>> and trans_xxx.c.inc
>>
>> static bool gen_rrr(DisasContext *ctx, arg_rrr *a, ...) {}
>> static bool gen_rr_i12(DisasContext *ctx, arg_rr_i *a, ) {}
>
> gen_rr_i ?

The code is not written completely,  like this:

gen_rr_i12:

@fmt_rr_i12               .... ...... imm:s12 rj:5 rd:5 &rr_i
slti         0000 001000 ............ ..... .....     @fmt_rr_i12
sltui        0000 001001 ............ ..... .....     @fmt_rr_i12
...

gen_rr_ui12:

@fmt_rr_ui12               .... ...... imm:12 rj:5 rd:5 &rr_i
andi         0000 001101 ............ ..... .....     @fmt_rr_ui12
ori          0000 001110 ............ ..... .....     @fmt_rr_ui12
xori         0000 001111 ............ ..... .....     @fmt_rr_ui12
...

@fmt_rr_i12 and @fmt_rr_ui12 are two 'Formats',  but they use the same 'Argument sets'(rr_i).

>
>> static bool gen_rrr_sa2p1(DisasContext *ctx, arg_rrr_sa *a, ...) {}
>
> gen_rrr_sa ?
>
Likewise.

gen_rrr_sa2p1:

@fmt_rrr_sa2p1        .... ........ ... .. rk:5 rj:5 rd:5   &fmt_rr_sa  sa=%sa2p1
lsl_w           0000 00000000 010 .. ..... ..... .....    @fmt_rrr_sa2p1
alsl_wu          0000 00000000 011 .. ..... ..... .....   @fmt_rrr_sa2p1
alsl_d           0000 00000010 110 .. ..... ..... .....   @fmt_rrr_sa2p1
...

gen_rrr_sa2:
@fmt_rrr_sa2        .... ........ ... sa:2 rk:5 rj:5 rd:5   &fmt_rr_sa
bytepick_w       0000 00000000 100 .. ..... ..... .....   @fmt_rrr_sa3
...

gen_rrr_sa3:
@fmt_rrr_sa3         .... ........ .. sa:3 rk:5 rj:5 rd:5   &fmt_rr_sa
bytepick_d       0000 00000000 11 ... ..... ..... .....   @fmt_rrr_sa3
...

>> Richard, is that OK?
>
> Other than those two nits, this looks very clean.  Thanks,
>
OK, I'll correct it on v11.

Thanks.
Song Gao
Richard Henderson Nov. 17, 2021, 9:55 a.m. UTC | #8
On 11/17/21 10:29 AM, gaosong wrote:
>> gen_rr_i ?
> 
> The code is not written completely,  like this:
> 
> gen_rr_i12:
> 
> @fmt_rr_i12               .... ...... imm:s12 rj:5 rd:5 &rr_i
> slti         0000 001000 ............ ..... .....     @fmt_rr_i12
> sltui        0000 001001 ............ ..... .....     @fmt_rr_i12
> ...
> 
> gen_rr_ui12:
> 
> @fmt_rr_ui12               .... ...... imm:12 rj:5 rd:5 &rr_i
> andi         0000 001101 ............ ..... .....     @fmt_rr_ui12
> ori          0000 001110 ............ ..... .....     @fmt_rr_ui12
> xori         0000 001111 ............ ..... .....     @fmt_rr_ui12
> ...
> 
> @fmt_rr_i12 and @fmt_rr_ui12 are two 'Formats',  but they use the same 'Argument sets'(rr_i).

What I meant is that there would be a single gen_rr_i function handing the argument set 
rr_i; no need for two gen_rr_i* functions.

> gen_rrr_sa2p1:
> 
> @fmt_rrr_sa2p1        .... ........ ... .. rk:5 rj:5 rd:5   &fmt_rr_sa  sa=%sa2p1
> lsl_w           0000 00000000 010 .. ..... ..... .....    @fmt_rrr_sa2p1
> alsl_wu          0000 00000000 011 .. ..... ..... .....   @fmt_rrr_sa2p1
> alsl_d           0000 00000010 110 .. ..... ..... .....   @fmt_rrr_sa2p1
> ...
> 
> gen_rrr_sa2:
> @fmt_rrr_sa2        .... ........ ... sa:2 rk:5 rj:5 rd:5   &fmt_rr_sa
> bytepick_w       0000 00000000 100 .. ..... ..... .....   @fmt_rrr_sa3
> ...
> 
> gen_rrr_sa3:
> @fmt_rrr_sa3         .... ........ .. sa:3 rk:5 rj:5 rd:5   &fmt_rr_sa
> bytepick_d       0000 00000000 11 ... ..... ..... .....   @fmt_rrr_sa3
> ...

Likewise a single gen_rrr_sa function.


r~
gaosong Nov. 18, 2021, 1:24 a.m. UTC | #9
Hi Richard,

On 2021/11/17 下午5:55, Richard Henderson wrote:
>>
>> @fmt_rr_i12 and @fmt_rr_ui12 are two 'Formats',  but they use the 
>> same 'Argument sets'(rr_i).
>
> What I meant is that there would be a single gen_rr_i function handing 
> the argument set rr_i; no need for two gen_rr_i* functions. 

Got it.

Thanks.
Song Gao
diff mbox series

Patch

diff --git a/target/loongarch/insn_trans/trans_arith.c.inc b/target/loongarch/insn_trans/trans_arith.c.inc
new file mode 100644
index 0000000..384a158
--- /dev/null
+++ b/target/loongarch/insn_trans/trans_arith.c.inc
@@ -0,0 +1,319 @@ 
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (c) 2021 Loongson Technology Corporation Limited
+ */
+
+static bool gen_r3(DisasContext *ctx, arg_fmt_rdrjrk *a,
+                   DisasExtend src1_ext, DisasExtend src2_ext,
+                   DisasExtend dst_ext, void (*func)(TCGv, TCGv, TCGv))
+{
+    TCGv dest = gpr_dst(ctx, a->rd, dst_ext);
+    TCGv src1 = gpr_src(ctx, a->rj, src1_ext);
+    TCGv src2 = gpr_src(ctx, a->rk, src2_ext);
+
+    func(dest, src1, src2);
+
+    /* dst_ext is EXT_NONE and input is dest, We don't run gen_set_gpr. */
+    if (dst_ext) {
+        gen_set_gpr(a->rd, dest, dst_ext);
+    }
+    return true;
+}
+
+static bool gen_r2_si12(DisasContext *ctx, arg_fmt_rdrjsi12 *a,
+                        DisasExtend src_ext, DisasExtend dst_ext,
+                        void (*func)(TCGv, TCGv, TCGv))
+{
+    TCGv dest = gpr_dst(ctx, a->rd, dst_ext);
+    TCGv src1 = gpr_src(ctx, a->rj, src_ext);
+    TCGv src2 = tcg_constant_tl(a->si12);
+
+    func(dest, src1, src2);
+
+    if (dst_ext) {
+        gen_set_gpr(a->rd, dest, dst_ext);
+    }
+    return true;
+}
+
+static bool gen_r3_sa2(DisasContext *ctx, arg_fmt_rdrjrksa2 *a,
+                       DisasExtend src_ext, DisasExtend dst_ext,
+                       void (*func)(TCGv, TCGv, TCGv, TCGv, target_long))
+{
+    TCGv dest = gpr_dst(ctx, a->rd, dst_ext);
+    TCGv src1 = gpr_src(ctx, a->rj, src_ext);
+    TCGv src2 = gpr_src(ctx, a->rk, src_ext);
+    TCGv temp = tcg_temp_new();
+
+    func(dest, src1, src2, temp, a->sa2);
+
+    if (dst_ext) {
+        gen_set_gpr(a->rd, dest, dst_ext);
+    }
+    tcg_temp_free(temp);
+    return true;
+}
+
+static bool trans_lu12i_w(DisasContext *ctx, arg_lu12i_w *a)
+{
+    TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
+
+    tcg_gen_movi_tl(dest, a->si20 << 12);
+    return true;
+}
+
+static bool gen_pc(DisasContext *ctx, arg_fmt_rdsi20 *a,
+                   target_ulong (*func)(target_ulong, int))
+{
+    TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
+    target_ulong addr = func(ctx->base.pc_next, a->si20);
+
+    tcg_gen_movi_tl(dest, addr);
+    return true;
+}
+
+static bool gen_r2_ui12(DisasContext *ctx, arg_fmt_rdrjui12 *a,
+                        void (*func)(TCGv, TCGv, target_long))
+{
+    TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
+    TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE);
+
+    func(dest, src1, a->ui12);
+    return true;
+}
+
+static void gen_slt(TCGv dest, TCGv src1, TCGv src2)
+{
+    tcg_gen_setcond_tl(TCG_COND_LT, dest, src1, src2);
+}
+
+static void gen_sltu(TCGv dest, TCGv src1, TCGv src2)
+{
+    tcg_gen_setcond_tl(TCG_COND_LTU, dest, src1, src2);
+}
+
+static void gen_mulh_w(TCGv dest, TCGv src1, TCGv src2)
+{
+    tcg_gen_mul_i64(dest, src1, src2);
+    tcg_gen_sari_i64(dest, dest, 32);
+}
+
+static void gen_mulh_wu(TCGv dest, TCGv src1, TCGv src2)
+{
+    tcg_gen_mul_i64(dest, src1, src2);
+    tcg_gen_sari_i64(dest, dest, 32);
+}
+
+static void gen_mulh_d(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv discard = tcg_temp_new();
+    tcg_gen_muls2_tl(discard, dest, src1, src2);
+    tcg_temp_free(discard);
+}
+
+static void gen_mulh_du(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv discard = tcg_temp_new();
+    tcg_gen_mulu2_tl(discard, dest, src1, src2);
+    tcg_temp_free(discard);
+}
+
+static void prep_divisor_d(TCGv ret, TCGv src1, TCGv src2)
+{
+    TCGv t0 = tcg_temp_new();
+    TCGv t1 = tcg_temp_new();
+    TCGv zero = tcg_constant_tl(0);
+
+    /*
+     * If min / -1, set the divisor to 1.
+     * This avoids potential host overflow trap and produces min.
+     * If x / 0, set the divisor to 1.
+     * This avoids potential host overflow trap;
+     * the required result is undefined.
+     */
+    tcg_gen_setcondi_tl(TCG_COND_EQ, ret, src1, INT64_MIN);
+    tcg_gen_setcondi_tl(TCG_COND_EQ, t0, src2, -1);
+    tcg_gen_setcondi_tl(TCG_COND_EQ, t1, src2, 0);
+    tcg_gen_and_tl(ret, ret, t0);
+    tcg_gen_or_tl(ret, ret, t1);
+    tcg_gen_movcond_tl(TCG_COND_NE, ret, ret, zero, ret, src2);
+
+    tcg_temp_free(t0);
+    tcg_temp_free(t1);
+}
+
+static void prep_divisor_du(TCGv ret, TCGv src2)
+{
+    TCGv zero = tcg_constant_tl(0);
+    TCGv one = tcg_constant_tl(1);
+
+    /*
+     * If x / 0, set the divisor to 1.
+     * This avoids potential host overflow trap;
+     * the required result is undefined.
+     */
+    tcg_gen_movcond_tl(TCG_COND_EQ, ret, src2, zero, one, src2);
+}
+
+static void gen_div_d(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv t0 = tcg_temp_new();
+    prep_divisor_d(t0, src1, src2);
+    tcg_gen_div_tl(dest, src1, t0);
+    tcg_temp_free(t0);
+}
+
+static void gen_rem_d(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv t0 = tcg_temp_new();
+    prep_divisor_d(t0, src1, src2);
+    tcg_gen_rem_tl(dest, src1, t0);
+    tcg_temp_free(t0);
+}
+
+static void gen_div_du(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv t0 = tcg_temp_new();
+    prep_divisor_du(t0, src2);
+    tcg_gen_divu_tl(dest, src1, t0);
+    tcg_temp_free(t0);
+}
+
+static void gen_rem_du(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv t0 = tcg_temp_new();
+    prep_divisor_du(t0, src2);
+    tcg_gen_remu_tl(dest, src1, t0);
+    tcg_temp_free(t0);
+}
+
+static void gen_div_w(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv t0 = tcg_temp_new();
+    /* We need not check for integer overflow for div_w. */
+    prep_divisor_du(t0, src2);
+    tcg_gen_div_tl(dest, src1, t0);
+    tcg_temp_free(t0);
+}
+
+static void gen_rem_w(TCGv dest, TCGv src1, TCGv src2)
+{
+    TCGv t0 = tcg_temp_new();
+    /* We need not check for integer overflow for rem_w. */
+    prep_divisor_du(t0, src2);
+    tcg_gen_rem_tl(dest, src1, t0);
+    tcg_temp_free(t0);
+}
+
+static void gen_alsl_w(TCGv dest, TCGv src1, TCGv src2,
+                       TCGv temp, target_long sa2)
+{
+    tcg_gen_shli_tl(temp, src1, sa2 + 1);
+    tcg_gen_add_tl(dest, temp, src2);
+}
+
+static void gen_alsl_wu(TCGv dest, TCGv src1, TCGv src2,
+                        TCGv temp, target_long sa2)
+{
+    tcg_gen_shli_tl(temp, src1, sa2 + 1);
+    tcg_gen_add_tl(dest, temp, src2);
+}
+
+static void gen_alsl_d(TCGv dest, TCGv src1, TCGv src2,
+                       TCGv temp, target_long sa2)
+{
+    tcg_gen_shli_tl(temp, src1, sa2 + 1);
+    tcg_gen_add_tl(dest, temp, src2);
+}
+
+static bool trans_lu32i_d(DisasContext *ctx, arg_lu32i_d *a)
+{
+    TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
+    TCGv src1 = gpr_src(ctx, a->rd, EXT_NONE);
+    TCGv src2 = tcg_constant_tl(a->si20);
+
+    tcg_gen_deposit_tl(dest, src1, src2, 32, 32);
+    return true;
+}
+
+static bool trans_lu52i_d(DisasContext *ctx, arg_lu52i_d *a)
+{
+    TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
+    TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE);
+    TCGv src2 = tcg_constant_tl(a->si12);
+
+    tcg_gen_deposit_tl(dest, src1, src2, 52, 12);
+    return true;
+}
+
+static target_ulong gen_pcaddi(target_ulong pc, int si20)
+{
+    return pc + (si20 << 2);
+}
+
+static target_ulong gen_pcalau12i(target_ulong pc, int si20)
+{
+    return (pc + (si20 << 12)) & ~0xfff;
+}
+
+static target_ulong gen_pcaddu12i(target_ulong pc, int si20)
+{
+    return pc + (si20 << 12);
+}
+
+static target_ulong gen_pcaddu18i(target_ulong pc, int si20)
+{
+    return pc + ((target_ulong)(si20) << 18);
+}
+
+static bool trans_addu16i_d(DisasContext *ctx, arg_addu16i_d *a)
+{
+    TCGv dest = gpr_dst(ctx, a->rd, EXT_NONE);
+    TCGv src1 = gpr_src(ctx, a->rj, EXT_NONE);
+
+    tcg_gen_addi_tl(dest, src1, a->si16 << 16);
+    return true;
+}
+
+TRANS(add_w, gen_r3, EXT_NONE, EXT_NONE, EXT_SIGN, tcg_gen_add_tl)
+TRANS(add_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_add_tl)
+TRANS(sub_w, gen_r3, EXT_NONE, EXT_NONE, EXT_SIGN, tcg_gen_sub_tl)
+TRANS(sub_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_sub_tl)
+TRANS(and, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_and_tl)
+TRANS(or, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_or_tl)
+TRANS(xor, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_xor_tl)
+TRANS(nor, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_nor_tl)
+TRANS(andn, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_andc_tl)
+TRANS(orn, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_orc_tl)
+TRANS(slt, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_slt)
+TRANS(sltu, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_sltu)
+TRANS(mul_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_SIGN, tcg_gen_mul_tl)
+TRANS(mul_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, tcg_gen_mul_tl)
+TRANS(mulh_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_NONE, gen_mulh_w)
+TRANS(mulh_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_NONE, gen_mulh_wu)
+TRANS(mulh_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_mulh_d)
+TRANS(mulh_du, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_mulh_du)
+TRANS(mulw_d_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_NONE, tcg_gen_mul_tl)
+TRANS(mulw_d_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_NONE, tcg_gen_mul_tl)
+TRANS(div_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_SIGN, gen_div_w)
+TRANS(mod_w, gen_r3, EXT_SIGN, EXT_SIGN, EXT_SIGN, gen_rem_w)
+TRANS(div_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_SIGN, gen_div_du)
+TRANS(mod_wu, gen_r3, EXT_ZERO, EXT_ZERO, EXT_SIGN, gen_rem_du)
+TRANS(div_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_div_d)
+TRANS(mod_d, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_rem_d)
+TRANS(div_du, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_div_du)
+TRANS(mod_du, gen_r3, EXT_NONE, EXT_NONE, EXT_NONE, gen_rem_du)
+TRANS(slti, gen_r2_si12, EXT_NONE, EXT_NONE, gen_slt)
+TRANS(sltui, gen_r2_si12, EXT_NONE, EXT_NONE, gen_sltu)
+TRANS(addi_w, gen_r2_si12, EXT_NONE, EXT_SIGN, tcg_gen_add_tl)
+TRANS(addi_d, gen_r2_si12, EXT_NONE, EXT_NONE, tcg_gen_add_tl)
+TRANS(alsl_w, gen_r3_sa2, EXT_NONE, EXT_SIGN, gen_alsl_w)
+TRANS(alsl_wu, gen_r3_sa2, EXT_NONE, EXT_ZERO, gen_alsl_wu)
+TRANS(alsl_d, gen_r3_sa2, EXT_NONE, EXT_NONE, gen_alsl_d)
+TRANS(pcaddi, gen_pc, gen_pcaddi)
+TRANS(pcalau12i, gen_pc, gen_pcalau12i)
+TRANS(pcaddu12i, gen_pc, gen_pcaddu12i)
+TRANS(pcaddu18i, gen_pc, gen_pcaddu18i)
+TRANS(andi, gen_r2_ui12, tcg_gen_andi_tl)
+TRANS(ori, gen_r2_ui12, tcg_gen_ori_tl)
+TRANS(xori, gen_r2_ui12, tcg_gen_xori_tl)
diff --git a/target/loongarch/insns.decode b/target/loongarch/insns.decode
new file mode 100644
index 0000000..3e6a051
--- /dev/null
+++ b/target/loongarch/insns.decode
@@ -0,0 +1,88 @@ 
+# SPDX-License-Identifier: GPL-2.0-or-later
+#
+# LoongArch instruction decode definitions.
+#
+# Copyright (c) 2021 Loongson Technology Corporation Limited
+#
+
+#
+# Fields
+#
+%rd      0:5
+%rj      5:5
+%rk      10:5
+%sa2     15:2
+%si12    10:s12
+%ui12    10:12
+%si16    10:s16
+%si20    5:s20
+
+#
+# Argument sets
+#
+&fmt_rdrjrk         rd rj rk
+&fmt_rdrjsi12       rd rj si12
+&fmt_rdrjrksa2      rd rj rk sa2
+&fmt_rdrjsi16       rd rj si16
+&fmt_rdrjui12       rd rj ui12
+&fmt_rdsi20         rd si20
+
+#
+# Formats
+#
+@fmt_rdrjrk          .... ........ ..... ..... ..... .....    &fmt_rdrjrk         %rd %rj %rk
+@fmt_rdrjsi12        .... ...... ............ ..... .....     &fmt_rdrjsi12       %rd %rj %si12
+@fmt_rdrjui12        .... ...... ............ ..... .....     &fmt_rdrjui12       %rd %rj %ui12
+@fmt_rdrjrksa2       .... ........ ... .. ..... ..... .....   &fmt_rdrjrksa2      %rd %rj %rk %sa2
+@fmt_rdrjsi16        .... .. ................ ..... .....     &fmt_rdrjsi16       %rd %rj %si16
+@fmt_rdsi20          .... ... .................... .....      &fmt_rdsi20         %rd %si20
+
+#
+# Fixed point arithmetic operation instruction
+#
+add_w            0000 00000001 00000 ..... ..... .....    @fmt_rdrjrk
+add_d            0000 00000001 00001 ..... ..... .....    @fmt_rdrjrk
+sub_w            0000 00000001 00010 ..... ..... .....    @fmt_rdrjrk
+sub_d            0000 00000001 00011 ..... ..... .....    @fmt_rdrjrk
+slt              0000 00000001 00100 ..... ..... .....    @fmt_rdrjrk
+sltu             0000 00000001 00101 ..... ..... .....    @fmt_rdrjrk
+slti             0000 001000 ............ ..... .....     @fmt_rdrjsi12
+sltui            0000 001001 ............ ..... .....     @fmt_rdrjsi12
+nor              0000 00000001 01000 ..... ..... .....    @fmt_rdrjrk
+and              0000 00000001 01001 ..... ..... .....    @fmt_rdrjrk
+or               0000 00000001 01010 ..... ..... .....    @fmt_rdrjrk
+xor              0000 00000001 01011 ..... ..... .....    @fmt_rdrjrk
+orn              0000 00000001 01100 ..... ..... .....    @fmt_rdrjrk
+andn             0000 00000001 01101 ..... ..... .....    @fmt_rdrjrk
+mul_w            0000 00000001 11000 ..... ..... .....    @fmt_rdrjrk
+mulh_w           0000 00000001 11001 ..... ..... .....    @fmt_rdrjrk
+mulh_wu          0000 00000001 11010 ..... ..... .....    @fmt_rdrjrk
+mul_d            0000 00000001 11011 ..... ..... .....    @fmt_rdrjrk
+mulh_d           0000 00000001 11100 ..... ..... .....    @fmt_rdrjrk
+mulh_du          0000 00000001 11101 ..... ..... .....    @fmt_rdrjrk
+mulw_d_w         0000 00000001 11110 ..... ..... .....    @fmt_rdrjrk
+mulw_d_wu        0000 00000001 11111 ..... ..... .....    @fmt_rdrjrk
+div_w            0000 00000010 00000 ..... ..... .....    @fmt_rdrjrk
+mod_w            0000 00000010 00001 ..... ..... .....    @fmt_rdrjrk
+div_wu           0000 00000010 00010 ..... ..... .....    @fmt_rdrjrk
+mod_wu           0000 00000010 00011 ..... ..... .....    @fmt_rdrjrk
+div_d            0000 00000010 00100 ..... ..... .....    @fmt_rdrjrk
+mod_d            0000 00000010 00101 ..... ..... .....    @fmt_rdrjrk
+div_du           0000 00000010 00110 ..... ..... .....    @fmt_rdrjrk
+mod_du           0000 00000010 00111 ..... ..... .....    @fmt_rdrjrk
+alsl_w           0000 00000000 010 .. ..... ..... .....   @fmt_rdrjrksa2
+alsl_wu          0000 00000000 011 .. ..... ..... .....   @fmt_rdrjrksa2
+alsl_d           0000 00000010 110 .. ..... ..... .....   @fmt_rdrjrksa2
+lu12i_w          0001 010 .................... .....      @fmt_rdsi20
+lu32i_d          0001 011 .................... .....      @fmt_rdsi20
+lu52i_d          0000 001100 ............ ..... .....     @fmt_rdrjsi12
+pcaddi           0001 100 .................... .....      @fmt_rdsi20
+pcalau12i        0001 101 .................... .....      @fmt_rdsi20
+pcaddu12i        0001 110 .................... .....      @fmt_rdsi20
+pcaddu18i        0001 111 .................... .....      @fmt_rdsi20
+addi_w           0000 001010 ............ ..... .....     @fmt_rdrjsi12
+addi_d           0000 001011 ............ ..... .....     @fmt_rdrjsi12
+addu16i_d        0001 00 ................ ..... .....     @fmt_rdrjsi16
+andi             0000 001101 ............ ..... .....     @fmt_rdrjui12
+ori              0000 001110 ............ ..... .....     @fmt_rdrjui12
+xori             0000 001111 ............ ..... .....     @fmt_rdrjui12
diff --git a/target/loongarch/translate.c b/target/loongarch/translate.c
index 048c895..d4e0bf3 100644
--- a/target/loongarch/translate.c
+++ b/target/loongarch/translate.c
@@ -57,6 +57,11 @@  static void loongarch_tr_init_disas_context(DisasContextBase *dcbase,
     /* Bound the number of insns to execute to those left on the page.  */
     bound = -(ctx->base.pc_first | TARGET_PAGE_MASK) / 4;
     ctx->base.max_insns = MIN(ctx->base.max_insns, bound);
+
+    ctx->ntemp = 0;
+    memset(ctx->temp, 0, sizeof(ctx->temp));
+
+    ctx->zero = tcg_constant_tl(0);
 }
 
 static void loongarch_tr_tb_start(DisasContextBase *dcbase, CPUState *cs)
@@ -70,6 +75,73 @@  static void loongarch_tr_insn_start(DisasContextBase *dcbase, CPUState *cs)
     tcg_gen_insn_start(ctx->base.pc_next);
 }
 
+/*
+ * Wrappers for getting reg values.
+ *
+ * The $zero register does not have cpu_gpr[0] allocated -- we supply the
+ * constant zero as a source, and an uninitialized sink as destination.
+ *
+ * Further, we may provide an extension for word operations.
+ */
+static TCGv temp_new(DisasContext *ctx)
+{
+    assert(ctx->ntemp < ARRAY_SIZE(ctx->temp));
+    return ctx->temp[ctx->ntemp++] = tcg_temp_new();
+}
+
+static TCGv gpr_src(DisasContext *ctx, int reg_num, DisasExtend src_ext)
+{
+    TCGv t;
+
+    if (reg_num == 0) {
+        return ctx->zero;
+    }
+
+    switch (src_ext) {
+    case EXT_NONE:
+        return cpu_gpr[reg_num];
+    case EXT_SIGN:
+        t = temp_new(ctx);
+        tcg_gen_ext32s_tl(t, cpu_gpr[reg_num]);
+        return t;
+    case EXT_ZERO:
+        t = temp_new(ctx);
+        tcg_gen_ext32u_tl(t, cpu_gpr[reg_num]);
+        return t;
+    }
+    g_assert_not_reached();
+}
+
+static TCGv gpr_dst(DisasContext *ctx, int reg_num, DisasExtend dst_ext)
+{
+    if (reg_num == 0 || dst_ext) {
+        return temp_new(ctx);
+    }
+    return cpu_gpr[reg_num];
+}
+
+static void gen_set_gpr(int reg_num, TCGv t, DisasExtend dst_ext)
+{
+    if (reg_num != 0) {
+        switch (dst_ext) {
+        case EXT_NONE:
+            tcg_gen_mov_tl(cpu_gpr[reg_num], t);
+            break;
+        case EXT_SIGN:
+            tcg_gen_ext32s_tl(cpu_gpr[reg_num], t);
+            break;
+        case EXT_ZERO:
+            tcg_gen_ext32u_tl(cpu_gpr[reg_num], t);
+            break;
+        default:
+            g_assert_not_reached();
+        }
+    }
+}
+
+#include "decode-insns.c.inc"
+#include "insn_trans/trans_arith.c.inc"
+
 static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
 {
     CPULoongArchState *env = cs->env_ptr;
@@ -83,6 +155,12 @@  static void loongarch_tr_translate_insn(DisasContextBase *dcbase, CPUState *cs)
         generate_exception(ctx, EXCP_INE);
     }
 
+    for (int i = ctx->ntemp - 1; i >= 0; --i) {
+        tcg_temp_free(ctx->temp[i]);
+        ctx->temp[i] = NULL;
+    }
+    ctx->ntemp = 0;
+
     ctx->base.pc_next += 4;
 }
 
diff --git a/target/loongarch/translate.h b/target/loongarch/translate.h
index 6cc7f1a..9cc1251 100644
--- a/target/loongarch/translate.h
+++ b/target/loongarch/translate.h
@@ -10,11 +10,30 @@ 
 
 #include "exec/translator.h"
 
+#define TRANS(NAME, FUNC, ...) \
+    static bool trans_##NAME(DisasContext *ctx, arg_##NAME * a) \
+    { return FUNC(ctx, a, __VA_ARGS__); }
+
+/*
+ * If an operation is being performed on less than TARGET_LONG_BITS,
+ * it may require the inputs to be sign- or zero-extended; which will
+ * depend on the exact operation being performed.
+ */
+typedef enum {
+    EXT_NONE,
+    EXT_SIGN,
+    EXT_ZERO,
+} DisasExtend;
+
 typedef struct DisasContext {
     DisasContextBase base;
     target_ulong page_start;
     uint32_t opcode;
     int mem_idx;
+    TCGv zero;
+    /* Space for 3 operands plus 1 extra for address computation. */
+    TCGv temp[4];
+    uint8_t ntemp;
 } DisasContext;
 
 void generate_exception(DisasContext *ctx, int excp);