mbox series

[RFC,v3,00/11] Add RX archtecture support

Message ID 20190302062138.10713-1-ysato@users.sourceforge.jp
Headers show
Series Add RX archtecture support | expand

Message

Yoshinori Sato March 2, 2019, 6:21 a.m. UTC
Hello.
This patch series is added Renesas RX target emulation.

My git repository is bellow.
git://git.pf.osdn.net/gitroot/y/ys/ysato/qemu.git

Since my understanding is not enough,
I want many comments to make this a good one.

Thanks.

Changes v2
Rewrite translate. using decodetree.py

Yoshinori Sato (11):
  target/rx: TCG Translation
  target/rx: TCG helper
  target/rx: CPU definition
  target/rx: RX disassembler
  target/rx: miscellaneous functions
  RX62N interrupt contorol uint
  RX62N internal timer modules
  RX62N internal serial communication interface
  RX Target hardware definition
  Add rx-softmmu
  MAINTAINERS: Add RX entry.

 MAINTAINERS                    |   20 +
 arch_init.c                    |    2 +
 configure                      |    8 +
 default-configs/rx-softmmu.mak |    7 +
 hw/char/Makefile.objs          |    2 +-
 hw/char/renesas_sci.c          |  288 ++++++
 hw/intc/Makefile.objs          |    1 +
 hw/intc/rx_icu.c               |  323 ++++++
 hw/rx/Makefile.objs            |    1 +
 hw/rx/rx62n.c                  |  227 ++++
 hw/rx/rxqemu.c                 |  100 ++
 hw/timer/Makefile.objs         |    2 +
 hw/timer/renesas_cmt.c         |  235 +++++
 hw/timer/renesas_tmr.c         |  412 ++++++++
 include/disas/bfd.h            |    5 +
 include/hw/char/renesas_sci.h  |   42 +
 include/hw/intc/rx_icu.h       |   49 +
 include/hw/rx/rx.h             |    7 +
 include/hw/rx/rx62n.h          |   54 +
 include/hw/timer/renesas_cmt.h |   33 +
 include/hw/timer/renesas_tmr.h |   42 +
 include/sysemu/arch_init.h     |    1 +
 target/rx/Makefile.objs        |   11 +
 target/rx/cpu-qom.h            |   52 +
 target/rx/cpu.c                |  224 ++++
 target/rx/cpu.h                |  214 ++++
 target/rx/disas.c              | 1570 ++++++++++++++++++++++++++++
 target/rx/gdbstub.c            |  113 ++
 target/rx/helper.c             |  252 +++++
 target/rx/helper.h             |   39 +
 target/rx/insns.decode         |  336 ++++++
 target/rx/monitor.c            |   38 +
 target/rx/op_helper.c          |  602 +++++++++++
 target/rx/translate.c          | 2220 ++++++++++++++++++++++++++++++++++++++++
 34 files changed, 7531 insertions(+), 1 deletion(-)
 create mode 100644 default-configs/rx-softmmu.mak
 create mode 100644 hw/char/renesas_sci.c
 create mode 100644 hw/intc/rx_icu.c
 create mode 100644 hw/rx/Makefile.objs
 create mode 100644 hw/rx/rx62n.c
 create mode 100644 hw/rx/rxqemu.c
 create mode 100644 hw/timer/renesas_cmt.c
 create mode 100644 hw/timer/renesas_tmr.c
 create mode 100644 include/hw/char/renesas_sci.h
 create mode 100644 include/hw/intc/rx_icu.h
 create mode 100644 include/hw/rx/rx.h
 create mode 100644 include/hw/rx/rx62n.h
 create mode 100644 include/hw/timer/renesas_cmt.h
 create mode 100644 include/hw/timer/renesas_tmr.h
 create mode 100644 target/rx/Makefile.objs
 create mode 100644 target/rx/cpu-qom.h
 create mode 100644 target/rx/cpu.c
 create mode 100644 target/rx/cpu.h
 create mode 100644 target/rx/disas.c
 create mode 100644 target/rx/gdbstub.c
 create mode 100644 target/rx/helper.c
 create mode 100644 target/rx/helper.h
 create mode 100644 target/rx/insns.decode
 create mode 100644 target/rx/monitor.c
 create mode 100644 target/rx/op_helper.c
 create mode 100644 target/rx/translate.c

Comments

no-reply@patchew.org March 2, 2019, 6:37 a.m. UTC | #1
Patchew URL: https://patchew.org/QEMU/20190302062138.10713-1-ysato@users.sourceforge.jp/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20190302062138.10713-1-ysato@users.sourceforge.jp
Subject: [Qemu-devel] [PATCH RFC v3 00/11] Add RX archtecture support

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]               patchew/20190302062138.10713-1-ysato@users.sourceforge.jp -> patchew/20190302062138.10713-1-ysato@users.sourceforge.jp
Switched to a new branch 'test'
e60fbb35e6 MAINTAINERS: Add RX entry.
3bdc16f591 Add rx-softmmu
38c0424e72 RX Target hardware definition
a6ec852164 RX62N internal serial communication interface
179739cb93 RX62N internal timer modules
52c62d9d30 RX62N interrupt contorol uint
cee296b1c3 target/rx: miscellaneous functions
d2050c0e82 target/rx: RX disassembler
823a75d919 target/rx: CPU definition
d5927fc378 target/rx: TCG helper
6c80d5cc68 target/rx: TCG Translation

=== OUTPUT BEGIN ===
1/11 Checking commit 6c80d5cc686a (target/rx: TCG Translation)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#15: 
new file mode 100644

ERROR: spaces required around that '*' (ctx:WxV)
#2089: FILE: target/rx/translate.c:1728:
+static bool trans_FCMP_ri(DisasContext *ctx, arg_FCMP_ri *a)
                                                          ^

ERROR: spaces required around that '*' (ctx:WxV)
#2111: FILE: target/rx/translate.c:1750:
+static bool trans_ITOF(DisasContext *ctx, arg_ITOF *a)
                                                    ^

ERROR: spaces required around that '*' (ctx:WxV)
#2232: FILE: target/rx/translate.c:1871:
+static bool trans_BNOT_lr(DisasContext *ctx, arg_BNOT_lr *a)
                                                          ^

total: 3 errors, 1 warnings, 2556 lines checked

Patch 1/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

2/11 Checking commit d5927fc378e5 (target/rx: TCG helper)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

total: 0 errors, 1 warnings, 893 lines checked

Patch 2/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
3/11 Checking commit 823a75d91989 (target/rx: CPU definition)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

total: 0 errors, 1 warnings, 490 lines checked

Patch 3/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
4/11 Checking commit d2050c0e8217 (target/rx: RX disassembler)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#34: 
new file mode 100644

total: 0 errors, 1 warnings, 1587 lines checked

Patch 4/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/11 Checking commit cee296b1c31c (target/rx: miscellaneous functions)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#11: 
new file mode 100644

total: 0 errors, 1 warnings, 162 lines checked

Patch 5/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/11 Checking commit 52c62d9d30d9 (RX62N interrupt contorol uint)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#24: 
new file mode 100644

total: 0 errors, 1 warnings, 376 lines checked

Patch 6/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
7/11 Checking commit 179739cb93ee (RX62N internal timer modules)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#30: 
new file mode 100644

total: 0 errors, 1 warnings, 730 lines checked

Patch 7/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
8/11 Checking commit a6ec852164ae (RX62N internal serial communication interface)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#28: 
new file mode 100644

total: 0 errors, 1 warnings, 338 lines checked

Patch 8/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
9/11 Checking commit 38c0424e72dd (RX Target hardware definition)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#14: 
new file mode 100644

total: 0 errors, 1 warnings, 389 lines checked

Patch 9/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
10/11 Checking commit 3bdc16f591b9 (Add rx-softmmu)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#50: 
new file mode 100644

total: 0 errors, 1 warnings, 42 lines checked

Patch 10/11 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
11/11 Checking commit e60fbb35e686 (MAINTAINERS: Add RX entry.)
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20190302062138.10713-1-ysato@users.sourceforge.jp/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Philippe Mathieu-Daudé March 2, 2019, 6:51 p.m. UTC | #2
Hi Yoshinori,

On 3/2/19 7:21 AM, Yoshinori Sato wrote:
> Hello.
> This patch series is added Renesas RX target emulation.
> 
> My git repository is bellow.
> git://git.pf.osdn.net/gitroot/y/ys/ysato/qemu.git
> 
> Since my understanding is not enough,
> I want many comments to make this a good one.

OK :)

Can you provide notes about how to test your port?
Such: links to toolchains, how to build, what firmware/OS we can run...

> 
> Thanks.
> 
> Changes v2
> Rewrite translate. using decodetree.py
> 
> Yoshinori Sato (11):
>   target/rx: TCG Translation
>   target/rx: TCG helper
>   target/rx: CPU definition
>   target/rx: RX disassembler
>   target/rx: miscellaneous functions
>   RX62N interrupt contorol uint
>   RX62N internal timer modules
>   RX62N internal serial communication interface
>   RX Target hardware definition
>   Add rx-softmmu
>   MAINTAINERS: Add RX entry.
> 
>  MAINTAINERS                    |   20 +
>  arch_init.c                    |    2 +
>  configure                      |    8 +
>  default-configs/rx-softmmu.mak |    7 +
>  hw/char/Makefile.objs          |    2 +-
>  hw/char/renesas_sci.c          |  288 ++++++
>  hw/intc/Makefile.objs          |    1 +
>  hw/intc/rx_icu.c               |  323 ++++++
>  hw/rx/Makefile.objs            |    1 +
>  hw/rx/rx62n.c                  |  227 ++++
>  hw/rx/rxqemu.c                 |  100 ++
>  hw/timer/Makefile.objs         |    2 +
>  hw/timer/renesas_cmt.c         |  235 +++++
>  hw/timer/renesas_tmr.c         |  412 ++++++++
>  include/disas/bfd.h            |    5 +
>  include/hw/char/renesas_sci.h  |   42 +
>  include/hw/intc/rx_icu.h       |   49 +
>  include/hw/rx/rx.h             |    7 +
>  include/hw/rx/rx62n.h          |   54 +
>  include/hw/timer/renesas_cmt.h |   33 +
>  include/hw/timer/renesas_tmr.h |   42 +
>  include/sysemu/arch_init.h     |    1 +
>  target/rx/Makefile.objs        |   11 +
>  target/rx/cpu-qom.h            |   52 +
>  target/rx/cpu.c                |  224 ++++
>  target/rx/cpu.h                |  214 ++++
>  target/rx/disas.c              | 1570 ++++++++++++++++++++++++++++
>  target/rx/gdbstub.c            |  113 ++
>  target/rx/helper.c             |  252 +++++
>  target/rx/helper.h             |   39 +
>  target/rx/insns.decode         |  336 ++++++
>  target/rx/monitor.c            |   38 +
>  target/rx/op_helper.c          |  602 +++++++++++
>  target/rx/translate.c          | 2220 ++++++++++++++++++++++++++++++++++++++++
>  34 files changed, 7531 insertions(+), 1 deletion(-)
>  create mode 100644 default-configs/rx-softmmu.mak
>  create mode 100644 hw/char/renesas_sci.c
>  create mode 100644 hw/intc/rx_icu.c
>  create mode 100644 hw/rx/Makefile.objs
>  create mode 100644 hw/rx/rx62n.c
>  create mode 100644 hw/rx/rxqemu.c
>  create mode 100644 hw/timer/renesas_cmt.c
>  create mode 100644 hw/timer/renesas_tmr.c
>  create mode 100644 include/hw/char/renesas_sci.h
>  create mode 100644 include/hw/intc/rx_icu.h
>  create mode 100644 include/hw/rx/rx.h
>  create mode 100644 include/hw/rx/rx62n.h
>  create mode 100644 include/hw/timer/renesas_cmt.h
>  create mode 100644 include/hw/timer/renesas_tmr.h
>  create mode 100644 target/rx/Makefile.objs
>  create mode 100644 target/rx/cpu-qom.h
>  create mode 100644 target/rx/cpu.c
>  create mode 100644 target/rx/cpu.h
>  create mode 100644 target/rx/disas.c
>  create mode 100644 target/rx/gdbstub.c
>  create mode 100644 target/rx/helper.c
>  create mode 100644 target/rx/helper.h
>  create mode 100644 target/rx/insns.decode
>  create mode 100644 target/rx/monitor.c
>  create mode 100644 target/rx/op_helper.c
>  create mode 100644 target/rx/translate.c
>
Yoshinori Sato March 3, 2019, 8:39 a.m. UTC | #3
On Sun, 03 Mar 2019 03:51:14 +0900,
Philippe Mathieu-Daudé wrote:
> 
> Hi Yoshinori,
> 
> On 3/2/19 7:21 AM, Yoshinori Sato wrote:
> > Hello.
> > This patch series is added Renesas RX target emulation.
> > 
> > My git repository is bellow.
> > git://git.pf.osdn.net/gitroot/y/ys/ysato/qemu.git
> > 
> > Since my understanding is not enough,
> > I want many comments to make this a good one.
> 
> OK :)
> 
> Can you provide notes about how to test your port?
> Such: links to toolchains, how to build, what firmware/OS we can run...

OK.
toolchain - rx-unknown-linux
binutils-2.32 include rx-unknown-linux support.
$ configure --target=rx-unknown-linux
$ make

gcc - please use my git repo.
git://git.pf.osdn.net/gitroot/y/ys/ysato/gcc.git rx-trunk
$ configure --target=rx-unknown-linux --enable-languages=c --disable-shared \
--disable-threads --with-uclibc --disable-libssp --disable-libquadmath \
--disable-libgomp --disable-libatomic
$ make

This toolchain can build u-boot / linux.

target program.
u-boot
git://git.pf.osdn.net/gitroot/y/ys/ysato/uboot.git rx
pre build binary in bellow.
https://osdn.net/users/ysato/pf/qemu/dl/u-boot.bin

linux
git://git.osdn.net/gitroot/uclinux-h8/linux.git rx
https://osdn.net/users/ysato/pf/qemu/dl/zImage

Since linux is still incomplete, it may be problematic.

> > 
> > Thanks.
> > 
> > Changes v2
> > Rewrite translate. using decodetree.py
> > 
> > Yoshinori Sato (11):
> >   target/rx: TCG Translation
> >   target/rx: TCG helper
> >   target/rx: CPU definition
> >   target/rx: RX disassembler
> >   target/rx: miscellaneous functions
> >   RX62N interrupt contorol uint
> >   RX62N internal timer modules
> >   RX62N internal serial communication interface
> >   RX Target hardware definition
> >   Add rx-softmmu
> >   MAINTAINERS: Add RX entry.
> > 
> >  MAINTAINERS                    |   20 +
> >  arch_init.c                    |    2 +
> >  configure                      |    8 +
> >  default-configs/rx-softmmu.mak |    7 +
> >  hw/char/Makefile.objs          |    2 +-
> >  hw/char/renesas_sci.c          |  288 ++++++
> >  hw/intc/Makefile.objs          |    1 +
> >  hw/intc/rx_icu.c               |  323 ++++++
> >  hw/rx/Makefile.objs            |    1 +
> >  hw/rx/rx62n.c                  |  227 ++++
> >  hw/rx/rxqemu.c                 |  100 ++
> >  hw/timer/Makefile.objs         |    2 +
> >  hw/timer/renesas_cmt.c         |  235 +++++
> >  hw/timer/renesas_tmr.c         |  412 ++++++++
> >  include/disas/bfd.h            |    5 +
> >  include/hw/char/renesas_sci.h  |   42 +
> >  include/hw/intc/rx_icu.h       |   49 +
> >  include/hw/rx/rx.h             |    7 +
> >  include/hw/rx/rx62n.h          |   54 +
> >  include/hw/timer/renesas_cmt.h |   33 +
> >  include/hw/timer/renesas_tmr.h |   42 +
> >  include/sysemu/arch_init.h     |    1 +
> >  target/rx/Makefile.objs        |   11 +
> >  target/rx/cpu-qom.h            |   52 +
> >  target/rx/cpu.c                |  224 ++++
> >  target/rx/cpu.h                |  214 ++++
> >  target/rx/disas.c              | 1570 ++++++++++++++++++++++++++++
> >  target/rx/gdbstub.c            |  113 ++
> >  target/rx/helper.c             |  252 +++++
> >  target/rx/helper.h             |   39 +
> >  target/rx/insns.decode         |  336 ++++++
> >  target/rx/monitor.c            |   38 +
> >  target/rx/op_helper.c          |  602 +++++++++++
> >  target/rx/translate.c          | 2220 ++++++++++++++++++++++++++++++++++++++++
> >  34 files changed, 7531 insertions(+), 1 deletion(-)
> >  create mode 100644 default-configs/rx-softmmu.mak
> >  create mode 100644 hw/char/renesas_sci.c
> >  create mode 100644 hw/intc/rx_icu.c
> >  create mode 100644 hw/rx/Makefile.objs
> >  create mode 100644 hw/rx/rx62n.c
> >  create mode 100644 hw/rx/rxqemu.c
> >  create mode 100644 hw/timer/renesas_cmt.c
> >  create mode 100644 hw/timer/renesas_tmr.c
> >  create mode 100644 include/hw/char/renesas_sci.h
> >  create mode 100644 include/hw/intc/rx_icu.h
> >  create mode 100644 include/hw/rx/rx.h
> >  create mode 100644 include/hw/rx/rx62n.h
> >  create mode 100644 include/hw/timer/renesas_cmt.h
> >  create mode 100644 include/hw/timer/renesas_tmr.h
> >  create mode 100644 target/rx/Makefile.objs
> >  create mode 100644 target/rx/cpu-qom.h
> >  create mode 100644 target/rx/cpu.c
> >  create mode 100644 target/rx/cpu.h
> >  create mode 100644 target/rx/disas.c
> >  create mode 100644 target/rx/gdbstub.c
> >  create mode 100644 target/rx/helper.c
> >  create mode 100644 target/rx/helper.h
> >  create mode 100644 target/rx/insns.decode
> >  create mode 100644 target/rx/monitor.c
> >  create mode 100644 target/rx/op_helper.c
> >  create mode 100644 target/rx/translate.c
> > 
>
Richard Henderson March 8, 2019, 1:24 a.m. UTC | #4
On 3/1/19 10:21 PM, Yoshinori Sato wrote:
> My git repository is bellow.
> git://git.pf.osdn.net/gitroot/y/ys/ysato/qemu.git

Somehow patch 1 did not arrive, so I am reviewing based on
rebasing this branch against master, and then looking at the diff.

> +struct CCop;

Unused?

> +static inline void cpu_get_tb_cpu_state(CPURXState *env, target_ulong *pc,
> +                                        target_ulong *cs_base, uint32_t *flags)
> +{
> +    *pc = env->pc;
> +    *cs_base = 0;
> +    *flags = 0;
> +}

*flags should contain PSW[PM], I think, so that knowledge of the
privilege level is given to the translator.

Looking forward I see that you're currently testing cpu_psw_pm dynamically for
instructions that require privs, so what you have is not wrong, but this is
exactly the sort of thing that TB flags are for.

> +#define RX_PSW_OP_NEG 4

Unused?

> +#define RX_BYTE 0
> +#define RX_WORD 1
> +#define RX_LONG 2

Redundant with TCGMemOps: MO_8, MO_16, MO_32?

> +++ b/target/rx/insns.decode

Should have a copyright + license notice.

> +BCnd_b         0010 cd:4 dsp:8                         &bcnd
...
> +#BRA_b         0010 1110 dsp:8         # overlaps BCnd_b

FYI, using pattern groups this can be written

{
  BRA_b        0010 1110 dsp:8
  Bcnd_b       0010 cd:4 dsp:8
}

I expect to have pattern groups merged this week.

> +static void rx_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
> +{
> +    DisasContext *ctx = container_of(dcbase, DisasContext, base);
> +
> +    ctx->src = tcg_temp_new();
> +}

This allocation will not survive across a branch or label.
So, after any SHIFTR_REG, for instance.

I think you need to allocate this on demand each insn, and
free it as necessary after each insn.

(Although avoiding branches in tcg is always helpful.)

> +/* generic load / store wrapper */
> +static inline void rx_gen_ldst(unsigned int size, unsigned int dir,
> +                        TCGv reg, TCGv mem)
> +{
> +    if (dir) {
> +        tcg_gen_qemu_ld_i32(reg, mem, 0, size | MO_SIGN | MO_TE);
> +    } else {
> +        tcg_gen_qemu_st_i32(reg, mem, 0, size | MO_TE);
> +    }
> +}

It would probably be worthwhile to split this function, and drop the "dir"
parameter.  It is always a constant, so instead of

  rx_gen_ldst(mi, RX_MEMORY_LD, ctx->src, ctx->src);
  rx_gen_ldst(a->sz, RX_MEMORY_ST, cpu_regs[a->rs], ctx->src);

use the shorter

  rx_gen_ld(mi, ctx->src, ctx->src);
  rx_gen_st(a->sz, cpu_regs[a->rs], ctx->src);


> +/* mov.l #uimm4,rd */
> +/* mov.l #uimm8,rd */
> +static bool trans_MOV_ri(DisasContext *ctx, arg_MOV_ri *a)
> +{
> +    tcg_gen_movi_i32(cpu_regs[a->rd], a->imm & 0xff);
> +    return true;
> +}

a->imm will already have been masked by the decode.
You can drop the & 0xff here.

> +/* mov.l #imm,rd */
> +static bool trans_MOV_rli(DisasContext *ctx, arg_MOV_rli *a)
> +{
> +    tcg_gen_movi_i32(cpu_regs[a->rd], a->imm);
> +    return true;
> +}

As written, this function is redundant.  We should be using the same MOV_ri,
with the immediate loaded from %b2_li_2.

Likewise for trans_MOV_mi vs trans_MOV_mli.  That said...

> +static bool trans_MOV_mli(DisasContext *ctx, arg_MOV_mli *a)
> +{
> +    TCGv imm = tcg_const_i32(a->imm);
> +    if (a->ld == 2) {
> +        a->dsp = bswap_16(a->dsp);
> +    }

This suggests that the decode is incorrect.  Or perhaps the construction of the
32-bit insn passed into decode.  In decode_load_bytes, we construct a
big-endian value, so it would seem this dsp field should be loaded as a
little-endian value.

This could be fixed by not attempting to load the LI constant in decodetree
(for this insn), which would in turn not require that you decode the LD operand
by hand in decodetree.  E.g.

static bool trans_MOV_mli(DisasContext *ctx, arg_MOV_mli *a)
{
    TCGv imm;

    /* dsp operand comes before immediate operand */
    rx_index_addr(a->ld, a->sz, a->rd, s);
    imm = tcg_const_i32(li(s, a->li));
    rx_gen_st(a->sz, imm, ctx->src);
    tcg_temp_free_i32(imm);
    return true;
}

This will be easiest if you ever support the big-endian version of RX.

> +/* push rs */
> +static bool trans_PUSH_r(DisasContext *ctx, arg_PUSH_r *a)
> +{
> +    if (a->rs != 0) {
> +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
> +        rx_gen_ldst(a->sz, RX_MEMORY_ST, cpu_regs[a->rs], cpu_regs[0]);
> +    } else {
> +        tcg_gen_mov_i32(ctx->src, cpu_regs[a->rs]);
> +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
> +        rx_gen_ldst(a->sz, RX_MEMORY_ST, ctx->src, cpu_regs[0]);
> +    }
> +    return true;

As far as I can see the THEN and ELSE cases have identical operation.

> +static bool trans_XCHG_rl(DisasContext *ctx, arg_XCHG_rl *a)
> +{
> +    int sz;
> +    TCGv tmp;
> +    tmp = tcg_temp_new();
> +    if (a->ld == 3) {
> +       /* xchg rs,rd */
> +        tcg_gen_mov_i32(tmp, cpu_regs[a->rs]);
> +        tcg_gen_mov_i32(cpu_regs[a->rs], cpu_regs[a->rd]);
> +        tcg_gen_mov_i32(cpu_regs[a->rd], tmp);
> +    } else {
> +        switch (a->mi) {
> +        case 0 ... 2:
> +            rx_index_addr(a->ld, a->mi, a->rs, ctx);
> +            rx_gen_ldst(a->mi, RX_MEMORY_LD, tmp, ctx->src);
> +            sz = a->mi;
> +            break;
> +        case 3:
> +            rx_index_addr(a->ld, RX_MEMORY_WORD, a->rs, ctx);
> +            rx_gen_ldu(RX_MEMORY_WORD, tmp, ctx->src);
> +            sz = RX_MEMORY_WORD;
> +            break;
> +        case 4:
> +            rx_index_addr(a->ld, RX_MEMORY_BYTE, a->rs, ctx);
> +            rx_gen_ldu(RX_MEMORY_BYTE, tmp, ctx->src);
> +            sz = RX_MEMORY_BYTE;
> +            break;
> +        }
> +        rx_gen_ldst(sz, RX_MEMORY_ST, cpu_regs[a->rd], ctx->src);
> +        tcg_gen_mov_i32(cpu_regs[a->rd], tmp);
> +    }

While I doubt that there is an SMP RX cpu, I think this should still implement
an atomic xchg operation.  This will allow rx-linux-user to run on SMP host
cpus.  Thus use

static inline TCGMemOp mi_to_mop(unsigned mi)
{
    static const TCGMemOp mop[5] = { MO_SB, MO_SW, MO_UL, MO_UH, MO_UB };
    tcg_debug_assert(mi < 5);
    return mop[mi];
}

  tcg_gen_atomic_xchg_i32(cpu_regs[a->rd], ctx->src, cpu_regs[a->rd],
                          0, mi_to_mop(a->mi));


> +#define STCOND(cond)                                                    \
> +    do {                                                                \
> +        tcg_gen_movi_i32(ctx->imm, a->imm);            \
> +        tcg_gen_movi_i32(ctx->one, 1);                                  \
> +        tcg_gen_movcond_i32(cond, cpu_regs[a->rd], cpu_psw_z,           \
> +                            ctx->one, ctx->imm, cpu_regs[a->rd]);       \
> +        return true;                                                    \
> +    } while(0)

Unused?  This seems close to what you wanted instead of...

> +#define STZFN(maskop)                                                   \
> +    do {                                                                \
> +        TCGv mask, imm;                                                 \
> +        mask = tcg_temp_new();                                          \
> +        imm = tcg_temp_new();                                           \
> +        maskop;                                                         \
> +        tcg_gen_andi_i32(imm, mask, a->imm);                            \
> +        tcg_gen_andc_i32(cpu_regs[a->rd], cpu_regs[a->rd], mask);       \
> +        tcg_gen_or_i32(cpu_regs[a->rd], cpu_regs[a->rd], imm);          \
> +        tcg_temp_free(mask);                                            \
> +        tcg_temp_free(imm);                                             \
> +        return true;                                                    \
> +    } while(0)

... this.

  TCGv zero = tcg_const_i32(0);
  TCGv imm = tcg_const_i32(a->imm);
  tcg_gen_movcond(COND, cpu_regs[a->rd], cpu_psw_z, zero,
                  imm, cpu_regs[a->rd]);

where trans_STZ passes TCG_COND_NE and STNZ passes TCG_COND_EQ.

In addition, there's no point in using a #define when a function would work
just as well.

> +#define GEN_LOGIC_OP(opr, ret, arg1, arg2) \
> +    do {                                                                \
> +        opr(ret, arg1, arg2);                                       \
> +        tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_psw_z, ret, 0);           \
> +        tcg_gen_setcondi_i32(TCG_COND_GEU, cpu_psw_s, ret, 0x80000000UL); \
> +    } while(0)

Note that the second is the same as

  tcg_gen_setcondi_i32(TCG_COND_LT, cpu_psw_s, ret, 0);
or
  tcg_gen_shri_i32(cpu_psw_s, ret, 31);

That said, this extra computation is exactly why ARM (and others) represent the
flag differently in the field.  The inline setting would be

  tcg_gen_mov_i32(cpu_psw_z, ret);
  tcg_gen_mov_i32(cpu_psw_s, ret);

And then whenever we *use* these flags, we only look at the sign bit of S, and
we see if the whole of Z is zero/non-zero.  Similarly, the overflow flag is
also stored in the sign bit of O, so that is computed more easily:

  O = (ret ^ in1) & ~(in1 & in2)

with no right-shift required at the end.

This also means that we've reduced the amount of inline computation to the
point that there is no point in *not* doing the computation every time it is
required, and so update_psw_o and cpu->psw_v[] goes away.

Again, no reason for a macro when a function will do.
They are easier to debug.

> +    rs = (a->rs2 < 16) ? a->rs2 : a->rd;

Is this due to

> @b2_rds_uimm4   .... .... imm:4 rd:4            &rri rs2=255 len=2

this?  A better decode is

@b2_rds_uimm4   .... .... imm:4 rd:4    &rri rs2=%b2_r_0 len=2

i.e. extract the rd field twice.  Once directly into rd and again from the same
bits into rs2.

> +/* revw rs, rd */
> +static bool trans_REVW(DisasContext *ctx, arg_REVW *a)
> +{
> +    TCGv hi, lo;
> +
> +    hi = tcg_temp_new();
> +    lo = tcg_temp_new();
> +    tcg_gen_shri_i32(hi, cpu_regs[a->rs], 16);
> +    tcg_gen_bswap16_i32(hi, hi);
> +    tcg_gen_shli_i32(hi, hi, 16);
> +    tcg_gen_bswap16_i32(lo, cpu_regs[a->rs]);
> +    tcg_gen_or_i32(cpu_regs[a->rd], hi, lo);
> +    tcg_temp_free(hi);
> +    tcg_temp_free(lo);
> +    return true;
> +}

The bswap16 primitive requires the input to be zero-extended.
Thus this second bswap16 may fail.  Perhaps better as

  T0 = 0x00ff00ff;
  T1 = RS & T0;
  T2 = RS >> 8;
  T1 = T1 << 8;
  T2 = T2 & T0;
  RD = T1 | T2;

> +/* conditional branch helper */
> +static void rx_bcnd_main(DisasContext *ctx, int cd, int dst, int len)
> +{
> +        TCGv t, f, cond;
> +        switch(cd) {
> +        case 0 ... 13:

Indentation is off?

> +static bool trans_BCnd_s(DisasContext *ctx, arg_BCnd_s *a)
> +{
> +    if (a->dsp < 3)
> +        a->dsp += 8;

This bit should probably be in the decode, via !function.

> +static int16_t rev16(uint16_t dsp)
> +{
> +    return ((dsp << 8) & 0xff00) | ((dsp >> 8) & 0x00ff);
> +}
> +
> +static int32_t rev24(uint32_t dsp)
> +{
> +    dsp = ((dsp << 16) & 0xff0000) |
> +        (dsp & 0x00ff00) |
> +        ((dsp >> 16) & 0x0000ff);
> +    dsp |= (dsp & 0x00800000) ? 0xff000000 : 0x00000000;
> +    return dsp;
> +}

These could also be in the decode, probably with %some_field.
But as with MOV above, maybe ought to be done via an explicit
call to li(), in order to get a (potential) big-endian RX to work.

Also, watch out for CODING_STYLE violations.
Please recheck everything with ./scripts/checkpatch.pl.

> +/* mvtachi rs */
> +static bool trans_MVTACHI(DisasContext *ctx, arg_MVTACHI *a)
> +{
> +    TCGv_i32 hi, lo;
> +    hi = tcg_temp_new_i32();
> +    lo = tcg_temp_new_i32();
> +    tcg_gen_extr_i64_i32(lo, hi, cpu_acc);
> +    tcg_gen_concat_i32_i64(cpu_acc, lo, cpu_regs[a->rs]);
> +    tcg_temp_free_i32(hi);
> +    tcg_temp_free_i32(lo);
> +    return true;
> +}

Hmm.  How about

   TCGv_i64 rs64 = tcg_temp_new_i64();
   tcg_gen_extu_i32_i64(rs64, cpu_regs[a->rs]);
   tcg_gen_deposit_i64(cpu_acc, cpu_acc, rs64, 32, 32);

> +static inline void clrsetpsw(int dst, int mode)
> +{
> +    TCGv psw[] = {
> +        cpu_psw_c, cpu_psw_z, cpu_psw_s, cpu_psw_o,
> +        NULL, NULL, NULL, NULL,
> +        cpu_psw_i, cpu_psw_u, NULL, NULL,
> +        NULL, NULL, NULL, NULL
> +    };
> +    TCGLabel *skip;
> +
> +    skip = gen_new_label();
> +    if (dst >= 8)
> +        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_psw_pm, 0, skip);
> +    tcg_gen_movi_i32(psw[dst], mode);
> +    if (dst == 3)
> +        tcg_gen_movi_i32(cpu_pswop, RX_PSW_OP_NONE);
> +    gen_set_label(skip);
> +}

When clearing psw_i, you'll need to exit to the main loop so that any pending
interrupts are recognized.  Otherwise you risk going around in a loop and never
delivering the interrupt at all.

> +/* rtfi */
> +static bool trans_RTFI(DisasContext *ctx, arg_RTFI *a)
> +{
> +    check_previleged();
> +    tcg_gen_mov_i32(cpu_pc, cpu_bpc);
> +    tcg_gen_mov_i32(cpu_psw, cpu_bpsw);
> +    gen_helper_unpack_psw(cpu_env);
> +    ctx->base.is_jmp = DISAS_JUMP;
> +    return true;
> +}

Likewise, you need to exit to the main loop, because psw_i changed.

> +    cpu_pswop = tcg_global_mem_new_i32(cpu_env,
> +                                     offsetof(CPURXState, psw_op), "");
> +    for (i = 0; i < 3; i++) {
> +        cpu_pswop_v[i] = tcg_global_mem_new_i32(cpu_env,
> +                                                offsetof(CPURXState, psw_v[i]), "");

Using the empty string for the name of a temp is cruel.
That'll be very hard to debug.

> +uint32_t psw_cond(CPURXState *env, uint32_t cond)

In general this can be inlined fairly easily.
See, for instance, target/arm/translate.c, arm_test_cc.

> +uint32_t helper_div(CPURXState *env, uint32_t num, uint32_t den)
> +{
> +    uint32_t ret = num;
> +    if (den != 0) {
> +        ret = (int32_t)num / (int32_t)den;
> +    }
> +    env->psw_o = (ret == 0 || den == 0);

That is not the correct overflow condition.
ret == 0 is irrelevant, but INT_MIN / -1 is relevant.


Finally, this was way too big to review properly.  This patch set will need to
be broken up into smaller pieces.  Perhaps introducing just one, or a few
related instructions with each patch.


r~
Yoshinori Sato March 20, 2019, 2:05 p.m. UTC | #5
On Fri, 08 Mar 2019 10:24:23 +0900,
Richard Henderson wrote:
> 
> On 3/1/19 10:21 PM, Yoshinori Sato wrote:
> > My git repository is bellow.
> > git://git.pf.osdn.net/gitroot/y/ys/ysato/qemu.git
> 
> Somehow patch 1 did not arrive, so I am reviewing based on
> rebasing this branch against master, and then looking at the diff.
> 
> > +struct CCop;
> 
> Unused?

Yes. I forgot remove.
Remove it.

> > +static inline void cpu_get_tb_cpu_state(CPURXState *env, target_ulong *pc,
> > +                                        target_ulong *cs_base, uint32_t *flags)
> > +{
> > +    *pc = env->pc;
> > +    *cs_base = 0;
> > +    *flags = 0;
> > +}
> 
> *flags should contain PSW[PM], I think, so that knowledge of the
> privilege level is given to the translator.
> 
> Looking forward I see that you're currently testing cpu_psw_pm dynamically for
> instructions that require privs, so what you have is not wrong, but this is
> exactly the sort of thing that TB flags are for.

OK.
Update PSW.PM copy to DisasContextBase.flags.

> > +#define RX_PSW_OP_NEG 4
> 
> Unused?

Yes. Remove it.

> > +#define RX_BYTE 0
> > +#define RX_WORD 1
> > +#define RX_LONG 2
> 
> Redundant with TCGMemOps: MO_8, MO_16, MO_32?

OK. Convert it.

> > +++ b/target/rx/insns.decode
> 
> Should have a copyright + license notice.
> 
> > +BCnd_b         0010 cd:4 dsp:8                         &bcnd
> ...
> > +#BRA_b         0010 1110 dsp:8         # overlaps BCnd_b
> 
> FYI, using pattern groups this can be written
> 
> {
>   BRA_b        0010 1110 dsp:8
>   Bcnd_b       0010 cd:4 dsp:8
> }
> 
> I expect to have pattern groups merged this week.

OK. Update pattern groups.

> > +static void rx_tr_init_disas_context(DisasContextBase *dcbase, CPUState *cs)
> > +{
> > +    DisasContext *ctx = container_of(dcbase, DisasContext, base);
> > +
> > +    ctx->src = tcg_temp_new();
> > +}
> 
> This allocation will not survive across a branch or label.
> So, after any SHIFTR_REG, for instance.
> 
> I think you need to allocate this on demand each insn, and
> free it as necessary after each insn.
> 
> (Although avoiding branches in tcg is always helpful.)

OK. Remove it.

> > +/* generic load / store wrapper */
> > +static inline void rx_gen_ldst(unsigned int size, unsigned int dir,
> > +                        TCGv reg, TCGv mem)
> > +{
> > +    if (dir) {
> > +        tcg_gen_qemu_ld_i32(reg, mem, 0, size | MO_SIGN | MO_TE);
> > +    } else {
> > +        tcg_gen_qemu_st_i32(reg, mem, 0, size | MO_TE);
> > +    }
> > +}
> 
> It would probably be worthwhile to split this function, and drop the "dir"
> parameter.  It is always a constant, so instead of
> 
>   rx_gen_ldst(mi, RX_MEMORY_LD, ctx->src, ctx->src);
>   rx_gen_ldst(a->sz, RX_MEMORY_ST, cpu_regs[a->rs], ctx->src);
> 
> use the shorter
> 
>   rx_gen_ld(mi, ctx->src, ctx->src);
>   rx_gen_st(a->sz, cpu_regs[a->rs], ctx->src);

OK. fixed

> 
> > +/* mov.l #uimm4,rd */
> > +/* mov.l #uimm8,rd */
> > +static bool trans_MOV_ri(DisasContext *ctx, arg_MOV_ri *a)
> > +{
> > +    tcg_gen_movi_i32(cpu_regs[a->rd], a->imm & 0xff);
> > +    return true;
> > +}
> 
> a->imm will already have been masked by the decode.
> You can drop the & 0xff here.

That was right.
It seems that I was misunderstood.

> > +/* mov.l #imm,rd */
> > +static bool trans_MOV_rli(DisasContext *ctx, arg_MOV_rli *a)
> > +{
> > +    tcg_gen_movi_i32(cpu_regs[a->rd], a->imm);
> > +    return true;
> > +}
> 
> As written, this function is redundant.  We should be using the same MOV_ri,
> with the immediate loaded from %b2_li_2.
> 
> Likewise for trans_MOV_mi vs trans_MOV_mli.  That said...

OK. Unified various interger instructions.

> > +static bool trans_MOV_mli(DisasContext *ctx, arg_MOV_mli *a)
> > +{
> > +    TCGv imm = tcg_const_i32(a->imm);
> > +    if (a->ld == 2) {
> > +        a->dsp = bswap_16(a->dsp);
> > +    }
> 
> This suggests that the decode is incorrect.  Or perhaps the construction of the
> 32-bit insn passed into decode.  In decode_load_bytes, we construct a
> big-endian value, so it would seem this dsp field should be loaded as a
> little-endian value.
> 
> This could be fixed by not attempting to load the LI constant in decodetree
> (for this insn), which would in turn not require that you decode the LD operand
> by hand in decodetree.  E.g.
> 
> static bool trans_MOV_mli(DisasContext *ctx, arg_MOV_mli *a)
> {
>     TCGv imm;
> 
>     /* dsp operand comes before immediate operand */
>     rx_index_addr(a->ld, a->sz, a->rd, s);
>     imm = tcg_const_i32(li(s, a->li));
>     rx_gen_st(a->sz, imm, ctx->src);
>     tcg_temp_free_i32(imm);
>     return true;
> }
> 
> This will be easiest if you ever support the big-endian version of RX.

OK. fixed another way.
But RX big-endian mode only data access.
So operand value always little-endian order.

> > +/* push rs */
> > +static bool trans_PUSH_r(DisasContext *ctx, arg_PUSH_r *a)
> > +{
> > +    if (a->rs != 0) {
> > +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
> > +        rx_gen_ldst(a->sz, RX_MEMORY_ST, cpu_regs[a->rs], cpu_regs[0]);
> > +    } else {
> > +        tcg_gen_mov_i32(ctx->src, cpu_regs[a->rs]);
> > +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
> > +        rx_gen_ldst(a->sz, RX_MEMORY_ST, ctx->src, cpu_regs[0]);
> > +    }
> > +    return true;
> 
> As far as I can see the THEN and ELSE cases have identical operation.

It little different.
In the case of r0, the value before decrementing is stored in memory.
I added comment.

> > +static bool trans_XCHG_rl(DisasContext *ctx, arg_XCHG_rl *a)
> > +{
> > +    int sz;
> > +    TCGv tmp;
> > +    tmp = tcg_temp_new();
> > +    if (a->ld == 3) {
> > +       /* xchg rs,rd */
> > +        tcg_gen_mov_i32(tmp, cpu_regs[a->rs]);
> > +        tcg_gen_mov_i32(cpu_regs[a->rs], cpu_regs[a->rd]);
> > +        tcg_gen_mov_i32(cpu_regs[a->rd], tmp);
> > +    } else {
> > +        switch (a->mi) {
> > +        case 0 ... 2:
> > +            rx_index_addr(a->ld, a->mi, a->rs, ctx);
> > +            rx_gen_ldst(a->mi, RX_MEMORY_LD, tmp, ctx->src);
> > +            sz = a->mi;
> > +            break;
> > +        case 3:
> > +            rx_index_addr(a->ld, RX_MEMORY_WORD, a->rs, ctx);
> > +            rx_gen_ldu(RX_MEMORY_WORD, tmp, ctx->src);
> > +            sz = RX_MEMORY_WORD;
> > +            break;
> > +        case 4:
> > +            rx_index_addr(a->ld, RX_MEMORY_BYTE, a->rs, ctx);
> > +            rx_gen_ldu(RX_MEMORY_BYTE, tmp, ctx->src);
> > +            sz = RX_MEMORY_BYTE;
> > +            break;
> > +        }
> > +        rx_gen_ldst(sz, RX_MEMORY_ST, cpu_regs[a->rd], ctx->src);
> > +        tcg_gen_mov_i32(cpu_regs[a->rd], tmp);
> > +    }
> 
> While I doubt that there is an SMP RX cpu, I think this should still implement
> an atomic xchg operation.  This will allow rx-linux-user to run on SMP host
> cpus.  Thus use
> 
> static inline TCGMemOp mi_to_mop(unsigned mi)
> {
>     static const TCGMemOp mop[5] = { MO_SB, MO_SW, MO_UL, MO_UH, MO_UB };
>     tcg_debug_assert(mi < 5);
>     return mop[mi];
> }
> 
>   tcg_gen_atomic_xchg_i32(cpu_regs[a->rd], ctx->src, cpu_regs[a->rd],
>                           0, mi_to_mop(a->mi));
>

OK.

> 
> > +#define STCOND(cond)                                                    \
> > +    do {                                                                \
> > +        tcg_gen_movi_i32(ctx->imm, a->imm);            \
> > +        tcg_gen_movi_i32(ctx->one, 1);                                  \
> > +        tcg_gen_movcond_i32(cond, cpu_regs[a->rd], cpu_psw_z,           \
> > +                            ctx->one, ctx->imm, cpu_regs[a->rd]);       \
> > +        return true;                                                    \
> > +    } while(0)
> 
> Unused?  This seems close to what you wanted instead of...

Yes. Removed.

> > +#define STZFN(maskop)                                                   \
> > +    do {                                                                \
> > +        TCGv mask, imm;                                                 \
> > +        mask = tcg_temp_new();                                          \
> > +        imm = tcg_temp_new();                                           \
> > +        maskop;                                                         \
> > +        tcg_gen_andi_i32(imm, mask, a->imm);                            \
> > +        tcg_gen_andc_i32(cpu_regs[a->rd], cpu_regs[a->rd], mask);       \
> > +        tcg_gen_or_i32(cpu_regs[a->rd], cpu_regs[a->rd], imm);          \
> > +        tcg_temp_free(mask);                                            \
> > +        tcg_temp_free(imm);                                             \
> > +        return true;                                                    \
> > +    } while(0)
> 
> ... this.
> 
>   TCGv zero = tcg_const_i32(0);
>   TCGv imm = tcg_const_i32(a->imm);
>   tcg_gen_movcond(COND, cpu_regs[a->rd], cpu_psw_z, zero,
>                   imm, cpu_regs[a->rd]);
> 
> where trans_STZ passes TCG_COND_NE and STNZ passes TCG_COND_EQ.
> 
> In addition, there's no point in using a #define when a function would work
> just as well.

OK. fixed.

> > +#define GEN_LOGIC_OP(opr, ret, arg1, arg2) \
> > +    do {                                                                \
> > +        opr(ret, arg1, arg2);                                       \
> > +        tcg_gen_setcondi_i32(TCG_COND_EQ, cpu_psw_z, ret, 0);           \
> > +        tcg_gen_setcondi_i32(TCG_COND_GEU, cpu_psw_s, ret, 0x80000000UL); \
> > +    } while(0)
> 
> Note that the second is the same as
> 
>   tcg_gen_setcondi_i32(TCG_COND_LT, cpu_psw_s, ret, 0);
> or
>   tcg_gen_shri_i32(cpu_psw_s, ret, 31);
> 
> That said, this extra computation is exactly why ARM (and others) represent the
> flag differently in the field.  The inline setting would be
> 
>   tcg_gen_mov_i32(cpu_psw_z, ret);
>   tcg_gen_mov_i32(cpu_psw_s, ret);
> 
> And then whenever we *use* these flags, we only look at the sign bit of S, and
> we see if the whole of Z is zero/non-zero.  Similarly, the overflow flag is
> also stored in the sign bit of O, so that is computed more easily:
> 
>   O = (ret ^ in1) & ~(in1 & in2)
> 
> with no right-shift required at the end.
> 
> This also means that we've reduced the amount of inline computation to the
> point that there is no point in *not* doing the computation every time it is
> required, and so update_psw_o and cpu->psw_v[] goes away.

OK. fixed psw operation.

> Again, no reason for a macro when a function will do.
> They are easier to debug.

OK.

> > +    rs = (a->rs2 < 16) ? a->rs2 : a->rd;
> 
> Is this due to
> 
> > @b2_rds_uimm4   .... .... imm:4 rd:4            &rri rs2=255 len=2
> 
> this?  A better decode is
> 
> @b2_rds_uimm4   .... .... imm:4 rd:4    &rri rs2=%b2_r_0 len=2
> 
> i.e. extract the rd field twice.  Once directly into rd and again from the same
> bits into rs2.

OK. It works fine.

> > +/* revw rs, rd */
> > +static bool trans_REVW(DisasContext *ctx, arg_REVW *a)
> > +{
> > +    TCGv hi, lo;
> > +
> > +    hi = tcg_temp_new();
> > +    lo = tcg_temp_new();
> > +    tcg_gen_shri_i32(hi, cpu_regs[a->rs], 16);
> > +    tcg_gen_bswap16_i32(hi, hi);
> > +    tcg_gen_shli_i32(hi, hi, 16);
> > +    tcg_gen_bswap16_i32(lo, cpu_regs[a->rs]);
> > +    tcg_gen_or_i32(cpu_regs[a->rd], hi, lo);
> > +    tcg_temp_free(hi);
> > +    tcg_temp_free(lo);
> > +    return true;
> > +}
> 
> The bswap16 primitive requires the input to be zero-extended.
> Thus this second bswap16 may fail.  Perhaps better as
> 
>   T0 = 0x00ff00ff;
>   T1 = RS & T0;
>   T2 = RS >> 8;
>   T1 = T1 << 8;
>   T2 = T2 & T0;
>   RD = T1 | T2;

OK.

> > +/* conditional branch helper */
> > +static void rx_bcnd_main(DisasContext *ctx, int cd, int dst, int len)
> > +{
> > +        TCGv t, f, cond;
> > +        switch(cd) {
> > +        case 0 ... 13:
> 
> Indentation is off?

checkpatch.pl said "switch and case should be at the same indent".

> > +static bool trans_BCnd_s(DisasContext *ctx, arg_BCnd_s *a)
> > +{
> > +    if (a->dsp < 3)
> > +        a->dsp += 8;
> 
> This bit should probably be in the decode, via !function.

OK.

> > +static int16_t rev16(uint16_t dsp)
> > +{
> > +    return ((dsp << 8) & 0xff00) | ((dsp >> 8) & 0x00ff);
> > +}
> > +
> > +static int32_t rev24(uint32_t dsp)
> > +{
> > +    dsp = ((dsp << 16) & 0xff0000) |
> > +        (dsp & 0x00ff00) |
> > +        ((dsp >> 16) & 0x0000ff);
> > +    dsp |= (dsp & 0x00800000) ? 0xff000000 : 0x00000000;
> > +    return dsp;
> > +}
> 
> These could also be in the decode, probably with %some_field.
> But as with MOV above, maybe ought to be done via an explicit
> call to li(), in order to get a (potential) big-endian RX to work.

This field always little-endian. So it needed.

> Also, watch out for CODING_STYLE violations.
> Please recheck everything with ./scripts/checkpatch.pl.

OK.

> > +/* mvtachi rs */
> > +static bool trans_MVTACHI(DisasContext *ctx, arg_MVTACHI *a)
> > +{
> > +    TCGv_i32 hi, lo;
> > +    hi = tcg_temp_new_i32();
> > +    lo = tcg_temp_new_i32();
> > +    tcg_gen_extr_i64_i32(lo, hi, cpu_acc);
> > +    tcg_gen_concat_i32_i64(cpu_acc, lo, cpu_regs[a->rs]);
> > +    tcg_temp_free_i32(hi);
> > +    tcg_temp_free_i32(lo);
> > +    return true;
> > +}
> 
> Hmm.  How about
> 
>    TCGv_i64 rs64 = tcg_temp_new_i64();
>    tcg_gen_extu_i32_i64(rs64, cpu_regs[a->rs]);
>    tcg_gen_deposit_i64(cpu_acc, cpu_acc, rs64, 32, 32);

OK. Fixed it.

> > +static inline void clrsetpsw(int dst, int mode)
> > +{
> > +    TCGv psw[] = {
> > +        cpu_psw_c, cpu_psw_z, cpu_psw_s, cpu_psw_o,
> > +        NULL, NULL, NULL, NULL,
> > +        cpu_psw_i, cpu_psw_u, NULL, NULL,
> > +        NULL, NULL, NULL, NULL
> > +    };
> > +    TCGLabel *skip;
> > +
> > +    skip = gen_new_label();
> > +    if (dst >= 8)
> > +        tcg_gen_brcondi_i32(TCG_COND_NE, cpu_psw_pm, 0, skip);
> > +    tcg_gen_movi_i32(psw[dst], mode);
> > +    if (dst == 3)
> > +        tcg_gen_movi_i32(cpu_pswop, RX_PSW_OP_NONE);
> > +    gen_set_label(skip);
> > +}
> 
> When clearing psw_i, you'll need to exit to the main loop so that any pending
> interrupts are recognized.  Otherwise you risk going around in a loop and never
> delivering the interrupt at all.

OK.

> > +/* rtfi */
> > +static bool trans_RTFI(DisasContext *ctx, arg_RTFI *a)
> > +{
> > +    check_previleged();
> > +    tcg_gen_mov_i32(cpu_pc, cpu_bpc);
> > +    tcg_gen_mov_i32(cpu_psw, cpu_bpsw);
> > +    gen_helper_unpack_psw(cpu_env);
> > +    ctx->base.is_jmp = DISAS_JUMP;
> > +    return true;
> > +}
> 
> Likewise, you need to exit to the main loop, because psw_i changed.

OK. All psw_i changed instruction exit to loop.

> > +    cpu_pswop = tcg_global_mem_new_i32(cpu_env,
> > +                                     offsetof(CPURXState, psw_op), "");
> > +    for (i = 0; i < 3; i++) {
> > +        cpu_pswop_v[i] = tcg_global_mem_new_i32(cpu_env,
> > +                                                offsetof(CPURXState, psw_v[i]), "");
> 
> Using the empty string for the name of a temp is cruel.
> That'll be very hard to debug.

OK. Removed this variable.

> > +uint32_t psw_cond(CPURXState *env, uint32_t cond)
> 
> In general this can be inlined fairly easily.
> See, for instance, target/arm/translate.c, arm_test_cc.

OK. It very helpful.
I reworked psw operation.

> > +uint32_t helper_div(CPURXState *env, uint32_t num, uint32_t den)
> > +{
> > +    uint32_t ret = num;
> > +    if (den != 0) {
> > +        ret = (int32_t)num / (int32_t)den;
> > +    }
> > +    env->psw_o = (ret == 0 || den == 0);
> 
> That is not the correct overflow condition.
> ret == 0 is irrelevant, but INT_MIN / -1 is relevant.
>

Yes. It my mistake.
Fixed it.

> 
> Finally, this was way too big to review properly.  This patch set will need to
> be broken up into smaller pieces.  Perhaps introducing just one, or a few
> related instructions with each patch.
>

Thanks.
I reworked your comment.
I will send v4 patch.

> 
> r~
>
Richard Henderson March 21, 2019, 1:35 a.m. UTC | #6
On 3/20/19 7:05 AM, Yoshinori Sato wrote:
> OK. fixed another way.
> But RX big-endian mode only data access.
> So operand value always little-endian order.

Oh that is convenient.
Therefore the operand can always be put together by pieces.  E.g.

-%b4_dsp_16     0:16 !function=dsp16
-%b4_bdsp       0:24 !function=bdsp_a
+%b4_dsp16      0:s8 8:8
+%b4_dsp24      0:s8 8:8 16:8

Also note the 's' qualifier that defines signed fields.

-%b2_bdsp       16:8 !function=bdsp_b
...
-@b2_bcnd_b     .... cd:4 .... ....             &bcnd dsp=%b2_bdsp sz=2
-@b2_bra_b      .... .... .... ....             &jdsp dsp=%b2_bdsp sz=2
+@b2_bcnd_b     .... cd:4 dsp:s8                &bcnd sz=2
+@b2_bra_b      .... .... dsp:s8                &jdsp sz=2


>>> +/* push rs */
>>> +static bool trans_PUSH_r(DisasContext *ctx, arg_PUSH_r *a)
>>> +{
>>> +    if (a->rs != 0) {
>>> +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
>>> +        rx_gen_ldst(a->sz, RX_MEMORY_ST, cpu_regs[a->rs], cpu_regs[0]);
>>> +    } else {
>>> +        tcg_gen_mov_i32(ctx->src, cpu_regs[a->rs]);
>>> +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
>>> +        rx_gen_ldst(a->sz, RX_MEMORY_ST, ctx->src, cpu_regs[0]);
>>> +    }
>>> +    return true;
>>
>> As far as I can see the THEN and ELSE cases have identical operation.
> 
> It little different.
> In the case of r0, the value before decrementing is stored in memory.
> I added comment.

What I mean is that the sequence that you use for r0 could also be used for all
other rN.

I understand that RX does not have an mmu, but the normal way we handle this is

  tcg_gen_subi_i32(addr, cpu_regs[0], 4);
  rx_gen_st(a->sz, cpu_regs[a->rs], addr);
  tcg_gen_mov_i32(cpu_regs[0], addr);

so that the stack pointer is not modified if the store raises an exception.


r~
Yoshinori Sato March 22, 2019, 2:14 p.m. UTC | #7
On Thu, 21 Mar 2019 10:35:07 +0900,
Richard Henderson wrote:
> 
> On 3/20/19 7:05 AM, Yoshinori Sato wrote:
> > OK. fixed another way.
> > But RX big-endian mode only data access.
> > So operand value always little-endian order.
> 
> Oh that is convenient.
> Therefore the operand can always be put together by pieces.  E.g.
> 
> -%b4_dsp_16     0:16 !function=dsp16
> -%b4_bdsp       0:24 !function=bdsp_a
> +%b4_dsp16      0:s8 8:8
> +%b4_dsp24      0:s8 8:8 16:8
> 
> Also note the 's' qualifier that defines signed fields.
> 
> -%b2_bdsp       16:8 !function=bdsp_b
> ...
> -@b2_bcnd_b     .... cd:4 .... ....             &bcnd dsp=%b2_bdsp sz=2
> -@b2_bra_b      .... .... .... ....             &jdsp dsp=%b2_bdsp sz=2
> +@b2_bcnd_b     .... cd:4 dsp:s8                &bcnd sz=2
> +@b2_bra_b      .... .... dsp:s8                &jdsp sz=2
>

OK.

> 
> >>> +/* push rs */
> >>> +static bool trans_PUSH_r(DisasContext *ctx, arg_PUSH_r *a)
> >>> +{
> >>> +    if (a->rs != 0) {
> >>> +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
> >>> +        rx_gen_ldst(a->sz, RX_MEMORY_ST, cpu_regs[a->rs], cpu_regs[0]);
> >>> +    } else {
> >>> +        tcg_gen_mov_i32(ctx->src, cpu_regs[a->rs]);
> >>> +        tcg_gen_subi_i32(cpu_regs[0], cpu_regs[0], 4);
> >>> +        rx_gen_ldst(a->sz, RX_MEMORY_ST, ctx->src, cpu_regs[0]);
> >>> +    }
> >>> +    return true;
> >>
> >> As far as I can see the THEN and ELSE cases have identical operation.
> > 
> > It little different.
> > In the case of r0, the value before decrementing is stored in memory.
> > I added comment.
> 
> What I mean is that the sequence that you use for r0 could also be used for all
> other rN.
> 
> I understand that RX does not have an mmu, but the normal way we handle this is
> 
>   tcg_gen_subi_i32(addr, cpu_regs[0], 4);
>   rx_gen_st(a->sz, cpu_regs[a->rs], addr);
>   tcg_gen_mov_i32(cpu_regs[0], addr);
> 
> so that the stack pointer is not modified if the store raises an exception.
>

r0 is stack pointer.
The push / pop instructions read and write the address indicated by r0.

This part is complicated, so let's fix it a little more.
It should be able to expand into transfer instruction of
pre-decrement and post-increment.
> 
> r~
>