From patchwork Sun Mar 25 22:27:46 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 148611 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 71253B6F98 for ; Mon, 26 Mar 2012 10:35:26 +1100 (EST) Received: from localhost ([::1]:34058 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SBvwr-000673-HG for incoming@patchwork.ozlabs.org; Sun, 25 Mar 2012 18:30:05 -0400 Received: from eggs.gnu.org ([208.118.235.92]:51100) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SBvvi-0003qS-V6 for qemu-devel@nongnu.org; Sun, 25 Mar 2012 18:28:57 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1SBvvb-0007vF-ED for qemu-devel@nongnu.org; Sun, 25 Mar 2012 18:28:54 -0400 Received: from mail-pb0-f45.google.com ([209.85.160.45]:34267) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1SBvva-0007uF-VR for qemu-devel@nongnu.org; Sun, 25 Mar 2012 18:28:47 -0400 Received: by mail-pb0-f45.google.com with SMTP id uo5so6347893pbc.4 for ; Sun, 25 Mar 2012 15:28:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id:x-mailer:in-reply-to :references; bh=X9cGkl2avji5/tVgdSIgKI/qOL5Me8Begr2IF39oLPQ=; b=bDyjI7g5dRwv6GB5e8TyorLmli6k+NnphRarJtf9Tx83hDUrOqJh888DgFGYGx/+8L 1+0lLNrW55Y14fTy2vQHMBcSB+IZDxZupK8Poj5akt3qItlgZgZOJowHxvhpv8rjbup7 1DtvkIfgMBGjL25EQYeyODBnFa2MolzEkeFFR87FWILqmBs5wak1Hr3zfA4nglMev3fm jEGAmw7V3ikZuTdPXwCHedUwuVp9shpd5iA3Vdc76+DooMFDqFJPAe1pauMX0xzNXWTm BbYko15Y5nV9dyh3YwyHw7tRknUHLolOCAjFG7gYj4zC0oikcRb9g4aVOw1PE/EFXKlU 9y7w== Received: by 10.68.136.162 with SMTP id qb2mr48344725pbb.67.1332714526120; Sun, 25 Mar 2012 15:28:46 -0700 (PDT) Received: from pebble.com ([173.160.232.49]) by mx.google.com with ESMTPS id l4sm11018797pbl.27.2012.03.25.15.28.44 (version=TLSv1/SSLv3 cipher=OTHER); Sun, 25 Mar 2012 15:28:45 -0700 (PDT) From: Richard Henderson To: qemu-devel@nongnu.org Date: Sun, 25 Mar 2012 15:27:46 -0700 Message-Id: <1332714477-30079-5-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.7.7.6 In-Reply-To: <1332714477-30079-1-git-send-email-rth@twiddle.net> References: <1332714477-30079-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 209.85.160.45 Cc: Blue Swirl Subject: [Qemu-devel] [PATCH 04/15] tcg-sparc: Fix qemu_ld/st to handle 32-bit host. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org At the same time, split out the tlb load logic to a new function. Fixes the cases of two data registers and two address registers. Fixes the signature of, and adds missing, qemu_ld/st opcodes. Signed-off-by: Richard Henderson --- tcg/sparc/tcg-target.c | 751 ++++++++++++++++++++++++------------------------ 1 files changed, 378 insertions(+), 373 deletions(-) diff --git a/tcg/sparc/tcg-target.c b/tcg/sparc/tcg-target.c index 257d20a..8763b03 100644 --- a/tcg/sparc/tcg-target.c +++ b/tcg/sparc/tcg-target.c @@ -448,14 +448,15 @@ static inline void tcg_out_addi(TCGContext *s, int reg, tcg_target_long val) } } -static inline void tcg_out_andi(TCGContext *s, int reg, tcg_target_long val) +static inline void tcg_out_andi(TCGContext *s, int rd, int rs, + tcg_target_long val) { if (val != 0) { if (check_fit_tl(val, 13)) - tcg_out_arithi(s, reg, reg, val, ARITH_AND); + tcg_out_arithi(s, rd, rs, val, ARITH_AND); else { tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, val); - tcg_out_arith(s, reg, reg, TCG_REG_I5, ARITH_AND); + tcg_out_arith(s, rd, rs, TCG_REG_I5, ARITH_AND); } } } @@ -744,422 +745,405 @@ static const void * const qemu_st_helpers[4] = { __stq_mmu, }; #endif -#endif -#if TARGET_LONG_BITS == 32 -#define TARGET_LD_OP LDUW -#else -#define TARGET_LD_OP LDX -#endif +/* Perform the TLB load and compare. -#if defined(CONFIG_SOFTMMU) -#if HOST_LONG_BITS == 32 -#define TARGET_ADDEND_LD_OP LDUW -#else -#define TARGET_ADDEND_LD_OP LDX -#endif -#endif + Inputs: + ADDRLO_IDX contains the index into ARGS of the low part of the + address; the high part of the address is at ADDR_LOW_IDX+1. -#if TCG_TARGET_REG_BITS == 64 -#define HOST_LD_OP LDX -#define HOST_ST_OP STX -#define HOST_SLL_OP SHIFT_SLLX -#define HOST_SRA_OP SHIFT_SRAX + MEM_INDEX and S_BITS are the memory context and log2 size of the load. + + WHICH is the offset into the CPUTLBEntry structure of the slot to read. + This should be offsetof addr_read or addr_write. + + Outputs: + LABEL_PTRS is filled with the position of the forward jumps to the + TLB miss case. This will always be a ,PN insn, so a 19-bit offset. + + Returns a register loaded with the low part of the address, adjusted + as indicated by the TLB and so is a host address. Undefined in the + TLB miss case. */ + +static int tcg_out_tlb_load(TCGContext *s, int addrlo_idx, int mem_index, + int s_bits, const TCGArg *args, + uint32_t **label_ptr, int which) +{ + const int addrlo = args[addrlo_idx]; + const int r0 = tcg_target_call_iarg_regs[0]; + const int r1 = tcg_target_call_iarg_regs[1]; + const int r2 = tcg_target_call_iarg_regs[2]; + int addr = addrlo; + int tlb_ofs; + + if (TCG_TARGET_REG_BITS == 32 && TARGET_LONG_BITS == 64) { + /* Assemble the 64-bit address in R0. */ + tcg_out_arithi(s, r0, addrlo, 0, SHIFT_SRL); + tcg_out_arithi(s, r1, args[addrlo_idx + 1], 32, SHIFT_SLLX); + tcg_out_arith(s, r0, r0, r1, ARITH_OR); + } + + /* Shift the page number down to tlb-entry. */ + tcg_out_arithi(s, r1, addrlo, + TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS, SHIFT_SRL); + + /* Mask out the page offset, except for the required alignment. */ + tcg_out_andi(s, r0, addr, TARGET_PAGE_MASK | ((1 << s_bits) - 1)); + + /* Compute tlb index, modulo tlb size. */ + tcg_out_andi(s, r1, r1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS); + + /* Relative to the current ENV. */ + tcg_out_arith(s, r1, TCG_AREG0, r1, ARITH_ADD); + + /* Find a base address that can load both tlb comparator and addend. */ + tlb_ofs = offsetof(CPUArchState, tlb_table[mem_index][0]); + if (!check_fit_tl(tlb_ofs + sizeof(CPUTLBEntry), 13)) { + tcg_out_addi(s, r1, tlb_ofs); + tlb_ofs = 0; + } + + /* ld [arg1 + which], arg2 */ + tcg_out_ld(s, TCG_TYPE_TL, r2, r1, tlb_ofs + which); + + /* subcc arg0, arg2, %g0 */ + tcg_out_cmp(s, r0, r2, 0); + + /* bne,pn %[ix]cc, label0 */ + *label_ptr = (uint32_t *)s->code_ptr; + tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_NE, 0) | INSN_OP2(0x1) | + ((TARGET_LONG_BITS == 64) << 21))); + + /* TLB Hit. Compute the host address into r1. The ld is in the + branch delay slot; harmless for the TLB miss case. */ + tcg_out_ld(s, TCG_TYPE_PTR, r1, r1, tlb_ofs+offsetof(CPUTLBEntry, addend)); + + if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) { + tcg_out_arithi(s, r0, addrlo, 0, SHIFT_SRL); + tcg_out_arith(s, r1, r0, r1, ARITH_ADD); + } else { + tcg_out_arith(s, r1, addrlo, r1, ARITH_ADD); + } + + return r1; +} +#endif /* CONFIG_SOFTMMU */ + +static void tcg_out_qemu_ld_direct(TCGContext *s, int addr, int datalo, + int datahi, int sizeop) +{ +#ifdef TARGET_WORDS_BIGENDIAN + const int bigendian = 1; #else -#define HOST_LD_OP LDUW -#define HOST_ST_OP STW -#define HOST_SLL_OP SHIFT_SLL -#define HOST_SRA_OP SHIFT_SRA + const int bigendian = 0; #endif + switch (sizeop) { + case 0: + /* ldub [addr], datalo */ + tcg_out_ldst(s, datalo, addr, 0, LDUB); + break; + case 0 | 4: + /* ldsb [addr], datalo */ + tcg_out_ldst(s, datalo, addr, 0, LDSB); + break; + case 1: + if (bigendian) { + /* lduh [addr], datalo */ + tcg_out_ldst(s, datalo, addr, 0, LDUH); + } else { + /* lduha [addr] ASI_PRIMARY_LITTLE, datalo */ + tcg_out_ldst_asi(s, datalo, addr, 0, LDUHA, ASI_PRIMARY_LITTLE); + } + break; + case 1 | 4: + if (bigendian) { + /* ldsh [addr], datalo */ + tcg_out_ldst(s, datalo, addr, 0, LDSH); + } else { + /* ldsha [addr] ASI_PRIMARY_LITTLE, datalo */ + tcg_out_ldst_asi(s, datalo, addr, 0, LDSHA, ASI_PRIMARY_LITTLE); + } + break; + case 2: + if (bigendian) { + /* lduw [addr], datalo */ + tcg_out_ldst(s, datalo, addr, 0, LDUW); + } else { + /* lduwa [addr] ASI_PRIMARY_LITTLE, datalo */ + tcg_out_ldst_asi(s, datalo, addr, 0, LDUWA, ASI_PRIMARY_LITTLE); + } + break; + case 2 | 4: + if (bigendian) { + /* ldsw [addr], datalo */ + tcg_out_ldst(s, datalo, addr, 0, LDSW); + } else { + /* ldswa [addr] ASI_PRIMARY_LITTLE, datalo */ + tcg_out_ldst_asi(s, datalo, addr, 0, LDSWA, ASI_PRIMARY_LITTLE); + } + break; + case 3: + if (TCG_TARGET_REG_BITS == 64) { + if (bigendian) { + /* ldx [addr], datalo */ + tcg_out_ldst(s, datalo, addr, 0, LDX); + } else { + /* ldxa [addr] ASI_PRIMARY_LITTLE, datalo */ + tcg_out_ldst_asi(s, datalo, addr, 0, LDXA, ASI_PRIMARY_LITTLE); + } + } else { + if (bigendian) { + tcg_out_ldst(s, datahi, addr, 0, LDUW); + tcg_out_ldst(s, datalo, addr, 4, LDUW); + } else { + tcg_out_ldst_asi(s, datalo, addr, 0, LDUWA, ASI_PRIMARY_LITTLE); + tcg_out_ldst_asi(s, datahi, addr, 4, LDUWA, ASI_PRIMARY_LITTLE); + } + } + break; + default: + tcg_abort(); + } +} -static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, - int opc) +static void tcg_out_qemu_ld(TCGContext *s, const TCGArg *args, int opc) { - int addr_reg, data_reg, arg0, arg1, arg2, mem_index, s_bits; + int addrlo_idx = 1, datalo, datahi, addr_reg; #if defined(CONFIG_SOFTMMU) - uint32_t *label1_ptr, *label2_ptr; + int memi_idx, memi, s_bits, n; + uint32_t *label_ptr[2]; #endif - data_reg = *args++; - addr_reg = *args++; - mem_index = *args; - s_bits = opc & 3; - - arg0 = TCG_REG_O0; - arg1 = TCG_REG_O1; - arg2 = TCG_REG_O2; + datahi = datalo = args[0]; + if (TCG_TARGET_REG_BITS == 32 && opc == 3) { + datahi = args[1]; + addrlo_idx = 2; + } #if defined(CONFIG_SOFTMMU) - /* srl addr_reg, x, arg1 */ - tcg_out_arithi(s, arg1, addr_reg, TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS, - SHIFT_SRL); - /* and addr_reg, x, arg0 */ - tcg_out_arithi(s, arg0, addr_reg, TARGET_PAGE_MASK | ((1 << s_bits) - 1), - ARITH_AND); - - /* and arg1, x, arg1 */ - tcg_out_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS); - - /* add arg1, x, arg1 */ - tcg_out_addi(s, arg1, offsetof(CPUArchState, - tlb_table[mem_index][0].addr_read)); - - /* add env, arg1, arg1 */ - tcg_out_arith(s, arg1, TCG_AREG0, arg1, ARITH_ADD); + memi_idx = addrlo_idx + 1 + (TARGET_LONG_BITS > TCG_TARGET_REG_BITS); + memi = args[memi_idx]; + s_bits = opc & 3; - /* ld [arg1], arg2 */ - tcg_out32(s, TARGET_LD_OP | INSN_RD(arg2) | INSN_RS1(arg1) | - INSN_RS2(TCG_REG_G0)); + addr_reg = tcg_out_tlb_load(s, addrlo_idx, memi, s_bits, args, + label_ptr, offsetof(CPUTLBEntry, addr_read)); - /* subcc arg0, arg2, %g0 */ - tcg_out_arith(s, TCG_REG_G0, arg0, arg2, ARITH_SUBCC); + /* TLB Hit. */ + tcg_out_qemu_ld_direct(s, addr_reg, datalo, datahi, opc); - /* will become: - be label1 - or - be,pt %xcc label1 */ - label1_ptr = (uint32_t *)s->code_ptr; - tcg_out32(s, 0); + /* b,pt,n label1 */ + label_ptr[1] = (uint32_t *)s->code_ptr; + tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x1) + | (1 << 29) | (1 << 19))); - /* mov (delay slot) */ - tcg_out_mov(s, TCG_TYPE_PTR, arg0, addr_reg); + /* TLB Miss. */ - /* mov */ - tcg_out_movi(s, TCG_TYPE_I32, arg1, mem_index); + *label_ptr[0] |= INSN_OFF19((unsigned long)s->code_ptr - + (unsigned long)label_ptr[0]); + n = 0; #ifdef CONFIG_TCG_PASS_AREG0 - /* XXX/FIXME: suboptimal */ - tcg_out_mov(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], - tcg_target_call_iarg_regs[2]); - tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[2], - tcg_target_call_iarg_regs[1]); - tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], - tcg_target_call_iarg_regs[0]); - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], - TCG_AREG0); + tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[n++], TCG_AREG0); #endif + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], + args[addrlo_idx + 1]); + } + tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], + args[addrlo_idx]); + + /* Store AREG0 in stack to avoid ugly glibc bugs that mangle + global registers */ + tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK, + TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - + sizeof(long)); - /* XXX: move that code at the end of the TB */ /* qemu_ld_helper[s_bits](arg0, arg1) */ tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_ld_helpers[s_bits] - (tcg_target_ulong)s->code_ptr) >> 2) & 0x3fffffff)); - /* Store AREG0 in stack to avoid ugly glibc bugs that mangle - global registers */ - // delay slot - tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK, - TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - - sizeof(long), HOST_ST_OP); - tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK, - TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - - sizeof(long), HOST_LD_OP); - - /* data_reg = sign_extend(arg0) */ + /* delay slot */ + tcg_out_movi(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[n], memi); + + /* Reload AREG0. */ + tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK, + TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - + sizeof(long)); + + n = tcg_target_call_oarg_regs[0]; + /* datalo = sign_extend(arg0) */ switch(opc) { case 0 | 4: - /* sll arg0, 24/56, data_reg */ - tcg_out_arithi(s, data_reg, arg0, (int)sizeof(tcg_target_long) * 8 - 8, - HOST_SLL_OP); - /* sra data_reg, 24/56, data_reg */ - tcg_out_arithi(s, data_reg, data_reg, - (int)sizeof(tcg_target_long) * 8 - 8, HOST_SRA_OP); + /* Recall that SRA sign extends from bit 31 through bit 63. */ + tcg_out_arithi(s, datalo, n, 24, SHIFT_SLL); + tcg_out_arithi(s, datalo, datalo, 24, SHIFT_SRA); break; case 1 | 4: - /* sll arg0, 16/48, data_reg */ - tcg_out_arithi(s, data_reg, arg0, - (int)sizeof(tcg_target_long) * 8 - 16, HOST_SLL_OP); - /* sra data_reg, 16/48, data_reg */ - tcg_out_arithi(s, data_reg, data_reg, - (int)sizeof(tcg_target_long) * 8 - 16, HOST_SRA_OP); + tcg_out_arithi(s, datalo, n, 16, SHIFT_SLL); + tcg_out_arithi(s, datalo, datalo, 16, SHIFT_SRA); break; case 2 | 4: - /* sll arg0, 32, data_reg */ - tcg_out_arithi(s, data_reg, arg0, 32, HOST_SLL_OP); - /* sra data_reg, 32, data_reg */ - tcg_out_arithi(s, data_reg, data_reg, 32, HOST_SRA_OP); + tcg_out_arithi(s, datalo, n, 0, SHIFT_SRA); break; + case 3: + if (TCG_TARGET_REG_BITS == 32) { + tcg_out_mov(s, TCG_TYPE_REG, datahi, n); + tcg_out_mov(s, TCG_TYPE_REG, datalo, n + 1); + break; + } + /* FALLTHRU */ case 0: case 1: case 2: - case 3: default: /* mov */ - tcg_out_mov(s, TCG_TYPE_REG, data_reg, arg0); + tcg_out_mov(s, TCG_TYPE_REG, datalo, n); break; } - /* will become: - ba label2 */ - label2_ptr = (uint32_t *)s->code_ptr; - tcg_out32(s, 0); - - /* nop (delay slot */ - tcg_out_nop(s); - - /* label1: */ -#if TARGET_LONG_BITS == 32 - /* be label1 */ - *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x2) | - INSN_OFF22((unsigned long)s->code_ptr - - (unsigned long)label1_ptr)); + *label_ptr[1] |= INSN_OFF19((unsigned long)s->code_ptr - + (unsigned long)label_ptr[1]); #else - /* be,pt %xcc label1 */ - *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x1) | - (0x5 << 19) | INSN_OFF19((unsigned long)s->code_ptr - - (unsigned long)label1_ptr)); -#endif - - /* ld [arg1 + x], arg1 */ - tcg_out_ldst(s, arg1, arg1, offsetof(CPUTLBEntry, addend) - - offsetof(CPUTLBEntry, addr_read), TARGET_ADDEND_LD_OP); - -#if TARGET_LONG_BITS == 32 - /* and addr_reg, x, arg0 */ - tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, 0xffffffff); - tcg_out_arith(s, arg0, addr_reg, TCG_REG_I5, ARITH_AND); - /* add arg0, arg1, arg0 */ - tcg_out_arith(s, arg0, arg0, arg1, ARITH_ADD); -#else - /* add addr_reg, arg1, arg0 */ - tcg_out_arith(s, arg0, addr_reg, arg1, ARITH_ADD); -#endif + addr_reg = args[addrlo_idx]; + if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) { + tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL); + addr_reg = TCG_REG_I5; + } + tcg_out_qemu_ld_direct(s, addr_reg, datalo, datahi, opc); +#endif /* CONFIG_SOFTMMU */ +} +static void tcg_out_qemu_st_direct(TCGContext *s, int addr, int datalo, + int datahi, int sizeop) +{ +#ifdef TARGET_WORDS_BIGENDIAN + const int bigendian = 1; #else - arg0 = addr_reg; + const int bigendian = 0; #endif - - switch(opc) { + switch (sizeop) { case 0: - /* ldub [arg0], data_reg */ - tcg_out_ldst(s, data_reg, arg0, 0, LDUB); - break; - case 0 | 4: - /* ldsb [arg0], data_reg */ - tcg_out_ldst(s, data_reg, arg0, 0, LDSB); + /* stb datalo, [addr] */ + tcg_out_ldst(s, datalo, addr, 0, STB); break; case 1: -#ifdef TARGET_WORDS_BIGENDIAN - /* lduh [arg0], data_reg */ - tcg_out_ldst(s, data_reg, arg0, 0, LDUH); -#else - /* lduha [arg0] ASI_PRIMARY_LITTLE, data_reg */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, LDUHA, ASI_PRIMARY_LITTLE); -#endif - break; - case 1 | 4: -#ifdef TARGET_WORDS_BIGENDIAN - /* ldsh [arg0], data_reg */ - tcg_out_ldst(s, data_reg, arg0, 0, LDSH); -#else - /* ldsha [arg0] ASI_PRIMARY_LITTLE, data_reg */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, LDSHA, ASI_PRIMARY_LITTLE); -#endif + if (bigendian) { + /* sth datalo, [addr] */ + tcg_out_ldst(s, datalo, addr, 0, STH); + } else { + /* stha datalo, [addr] ASI_PRIMARY_LITTLE */ + tcg_out_ldst_asi(s, datalo, addr, 0, STHA, ASI_PRIMARY_LITTLE); + } break; case 2: -#ifdef TARGET_WORDS_BIGENDIAN - /* lduw [arg0], data_reg */ - tcg_out_ldst(s, data_reg, arg0, 0, LDUW); -#else - /* lduwa [arg0] ASI_PRIMARY_LITTLE, data_reg */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, LDUWA, ASI_PRIMARY_LITTLE); -#endif - break; - case 2 | 4: -#ifdef TARGET_WORDS_BIGENDIAN - /* ldsw [arg0], data_reg */ - tcg_out_ldst(s, data_reg, arg0, 0, LDSW); -#else - /* ldswa [arg0] ASI_PRIMARY_LITTLE, data_reg */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, LDSWA, ASI_PRIMARY_LITTLE); -#endif + if (bigendian) { + /* stw datalo, [addr] */ + tcg_out_ldst(s, datalo, addr, 0, STW); + } else { + /* stwa datalo, [addr] ASI_PRIMARY_LITTLE */ + tcg_out_ldst_asi(s, datalo, addr, 0, STWA, ASI_PRIMARY_LITTLE); + } break; case 3: -#ifdef TARGET_WORDS_BIGENDIAN - /* ldx [arg0], data_reg */ - tcg_out_ldst(s, data_reg, arg0, 0, LDX); -#else - /* ldxa [arg0] ASI_PRIMARY_LITTLE, data_reg */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, LDXA, ASI_PRIMARY_LITTLE); -#endif + if (TCG_TARGET_REG_BITS == 64) { + if (bigendian) { + /* stx datalo, [addr] */ + tcg_out_ldst(s, datalo, addr, 0, STX); + } else { + /* stxa datalo, [addr] ASI_PRIMARY_LITTLE */ + tcg_out_ldst_asi(s, datalo, addr, 0, STXA, ASI_PRIMARY_LITTLE); + } + } else { + if (bigendian) { + tcg_out_ldst(s, datahi, addr, 0, STW); + tcg_out_ldst(s, datalo, addr, 4, STW); + } else { + tcg_out_ldst_asi(s, datalo, addr, 0, STWA, ASI_PRIMARY_LITTLE); + tcg_out_ldst_asi(s, datahi, addr, 4, STWA, ASI_PRIMARY_LITTLE); + } + } break; default: tcg_abort(); } - -#if defined(CONFIG_SOFTMMU) - /* label2: */ - *label2_ptr = (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x2) | - INSN_OFF22((unsigned long)s->code_ptr - - (unsigned long)label2_ptr)); -#endif } -static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, - int opc) +static void tcg_out_qemu_st(TCGContext *s, const TCGArg *args, int opc) { - int addr_reg, data_reg, arg0, arg1, arg2, mem_index, s_bits; + int addrlo_idx = 1, datalo, datahi, addr_reg; #if defined(CONFIG_SOFTMMU) - uint32_t *label1_ptr, *label2_ptr; + int memi_idx, memi, n; + uint32_t *label_ptr[2]; #endif - data_reg = *args++; - addr_reg = *args++; - mem_index = *args; - - s_bits = opc; - - arg0 = TCG_REG_O0; - arg1 = TCG_REG_O1; - arg2 = TCG_REG_O2; + datahi = datalo = args[0]; + if (TCG_TARGET_REG_BITS == 32 && opc == 3) { + datahi = args[1]; + addrlo_idx = 2; + } #if defined(CONFIG_SOFTMMU) - /* srl addr_reg, x, arg1 */ - tcg_out_arithi(s, arg1, addr_reg, TARGET_PAGE_BITS - CPU_TLB_ENTRY_BITS, - SHIFT_SRL); + memi_idx = addrlo_idx + 1 + (TARGET_LONG_BITS > TCG_TARGET_REG_BITS); + memi = args[memi_idx]; - /* and addr_reg, x, arg0 */ - tcg_out_arithi(s, arg0, addr_reg, TARGET_PAGE_MASK | ((1 << s_bits) - 1), - ARITH_AND); + addr_reg = tcg_out_tlb_load(s, addrlo_idx, memi, opc, args, + label_ptr, offsetof(CPUTLBEntry, addr_write)); - /* and arg1, x, arg1 */ - tcg_out_andi(s, arg1, (CPU_TLB_SIZE - 1) << CPU_TLB_ENTRY_BITS); + /* TLB Hit. */ + tcg_out_qemu_st_direct(s, addr_reg, datalo, datahi, opc); - /* add arg1, x, arg1 */ - tcg_out_addi(s, arg1, offsetof(CPUArchState, - tlb_table[mem_index][0].addr_write)); + /* b,pt,n label1 */ + label_ptr[1] = (uint32_t *)s->code_ptr; + tcg_out32(s, (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x1) + | (1 << 29) | (1 << 19))); - /* add env, arg1, arg1 */ - tcg_out_arith(s, arg1, TCG_AREG0, arg1, ARITH_ADD); + /* TLB Miss. */ - /* ld [arg1], arg2 */ - tcg_out32(s, TARGET_LD_OP | INSN_RD(arg2) | INSN_RS1(arg1) | - INSN_RS2(TCG_REG_G0)); - - /* subcc arg0, arg2, %g0 */ - tcg_out_arith(s, TCG_REG_G0, arg0, arg2, ARITH_SUBCC); - - /* will become: - be label1 - or - be,pt %xcc label1 */ - label1_ptr = (uint32_t *)s->code_ptr; - tcg_out32(s, 0); - - /* mov (delay slot) */ - tcg_out_mov(s, TCG_TYPE_PTR, arg0, addr_reg); - - /* mov */ - tcg_out_mov(s, TCG_TYPE_REG, arg1, data_reg); - - /* mov */ - tcg_out_movi(s, TCG_TYPE_I32, arg2, mem_index); + *label_ptr[0] |= INSN_OFF19((unsigned long)s->code_ptr - + (unsigned long)label_ptr[0]); + n = 0; #ifdef CONFIG_TCG_PASS_AREG0 - /* XXX/FIXME: suboptimal */ - tcg_out_mov(s, TCG_TYPE_I32, tcg_target_call_iarg_regs[3], - tcg_target_call_iarg_regs[2]); - tcg_out_mov(s, TCG_TYPE_I64, tcg_target_call_iarg_regs[2], - tcg_target_call_iarg_regs[1]); - tcg_out_mov(s, TCG_TYPE_TL, tcg_target_call_iarg_regs[1], - tcg_target_call_iarg_regs[0]); - tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[0], - TCG_AREG0); + tcg_out_mov(s, TCG_TYPE_PTR, tcg_target_call_iarg_regs[n++], TCG_AREG0); #endif - /* XXX: move that code at the end of the TB */ - /* qemu_st_helper[s_bits](arg0, arg1, arg2) */ - tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_st_helpers[s_bits] - - (tcg_target_ulong)s->code_ptr) >> 2) - & 0x3fffffff)); + if (TARGET_LONG_BITS > TCG_TARGET_REG_BITS) { + tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], + args[addrlo_idx + 1]); + } + tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], + args[addrlo_idx]); + if (TCG_TARGET_REG_BITS == 32 && opc == 3) { + tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], datahi); + } + tcg_out_mov(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n++], datalo); + /* Store AREG0 in stack to avoid ugly glibc bugs that mangle global registers */ - // delay slot - tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK, - TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - - sizeof(long), HOST_ST_OP); - tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK, - TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - - sizeof(long), HOST_LD_OP); - - /* will become: - ba label2 */ - label2_ptr = (uint32_t *)s->code_ptr; - tcg_out32(s, 0); - - /* nop (delay slot) */ - tcg_out_nop(s); - -#if TARGET_LONG_BITS == 32 - /* be label1 */ - *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x2) | - INSN_OFF22((unsigned long)s->code_ptr - - (unsigned long)label1_ptr)); -#else - /* be,pt %xcc label1 */ - *label1_ptr = (INSN_OP(0) | INSN_COND(COND_E, 0) | INSN_OP2(0x1) | - (0x5 << 19) | INSN_OFF19((unsigned long)s->code_ptr - - (unsigned long)label1_ptr)); -#endif + tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK, + TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - + sizeof(long)); - /* ld [arg1 + x], arg1 */ - tcg_out_ldst(s, arg1, arg1, offsetof(CPUTLBEntry, addend) - - offsetof(CPUTLBEntry, addr_write), TARGET_ADDEND_LD_OP); + /* qemu_st_helper[s_bits](arg0, arg1, arg2) */ + tcg_out32(s, CALL | ((((tcg_target_ulong)qemu_st_helpers[opc] + - (tcg_target_ulong)s->code_ptr) >> 2) + & 0x3fffffff)); + /* delay slot */ + tcg_out_movi(s, TCG_TYPE_REG, tcg_target_call_iarg_regs[n], memi); -#if TARGET_LONG_BITS == 32 - /* and addr_reg, x, arg0 */ - tcg_out_movi(s, TCG_TYPE_I32, TCG_REG_I5, 0xffffffff); - tcg_out_arith(s, arg0, addr_reg, TCG_REG_I5, ARITH_AND); - /* add arg0, arg1, arg0 */ - tcg_out_arith(s, arg0, arg0, arg1, ARITH_ADD); -#else - /* add addr_reg, arg1, arg0 */ - tcg_out_arith(s, arg0, addr_reg, arg1, ARITH_ADD); -#endif + /* Reload AREG0. */ + tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK, + TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - + sizeof(long)); + *label_ptr[1] |= INSN_OFF19((unsigned long)s->code_ptr - + (unsigned long)label_ptr[1]); #else - arg0 = addr_reg; -#endif - - switch(opc) { - case 0: - /* stb data_reg, [arg0] */ - tcg_out_ldst(s, data_reg, arg0, 0, STB); - break; - case 1: -#ifdef TARGET_WORDS_BIGENDIAN - /* sth data_reg, [arg0] */ - tcg_out_ldst(s, data_reg, arg0, 0, STH); -#else - /* stha data_reg, [arg0] ASI_PRIMARY_LITTLE */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, STHA, ASI_PRIMARY_LITTLE); -#endif - break; - case 2: -#ifdef TARGET_WORDS_BIGENDIAN - /* stw data_reg, [arg0] */ - tcg_out_ldst(s, data_reg, arg0, 0, STW); -#else - /* stwa data_reg, [arg0] ASI_PRIMARY_LITTLE */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, STWA, ASI_PRIMARY_LITTLE); -#endif - break; - case 3: -#ifdef TARGET_WORDS_BIGENDIAN - /* stx data_reg, [arg0] */ - tcg_out_ldst(s, data_reg, arg0, 0, STX); -#else - /* stxa data_reg, [arg0] ASI_PRIMARY_LITTLE */ - tcg_out_ldst_asi(s, data_reg, arg0, 0, STXA, ASI_PRIMARY_LITTLE); -#endif - break; - default: - tcg_abort(); + addr_reg = args[addrlo_idx]; + if (TCG_TARGET_REG_BITS == 64 && TARGET_LONG_BITS == 32) { + tcg_out_arithi(s, TCG_REG_I5, addr_reg, 0, SHIFT_SRL); + addr_reg = TCG_REG_I5; } - -#if defined(CONFIG_SOFTMMU) - /* label2: */ - *label2_ptr = (INSN_OP(0) | INSN_COND(COND_A, 0) | INSN_OP2(0x2) | - INSN_OFF22((unsigned long)s->code_ptr - - (unsigned long)label2_ptr)); -#endif + tcg_out_qemu_st_direct(s, addr_reg, datalo, datahi, opc); +#endif /* CONFIG_SOFTMMU */ } static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, @@ -1205,12 +1189,12 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, /* Store AREG0 in stack to avoid ugly glibc bugs that mangle global registers */ // delay slot - tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK, - TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - - sizeof(long), HOST_ST_OP); - tcg_out_ldst(s, TCG_AREG0, TCG_REG_CALL_STACK, - TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - - sizeof(long), HOST_LD_OP); + tcg_out_st(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK, + TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - + sizeof(long)); + tcg_out_ld(s, TCG_TYPE_REG, TCG_AREG0, TCG_REG_CALL_STACK, + TCG_TARGET_CALL_STACK_OFFSET - TCG_STATIC_CALL_ARGS_SIZE - + sizeof(long)); break; case INDEX_op_jmp: case INDEX_op_br: @@ -1378,6 +1362,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, tcg_out_qemu_ld(s, args, 2 | 4); break; #endif + case INDEX_op_qemu_ld64: + tcg_out_qemu_ld(s, args, 3); + break; case INDEX_op_qemu_st8: tcg_out_qemu_st(s, args, 0); break; @@ -1387,6 +1374,9 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, case INDEX_op_qemu_st32: tcg_out_qemu_st(s, args, 2); break; + case INDEX_op_qemu_st64: + tcg_out_qemu_st(s, args, 3); + break; #if TCG_TARGET_REG_BITS == 64 case INDEX_op_movi_i64: @@ -1451,13 +1441,6 @@ static inline void tcg_out_op(TCGContext *s, TCGOpcode opc, const TCGArg *args, args[2], const_args[2]); break; - case INDEX_op_qemu_ld64: - tcg_out_qemu_ld(s, args, 3); - break; - case INDEX_op_qemu_st64: - tcg_out_qemu_st(s, args, 3); - break; - #endif gen_arith: tcg_out_arithc(s, args[0], args[1], args[2], const_args[2], c); @@ -1522,20 +1505,6 @@ static const TCGTargetOpDef sparc_op_defs[] = { { INDEX_op_mulu2_i32, { "r", "r", "r", "rJ" } }, #endif - { INDEX_op_qemu_ld8u, { "r", "L" } }, - { INDEX_op_qemu_ld8s, { "r", "L" } }, - { INDEX_op_qemu_ld16u, { "r", "L" } }, - { INDEX_op_qemu_ld16s, { "r", "L" } }, - { INDEX_op_qemu_ld32, { "r", "L" } }, -#if TCG_TARGET_REG_BITS == 64 - { INDEX_op_qemu_ld32u, { "r", "L" } }, - { INDEX_op_qemu_ld32s, { "r", "L" } }, -#endif - - { INDEX_op_qemu_st8, { "L", "L" } }, - { INDEX_op_qemu_st16, { "L", "L" } }, - { INDEX_op_qemu_st32, { "L", "L" } }, - #if TCG_TARGET_REG_BITS == 64 { INDEX_op_mov_i64, { "r", "r" } }, { INDEX_op_movi_i64, { "r" } }, @@ -1550,8 +1519,6 @@ static const TCGTargetOpDef sparc_op_defs[] = { { INDEX_op_st16_i64, { "r", "r" } }, { INDEX_op_st32_i64, { "r", "r" } }, { INDEX_op_st_i64, { "r", "r" } }, - { INDEX_op_qemu_ld64, { "L", "L" } }, - { INDEX_op_qemu_st64, { "L", "L" } }, { INDEX_op_add_i64, { "r", "r", "rJ" } }, { INDEX_op_mul_i64, { "r", "r", "rJ" } }, @@ -1578,10 +1545,48 @@ static const TCGTargetOpDef sparc_op_defs[] = { { INDEX_op_brcond_i64, { "r", "rJ" } }, { INDEX_op_setcond_i64, { "r", "r", "rJ" } }, -#else - { INDEX_op_qemu_ld64, { "L", "L", "L" } }, +#endif + +#if TCG_TARGET_REG_BITS == 64 + { INDEX_op_qemu_ld8u, { "r", "L" } }, + { INDEX_op_qemu_ld8s, { "r", "L" } }, + { INDEX_op_qemu_ld16u, { "r", "L" } }, + { INDEX_op_qemu_ld16s, { "r", "L" } }, + { INDEX_op_qemu_ld32, { "r", "L" } }, + { INDEX_op_qemu_ld32u, { "r", "L" } }, + { INDEX_op_qemu_ld32s, { "r", "L" } }, + { INDEX_op_qemu_ld64, { "r", "L" } }, + + { INDEX_op_qemu_st8, { "L", "L" } }, + { INDEX_op_qemu_st16, { "L", "L" } }, + { INDEX_op_qemu_st32, { "L", "L" } }, + { INDEX_op_qemu_st64, { "L", "L" } }, +#elif TARGET_LONG_BITS <= TCG_TARGET_REG_BITS + { INDEX_op_qemu_ld8u, { "r", "L" } }, + { INDEX_op_qemu_ld8s, { "r", "L" } }, + { INDEX_op_qemu_ld16u, { "r", "L" } }, + { INDEX_op_qemu_ld16s, { "r", "L" } }, + { INDEX_op_qemu_ld32, { "r", "L" } }, + { INDEX_op_qemu_ld64, { "r", "r", "L" } }, + + { INDEX_op_qemu_st8, { "L", "L" } }, + { INDEX_op_qemu_st16, { "L", "L" } }, + { INDEX_op_qemu_st32, { "L", "L" } }, { INDEX_op_qemu_st64, { "L", "L", "L" } }, +#else + { INDEX_op_qemu_ld8u, { "r", "L", "L" } }, + { INDEX_op_qemu_ld8s, { "r", "L", "L" } }, + { INDEX_op_qemu_ld16u, { "r", "L", "L" } }, + { INDEX_op_qemu_ld16s, { "r", "L", "L" } }, + { INDEX_op_qemu_ld32, { "r", "L", "L" } }, + { INDEX_op_qemu_ld64, { "L", "L", "L", "L" } }, + + { INDEX_op_qemu_st8, { "L", "L", "L" } }, + { INDEX_op_qemu_st16, { "L", "L", "L" } }, + { INDEX_op_qemu_st32, { "L", "L", "L" } }, + { INDEX_op_qemu_st64, { "L", "L", "L", "L" } }, #endif + { -1 }, };