From patchwork Thu May 27 20:46:34 2010 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Richard Henderson X-Patchwork-Id: 53844 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [199.232.76.165]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id D5B12B6EFF for ; Fri, 28 May 2010 08:44:56 +1000 (EST) Received: from localhost ([127.0.0.1]:34760 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OHloq-0002tH-Hk for incoming@patchwork.ozlabs.org; Thu, 27 May 2010 18:44:52 -0400 Received: from [140.186.70.92] (port=56409 helo=eggs.gnu.org) by lists.gnu.org with esmtp (Exim 4.43) id 1OHjzj-0002Qc-RQ for qemu-devel@nongnu.org; Thu, 27 May 2010 16:48:01 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.69) (envelope-from ) id 1OHjzh-0005UQ-H7 for qemu-devel@nongnu.org; Thu, 27 May 2010 16:47:59 -0400 Received: from are.twiddle.net ([75.149.56.221]:51272) by eggs.gnu.org with esmtp (Exim 4.69) (envelope-from ) id 1OHjzh-0005UJ-5k for qemu-devel@nongnu.org; Thu, 27 May 2010 16:47:57 -0400 Received: from anchor.twiddle.home (anchor.twiddle.home [172.31.0.4]) by are.twiddle.net (Postfix) with ESMTPS id 9109BA21; Thu, 27 May 2010 13:47:56 -0700 (PDT) Received: from anchor.twiddle.home (anchor.twiddle.home [127.0.0.1]) by anchor.twiddle.home (8.14.4/8.14.4) with ESMTP id o4RKlute031056; Thu, 27 May 2010 13:47:56 -0700 Received: (from rth@localhost) by anchor.twiddle.home (8.14.4/8.14.4/Submit) id o4RKltpc031055; Thu, 27 May 2010 13:47:55 -0700 From: Richard Henderson To: qemu-devel@nongnu.org Date: Thu, 27 May 2010 13:46:34 -0700 Message-Id: <1274993204-30766-53-git-send-email-rth@twiddle.net> X-Mailer: git-send-email 1.7.0.1 In-Reply-To: <1274993204-30766-1-git-send-email-rth@twiddle.net> References: <1274993204-30766-1-git-send-email-rth@twiddle.net> X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 2) Cc: agraf@suse.de, aurelien@aurel32.net Subject: [Qemu-devel] [PATCH 52/62] tcg-s390: Conditionalize OR IMMEDIATE instructions. X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.5 Precedence: list List-Id: qemu-devel.nongnu.org List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org The 32-bit immediate OR instructions are in the extended-immediate facility. Use these only if present. At the same time, pull the logic to load immediates into registers into a constraint letter for TCG. Signed-off-by: Richard Henderson --- tcg/s390/tcg-target.c | 92 +++++++++++++++++++++++++++++++++++++------------ 1 files changed, 70 insertions(+), 22 deletions(-) diff --git a/tcg/s390/tcg-target.c b/tcg/s390/tcg-target.c index 359f6d1..36d4ad0 100644 --- a/tcg/s390/tcg-target.c +++ b/tcg/s390/tcg-target.c @@ -38,6 +38,7 @@ #define TCG_CT_CONST_ADDI 0x0400 #define TCG_CT_CONST_MULI 0x0800 #define TCG_CT_CONST_ANDI 0x1000 +#define TCG_CT_CONST_ORI 0x2000 #define TCG_TMP0 TCG_REG_R14 @@ -358,6 +359,10 @@ static int target_parse_constraint(TCGArgConstraint *ct, const char **pct_str) ct->ct &= ~TCG_CT_REG; ct->ct |= TCG_CT_CONST_ANDI; break; + case 'O': + ct->ct &= ~TCG_CT_REG; + ct->ct |= TCG_CT_CONST_ORI; + break; default: break; } @@ -424,6 +429,36 @@ static int tcg_match_andi(int ct, tcg_target_ulong val) return 1; } +/* Immediates to be used with logical OR. This is an optimization only, + since a full 64-bit immediate OR can always be performed with 4 sequential + OI[LH][LH] instructions. What we're looking for is immediates that we + can load efficiently, and the immediate load plus the reg-reg OR is + smaller than the sequential OI's. */ + +static int tcg_match_ori(int ct, tcg_target_long val) +{ + if (facilities & FACILITY_EXT_IMM) { + if (ct & TCG_CT_CONST_32) { + /* All 32-bit ORs can be performed with 1 48-bit insn. */ + return 1; + } + } + + /* Look for negative values. These are best to load with LGHI. */ + if (val < 0) { + if (val == (int16_t)val) { + return 0; + } + if (facilities & FACILITY_EXT_IMM) { + if (val == (int32_t)val) { + return 0; + } + } + } + + return 1; +} + /* Test if a constant matches the constraint. */ static int tcg_target_const_match(tcg_target_long val, const TCGArgConstraint *arg_ct) @@ -465,6 +500,8 @@ static int tcg_target_const_match(tcg_target_long val, } } else if (ct & TCG_CT_CONST_ANDI) { return tcg_match_andi(ct, val); + } else if (ct & TCG_CT_CONST_ORI) { + return tcg_match_ori(ct, val); } return 0; @@ -907,34 +944,45 @@ static void tgen64_ori(TCGContext *s, TCGReg dest, tcg_target_ulong val) int i; - /* Zero-th, look for no-op. */ + /* Look for no-op. */ if (val == 0) { return; } - /* First, try all 32-bit insns that can perform it in one go. */ - for (i = 0; i < 4; i++) { - tcg_target_ulong mask = (0xffffull << i*16); - if ((val & mask) != 0 && (val & ~mask) == 0) { - tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16); - return; + if (facilities & FACILITY_EXT_IMM) { + /* Try all 32-bit insns that can perform it in one go. */ + for (i = 0; i < 4; i++) { + tcg_target_ulong mask = (0xffffull << i*16); + if ((val & mask) != 0 && (val & ~mask) == 0) { + tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16); + return; + } } - } - /* Second, try all 48-bit insns that can perform it in one go. */ - for (i = 0; i < 2; i++) { - tcg_target_ulong mask = (0xffffffffull << i*32); - if ((val & mask) != 0 && (val & ~mask) == 0) { - tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32); - return; + /* Try all 48-bit insns that can perform it in one go. */ + for (i = 0; i < 2; i++) { + tcg_target_ulong mask = (0xffffffffull << i*32); + if ((val & mask) != 0 && (val & ~mask) == 0) { + tcg_out_insn_RIL(s, nif_insns[i], dest, val >> i*32); + return; + } } - } - /* Last, perform the OR via sequential modifications to the - high and low parts. Do this via recursion to handle 16-bit - vs 32-bit masks in each half. */ - tgen64_ori(s, dest, val & 0x00000000ffffffffull); - tgen64_ori(s, dest, val & 0xffffffff00000000ull); + /* Perform the OR via sequential modifications to the high and + low parts. Do this via recursion to handle 16-bit vs 32-bit + masks in each half. */ + tgen64_ori(s, dest, val & 0x00000000ffffffffull); + tgen64_ori(s, dest, val & 0xffffffff00000000ull); + } else { + /* With no extended-immediate facility, we don't need to be so + clever. Just iterate over the insns and mask in the constant. */ + for (i = 0; i < 4; i++) { + tcg_target_ulong mask = (0xffffull << i*16); + if ((val & mask) != 0) { + tcg_out_insn_RI(s, oi_insns[i], dest, val >> i*16); + } + } + } } static void tgen64_xori(TCGContext *s, TCGReg dest, tcg_target_ulong val) @@ -1764,7 +1812,7 @@ static const TCGTargetOpDef s390_op_defs[] = { { INDEX_op_divu2_i32, { "b", "a", "0", "1", "r" } }, { INDEX_op_and_i32, { "r", "0", "rWA" } }, - { INDEX_op_or_i32, { "r", "0", "ri" } }, + { INDEX_op_or_i32, { "r", "0", "rWO" } }, { INDEX_op_xor_i32, { "r", "0", "ri" } }, { INDEX_op_neg_i32, { "r", "r" } }, @@ -1825,7 +1873,7 @@ static const TCGTargetOpDef s390_op_defs[] = { { INDEX_op_divu2_i64, { "b", "a", "0", "1", "r" } }, { INDEX_op_and_i64, { "r", "0", "rA" } }, - { INDEX_op_or_i64, { "r", "0", "ri" } }, + { INDEX_op_or_i64, { "r", "0", "rO" } }, { INDEX_op_xor_i64, { "r", "0", "ri" } }, { INDEX_op_neg_i64, { "r", "r" } },