From patchwork Thu Mar 27 09:29:18 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 334272 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 29E78140088 for ; Thu, 27 Mar 2014 20:29:30 +1100 (EST) Received: from localhost ([::1]:52518 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WT6cn-0006XA-M5 for incoming@patchwork.ozlabs.org; Thu, 27 Mar 2014 05:29:25 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:52736) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WT6cT-0006Wq-2j for qemu-devel@nongnu.org; Thu, 27 Mar 2014 05:29:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1WT6cN-0007Tf-7p for qemu-devel@nongnu.org; Thu, 27 Mar 2014 05:29:05 -0400 Received: from mail-wi0-f176.google.com ([209.85.212.176]:33860) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1WT6cM-0007TP-V2 for qemu-devel@nongnu.org; Thu, 27 Mar 2014 05:28:59 -0400 Received: by mail-wi0-f176.google.com with SMTP id r20so5734532wiv.15 for ; Thu, 27 Mar 2014 02:28:57 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=WEXEiKd/N7+qDNZMl5ymen+MdJcrgh+qO6WisiBPQuo=; b=jVe5kUhjm+dAvK/z5HOUbN1Kq5aHK68emlBsaAXBVOVP76B7wDQzOksuub7xG9omOr tfbx7YarbA6iigQMfBxD0MGl7aOhFbJZ1tjXeHrBOhNV0qSjDio+bWiU7hPDC2eyxYfG UUp0jMtvZb9duvx7F2fmjxN1gHi4fnBGS2/hbk9NMYpRpNua+WojZVGrWyXYLllhpXka jmMQ6Pr12wqtU1aXsseU+AO9CxLa7OG/gCKTdJso2g9Iq4UHlgWwrsKpBslTOfw9xe7i kbjBz4jXE/3Vw9bJPk/MDlBs3w/uyKnRcs5DbcACYMSsdGFD8wO0hZp6HB1FWss/UZM9 Bxiw== X-Gm-Message-State: ALoCoQk8EjOmCbLFfG9I5Ux3Azo70O8VA1QXm/UG+9BpfFlNqg0hb5SFHJUQPQBD4RMEZiMH3A7a X-Received: by 10.180.84.73 with SMTP id w9mr3328495wiy.58.1395912537200; Thu, 27 Mar 2014 02:28:57 -0700 (PDT) Received: from ards-macbook-pro.local (cag06-7-83-153-85-71.fbx.proxad.net. [83.153.85.71]) by mx.google.com with ESMTPSA id t6sm10567442wix.4.2014.03.27.02.28.56 for (version=TLSv1.1 cipher=ECDHE-RSA-RC4-SHA bits=128/128); Thu, 27 Mar 2014 02:28:56 -0700 (PDT) From: Ard Biesheuvel To: qemu-devel@nongnu.org, peter.maydell@linaro.org Date: Thu, 27 Mar 2014 10:29:18 +0100 Message-Id: <1395912558-1041-1-git-send-email-ard.biesheuvel@linaro.org> X-Mailer: git-send-email 1.8.3.2 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.85.212.176 Cc: afaerber@suse.de, christoffer.dall@linaro.org, Ard Biesheuvel Subject: [Qemu-devel] [PATCH] target-arm: add support for v8 VMULL.P64 instruction X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org This adds support for the VMULL.P64 polynomial 64x64 to 128 bit multiplication instruction, which is an optional feature that is part of the v8 Crypto Extensions. Signed-off-by: Ard Biesheuvel --- This is an incremental patch on top of the SHA-1/SHA-256 patch I sent earlier this week. target-arm/cpu.c | 1 + target-arm/cpu.h | 1 + target-arm/crypto_helper.c | 19 +++++++++++++++++++ target-arm/helper.h | 2 ++ target-arm/translate.c | 18 +++++++++++++++++- 5 files changed, 40 insertions(+), 1 deletion(-) diff --git a/target-arm/cpu.c b/target-arm/cpu.c index 58c4584ac3bc..60244c7ffc82 100644 --- a/target-arm/cpu.c +++ b/target-arm/cpu.c @@ -293,6 +293,7 @@ static void arm_cpu_realizefn(DeviceState *dev, Error **errp) set_feature(env, ARM_FEATURE_V8_AES); set_feature(env, ARM_FEATURE_V8_SHA1); set_feature(env, ARM_FEATURE_V8_SHA256); + set_feature(env, ARM_FEATURE_V8_PMULL); } if (arm_feature(env, ARM_FEATURE_V7)) { set_feature(env, ARM_FEATURE_VAPA); diff --git a/target-arm/cpu.h b/target-arm/cpu.h index f5039d8b0177..d8add6d565a6 100644 --- a/target-arm/cpu.h +++ b/target-arm/cpu.h @@ -632,6 +632,7 @@ enum arm_features { ARM_FEATURE_CRC, /* ARMv8 CRC instructions */ ARM_FEATURE_V8_SHA1, /* implements SHA1 part of v8 Crypto Extensions */ ARM_FEATURE_V8_SHA256, /* implements SHA256 part of v8 Crypto Extensions */ + ARM_FEATURE_V8_PMULL, /* implements PMULL part of v8 Crypto Extensions */ }; static inline int arm_feature(CPUARMState *env, int feature) diff --git a/target-arm/crypto_helper.c b/target-arm/crypto_helper.c index 211be36ebda8..b56a767b527e 100644 --- a/target-arm/crypto_helper.c +++ b/target-arm/crypto_helper.c @@ -522,3 +522,22 @@ void HELPER(crypto_sha256su1)(CPUARMState *env, uint32_t rd, uint32_t rn, env->vfp.regs[rd] = make_float64(d.l[0]); env->vfp.regs[rd + 1] = make_float64(d.l[1]); } + +void HELPER(crypto_pmull)(CPUARMState *env, uint32_t rd, uint32_t rn, + uint32_t rm) +{ + uint64_t n = float64_val(env->vfp.regs[rn]); + uint64_t m = float64_val(env->vfp.regs[rm]); + uint64_t d0 = (n & 1) ? m : 0; + uint64_t d1 = 0; + int shift; + + for (shift = 1; (n >>= 1); shift++) { + if (n & 1) { + d0 ^= m << shift; + d1 ^= m >> (64 - shift); + } + } + env->vfp.regs[rd] = make_float64(d0); + env->vfp.regs[rd + 1] = make_float64(d1); +} diff --git a/target-arm/helper.h b/target-arm/helper.h index 9024aef75157..8333f7dd0be2 100644 --- a/target-arm/helper.h +++ b/target-arm/helper.h @@ -521,6 +521,8 @@ DEF_HELPER_4(crypto_sha256h2, void, env, i32, i32, i32) DEF_HELPER_3(crypto_sha256su0, void, env, i32, i32) DEF_HELPER_4(crypto_sha256su1, void, env, i32, i32, i32) +DEF_HELPER_4(crypto_pmull, void, env, i32, i32, i32) + DEF_HELPER_FLAGS_3(crc32, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32) DEF_HELPER_FLAGS_3(crc32c, TCG_CALL_NO_RWG_SE, i32, i32, i32, i32) diff --git a/target-arm/translate.c b/target-arm/translate.c index e79241402da8..576cdc24b530 100644 --- a/target-arm/translate.c +++ b/target-arm/translate.c @@ -5917,7 +5917,7 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins {0, 0, 0, 6}, /* VQDMLSL */ {0, 0, 0, 0}, /* Integer VMULL */ {0, 0, 0, 2}, /* VQDMULL */ - {0, 0, 0, 5}, /* Polynomial VMULL */ + {0, 0, 0, 4}, /* Polynomial VMULL */ {0, 0, 0, 3}, /* Reserved: always UNDEF */ }; @@ -5937,6 +5937,22 @@ static int disas_neon_data_insn(CPUARMState * env, DisasContext *s, uint32_t ins return 1; } + /* Handle VMULL.P64 (Polynomial 64x64 to 128 bit multiply) + outside the loop below as it only performs a single pass. */ + if (op == 14 && size == 2) { + if (!arm_feature(env, ARM_FEATURE_V8_PMULL)) { + return 1; + } + tmp = tcg_const_i32(rd); + tmp2 = tcg_const_i32(rn); + tmp3 = tcg_const_i32(rm); + gen_helper_crypto_pmull(cpu_env, tmp, tmp2, tmp3); + tcg_temp_free_i32(tmp); + tcg_temp_free_i32(tmp2); + tcg_temp_free_i32(tmp3); + return 0; + } + /* Avoid overlapping operands. Wide source operands are always aligned so will never overlap with wide destinations in problematic ways. */