From patchwork Sat Jan 7 21:24:02 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Aurelien Jarno X-Patchwork-Id: 134872 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id 31FEAB6F71 for ; Sun, 8 Jan 2012 08:24:16 +1100 (EST) Received: from localhost ([::1]:35054 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RjdkM-0003a7-1q for incoming@patchwork.ozlabs.org; Sat, 07 Jan 2012 16:24:14 -0500 Received: from eggs.gnu.org ([140.186.70.92]:34482) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RjdkG-0003Zp-3D for qemu-devel@nongnu.org; Sat, 07 Jan 2012 16:24:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1RjdkE-0002ob-NA for qemu-devel@nongnu.org; Sat, 07 Jan 2012 16:24:08 -0500 Received: from hall.aurel32.net ([88.191.126.93]:32799) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1RjdkE-0002oR-BF; Sat, 07 Jan 2012 16:24:06 -0500 Received: from [2001:470:d4ed:0:5e26:aff:fe2b:6f5b] (helo=volta.aurel32.net) by hall.aurel32.net with esmtpsa (TLS1.0:RSA_AES_256_CBC_SHA1:32) (Exim 4.72) (envelope-from ) id 1RjdkB-0008FR-LS; Sat, 07 Jan 2012 22:24:03 +0100 Received: from aurel32 by volta.aurel32.net with local (Exim 4.77) (envelope-from ) id 1RjdkA-0008LX-GX; Sat, 07 Jan 2012 22:24:02 +0100 Date: Sat, 7 Jan 2012 22:24:02 +0100 From: Aurelien Jarno To: Peter Maydell Message-ID: <20120107212402.GF20302@volta.aurel32.net> References: <1325966978-940-1-git-send-email-aurelien@aurel32.net> <1325966978-940-2-git-send-email-aurelien@aurel32.net> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Mailer: Mutt 1.5.21 (2010-09-15) User-Agent: Mutt/1.5.21 (2010-09-15) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 88.191.126.93 Cc: qemu-devel@nongnu.org, qemu-stable@nongnu.org Subject: [Qemu-devel] [PATCH 1/4 v2] target-i386: fix {min, max}{pd, ps, sd, ss} SSE2 instructions X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org On Sat, Jan 07, 2012 at 08:22:53PM +0000, Peter Maydell wrote: > On 7 January 2012 20:09, Aurelien Jarno wrote: > > minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2 > > instructions have been broken when switching target-i386 to softfloat. > > It's not possible to use comparison instructions on float types anymore > > to softfloat, so use the floatXX_min anf floatXX_max functions instead. > > Nope, this gets the x86 special cases wrong. This has been discussed > here before: > > http://www.mail-archive.com/qemu-devel@nongnu.org/msg85557.html > has the right implementation (from Jason Wessell) and a comment > (from me) about why it's right. > Good catch, the patch below should implement the correct behaviour. target-i386: fix {min,max}{pd,ps,sd,ss} SSE2 instructions minpd, minps, minsd, minss and maxpd, maxps, maxsd, maxss SSE2 instructions have been broken when switching target-i386 to softfloat. It's not possible to use comparison instructions on float types anymore to softfloat, so use the floatXX_lt function instead, as the float_XX_min and float_XX_max functions can't be used due to the Intel specific behaviour. As it implements the correct NaNs behaviour, let's remove the corresponding entry from the TODO. It fixes GDM screen display on Debian Lenny. Thanks to Peter Maydell and Jason Wessel for their analysis of the problem. Signed-off-by: Aurelien Jarno --- target-i386/TODO | 1 - target-i386/ops_sse.h | 9 +++++++-- 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/target-i386/TODO b/target-i386/TODO index c8ada07..a8d69cf 100644 --- a/target-i386/TODO +++ b/target-i386/TODO @@ -15,7 +15,6 @@ Correctness issues: - DRx register support - CR0.AC emulation - SSE alignment checks -- fix SSE min/max with nans Optimizations/Features: diff --git a/target-i386/ops_sse.h b/target-i386/ops_sse.h index 47dde78..8ed231d 100644 --- a/target-i386/ops_sse.h +++ b/target-i386/ops_sse.h @@ -584,10 +584,15 @@ void helper_ ## name ## sd (Reg *d, Reg *s)\ #define FPU_SUB(size, a, b) float ## size ## _sub(a, b, &env->sse_status) #define FPU_MUL(size, a, b) float ## size ## _mul(a, b, &env->sse_status) #define FPU_DIV(size, a, b) float ## size ## _div(a, b, &env->sse_status) -#define FPU_MIN(size, a, b) (a) < (b) ? (a) : (b) -#define FPU_MAX(size, a, b) (a) > (b) ? (a) : (b) #define FPU_SQRT(size, a, b) float ## size ## _sqrt(b, &env->sse_status) +/* Note that the choice of comparison op here is important to get the + * special cases right: for min and max Intel specifies that (-0,0), + * (NaN, anything) and (anything, NaN) return the second argument. + */ +#define FPU_MIN(size, a, b) float ## size ## _lt(a, b, &env->sse_status) ? (a) : (b) +#define FPU_MAX(size, a, b) float ## size ## _lt(b, a, &env->sse_status) ? (a) : (b) + SSE_HELPER_S(add, FPU_ADD) SSE_HELPER_S(sub, FPU_SUB) SSE_HELPER_S(mul, FPU_MUL)