Message ID | 1302645571-20500-9-git-send-email-aurelien@aurel32.net |
---|---|
State | New |
Headers | show |
On 12 April 2011 22:59, Aurelien Jarno <aurelien@aurel32.net> wrote: > Use float64_unordered() in helper_cmptun() instead of doing the > the comparison manually. This also fixes the wrong behaviours with > sNaNs. > > Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> > --- > target-alpha/op_helper.c | 5 +++-- > 1 files changed, 3 insertions(+), 2 deletions(-) > > diff --git a/target-alpha/op_helper.c b/target-alpha/op_helper.c > index 6c2ae20..e07ae69 100644 > --- a/target-alpha/op_helper.c > +++ b/target-alpha/op_helper.c > @@ -904,10 +904,11 @@ uint64_t helper_cmptun (uint64_t a, uint64_t b) > fa = t_to_float64(a); > fb = t_to_float64(b); > > - if (float64_is_quiet_nan(fa) || float64_is_quiet_nan(fb)) > + if (float64_unordered(fa, fb, &FP_STATUS)) { > return 0x4000000000000000ULL; > - else > + } else { > return 0; > + } > } I'm not sure this is right. The Alpha Architecture Handbook is a bit opaque, but the Compiler Writer's Guide is clearer: www.compaq.com/cpq-alphaserver/technology/literature/cmpwrgd.pdf page D-4 says that CMPTUN (like CMPTEQ) should generate InvalidOp for SNaNs but not for QNaNs. (Contrast CMPTLT, CMPTLE, which set InvalidOp for both SNaNs and QNaNs.) So I think you want the _quiet version here. (And helper_cmpteq needs to use float64_eq_quiet rather than float64_eq.) -- PMM
[ Odd, the original thread doesn't seem to have arrived here. ] On 04/13/2011 07:52 AM, Peter Maydell wrote: > So I think you want the _quiet version here. (And helper_cmpteq > needs to use float64_eq_quiet rather than float64_eq.) Yes, the _quiet version is what's needed for all comparisons. For the record, the goal for QEMU should not be to emulate the hardware exactly, but to emulate the entire HW+OS system. Thus when looking at Table B-2 one should look at the OS Completion Handler and User Signal Handler columns. That's for /s qualified instructions anyway. For non-ieee code we take care to inject the extra traps via helper_ieee_input*. (It looks like some of the Alpha code can be cleaned up a bit. I don't recall flush_inputs_to_zero option being there before, and we do that by hand in helper_ieee_input*.) r~
On 13 April 2011 16:38, Richard Henderson <rth@twiddle.net> wrote: > [ Odd, the original thread doesn't seem to have arrived here. ] > > On 04/13/2011 07:52 AM, Peter Maydell wrote: >> So I think you want the _quiet version here. (And helper_cmpteq >> needs to use float64_eq_quiet rather than float64_eq.) > > Yes, the _quiet version is what's needed for all comparisons. Really all comparisons, including CMPTLT, CMPTLE? > (It looks like some of the Alpha code can be cleaned up a bit. > I don't recall flush_inputs_to_zero option being there before, > and we do that by hand in helper_ieee_input*.) Yes, I added that softfloat feature for the benefit of ARM's flush-to-zero mode. -- PMM
On 04/13/2011 08:42 AM, Peter Maydell wrote: > On 13 April 2011 16:38, Richard Henderson <rth@twiddle.net> wrote: >> [ Odd, the original thread doesn't seem to have arrived here. ] >> >> On 04/13/2011 07:52 AM, Peter Maydell wrote: >>> So I think you want the _quiet version here. (And helper_cmpteq >>> needs to use float64_eq_quiet rather than float64_eq.) >> >> Yes, the _quiet version is what's needed for all comparisons. > > Really all comparisons, including CMPTLT, CMPTLE? Oops, no. CMPTLE and CMPTLT in Table B-2 are on the next page, and clearly indicate that they signal InvalidOP for QNaN. So it's just CMPTUN and CMPTEQ that should not signal on QNaN. r~
On Wed, Apr 13, 2011 at 08:53:27AM -0700, Richard Henderson wrote: > On 04/13/2011 08:42 AM, Peter Maydell wrote: > > On 13 April 2011 16:38, Richard Henderson <rth@twiddle.net> wrote: > >> [ Odd, the original thread doesn't seem to have arrived here. ] > >> > >> On 04/13/2011 07:52 AM, Peter Maydell wrote: > >>> So I think you want the _quiet version here. (And helper_cmpteq > >>> needs to use float64_eq_quiet rather than float64_eq.) > >> > >> Yes, the _quiet version is what's needed for all comparisons. > > > > Really all comparisons, including CMPTLT, CMPTLE? > > Oops, no. CMPTLE and CMPTLT in Table B-2 are on the next page, > and clearly indicate that they signal InvalidOP for QNaN. > > So it's just CMPTUN and CMPTEQ that should not signal on QNaN. > Thanks to both of you for digging into the manuals, the information I have found in the manual I have were not so clear. I'll fix that in the next version.
On 13 April 2011 16:38, Richard Henderson <rth@twiddle.net> wrote: > (It looks like some of the Alpha code can be cleaned up a bit. > I don't recall flush_inputs_to_zero option being there before, > and we do that by hand in helper_ieee_input*.) While we're on the subject of Alpha and flush-to-zero modes, do you know what exception bits should get set when Alpha flushes a denormal output (not input) to zero? At the moment softfloat doesn't set any flags when it does this, which is definitely wrong for ARM. I had a look at the manuals for the other archs which might set flush_to_zero: SH: should set underflow flag ARM: should set underflow flag PPC: should set underflow (I think) MIPS: docs didn't say I think Alpha should set Inexact and not Underflow, but I'm not sure -- can you confirm/deny? (Obviously I would prefer it if every architecture just needed to set underflow, but somehow I don't think it will work out that cleanly :-)) thanks -- PMM
On 04/14/2011 02:14 AM, Peter Maydell wrote: > While we're on the subject of Alpha and flush-to-zero modes, > do you know what exception bits should get set when Alpha > flushes a denormal output (not input) to zero? ... > I think Alpha should set Inexact and not Underflow, but > I'm not sure -- can you confirm/deny? Deny. Page B-8, Add Sub Output Exceptions, Exponent underflow and disabled: Supply +0, no exception delivered to user. Exponent underflow and enabled: Supply +-MIN denorm, Underflow delivered to user. Footnote 3, Overflow and Underflow have priority over Inexact. r~
On 14 April 2011 16:14, Richard Henderson <rth@twiddle.net> wrote: > On 04/14/2011 02:14 AM, Peter Maydell wrote: >> While we're on the subject of Alpha and flush-to-zero modes, >> do you know what exception bits should get set when Alpha >> flushes a denormal output (not input) to zero? > ... >> I think Alpha should set Inexact and not Underflow, but >> I'm not sure -- can you confirm/deny? > > Deny. > > Page B-8, Add Sub Output Exceptions, > > Exponent underflow and disabled: > Supply +0, no exception delivered to user. > > Exponent underflow and enabled: > Supply +-MIN denorm, Underflow delivered to user. > > Footnote 3, Overflow and Underflow have priority over Inexact. Thanks. Does "no exception delivered to user" mean also "and do not set FPCR bit UNF" ? The reason I thought it might set Inexact is that I was looking at page 4-79, which says: "If both the UNFD (underflow disable) bit and the UNDZ (underflow to zero) bit are set in the FPCR, the implementation sets the result of an underflow operation to a true zero result. The zeroing of a denormal result by UNDZ must also be treated as an inexact result." -- PMM
On 04/14/2011 08:39 AM, Peter Maydell wrote: >> Exponent underflow and disabled: >> Supply +0, no exception delivered to user. >> >> Exponent underflow and enabled: >> Supply +-MIN denorm, Underflow delivered to user. >> >> Footnote 3, Overflow and Underflow have priority over Inexact. > > Thanks. Does "no exception delivered to user" mean also > "and do not set FPCR bit UNF" ? Yes. > The reason I thought it might set Inexact is that I was looking > at page 4-79, which says: > > "If both the UNFD (underflow disable) bit and the UNDZ (underflow > to zero) bit are set in the FPCR, the implementation sets the > result of an underflow operation to a true zero result. The > zeroing of a denormal result by UNDZ must also be treated as an > inexact result." Hum. It looks like we can choose between these results then, depending on the intersection of the FPCR disable bits, and the per-instruction trapping mode bits (see section 4.7.7.2). I *think* what would be best for Alpha is if, within softfloat, both conditions are signaled, and then we can filter the result that is actually needed via helper_fp_exc_raise? It's hard to say without actually doing the work... Unfortunately, I suspect that the Correct result on real HW also depends on the OS completion handler, and I know that at least for Linux that code was written before UNDZ was added. So I don't know if even real HW produces the correct result when considering Underflow priority over Inexact. r~
On 14 April 2011 17:27, Richard Henderson <rth@twiddle.net> wrote: > On 04/14/2011 08:39 AM, Peter Maydell wrote: >>> Exponent underflow and disabled: >>> Supply +0, no exception delivered to user. >>> >>> Exponent underflow and enabled: >>> Supply +-MIN denorm, Underflow delivered to user. >>> >>> Footnote 3, Overflow and Underflow have priority over Inexact. >> >> Thanks. Does "no exception delivered to user" mean also >> "and do not set FPCR bit UNF" ? > > Yes. > >> The reason I thought it might set Inexact is that I was looking >> at page 4-79, which says: >> >> "If both the UNFD (underflow disable) bit and the UNDZ (underflow >> to zero) bit are set in the FPCR, the implementation sets the >> result of an underflow operation to a true zero result. The >> zeroing of a denormal result by UNDZ must also be treated as an >> inexact result." > > Hum. It looks like we can choose between these results then, > depending on the intersection of the FPCR disable bits, and > the per-instruction trapping mode bits (see section 4.7.7.2). > > I *think* what would be best for Alpha is if, within softfloat, > both conditions are signaled, and then we can filter the result > that is actually needed via helper_fp_exc_raise? It's hard to > say without actually doing the work... Unfortunately doing that is bad for other architectures because they tend to let the softfloat status flags act directly as the cumulative exception flags, so if softfloat sets flags they don't want you've suddenly imposed a burden of saving and restoring flags or something similarly ugly. It sounds like maybe what we need is to have a signed char float_ftz_flags or similar in float_status which should be set to the float_flags which should be passed to float_raise() when we flush an output denormal to zero. Then we can set that to float_flag_underflow for ARM, leave it at 0 for other architectures [preserve existing semantics] and set it to float_flag_underflow | float_flag_inexact for Alpha. Or the other approach would be to do something like how the 'denormal-input-went-to-zero' case is handled: have an extra float_flag for float_flag_output_denormal, and let targets either (a) special-case it or (b) merge it with whichever of underflow or inexact they like when constructing the guest-visible floating point status flags. I kind of like that better, actually. -- PMM
On 04/14/2011 09:48 AM, Peter Maydell wrote: > Or the other approach would be to do something like > how the 'denormal-input-went-to-zero' case is handled: have > an extra float_flag for float_flag_output_denormal, and let > targets either (a) special-case it or (b) merge it with > whichever of underflow or inexact they like when constructing > the guest-visible floating point status flags. I kind of like > that better, actually. Either of your proposed solutions works for me. Either allows significant freedom in the backend to decide on how to implement the per-instruction selection. r~
diff --git a/target-alpha/op_helper.c b/target-alpha/op_helper.c index 6c2ae20..e07ae69 100644 --- a/target-alpha/op_helper.c +++ b/target-alpha/op_helper.c @@ -904,10 +904,11 @@ uint64_t helper_cmptun (uint64_t a, uint64_t b) fa = t_to_float64(a); fb = t_to_float64(b); - if (float64_is_quiet_nan(fa) || float64_is_quiet_nan(fb)) + if (float64_unordered(fa, fb, &FP_STATUS)) { return 0x4000000000000000ULL; - else + } else { return 0; + } } uint64_t helper_cmpteq(uint64_t a, uint64_t b)
Use float64_unordered() in helper_cmptun() instead of doing the the comparison manually. This also fixes the wrong behaviours with sNaNs. Signed-off-by: Aurelien Jarno <aurelien@aurel32.net> --- target-alpha/op_helper.c | 5 +++-- 1 files changed, 3 insertions(+), 2 deletions(-)