Message ID | 20211213121320.947362-1-matheus.ferst@eldorado.org.br |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] target/ppc: do not silence snan in xscvspdpn | expand |
On 12/13/21 13:13, matheus.ferst@eldorado.org.br wrote: > From: Matheus Ferst <matheus.ferst@eldorado.org.br> > > The non-signalling versions of VSX scalar convert to shorter/longer > precision insns doesn't silence SNaNs in the hardware. We are currently > honoring this behavior in xscvdpspn, since helper_xscvdpspn handles the > conversion with extracts/deposits/etc. OTOH, xscvspdpn uses > float32_to_float64 that calls parts_float_to_float, which uses > parts_return_nan that finally calls parts_silence_nan if the input is an > SNaN. > > To address this problem, this patch adds a new float_status flag > (return_snan) to avoid the call to parts_silence_nan in the > float_class_snan case of parts_return_nan. > > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> > --- > This behavior was observed in a Power9 and can be reproduced with the > following code: > > int main(void) > { > __uint128_t t, b = 0x7f80000200000000; > > asm("xscvspdpn %x0, %x1\n\t" > : "=wa" (t) > : "wa" (b << 64)); > printf("0x%016" PRIx64 "%016" PRIx64 "\n", > (uint64_t)(t >> 64), (uint64_t)t); > > b = 0x7ff0000000000002; > asm("xscvdpspn %x0, %x1\n\t" > : "=wa" (t) > : "wa" (b << 64)); > printf("0x%016" PRIx64 "%016" PRIx64 "\n", > (uint64_t)(t >> 64), (uint64_t)t); > > return 0; > } Why not add this test in tests/tcg/ppc64le/ ? > > That results in: > $ ./xscv_non_signaling > xscvspdpn: 0x7ff00000400000000000000000000000 > xscvdpspn: 0x7f8000007f8000000000000000000000 > $ qemu-ppc64le ./xscv_non_signaling > xscvspdpn: 0x7ff80000400000000000000000000000 > xscvdpspn: 0x7f8000007f8000000000000000000000 > > PowerISA is more descriptive of xscvdpspn wrt SNaN but doesn't say > anything about NaNs in xscvspdpn description. Should we match the > hardware behavior? If so, does it worth adding a new flag in > float_status for a single instruction? We could also handle that in > helper_xscvdpspn, e.g.: > > float32 sp = xb >> 32; > float64 dp; > > dp = float32_to_float64(xb >> 32, &tstat); > if (float32_classify(sp) == is_snan) { > dp &= ~(1ULL << 51); > dp |= (dp & 0x7ffffffffffffULL) == 0; > } > > return dp; > > but it feels kind hacky.
On 12/13/21 4:13 AM, matheus.ferst@eldorado.org.br wrote: > From: Matheus Ferst <matheus.ferst@eldorado.org.br> > > The non-signalling versions of VSX scalar convert to shorter/longer > precision insns doesn't silence SNaNs in the hardware. We are currently > honoring this behavior in xscvdpspn, since helper_xscvdpspn handles the > conversion with extracts/deposits/etc. OTOH, xscvspdpn uses > float32_to_float64 that calls parts_float_to_float, which uses > parts_return_nan that finally calls parts_silence_nan if the input is an > SNaN. > > To address this problem, this patch adds a new float_status flag > (return_snan) to avoid the call to parts_silence_nan in the > float_class_snan case of parts_return_nan. > > Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> I think you shouldn't be using softfloat at all for this, but use the existing helper_todouble function. r~
On 13/12/2021 09:36, Philippe Mathieu-Daudé wrote: > On 12/13/21 13:13, matheus.ferst@eldorado.org.br wrote: >> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >> >> The non-signalling versions of VSX scalar convert to shorter/longer >> precision insns doesn't silence SNaNs in the hardware. We are currently >> honoring this behavior in xscvdpspn, since helper_xscvdpspn handles the >> conversion with extracts/deposits/etc. OTOH, xscvspdpn uses >> float32_to_float64 that calls parts_float_to_float, which uses >> parts_return_nan that finally calls parts_silence_nan if the input is an >> SNaN. >> >> To address this problem, this patch adds a new float_status flag >> (return_snan) to avoid the call to parts_silence_nan in the >> float_class_snan case of parts_return_nan. >> >> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> >> --- >> This behavior was observed in a Power9 and can be reproduced with the >> following code: >> >> int main(void) >> { >> __uint128_t t, b = 0x7f80000200000000; >> >> asm("xscvspdpn %x0, %x1\n\t" >> : "=wa" (t) >> : "wa" (b << 64)); >> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >> (uint64_t)(t >> 64), (uint64_t)t); >> >> b = 0x7ff0000000000002; >> asm("xscvdpspn %x0, %x1\n\t" >> : "=wa" (t) >> : "wa" (b << 64)); >> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >> (uint64_t)(t >> 64), (uint64_t)t); >> >> return 0; >> } > > Why not add this test in tests/tcg/ppc64le/ ? I'll add it in v2. Is it ok to use __uint128_t in tests? Thanks, Matheus K. Ferst Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/> Analista de Software Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
On 13/12/2021 12:46, Richard Henderson wrote: > On 12/13/21 4:13 AM, matheus.ferst@eldorado.org.br wrote: >> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >> >> The non-signalling versions of VSX scalar convert to shorter/longer >> precision insns doesn't silence SNaNs in the hardware. We are currently >> honoring this behavior in xscvdpspn, since helper_xscvdpspn handles the >> conversion with extracts/deposits/etc. OTOH, xscvspdpn uses >> float32_to_float64 that calls parts_float_to_float, which uses >> parts_return_nan that finally calls parts_silence_nan if the input is an >> SNaN. >> >> To address this problem, this patch adds a new float_status flag >> (return_snan) to avoid the call to parts_silence_nan in the >> float_class_snan case of parts_return_nan. >> >> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> > > I think you shouldn't be using softfloat at all for this, but use the > existing > helper_todouble function. > Oh, that's better, I'll send a v2. Thanks, Matheus K. Ferst Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/> Analista de Software Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
On 12/13/21 21:15, Matheus K. Ferst wrote: > On 13/12/2021 09:36, Philippe Mathieu-Daudé wrote: >> On 12/13/21 13:13, matheus.ferst@eldorado.org.br wrote: >>> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >>> >>> The non-signalling versions of VSX scalar convert to shorter/longer >>> precision insns doesn't silence SNaNs in the hardware. We are currently >>> honoring this behavior in xscvdpspn, since helper_xscvdpspn handles the >>> conversion with extracts/deposits/etc. OTOH, xscvspdpn uses >>> float32_to_float64 that calls parts_float_to_float, which uses >>> parts_return_nan that finally calls parts_silence_nan if the input is an >>> SNaN. >>> >>> To address this problem, this patch adds a new float_status flag >>> (return_snan) to avoid the call to parts_silence_nan in the >>> float_class_snan case of parts_return_nan. >>> >>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> >>> --- >>> This behavior was observed in a Power9 and can be reproduced with the >>> following code: >>> >>> int main(void) >>> { >>> __uint128_t t, b = 0x7f80000200000000; >>> >>> asm("xscvspdpn %x0, %x1\n\t" >>> : "=wa" (t) >>> : "wa" (b << 64)); >>> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >>> (uint64_t)(t >> 64), (uint64_t)t); >>> >>> b = 0x7ff0000000000002; >>> asm("xscvdpspn %x0, %x1\n\t" >>> : "=wa" (t) >>> : "wa" (b << 64)); >>> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >>> (uint64_t)(t >> 64), (uint64_t)t); >>> >>> return 0; >>> } >> >> Why not add this test in tests/tcg/ppc64le/ ? > > I'll add it in v2. Is it ok to use __uint128_t in tests? What about: int main(void) { #ifndef __SIZEOF_INT128__ printf("uint128_t not available, skipping...\n"); #else ... #endif return 0; }
Philippe Mathieu-Daudé <f4bug@amsat.org> writes: > On 12/13/21 21:15, Matheus K. Ferst wrote: >> On 13/12/2021 09:36, Philippe Mathieu-Daudé wrote: >>> On 12/13/21 13:13, matheus.ferst@eldorado.org.br wrote: >>>> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >>>> >>>> The non-signalling versions of VSX scalar convert to shorter/longer >>>> precision insns doesn't silence SNaNs in the hardware. We are currently >>>> honoring this behavior in xscvdpspn, since helper_xscvdpspn handles the >>>> conversion with extracts/deposits/etc. OTOH, xscvspdpn uses >>>> float32_to_float64 that calls parts_float_to_float, which uses >>>> parts_return_nan that finally calls parts_silence_nan if the input is an >>>> SNaN. >>>> >>>> To address this problem, this patch adds a new float_status flag >>>> (return_snan) to avoid the call to parts_silence_nan in the >>>> float_class_snan case of parts_return_nan. >>>> >>>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> >>>> --- >>>> This behavior was observed in a Power9 and can be reproduced with the >>>> following code: >>>> >>>> int main(void) >>>> { >>>> __uint128_t t, b = 0x7f80000200000000; >>>> >>>> asm("xscvspdpn %x0, %x1\n\t" >>>> : "=wa" (t) >>>> : "wa" (b << 64)); >>>> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >>>> (uint64_t)(t >> 64), (uint64_t)t); >>>> >>>> b = 0x7ff0000000000002; >>>> asm("xscvdpspn %x0, %x1\n\t" >>>> : "=wa" (t) >>>> : "wa" (b << 64)); >>>> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >>>> (uint64_t)(t >> 64), (uint64_t)t); >>>> >>>> return 0; >>>> } >>> >>> Why not add this test in tests/tcg/ppc64le/ ? >> >> I'll add it in v2. Is it ok to use __uint128_t in tests? > > What about: > > int main(void) > { > #ifndef __SIZEOF_INT128__ > printf("uint128_t not available, skipping...\n"); > #else > ... > #endif > return 0; > } We have a tests/tcg/configure.sh which does feature tests although it is mainly testing for the presence of compiler target flags. We do this because while the docker compilers are all pretty modern the user might be using their own cross compiler. I'm generally not keen on the tests silently skipping because it gives a false impression things have been tested. If it is possible to avoid INT128 shenanigans to load the values into the inline assembler lets do that, otherwise lets feature test and ifdef a skip-test in the makefile.
On 15/12/2021 12:55, Alex Bennée wrote: > Philippe Mathieu-Daudé <f4bug@amsat.org> writes: > >> On 12/13/21 21:15, Matheus K. Ferst wrote: >>> On 13/12/2021 09:36, Philippe Mathieu-Daudé wrote: >>>> On 12/13/21 13:13, matheus.ferst@eldorado.org.br wrote: >>>>> From: Matheus Ferst <matheus.ferst@eldorado.org.br> >>>>> >>>>> The non-signalling versions of VSX scalar convert to shorter/longer >>>>> precision insns doesn't silence SNaNs in the hardware. We are currently >>>>> honoring this behavior in xscvdpspn, since helper_xscvdpspn handles the >>>>> conversion with extracts/deposits/etc. OTOH, xscvspdpn uses >>>>> float32_to_float64 that calls parts_float_to_float, which uses >>>>> parts_return_nan that finally calls parts_silence_nan if the input is an >>>>> SNaN. >>>>> >>>>> To address this problem, this patch adds a new float_status flag >>>>> (return_snan) to avoid the call to parts_silence_nan in the >>>>> float_class_snan case of parts_return_nan. >>>>> >>>>> Signed-off-by: Matheus Ferst <matheus.ferst@eldorado.org.br> >>>>> --- >>>>> This behavior was observed in a Power9 and can be reproduced with the >>>>> following code: >>>>> >>>>> int main(void) >>>>> { >>>>> __uint128_t t, b = 0x7f80000200000000; >>>>> >>>>> asm("xscvspdpn %x0, %x1\n\t" >>>>> : "=wa" (t) >>>>> : "wa" (b << 64)); >>>>> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >>>>> (uint64_t)(t >> 64), (uint64_t)t); >>>>> >>>>> b = 0x7ff0000000000002; >>>>> asm("xscvdpspn %x0, %x1\n\t" >>>>> : "=wa" (t) >>>>> : "wa" (b << 64)); >>>>> printf("0x%016" PRIx64 "%016" PRIx64 "\n", >>>>> (uint64_t)(t >> 64), (uint64_t)t); >>>>> >>>>> return 0; >>>>> } >>>> >>>> Why not add this test in tests/tcg/ppc64le/ ? >>> >>> I'll add it in v2. Is it ok to use __uint128_t in tests? >> >> What about: >> >> int main(void) >> { >> #ifndef __SIZEOF_INT128__ >> printf("uint128_t not available, skipping...\n"); >> #else >> ... >> #endif >> return 0; >> } > > We have a tests/tcg/configure.sh which does feature tests although it is > mainly testing for the presence of compiler target flags. We do this > because while the docker compilers are all pretty modern the user might > be using their own cross compiler. > > I'm generally not keen on the tests silently skipping because it gives a > false impression things have been tested. If it is possible to avoid > INT128 shenanigans to load the values into the inline assembler lets do > that, otherwise lets feature test and ifdef a skip-test in the makefile. > > -- > Alex Bennée > I think we can pass the value via register and use mtvsrd/mfvsrd to avoid the INT128 part. There was a v2 of this patch, but I messed up the CC and only included get_maintainer.pl result and Philippe. The patch is now applied, so I'll send this change as a follow-up. Thanks, Matheus K. Ferst Instituto de Pesquisas ELDORADO <http://www.eldorado.org.br/> Analista de Software Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
diff --git a/fpu/softfloat-parts.c.inc b/fpu/softfloat-parts.c.inc index 41d4b17e41..ccc24f9153 100644 --- a/fpu/softfloat-parts.c.inc +++ b/fpu/softfloat-parts.c.inc @@ -20,10 +20,12 @@ static void partsN(return_nan)(FloatPartsN *a, float_status *s) switch (a->cls) { case float_class_snan: float_raise(float_flag_invalid, s); - if (s->default_nan_mode) { - parts_default_nan(a, s); - } else { - parts_silence_nan(a, s); + if (!return_snan(s)) { + if (s->default_nan_mode) { + parts_default_nan(a, s); + } else { + parts_silence_nan(a, s); + } } break; case float_class_qnan: diff --git a/fpu/softfloat-specialize.c.inc b/fpu/softfloat-specialize.c.inc index f2ad0f335e..70f686e0a3 100644 --- a/fpu/softfloat-specialize.c.inc +++ b/fpu/softfloat-specialize.c.inc @@ -92,6 +92,15 @@ static inline bool no_signaling_nans(float_status *status) #endif } +static inline bool return_snan(float_status *status) +{ +#if defined(TARGET_PPC) + return status->return_snan; +#else + return false; +#endif +} + /* Define how the architecture discriminates signaling NaNs. * This done with the most significant bit of the fraction. * In IEEE 754-1985 this was implementation defined, but in IEEE 754-2008 diff --git a/include/fpu/softfloat-types.h b/include/fpu/softfloat-types.h index 5bcbd041f7..eb55298348 100644 --- a/include/fpu/softfloat-types.h +++ b/include/fpu/softfloat-types.h @@ -188,6 +188,7 @@ typedef struct float_status { bool snan_bit_is_one; bool use_first_nan; bool no_signaling_nans; + bool return_snan; } float_status; #endif /* SOFTFLOAT_TYPES_H */ diff --git a/target/ppc/fpu_helper.c b/target/ppc/fpu_helper.c index c4896cecc8..9fc774fdb0 100644 --- a/target/ppc/fpu_helper.c +++ b/target/ppc/fpu_helper.c @@ -2746,6 +2746,8 @@ uint64_t helper_xscvdpspn(CPUPPCState *env, uint64_t xb) uint64_t helper_xscvspdpn(CPUPPCState *env, uint64_t xb) { float_status tstat = env->fp_status; + + tstat.return_snan = true; set_float_exception_flags(0, &tstat); return float32_to_float64(xb >> 32, &tstat);