Message ID | 20200820183700.115087-1-msc@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | Update powerpc libm-test-ulps | expand |
On 8/20/20 2:37 PM, Matheus Castanho via Libc-alpha wrote: > Before this patch, the following tests were failing: > > ppc and ppc64: > FAIL: math/test-ldouble-j0 > > ppc64le: > FAIL: math/test-ibm128-j0 > --- > sysdeps/powerpc/fpu/libm-test-ulps | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps > index cd2a5fed45..0b82c3f107 100644 > --- a/sysdeps/powerpc/fpu/libm-test-ulps > +++ b/sysdeps/powerpc/fpu/libm-test-ulps > @@ -1317,13 +1317,13 @@ Function: "j0_downward": > double: 2 > float: 4 > float128: 4 > -ldouble: 11 > +ldouble: 12 > > Function: "j0_towardzero": > double: 5 > float: 6 > float128: 2 > -ldouble: 8 > +ldouble: 16 We should not have ULPs higher than 9. I see Adhemerval added some 11 ULPs here for cexp. We should be able to achieve <= 9 ULPs on these algorithms, otherwise there are compiler problems that need fixing? > Function: "j0_upward": > double: 4 >
On 20/08/2020 15:39, Carlos O'Donell wrote: > On 8/20/20 2:37 PM, Matheus Castanho via Libc-alpha wrote: >> Before this patch, the following tests were failing: >> >> ppc and ppc64: >> FAIL: math/test-ldouble-j0 >> >> ppc64le: >> FAIL: math/test-ibm128-j0 >> --- >> sysdeps/powerpc/fpu/libm-test-ulps | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps >> index cd2a5fed45..0b82c3f107 100644 >> --- a/sysdeps/powerpc/fpu/libm-test-ulps >> +++ b/sysdeps/powerpc/fpu/libm-test-ulps >> @@ -1317,13 +1317,13 @@ Function: "j0_downward": >> double: 2 >> float: 4 >> float128: 4 >> -ldouble: 11 >> +ldouble: 12 >> >> Function: "j0_towardzero": >> double: 5 >> float: 6 >> float128: 2 >> -ldouble: 8 >> +ldouble: 16 > > We should not have ULPs higher than 9. > > I see Adhemerval added some 11 ULPs here for cexp. > > We should be able to achieve <= 9 ULPs on these algorithms, otherwise there are > compiler problems that need fixing? We are more forgiving for IBM long double due its inherent precision issues: math/libm-test-support.c 228 if (testing_ibm128) 229 /* The documented accuracy of IBM long double division is 3ulp 230 (see libgcc/config/rs6000/ibm-ldouble-format), so do not 231 require better accuracy for libm functions that are exactly 232 defined for other formats. */ 233 max_valid_error = exact ? 3 : 16; 234 else 235 max_valid_error = exact ? 0 : 9; And jN implementation also has low precision for some inputs. With both constraints I think 16ulps should be ok.
Dear Carlos, > We should be able to achieve <= 9 ULPs on these algorithms, otherwise there are > compiler problems that need fixing? we are far from <= 9 ULPS for j0,j1,y0,y1, even in binary32. See https://members.loria.fr/PZimmermann/papers/accuracy.pdf. Paul
Adhemerval Zanella via Libc-alpha <libc-alpha@sourceware.org> writes: > On 20/08/2020 15:39, Carlos O'Donell wrote: >> On 8/20/20 2:37 PM, Matheus Castanho via Libc-alpha wrote: >>> Before this patch, the following tests were failing: >>> >>> ppc and ppc64: >>> FAIL: math/test-ldouble-j0 >>> >>> ppc64le: >>> FAIL: math/test-ibm128-j0 >>> --- >>> sysdeps/powerpc/fpu/libm-test-ulps | 4 ++-- >>> 1 file changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps >>> index cd2a5fed45..0b82c3f107 100644 >>> --- a/sysdeps/powerpc/fpu/libm-test-ulps >>> +++ b/sysdeps/powerpc/fpu/libm-test-ulps >>> @@ -1317,13 +1317,13 @@ Function: "j0_downward": >>> double: 2 >>> float: 4 >>> float128: 4 >>> -ldouble: 11 >>> +ldouble: 12 >>> >>> Function: "j0_towardzero": >>> double: 5 >>> float: 6 >>> float128: 2 >>> -ldouble: 8 >>> +ldouble: 16 >> >> We should not have ULPs higher than 9. >> >> I see Adhemerval added some 11 ULPs here for cexp. >> >> We should be able to achieve <= 9 ULPs on these algorithms, otherwise there are >> compiler problems that need fixing? > > We are more forgiving for IBM long double due its inherent precision issues: > > math/libm-test-support.c > > 228 if (testing_ibm128) > 229 /* The documented accuracy of IBM long double division is 3ulp > 230 (see libgcc/config/rs6000/ibm-ldouble-format), so do not > 231 require better accuracy for libm functions that are exactly > 232 defined for other formats. */ > 233 max_valid_error = exact ? 3 : 16; > 234 else > 235 max_valid_error = exact ? 0 : 9; > > And jN implementation also has low precision for some inputs. With both constraints > I think 16ulps should be ok. There is also a loss of precision with different rounding modes in libgcc. There are currently 30 entries for ibm128 with ULP between 10 and 16 (without counting this patch). Maybe some of these should actually be marked as xfail-rounding:ibm128-libgcc instead. The only way to validate this is by compiling glibc with a libgcc that has a patch from Joseph. I have an up-to-date version of that patch in https://github.com/tuliom/gcc/commit/ca42479cae3c2b56651c3e97bb5eeaf24ca4bb61
On 8/20/20 3:44 PM, Tulio Magno Quites Machado Filho wrote: > Adhemerval Zanella via Libc-alpha <libc-alpha@sourceware.org> writes: > >> On 20/08/2020 15:39, Carlos O'Donell wrote: >>> On 8/20/20 2:37 PM, Matheus Castanho via Libc-alpha wrote: >>>> Before this patch, the following tests were failing: >>>> >>>> ppc and ppc64: >>>> FAIL: math/test-ldouble-j0 >>>> >>>> ppc64le: >>>> FAIL: math/test-ibm128-j0 >>>> --- >>>> sysdeps/powerpc/fpu/libm-test-ulps | 4 ++-- >>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps >>>> index cd2a5fed45..0b82c3f107 100644 >>>> --- a/sysdeps/powerpc/fpu/libm-test-ulps >>>> +++ b/sysdeps/powerpc/fpu/libm-test-ulps >>>> @@ -1317,13 +1317,13 @@ Function: "j0_downward": >>>> double: 2 >>>> float: 4 >>>> float128: 4 >>>> -ldouble: 11 >>>> +ldouble: 12 >>>> >>>> Function: "j0_towardzero": >>>> double: 5 >>>> float: 6 >>>> float128: 2 >>>> -ldouble: 8 >>>> +ldouble: 16 >>> >>> We should not have ULPs higher than 9. >>> >>> I see Adhemerval added some 11 ULPs here for cexp. >>> >>> We should be able to achieve <= 9 ULPs on these algorithms, otherwise there are >>> compiler problems that need fixing? >> >> We are more forgiving for IBM long double due its inherent precision issues: >> >> math/libm-test-support.c >> >> 228 if (testing_ibm128) >> 229 /* The documented accuracy of IBM long double division is 3ulp >> 230 (see libgcc/config/rs6000/ibm-ldouble-format), so do not >> 231 require better accuracy for libm functions that are exactly >> 232 defined for other formats. */ >> 233 max_valid_error = exact ? 3 : 16; >> 234 else >> 235 max_valid_error = exact ? 0 : 9; Thanks. I didn't know that. >> >> And jN implementation also has low precision for some inputs. With both constraints >> I think 16ulps should be ok. > > There is also a loss of precision with different rounding modes in libgcc. > > There are currently 30 entries for ibm128 with ULP between 10 and 16 (without > counting this patch). Maybe some of these should actually be marked as > xfail-rounding:ibm128-libgcc instead. If the loss of precision is due to the implementation then it seems like using an XFAIL is not an accurate representation of the state. > The only way to validate this is by compiling glibc with a libgcc that has a > patch from Joseph. I have an up-to-date version of that patch in > https://github.com/tuliom/gcc/commit/ca42479cae3c2b56651c3e97bb5eeaf24ca4bb61 Interesting patch, it looks similar to what we do in glibc for some of the libm functions.
On 8/20/20 6:25 PM, Carlos O'Donell wrote: > On 8/20/20 3:44 PM, Tulio Magno Quites Machado Filho wrote: >> Adhemerval Zanella via Libc-alpha <libc-alpha@sourceware.org> writes: >> >>> On 20/08/2020 15:39, Carlos O'Donell wrote: >>>> On 8/20/20 2:37 PM, Matheus Castanho via Libc-alpha wrote: >>>>> Before this patch, the following tests were failing: >>>>> >>>>> ppc and ppc64: >>>>> FAIL: math/test-ldouble-j0 >>>>> >>>>> ppc64le: >>>>> FAIL: math/test-ibm128-j0 >>>>> --- >>>>> sysdeps/powerpc/fpu/libm-test-ulps | 4 ++-- >>>>> 1 file changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps >>>>> index cd2a5fed45..0b82c3f107 100644 >>>>> --- a/sysdeps/powerpc/fpu/libm-test-ulps >>>>> +++ b/sysdeps/powerpc/fpu/libm-test-ulps >>>>> @@ -1317,13 +1317,13 @@ Function: "j0_downward": >>>>> double: 2 >>>>> float: 4 >>>>> float128: 4 >>>>> -ldouble: 11 >>>>> +ldouble: 12 >>>>> >>>>> Function: "j0_towardzero": >>>>> double: 5 >>>>> float: 6 >>>>> float128: 2 >>>>> -ldouble: 8 >>>>> +ldouble: 16 >>>> >>>> We should not have ULPs higher than 9. >>>> >>>> I see Adhemerval added some 11 ULPs here for cexp. >>>> >>>> We should be able to achieve <= 9 ULPs on these algorithms, otherwise there are >>>> compiler problems that need fixing? >>> >>> We are more forgiving for IBM long double due its inherent precision issues: >>> >>> math/libm-test-support.c >>> >>> 228 if (testing_ibm128) >>> 229 /* The documented accuracy of IBM long double division is 3ulp >>> 230 (see libgcc/config/rs6000/ibm-ldouble-format), so do not >>> 231 require better accuracy for libm functions that are exactly >>> 232 defined for other formats. */ >>> 233 max_valid_error = exact ? 3 : 16; >>> 234 else >>> 235 max_valid_error = exact ? 0 : 9; > > Thanks. I didn't know that. > >>> >>> And jN implementation also has low precision for some inputs. With both constraints >>> I think 16ulps should be ok. >> >> There is also a loss of precision with different rounding modes in libgcc. >> >> There are currently 30 entries for ibm128 with ULP between 10 and 16 (without >> counting this patch). Maybe some of these should actually be marked as >> xfail-rounding:ibm128-libgcc instead. > > If the loss of precision is due to the implementation then it seems like using > an XFAIL is not an accurate representation of the state. > >> The only way to validate this is by compiling glibc with a libgcc that has a >> patch from Joseph. I have an up-to-date version of that patch in >> https://github.com/tuliom/gcc/commit/ca42479cae3c2b56651c3e97bb5eeaf24ca4bb61 > > Interesting patch, it looks similar to what we do in glibc for some of the > libm functions. > What do you think we should do in this case? Looks like with the libgcc patch Tulio mentioned we can actually calculate the correct ULPs for ibm128, but we would not be able to get that same precision when building with a regular GCC, which would still cause issues. So should we: 1. Use max precision ULPs calculated with the patched libgcc? This would probably require adding xfail-rounding:ibm128-libgcc to several entries in auto-libm-test-in to guarantee tests pass with regular GCC. 2. Do (1) only for entries that have ULPs higher than a threshold (say, 9 or 16)? 3. Apply the patch as-is? 4. Other? -- Matheus Castanho
Matheus Castanho via Libc-alpha <libc-alpha@sourceware.org> writes: > What do you think we should do in this case? Looks like with the libgcc > patch Tulio mentioned we can actually calculate the correct ULPs for > ibm128, but we would not be able to get that same precision when > building with a regular GCC, which would still cause issues. > > So should we: > 1. Use max precision ULPs calculated with the patched libgcc? > This would probably require adding xfail-rounding:ibm128-libgcc to > several entries in auto-libm-test-in to guarantee tests pass with > regular GCC. I believe this is the best solution if the amount of tests marked as xfail is small, e.g. 100 out of ~8k from math/auto-libm-test-in. However, if a high percentage of tests are xfail'ed, then I think we should consider option 2. > 2. Do (1) only for entries that have ULPs higher than a threshold (say, > 9 or 16)? Likewise, if we're able to keep maximum ULPs at 9 without marking too many tests as xfail'ed, that's better. Per the contents of sysdeps/powerpc/fpu/libm-test-ulps, this should be possible and would not need have a greater max_valid_error for inexact functions just for ibm128.
On Mon, 31 Aug 2020, Tulio Magno Quites Machado Filho via Libc-alpha wrote: > > 2. Do (1) only for entries that have ULPs higher than a threshold (say, > > 9 or 16)? > > Likewise, if we're able to keep maximum ULPs at 9 without marking too many tests > as xfail'ed, that's better. > Per the contents of sysdeps/powerpc/fpu/libm-test-ulps, this should be possible > and would not need have a greater max_valid_error for inexact functions just > for ibm128. If the functions for different floating-point formats use similar algorithms, the error may be a multiple of the error for the basic arithmetic operations. Since the basic arithmetic operations for ldbl-128ibm are less accurate than for IEEE formats, it seems reasonable to allow larger errors for libm functions for that format as well. Ideally the errors would be smaller than they are for some functions with larger errors, but that might require algorithmic improvements. The existing bounds of 9 or 16 ulps are empirical, based on what's seen with functions where the issue is simply the accumulation of lots of separate errors rather than algorithms with inherent numerical problems.
Hi, We're seeing these powerpc test failures in our Fedora testing. Is it possible to submit the original ulps patch and resolve the other concerns going forward? Thank you, Patsy On Mon, Aug 31, 2020 at 1:44 PM Tulio Magno Quites Machado Filho via Libc-alpha <libc-alpha@sourceware.org> wrote: > Matheus Castanho via Libc-alpha <libc-alpha@sourceware.org> writes: > > > What do you think we should do in this case? Looks like with the libgcc > > patch Tulio mentioned we can actually calculate the correct ULPs for > > ibm128, but we would not be able to get that same precision when > > building with a regular GCC, which would still cause issues. > > > > So should we: > > 1. Use max precision ULPs calculated with the patched libgcc? > > This would probably require adding xfail-rounding:ibm128-libgcc to > > several entries in auto-libm-test-in to guarantee tests pass with > > regular GCC. > > I believe this is the best solution if the amount of tests marked as xfail > is > small, e.g. 100 out of ~8k from math/auto-libm-test-in. > However, if a high percentage of tests are xfail'ed, then I think we should > consider option 2. > > > 2. Do (1) only for entries that have ULPs higher than a threshold (say, > > 9 or 16)? > > Likewise, if we're able to keep maximum ULPs at 9 without marking too many > tests > as xfail'ed, that's better. > Per the contents of sysdeps/powerpc/fpu/libm-test-ulps, this should be > possible > and would not need have a greater max_valid_error for inexact functions > just > for ibm128. > > -- > Tulio Magno > >
diff --git a/sysdeps/powerpc/fpu/libm-test-ulps b/sysdeps/powerpc/fpu/libm-test-ulps index cd2a5fed45..0b82c3f107 100644 --- a/sysdeps/powerpc/fpu/libm-test-ulps +++ b/sysdeps/powerpc/fpu/libm-test-ulps @@ -1317,13 +1317,13 @@ Function: "j0_downward": double: 2 float: 4 float128: 4 -ldouble: 11 +ldouble: 12 Function: "j0_towardzero": double: 5 float: 6 float128: 2 -ldouble: 8 +ldouble: 16 Function: "j0_upward": double: 4