Message ID | 20231023213659.3236496-3-adhemerval.zanella@linaro.org |
---|---|
State | New |
Headers | show |
Series | Fix fesetexcept/fesetexceptflag on powerpc and x86 | expand |
On Mon, 23 Oct 2023, Adhemerval Zanella wrote: > + if (temp.__control_word & temp.__status_word & excepts) > + /* Setting the exception flags may trigger a trap (at the next > + floating-point instruction, but that does not matter). > + ISO C 23 ?? 7.6.4.5 does not allow it. */ > + return -1; Don't you need to check (~temp.__control_word), as in patch 3, since the bits in the control word are set for masked exceptions, clear for trapping?
Adhemerval Zanella wrote: > diff --git a/math/test-fesetexcept-traps.c b/math/test-fesetexcept-traps.c > index 96f6c4752f..122c23eb7e 100644 > --- a/math/test-fesetexcept-traps.c > +++ b/math/test-fesetexcept-traps.c > @@ -43,6 +44,16 @@ do_test (void) > where setting the exception might result in traps the function should > return a nonzero value. */ > ret = fesetexcept (FE_ALL_EXCEPT); > + > + /* Execute some floating-point operations, since on some CPUs exceptions > + triggers a trap only at the next floating-point instruction. */ > + double a = 1.0; > + double b = a + a; > + math_force_eval (b); > + long double al = 1.0L; > + long double bl = al + al; I would mark the variables a, b, al, bl as 'volatile', otherwise it's too easy for GCC to do constant-folding and thus eliminate the floating-point operations. > diff --git a/sysdeps/i386/fpu/fesetexcept.c b/sysdeps/i386/fpu/fesetexcept.c > index 18949e982a..a4c70cd1d1 100644 > --- a/sysdeps/i386/fpu/fesetexcept.c > +++ b/sysdeps/i386/fpu/fesetexcept.c > @@ -17,15 +17,54 @@ > <https://www.gnu.org/licenses/>. */ > > #include <fenv.h> > +#include <ldsodefs.h> > > int > fesetexcept (int excepts) > { > + /* The flags can be set in the 387 unit or in the SSE unit. To set a flag, > + it is sufficient to do it in the SSE unit, because that is guaranteed to > + not trap. However, on i386 CPUs that have only a 387 unit, set the flags > + in the 387, as long as this cannot trap. */ > + > fenv_t temp; > > + excepts &= FE_ALL_EXCEPT; > + > __asm__ ("fnstenv %0" : "=m" (*&temp)); The variable 'temp' and this __asm__ statement can be moved to the 'else' branch, since in the case that an SSE unit is present, we don't need to touch the 387 unit at all. (Since the job here is to _set_ some exception flag bits.) > + if (CPU_FEATURE_USABLE (SSE)) > + { > + /* Clear relevant flags. */ > + temp.__status_word &= ~excepts; > + > + /* Store the new status word (along with the rest of the environment). */ > + __asm__ ("fldenv %0" : : "m" (*&temp)); > + These last two statements can be removed, since in this case, we don't need to touch the 387 unit at all. > + /* And now similarly for SSE. */ > + unsigned int mxcsr; > + __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); > + > + /* Set relevant flags. */ > + mxcsr |= excepts & FE_ALL_EXCEPT; No need for the ' & FE_ALL_EXCEPT' here, since it was already done at function entry. > + /* Put the new data in effect. */ > + __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); > + } > + else > + { > + /* Clear or set relevant flags. */ > + temp.__status_word ^= temp.__status_word & excepts; This last statement is not right: It clears bits from temp.__status_word, but should *set* these bits instead. Change this to: /* Set relevant flags. */ temp.__status_word |= excepts; > + if (temp.__control_word & temp.__status_word & excepts) The temp.__status_word does not need to be tested here, since we just set all EXCEPTS bit in it just before. With Joseph's remark, this line becomes if ((~temp.__control_word) & excepts) > + /* Setting the exception flags may trigger a trap (at the next > + floating-point instruction, but that does not matter). > + ISO C 23 § 7.6.4.5 does not allow it. */ In this function, we need to reference § 7.6.4.4. > diff --git a/sysdeps/i386/fpu/math-tests-trap-force.h b/sysdeps/i386/fpu/math-tests-trap-force.h > new file mode 100644 > index 0000000000..d88229c271 > --- /dev/null > +++ b/sysdeps/i386/fpu/math-tests-trap-force.h > @@ -0,0 +1,29 @@ > +/* Configuration for math tests: support for setting exception flags > + without causing enabled traps. i686 version. > + Copyright (C) 2023 Free Software Foundation, Inc. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +#ifndef I386_FPU_MATH_TESTS_TRAP_FORCE_H > +#define I386_FPU_MATH_TESTS_TRAP_FORCE_H 1 > + > +#include <cpu-features.h> > + > +/* Setting exception flags in FPSCR results in enabled traps for those > + exceptions being taken. */ The i386 does not have an FPSCR register. The exception flags are stored in the register that gdb calls 'fstat' instead. I would use the same name. (The Intel reference does not have a short name for this register; it calls it "FPU Status Register".) Bruno
On 23/10/23 21:17, Bruno Haible wrote: > Adhemerval Zanella wrote: >> diff --git a/math/test-fesetexcept-traps.c b/math/test-fesetexcept-traps.c >> index 96f6c4752f..122c23eb7e 100644 >> --- a/math/test-fesetexcept-traps.c >> +++ b/math/test-fesetexcept-traps.c >> @@ -43,6 +44,16 @@ do_test (void) >> where setting the exception might result in traps the function should >> return a nonzero value. */ >> ret = fesetexcept (FE_ALL_EXCEPT); >> + >> + /* Execute some floating-point operations, since on some CPUs exceptions >> + triggers a trap only at the next floating-point instruction. */ >> + double a = 1.0; >> + double b = a + a; >> + math_force_eval (b); >> + long double al = 1.0L; >> + long double bl = al + al; > > I would mark the variables a, b, al, bl as 'volatile', otherwise it's too > easy for GCC to do constant-folding and thus eliminate the floating-point > operations. It should not matter for i386, since it still generates a floating-point loading, but I agree that forcing the fp operation seems better. > >> diff --git a/sysdeps/i386/fpu/fesetexcept.c b/sysdeps/i386/fpu/fesetexcept.c >> index 18949e982a..a4c70cd1d1 100644 >> --- a/sysdeps/i386/fpu/fesetexcept.c >> +++ b/sysdeps/i386/fpu/fesetexcept.c >> @@ -17,15 +17,54 @@ >> <https://www.gnu.org/licenses/>. */ >> >> #include <fenv.h> >> +#include <ldsodefs.h> >> >> int >> fesetexcept (int excepts) >> { >> + /* The flags can be set in the 387 unit or in the SSE unit. To set a flag, >> + it is sufficient to do it in the SSE unit, because that is guaranteed to >> + not trap. However, on i386 CPUs that have only a 387 unit, set the flags >> + in the 387, as long as this cannot trap. */ >> + >> fenv_t temp; >> >> + excepts &= FE_ALL_EXCEPT; >> + >> __asm__ ("fnstenv %0" : "=m" (*&temp)); > > The variable 'temp' and this __asm__ statement can be moved to the 'else' > branch, since in the case that an SSE unit is present, we don't need to > touch the 387 unit at all. (Since the job here is to _set_ some exception > flag bits.) Indeed, I will change it. > >> + if (CPU_FEATURE_USABLE (SSE)) >> + { >> + /* Clear relevant flags. */ >> + temp.__status_word &= ~excepts; >> + >> + /* Store the new status word (along with the rest of the environment). */ >> + __asm__ ("fldenv %0" : : "m" (*&temp)); >> + > > These last two statements can be removed, since in this case, we don't need to > touch the 387 unit at all. Ack. > >> + /* And now similarly for SSE. */ >> + unsigned int mxcsr; >> + __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); >> + >> + /* Set relevant flags. */ >> + mxcsr |= excepts & FE_ALL_EXCEPT; > > No need for the ' & FE_ALL_EXCEPT' here, since it was already done at function > entry. Ack. > >> + /* Put the new data in effect. */ >> + __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); >> + } >> + else >> + { >> + /* Clear or set relevant flags. */ >> + temp.__status_word ^= temp.__status_word & excepts; > > This last statement is not right: It clears bits from temp.__status_word, > but should *set* these bits instead. Change this to: > > /* Set relevant flags. */ > temp.__status_word |= excepts; Indeed, I think I missed it because I don't have some old chip that actually stress it. I will check if qemu-user can emulated one. > >> + if (temp.__control_word & temp.__status_word & excepts) > > The temp.__status_word does not need to be tested here, since we just > set all EXCEPTS bit in it just before. With Joseph's remark, this line > becomes > > if ((~temp.__control_word) & excepts) Ack. > >> + /* Setting the exception flags may trigger a trap (at the next >> + floating-point instruction, but that does not matter). >> + ISO C 23 § 7.6.4.5 does not allow it. */ > > In this function, we need to reference § 7.6.4.4. Ack (and I will remove § to avoid non ascii characteres). > >> diff --git a/sysdeps/i386/fpu/math-tests-trap-force.h b/sysdeps/i386/fpu/math-tests-trap-force.h >> new file mode 100644 >> index 0000000000..d88229c271 >> --- /dev/null >> +++ b/sysdeps/i386/fpu/math-tests-trap-force.h >> @@ -0,0 +1,29 @@ >> +/* Configuration for math tests: support for setting exception flags >> + without causing enabled traps. i686 version. >> + Copyright (C) 2023 Free Software Foundation, Inc. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <https://www.gnu.org/licenses/>. */ >> + >> +#ifndef I386_FPU_MATH_TESTS_TRAP_FORCE_H >> +#define I386_FPU_MATH_TESTS_TRAP_FORCE_H 1 >> + >> +#include <cpu-features.h> >> + >> +/* Setting exception flags in FPSCR results in enabled traps for those >> + exceptions being taken. */ > > The i386 does not have an FPSCR register. The exception flags are stored > in the register that gdb calls 'fstat' instead. I would use the same name. > (The Intel reference does not have a short name for this register; it > calls it "FPU Status Register".) > Ack. > Bruno > > >
Adhemerval Zanella Netto wrote: > > In this function, we need to reference § 7.6.4.4. > > Ack (and I will remove § to avoid non ascii characteres). We can have UTF-8 encoded non-ASCII characters in the source code comments nowadays. Such as in glibc/include/idx.h, glibc/string/strverscmp.c, etc. Bruno
diff --git a/math/test-fesetexcept-traps.c b/math/test-fesetexcept-traps.c index 96f6c4752f..122c23eb7e 100644 --- a/math/test-fesetexcept-traps.c +++ b/math/test-fesetexcept-traps.c @@ -19,6 +19,7 @@ #include <fenv.h> #include <stdio.h> #include <math-tests.h> +#include <math-barriers.h> static int do_test (void) @@ -43,6 +44,16 @@ do_test (void) where setting the exception might result in traps the function should return a nonzero value. */ ret = fesetexcept (FE_ALL_EXCEPT); + + /* Execute some floating-point operations, since on some CPUs exceptions + triggers a trap only at the next floating-point instruction. */ + double a = 1.0; + double b = a + a; + math_force_eval (b); + long double al = 1.0L; + long double bl = al + al; + math_force_eval (bl); + if (ret == 0) puts ("fesetexcept (FE_ALL_EXCEPT) succeeded"); else if (!EXCEPTION_SET_FORCES_TRAP) diff --git a/sysdeps/i386/fpu/fesetexcept.c b/sysdeps/i386/fpu/fesetexcept.c index 18949e982a..a4c70cd1d1 100644 --- a/sysdeps/i386/fpu/fesetexcept.c +++ b/sysdeps/i386/fpu/fesetexcept.c @@ -17,15 +17,54 @@ <https://www.gnu.org/licenses/>. */ #include <fenv.h> +#include <ldsodefs.h> int fesetexcept (int excepts) { + /* The flags can be set in the 387 unit or in the SSE unit. To set a flag, + it is sufficient to do it in the SSE unit, because that is guaranteed to + not trap. However, on i386 CPUs that have only a 387 unit, set the flags + in the 387, as long as this cannot trap. */ + fenv_t temp; + excepts &= FE_ALL_EXCEPT; + __asm__ ("fnstenv %0" : "=m" (*&temp)); - temp.__status_word |= excepts & FE_ALL_EXCEPT; - __asm__ ("fldenv %0" : : "m" (*&temp)); + + if (CPU_FEATURE_USABLE (SSE)) + { + /* Clear relevant flags. */ + temp.__status_word &= ~excepts; + + /* Store the new status word (along with the rest of the environment). */ + __asm__ ("fldenv %0" : : "m" (*&temp)); + + /* And now similarly for SSE. */ + unsigned int mxcsr; + __asm__ ("stmxcsr %0" : "=m" (*&mxcsr)); + + /* Set relevant flags. */ + mxcsr |= excepts & FE_ALL_EXCEPT; + + /* Put the new data in effect. */ + __asm__ ("ldmxcsr %0" : : "m" (*&mxcsr)); + } + else + { + /* Clear or set relevant flags. */ + temp.__status_word ^= temp.__status_word & excepts; + + if (temp.__control_word & temp.__status_word & excepts) + /* Setting the exception flags may trigger a trap (at the next + floating-point instruction, but that does not matter). + ISO C 23 § 7.6.4.5 does not allow it. */ + return -1; + + /* Store the new status word (along with the rest of the environment). */ + __asm__ ("fldenv %0" : : "m" (*&temp)); + } return 0; } diff --git a/sysdeps/i386/fpu/math-tests-trap-force.h b/sysdeps/i386/fpu/math-tests-trap-force.h new file mode 100644 index 0000000000..d88229c271 --- /dev/null +++ b/sysdeps/i386/fpu/math-tests-trap-force.h @@ -0,0 +1,29 @@ +/* Configuration for math tests: support for setting exception flags + without causing enabled traps. i686 version. + Copyright (C) 2023 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#ifndef I386_FPU_MATH_TESTS_TRAP_FORCE_H +#define I386_FPU_MATH_TESTS_TRAP_FORCE_H 1 + +#include <cpu-features.h> + +/* Setting exception flags in FPSCR results in enabled traps for those + exceptions being taken. */ +#define EXCEPTION_SET_FORCES_TRAP (CPU_FEATURE_USABLE (SSE)) + +#endif /* math-tests-trap-force.h. */