Message ID | 000e01cfeee7$99383f10$cba8bd30$@com |
---|---|
State | New |
Headers | show |
On 10/23/2014 01:34 PM, Wilco Dijkstra wrote: > Cleanup feclearexcept to use the same logic as the ARM version. No functional changes. > > ChangeLog: > 2014-10-23 Wilco Dijkstra <wdijkstr@arm.com> > > * sysdeps/aarch64/fpu/fclrexcpt.c (feclearexcept): > Simplify logic. Looks good to me. Reviewed-by: Carlos O'Donell <carlos@redhat.com> > --- > sysdeps/aarch64/fpu/fclrexcpt.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/sysdeps/aarch64/fpu/fclrexcpt.c b/sysdeps/aarch64/fpu/fclrexcpt.c > index b24f0ff..4471373 100644 > --- a/sysdeps/aarch64/fpu/fclrexcpt.c > +++ b/sysdeps/aarch64/fpu/fclrexcpt.c > @@ -28,7 +28,7 @@ feclearexcept (int excepts) > excepts &= FE_ALL_EXCEPT; > > _FPU_GETFPSR (fpsr); > - fpsr_new = (fpsr & ~FE_ALL_EXCEPT) | (fpsr & FE_ALL_EXCEPT & ~excepts); > + fpsr_new = fpsr & ~excepts; OK. The logic does seem to collapse down nicely. No need to assembly the final fpsr_new from the two halves. Is the generated code better? > > if (fpsr != fpsr_new) > _FPU_SETFPSR (fpsr_new); > Cheers, Carlos.
> Carlos O'Donell wrote: > > --- > > sysdeps/aarch64/fpu/fclrexcpt.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/sysdeps/aarch64/fpu/fclrexcpt.c b/sysdeps/aarch64/fpu/fclrexcpt.c > > index b24f0ff..4471373 100644 > > --- a/sysdeps/aarch64/fpu/fclrexcpt.c > > +++ b/sysdeps/aarch64/fpu/fclrexcpt.c > > @@ -28,7 +28,7 @@ feclearexcept (int excepts) > > excepts &= FE_ALL_EXCEPT; > > > > _FPU_GETFPSR (fpsr); > > - fpsr_new = (fpsr & ~FE_ALL_EXCEPT) | (fpsr & FE_ALL_EXCEPT & ~excepts); > > + fpsr_new = fpsr & ~excepts; > > OK. > > The logic does seem to collapse down nicely. No need to assembly the final > fpsr_new from the two halves. > > Is the generated code better? Absolutely - it saves 3 instructions. GCC understands ((X & ~Y) | (X & Y)) == X, but it doesn't do (X & ~Y) | (X & Y & Z) -> X & (Z | ~Y). In any case the new version is much easier to understand as you no longer have to figure out what it is trying to achieve! Thanks for the link btw, I know what to do in the future for trivial patches. Patch 1-4 have been committed. Wilco
On 10/24/2014 11:27 AM, Wilco Dijkstra wrote: >> Carlos O'Donell wrote: >>> --- >>> sysdeps/aarch64/fpu/fclrexcpt.c | 2 +- >>> 1 file changed, 1 insertion(+), 1 deletion(-) >>> >>> diff --git a/sysdeps/aarch64/fpu/fclrexcpt.c b/sysdeps/aarch64/fpu/fclrexcpt.c >>> index b24f0ff..4471373 100644 >>> --- a/sysdeps/aarch64/fpu/fclrexcpt.c >>> +++ b/sysdeps/aarch64/fpu/fclrexcpt.c >>> @@ -28,7 +28,7 @@ feclearexcept (int excepts) >>> excepts &= FE_ALL_EXCEPT; >>> >>> _FPU_GETFPSR (fpsr); >>> - fpsr_new = (fpsr & ~FE_ALL_EXCEPT) | (fpsr & FE_ALL_EXCEPT & ~excepts); >>> + fpsr_new = fpsr & ~excepts; >> >> OK. >> >> The logic does seem to collapse down nicely. No need to assembly the final >> fpsr_new from the two halves. >> >> Is the generated code better? > > Absolutely - it saves 3 instructions. GCC understands ((X & ~Y) | (X & Y)) == X, > but it doesn't do (X & ~Y) | (X & Y & Z) -> X & (Z | ~Y). In any case the new > version is much easier to understand as you no longer have to figure out what > it is trying to achieve! Agreed. Thanks for verifying. > Thanks for the link btw, I know what to do in the future for trivial patches. > Patch 1-4 have been committed. My pleasure. I want to make developing for glibc as painless as possible, but no less ;-) Cheers, Carlos.
diff --git a/sysdeps/aarch64/fpu/fclrexcpt.c b/sysdeps/aarch64/fpu/fclrexcpt.c index b24f0ff..4471373 100644 --- a/sysdeps/aarch64/fpu/fclrexcpt.c +++ b/sysdeps/aarch64/fpu/fclrexcpt.c @@ -28,7 +28,7 @@ feclearexcept (int excepts) excepts &= FE_ALL_EXCEPT; _FPU_GETFPSR (fpsr); - fpsr_new = (fpsr & ~FE_ALL_EXCEPT) | (fpsr & FE_ALL_EXCEPT & ~excepts); + fpsr_new = fpsr & ~excepts; if (fpsr != fpsr_new) _FPU_SETFPSR (fpsr_new);