Message ID | df790a93-52bf-761a-6586-78d540934f96@linux.ibm.com |
---|---|
State | New |
Headers | show |
Series | [v6,rs6000] Implemented f[min/max]_optab by xs[min/max]dp [PR103605] | expand |
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html Thanks. On 24/6/2022 上午 10:02, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead > of smin/max. So the builtins always generate xs[min/max]dp on all > platforms. > > Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> > > gcc/ > PR target/103605 > * config/rs6000/rs6000.md (FMINMAX): New. > (minmax_op): New. > (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. > * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set > pattern to fmaxdf3. > (__builtin_vsx_xsmindp): Set pattern to fmindf3. > > gcc/testsuite/ > PR target/103605 > * gcc.dg/powerpc/pr103605.c: New. > > > patch.diff > diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def > index f4a9f24bcc5..8b735493b40 100644 > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def > @@ -1613,10 +1613,10 @@ > XSCVSPDP vsx_xscvspdp {} > > const double __builtin_vsx_xsmaxdp (double, double); > - XSMAXDP smaxdf3 {} > + XSMAXDP fmaxdf3 {} > > const double __builtin_vsx_xsmindp (double, double); > - XSMINDP smindf3 {} > + XSMINDP fmindf3 {} > > const double __builtin_vsx_xsrdpi (double); > XSRDPI vsx_xsrdpi {} > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index bf85baa5370..ae0dd98f0f9 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -158,6 +158,8 @@ (define_c_enum "unspec" > UNSPEC_HASHCHK > UNSPEC_XXSPLTIDP_CONST > UNSPEC_XXSPLTIW_CONST > + UNSPEC_FMAX > + UNSPEC_FMIN > ]) > > ;; > @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" > DONE; > }) > > + > +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) > + > +(define_int_attr minmax_op [(UNSPEC_FMAX "max") > + (UNSPEC_FMIN "min")]) > + > +(define_insn "f<minmax_op><mode>3" > + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") > + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") > + (match_operand:SFDF 2 "vsx_register_operand" "wa")] > + FMINMAX))] > + "TARGET_VSX && !flag_finite_math_only" > + "xs<minmax_op>dp %x0,%x1,%x2" > + [(set_attr "type" "fp")] > +) > + > (define_expand "mov<mode>cc" > [(set (match_operand:GPR 0 "gpc_reg_operand") > (if_then_else:GPR (match_operand 1 "comparison_operator") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c > new file mode 100644 > index 00000000000..1c938d40e61 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c > @@ -0,0 +1,37 @@ > +/* { dg-do compile } */ > +/* { dg-require-effective-target powerpc_vsx_ok } */ > +/* { dg-options "-O2 -mvsx" } */ > +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ > +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ > + > +#include <math.h> > + > +double test1 (double d0, double d1) > +{ > + return fmin (d0, d1); > +} > + > +float test2 (float d0, float d1) > +{ > + return fmin (d0, d1); > +} > + > +double test3 (double d0, double d1) > +{ > + return fmax (d0, d1); > +} > + > +float test4 (float d0, float d1) > +{ > + return fmax (d0, d1); > +} > + > +double test5 (double d0, double d1) > +{ > + return __builtin_vsx_xsmindp (d0, d1); > +} > + > +double test6 (double d0, double d1) > +{ > + return __builtin_vsx_xsmaxdp (d0, d1); > +}
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html Thanks. On 4/7/2022 下午 2:32, HAO CHEN GUI wrote: > Hi, > Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html > Thanks. > > On 24/6/2022 上午 10:02, HAO CHEN GUI wrote: >> Hi, >> This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. >> Tests show that outputs of xs[min/max]dp are consistent with the standard >> of C99 fmin/max. >> >> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >> of smin/max. So the builtins always generate xs[min/max]dp on all >> platforms. >> >> Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. >> Is this okay for trunk? Any recommendations? Thanks a lot. >> >> ChangeLog >> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> >> >> gcc/ >> PR target/103605 >> * config/rs6000/rs6000.md (FMINMAX): New. >> (minmax_op): New. >> (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. >> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set >> pattern to fmaxdf3. >> (__builtin_vsx_xsmindp): Set pattern to fmindf3. >> >> gcc/testsuite/ >> PR target/103605 >> * gcc.dg/powerpc/pr103605.c: New. >> >> >> patch.diff >> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def >> index f4a9f24bcc5..8b735493b40 100644 >> --- a/gcc/config/rs6000/rs6000-builtins.def >> +++ b/gcc/config/rs6000/rs6000-builtins.def >> @@ -1613,10 +1613,10 @@ >> XSCVSPDP vsx_xscvspdp {} >> >> const double __builtin_vsx_xsmaxdp (double, double); >> - XSMAXDP smaxdf3 {} >> + XSMAXDP fmaxdf3 {} >> >> const double __builtin_vsx_xsmindp (double, double); >> - XSMINDP smindf3 {} >> + XSMINDP fmindf3 {} >> >> const double __builtin_vsx_xsrdpi (double); >> XSRDPI vsx_xsrdpi {} >> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >> index bf85baa5370..ae0dd98f0f9 100644 >> --- a/gcc/config/rs6000/rs6000.md >> +++ b/gcc/config/rs6000/rs6000.md >> @@ -158,6 +158,8 @@ (define_c_enum "unspec" >> UNSPEC_HASHCHK >> UNSPEC_XXSPLTIDP_CONST >> UNSPEC_XXSPLTIW_CONST >> + UNSPEC_FMAX >> + UNSPEC_FMIN >> ]) >> >> ;; >> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" >> DONE; >> }) >> >> + >> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) >> + >> +(define_int_attr minmax_op [(UNSPEC_FMAX "max") >> + (UNSPEC_FMIN "min")]) >> + >> +(define_insn "f<minmax_op><mode>3" >> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") >> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") >> + (match_operand:SFDF 2 "vsx_register_operand" "wa")] >> + FMINMAX))] >> + "TARGET_VSX && !flag_finite_math_only" >> + "xs<minmax_op>dp %x0,%x1,%x2" >> + [(set_attr "type" "fp")] >> +) >> + >> (define_expand "mov<mode>cc" >> [(set (match_operand:GPR 0 "gpc_reg_operand") >> (if_then_else:GPR (match_operand 1 "comparison_operator") >> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c >> new file mode 100644 >> index 00000000000..1c938d40e61 >> --- /dev/null >> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c >> @@ -0,0 +1,37 @@ >> +/* { dg-do compile } */ >> +/* { dg-require-effective-target powerpc_vsx_ok } */ >> +/* { dg-options "-O2 -mvsx" } */ >> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ >> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ >> + >> +#include <math.h> >> + >> +double test1 (double d0, double d1) >> +{ >> + return fmin (d0, d1); >> +} >> + >> +float test2 (float d0, float d1) >> +{ >> + return fmin (d0, d1); >> +} >> + >> +double test3 (double d0, double d1) >> +{ >> + return fmax (d0, d1); >> +} >> + >> +float test4 (float d0, float d1) >> +{ >> + return fmax (d0, d1); >> +} >> + >> +double test5 (double d0, double d1) >> +{ >> + return __builtin_vsx_xsmindp (d0, d1); >> +} >> + >> +double test6 (double d0, double d1) >> +{ >> + return __builtin_vsx_xsmaxdp (d0, d1); >> +}
Hi, Gentle ping this: https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html Thanks. On 1/8/2022 上午 10:03, HAO CHEN GUI wrote: > Hi, > Gentle ping this: > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html > Thanks. > > > On 4/7/2022 下午 2:32, HAO CHEN GUI wrote: >> Hi, >> Gentle ping this: >> https://gcc.gnu.org/pipermail/gcc-patches/2022-June/597158.html >> Thanks. >> >> On 24/6/2022 上午 10:02, HAO CHEN GUI wrote: >>> Hi, >>> This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. >>> Tests show that outputs of xs[min/max]dp are consistent with the standard >>> of C99 fmin/max. >>> >>> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >>> of smin/max. So the builtins always generate xs[min/max]dp on all >>> platforms. >>> >>> Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. >>> Is this okay for trunk? Any recommendations? Thanks a lot. >>> >>> ChangeLog >>> 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> >>> >>> gcc/ >>> PR target/103605 >>> * config/rs6000/rs6000.md (FMINMAX): New. >>> (minmax_op): New. >>> (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. >>> * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set >>> pattern to fmaxdf3. >>> (__builtin_vsx_xsmindp): Set pattern to fmindf3. >>> >>> gcc/testsuite/ >>> PR target/103605 >>> * gcc.dg/powerpc/pr103605.c: New. >>> >>> >>> patch.diff >>> diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def >>> index f4a9f24bcc5..8b735493b40 100644 >>> --- a/gcc/config/rs6000/rs6000-builtins.def >>> +++ b/gcc/config/rs6000/rs6000-builtins.def >>> @@ -1613,10 +1613,10 @@ >>> XSCVSPDP vsx_xscvspdp {} >>> >>> const double __builtin_vsx_xsmaxdp (double, double); >>> - XSMAXDP smaxdf3 {} >>> + XSMAXDP fmaxdf3 {} >>> >>> const double __builtin_vsx_xsmindp (double, double); >>> - XSMINDP smindf3 {} >>> + XSMINDP fmindf3 {} >>> >>> const double __builtin_vsx_xsrdpi (double); >>> XSRDPI vsx_xsrdpi {} >>> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md >>> index bf85baa5370..ae0dd98f0f9 100644 >>> --- a/gcc/config/rs6000/rs6000.md >>> +++ b/gcc/config/rs6000/rs6000.md >>> @@ -158,6 +158,8 @@ (define_c_enum "unspec" >>> UNSPEC_HASHCHK >>> UNSPEC_XXSPLTIDP_CONST >>> UNSPEC_XXSPLTIW_CONST >>> + UNSPEC_FMAX >>> + UNSPEC_FMIN >>> ]) >>> >>> ;; >>> @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" >>> DONE; >>> }) >>> >>> + >>> +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) >>> + >>> +(define_int_attr minmax_op [(UNSPEC_FMAX "max") >>> + (UNSPEC_FMIN "min")]) >>> + >>> +(define_insn "f<minmax_op><mode>3" >>> + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") >>> + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") >>> + (match_operand:SFDF 2 "vsx_register_operand" "wa")] >>> + FMINMAX))] >>> + "TARGET_VSX && !flag_finite_math_only" >>> + "xs<minmax_op>dp %x0,%x1,%x2" >>> + [(set_attr "type" "fp")] >>> +) >>> + >>> (define_expand "mov<mode>cc" >>> [(set (match_operand:GPR 0 "gpc_reg_operand") >>> (if_then_else:GPR (match_operand 1 "comparison_operator") >>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c >>> new file mode 100644 >>> index 00000000000..1c938d40e61 >>> --- /dev/null >>> +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c >>> @@ -0,0 +1,37 @@ >>> +/* { dg-do compile } */ >>> +/* { dg-require-effective-target powerpc_vsx_ok } */ >>> +/* { dg-options "-O2 -mvsx" } */ >>> +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ >>> +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ >>> + >>> +#include <math.h> >>> + >>> +double test1 (double d0, double d1) >>> +{ >>> + return fmin (d0, d1); >>> +} >>> + >>> +float test2 (float d0, float d1) >>> +{ >>> + return fmin (d0, d1); >>> +} >>> + >>> +double test3 (double d0, double d1) >>> +{ >>> + return fmax (d0, d1); >>> +} >>> + >>> +float test4 (float d0, float d1) >>> +{ >>> + return fmax (d0, d1); >>> +} >>> + >>> +double test5 (double d0, double d1) >>> +{ >>> + return __builtin_vsx_xsmindp (d0, d1); >>> +} >>> + >>> +double test6 (double d0, double d1) >>> +{ >>> + return __builtin_vsx_xsmaxdp (d0, d1); >>> +}
Hi Haochen, on 2022/6/24 10:02, HAO CHEN GUI wrote: > Hi, > This patch implements optab f[min/max]_optab by xs[min/max]dp on rs6000. > Tests show that outputs of xs[min/max]dp are consistent with the standard > of C99 fmin/max. > > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead > of smin/max. So the builtins always generate xs[min/max]dp on all > platforms. > > Bootstrapped and tested on ppc64 Linux BE and LE with no regressions. > Is this okay for trunk? Any recommendations? Thanks a lot. > > ChangeLog > 2022-06-24 Haochen Gui <guihaoc@linux.ibm.com> > > gcc/ > PR target/103605 > * config/rs6000/rs6000.md (FMINMAX): New. > (minmax_op): New. > (f<minmax_op><mode>3): New pattern by UNSPEC_FMAX and UNSPEC_FMIN. Nit: here miss UNSPEC_FMAX and UNSPEC_FMIN. > * config/rs6000/rs6000-builtins.def (__builtin_vsx_xsmaxdp): Set > pattern to fmaxdf3. > (__builtin_vsx_xsmindp): Set pattern to fmindf3. > > gcc/testsuite/ > PR target/103605 > * gcc.dg/powerpc/pr103605.c: New. > > > patch.diff > diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def > index f4a9f24bcc5..8b735493b40 100644 > --- a/gcc/config/rs6000/rs6000-builtins.def > +++ b/gcc/config/rs6000/rs6000-builtins.def > @@ -1613,10 +1613,10 @@ > XSCVSPDP vsx_xscvspdp {} > > const double __builtin_vsx_xsmaxdp (double, double); > - XSMAXDP smaxdf3 {} > + XSMAXDP fmaxdf3 {} > > const double __builtin_vsx_xsmindp (double, double); > - XSMINDP smindf3 {} > + XSMINDP fmindf3 {} > > const double __builtin_vsx_xsrdpi (double); > XSRDPI vsx_xsrdpi {} > diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md > index bf85baa5370..ae0dd98f0f9 100644 > --- a/gcc/config/rs6000/rs6000.md > +++ b/gcc/config/rs6000/rs6000.md > @@ -158,6 +158,8 @@ (define_c_enum "unspec" > UNSPEC_HASHCHK > UNSPEC_XXSPLTIDP_CONST > UNSPEC_XXSPLTIW_CONST > + UNSPEC_FMAX > + UNSPEC_FMIN > ]) > > ;; > @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" > DONE; > }) > > + > +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) > + > +(define_int_attr minmax_op [(UNSPEC_FMAX "max") > + (UNSPEC_FMIN "min")]) > + > +(define_insn "f<minmax_op><mode>3" > + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") > + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") > + (match_operand:SFDF 2 "vsx_register_operand" "wa")] > + FMINMAX))] > + "TARGET_VSX && !flag_finite_math_only" > + "xs<minmax_op>dp %x0,%x1,%x2" > + [(set_attr "type" "fp")] > +) > + > (define_expand "mov<mode>cc" > [(set (match_operand:GPR 0 "gpc_reg_operand") > (if_then_else:GPR (match_operand 1 "comparison_operator") > diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c > new file mode 100644 > index 00000000000..1c938d40e61 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c > @@ -0,0 +1,37 @@ > +/* { dg-do compile } */ Nit: This dg-do line isn't needed. OK with or without two nits fixed. Thanks! BR, Kewen
Hi! On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: > This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead > of smin/max. So the builtins always generate xs[min/max]dp on all > platforms. But how does this not blow up with -ffast-math? In the other direction I am worried that the unspecs will degrade performance (relative to smin/smax) when -ffast-math *is* active (and this new builtin code and pattern doesn't blow up). I still think we should get RTL codes for this, to have access to proper floating point min/max semantics always and everywhere. "fmin" and "fmax" seem to be good names :-) Segher
on 2022/9/22 05:56, Segher Boessenkool wrote: > Hi! > > On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: >> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >> of smin/max. So the builtins always generate xs[min/max]dp on all >> platforms. > > But how does this not blow up with -ffast-math? Indeed. Since it guards with "TARGET_VSX && !flag_finite_math_only", the bifs seem to cause ICE at -ffast-math. Haochen, could you double check it? > > In the other direction I am worried that the unspecs will degrade > performance (relative to smin/smax) when -ffast-math *is* active (and > this new builtin code and pattern doesn't blow up). For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} EXPR in middle end, and yes, it can degrade for the bifs, although IMHO the previous expansion to smin/smax contradicts with the bif names (users expect to map them to xs{min,max}dp than others). > > I still think we should get RTL codes for this, to have access to proper > floating point min/max semantics always and everywhere. "fmin" and > "fmax" seem to be good names :-) It would be good, especially if we have observed some uses of these bifs and further opportunities around them. :) BR, Kewen
Hi Kewen & Segher, Thanks so much for your review comments. On 22/9/2022 上午 10:28, Kewen.Lin wrote: > on 2022/9/22 05:56, Segher Boessenkool wrote: >> Hi! >> >> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: >>> This patch also binds __builtin_vsx_xs[min/max]dp to fmin/max instead >>> of smin/max. So the builtins always generate xs[min/max]dp on all >>> platforms. >> >> But how does this not blow up with -ffast-math? > > Indeed. Since it guards with "TARGET_VSX && !flag_finite_math_only", > the bifs seem to cause ICE at -ffast-math. > > Haochen, could you double check it? I tested it with "-ffast-math". fmin/max functions are converted to MIN/MAX_EXPR in gimple lower pass. But the built-ins are not and hit the ICE. I thought the built-ins are folded to MIN/MAX_EXPR like vec_ versions' when fast-math is set. In fact they're not. Sorry for that. I made a patch to fold these two built-ins to MIN/MAX_EXPR when fast-math is set. Then the built-ins are converted to MIN/MAX_EXPR and expanded to smin/max. Thanks for pointing out the problem! > >> >> In the other direction I am worried that the unspecs will degrade >> performance (relative to smin/smax) when -ffast-math *is* active (and >> this new builtin code and pattern doesn't blow up). > > For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} > EXPR in middle end, and yes, it can degrade for the bifs, although IMHO > the previous expansion to smin/smax contradicts with the bif names (users > expect to map them to xs{min,max}dp than others). > >> >> I still think we should get RTL codes for this, to have access to proper >> floating point min/max semantics always and everywhere. "fmin" and >> "fmax" seem to be good names :-) > > It would be good, especially if we have observed some uses of these bifs > and further opportunities around them. :) > Shall we submit a PR to add fmin/fmax to RTL codes? > BR, > Kewen
Hi! On Thu, Sep 22, 2022 at 05:59:07PM +0800, HAO CHEN GUI wrote: > >> I still think we should get RTL codes for this, to have access to proper > >> floating point min/max semantics always and everywhere. "fmin" and > >> "fmax" seem to be good names :-) > > > > It would be good, especially if we have observed some uses of these bifs > > and further opportunities around them. :) > > > Shall we submit a PR to add fmin/fmax to RTL codes? Yes, please do. If we have fmin/fmax RTL codes that describe the standard semantics, we can generate code for that with -ffast-math as well, since the code generated is optimal in either case; it's just the *generic* optimisations that fall behind. Segher
Hi! On Thu, Sep 22, 2022 at 10:28:23AM +0800, Kewen.Lin wrote: > on 2022/9/22 05:56, Segher Boessenkool wrote: > > On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: > > In the other direction I am worried that the unspecs will degrade > > performance (relative to smin/smax) when -ffast-math *is* active (and > > this new builtin code and pattern doesn't blow up). > > For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} > EXPR in middle end, and yes, it can degrade for the bifs, although IMHO > the previous expansion to smin/smax contradicts with the bif names (users > expect to map them to xs{min,max}dp than others). But builtins *never* say to generate any particular instruction. They say to generate code that implements certain functionality. For many builtins this does of course boil down to specific instructions, but even then it could be optimised away completely or replace with something more specific if things can be folded or such. > > I still think we should get RTL codes for this, to have access to proper > > floating point min/max semantics always and everywhere. "fmin" and > > "fmax" seem to be good names :-) > > It would be good, especially if we have observed some uses of these bifs > and further opportunities around them. :) Currently we only have smin/smax for float, and those are not valid for NaNs, or when the sign of zeros is relevant. On the other hand the semantics of fmin/fmax are settled and in most standards nowadays. So it is time we did this I would say :-) Segher
Hi Segher, on 2022/9/22 22:05, Segher Boessenkool wrote: > Hi! > > On Thu, Sep 22, 2022 at 10:28:23AM +0800, Kewen.Lin wrote: >> on 2022/9/22 05:56, Segher Boessenkool wrote: >>> On Fri, Jun 24, 2022 at 10:02:19AM +0800, HAO CHEN GUI wrote: >>> In the other direction I am worried that the unspecs will degrade >>> performance (relative to smin/smax) when -ffast-math *is* active (and >>> this new builtin code and pattern doesn't blow up). >> >> For fmin/fmax it would be fine, since they are transformed to {MAX,MIN} >> EXPR in middle end, and yes, it can degrade for the bifs, although IMHO >> the previous expansion to smin/smax contradicts with the bif names (users >> expect to map them to xs{min,max}dp than others). > > But builtins *never* say to generate any particular instruction. They > say to generate code that implements certain functionality. For many > builtins this does of course boil down to specific instructions, but > even then it could be optimised away completely or replace with > something more specific if things can be folded or such. ah, your explanation refreshed my mind, thanks! Previously I thought the bifs with specific mnemonic as part of their names should be used to generate specific instructions, it's to save users' efforts using inline-asm, if we want them to represent the generic functionality (not bind with specific), we can use some generic names instead. As your explanation, binding at fast-math isn't needed, then I think Haochen's patch v7 with gimple folding can avoid the concern on degradation at fast-math (still smax/smin), nice. :) BR, Kewen
diff --git a/gcc/config/rs6000/rs6000-builtins.def b/gcc/config/rs6000/rs6000-builtins.def index f4a9f24bcc5..8b735493b40 100644 --- a/gcc/config/rs6000/rs6000-builtins.def +++ b/gcc/config/rs6000/rs6000-builtins.def @@ -1613,10 +1613,10 @@ XSCVSPDP vsx_xscvspdp {} const double __builtin_vsx_xsmaxdp (double, double); - XSMAXDP smaxdf3 {} + XSMAXDP fmaxdf3 {} const double __builtin_vsx_xsmindp (double, double); - XSMINDP smindf3 {} + XSMINDP fmindf3 {} const double __builtin_vsx_xsrdpi (double); XSRDPI vsx_xsrdpi {} diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md index bf85baa5370..ae0dd98f0f9 100644 --- a/gcc/config/rs6000/rs6000.md +++ b/gcc/config/rs6000/rs6000.md @@ -158,6 +158,8 @@ (define_c_enum "unspec" UNSPEC_HASHCHK UNSPEC_XXSPLTIDP_CONST UNSPEC_XXSPLTIW_CONST + UNSPEC_FMAX + UNSPEC_FMIN ]) ;; @@ -5341,6 +5343,22 @@ (define_insn_and_split "*s<minmax><mode>3_fpr" DONE; }) + +(define_int_iterator FMINMAX [UNSPEC_FMAX UNSPEC_FMIN]) + +(define_int_attr minmax_op [(UNSPEC_FMAX "max") + (UNSPEC_FMIN "min")]) + +(define_insn "f<minmax_op><mode>3" + [(set (match_operand:SFDF 0 "vsx_register_operand" "=wa") + (unspec:SFDF [(match_operand:SFDF 1 "vsx_register_operand" "wa") + (match_operand:SFDF 2 "vsx_register_operand" "wa")] + FMINMAX))] + "TARGET_VSX && !flag_finite_math_only" + "xs<minmax_op>dp %x0,%x1,%x2" + [(set_attr "type" "fp")] +) + (define_expand "mov<mode>cc" [(set (match_operand:GPR 0 "gpc_reg_operand") (if_then_else:GPR (match_operand 1 "comparison_operator") diff --git a/gcc/testsuite/gcc.target/powerpc/pr103605.c b/gcc/testsuite/gcc.target/powerpc/pr103605.c new file mode 100644 index 00000000000..1c938d40e61 --- /dev/null +++ b/gcc/testsuite/gcc.target/powerpc/pr103605.c @@ -0,0 +1,37 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target powerpc_vsx_ok } */ +/* { dg-options "-O2 -mvsx" } */ +/* { dg-final { scan-assembler-times {\mxsmaxdp\M} 3 } } */ +/* { dg-final { scan-assembler-times {\mxsmindp\M} 3 } } */ + +#include <math.h> + +double test1 (double d0, double d1) +{ + return fmin (d0, d1); +} + +float test2 (float d0, float d1) +{ + return fmin (d0, d1); +} + +double test3 (double d0, double d1) +{ + return fmax (d0, d1); +} + +float test4 (float d0, float d1) +{ + return fmax (d0, d1); +} + +double test5 (double d0, double d1) +{ + return __builtin_vsx_xsmindp (d0, d1); +} + +double test6 (double d0, double d1) +{ + return __builtin_vsx_xsmaxdp (d0, d1); +}