Message ID | 20240712211247.106814-1-vineetg@rivosinc.com |
---|---|
State | New |
Headers | show |
Series | [v3] RISC-V: use fclass insns to implement isfinite, isnormal and isinf builtins | expand |
On 7/12/24 3:12 PM, Vineet Gupta wrote: > Changes since v2: > - fclass define_insn tightened to check op0 mode "X" with additional > expander w/o mode for callers. > - builtins expander explicit mode check and FAIL if mode not appropriate. > - subreg promoted handling to elide potential extension of ret val. > - Added isinf builtin with bimodal retun value as others. > > Changes since v1: > - Removed UNSPEC_{INFINITE,ISNORMAL} > - Don't hardcode SI in patterns, try to keep X to avoid potential > sign extension pitfalls. Implementation wise requires skipping > :MODE specifier in match_operand which is flagged as missing mode > warning. > --- > > Currently thsse builtins use float compare instructions which require > FP flags to be save/restore around them. > Our perf team complained this could be costly in uarch. > RV Base ISA already has FCLASS.{d,s,h} instruction to compare/identify FP > values w/o disturbing FP exception flags. > > Coincidently, upstream very recently got support for the corresponding > optabs. So this just requires wiring up in the backend. > > Tested for rv64, one additioal failure g++.dg/opt/pr107569.C needs > upstream ranger fix for the new optab. > > gcc/ChangeLog: > * config/riscv/riscv.md: Add UNSPEC_FCLASS. > define_insn and define_expand for fclass. > define_expand for isfinite, isnormal, isinf. > > gcc/testsuite/ChangeLog: > * gcc.target/riscv/fclass.c: New tests. > > Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> > --- > gcc/config/riscv/riscv.md | 134 ++++++++++++++++++++++++ > gcc/testsuite/gcc.target/riscv/fclass.c | 38 +++++++ > 2 files changed, 172 insertions(+) > create mode 100644 gcc/testsuite/gcc.target/riscv/fclass.c > > diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md > index ff37125e3f28..c67e5129753a 100644 > --- a/gcc/config/riscv/riscv.md > +++ b/gcc/config/riscv/riscv.md > @@ -68,6 +68,7 @@ > UNSPEC_FMAX > UNSPEC_FMINM > UNSPEC_FMAXM > + UNSPEC_FCLASS > > ;; Stack tie > UNSPEC_TIE > @@ -3436,6 +3437,139 @@ > (set_attr "mode" "<UNITMODE>") > (set (attr "length") (const_int 16))]) > > +;; fclass instruction output bitmap > +;; 0 negative infinity > +;; 1 negative normal number. > +;; 2 negative subnormal number. > +;; 3 -0 > +;; 4 +0 > +;; 5 positive subnormal number. > +;; 6 positive normal number. > +;; 7 positive infinity > +;; 8 signaling NaN. > +;; 9 quiet NaN > + > +(define_insn "*fclass<ANYF:mode><X:mode>" > + [(set (match_operand:X 0 "register_operand" "=r") > + (unspec [(match_operand:ANYF 1 "register_operand" " f")] > + UNSPEC_FCLASS))] > + "TARGET_HARD_FLOAT" > + "fclass.<fmt>\t%0,%1"; > + [(set_attr "type" "fcmp") > + (set_attr "mode" "<UNITMODE>")]) > + > +;; Implementation note: > +;; Indirection via the expander needed due to md syntax limitations. > +;; define_insn needs tighter mode check (X for operand0) requiring <X:mode> > +;; in name. This forces callers like isfinite to specify mode but can't due > +;; to their own standard pattern name requirement. The additional expander > +;; with matching RTL template (but elided mode check) allows callers to use > +;; expander's name with actual asm coming from stricter define_insn. > + > +(define_expand "fclass<ANYF:mode>" > + [(set (match_operand 0 "register_operand" "=r") > + (unspec [(match_operand:ANYF 1 "register_operand" " f")] > + UNSPEC_FCLASS))] > + "TARGET_HARD_FLOAT") Note that for fclass, which isn't a standard pattern name, you have additional text in the same -- say for example a mode. It's just those standard names defined in md.texi where we have naming restrictions. The net is you can avoid the fclass expander entirely. So you can drop the '*' in the fclass pattern. Then call it as gen_fclass<ANYF:mode>di or gen_fclass<ANYF:mode>si in the is* expanders. > + /* Allow SI for rv32/rv64 and DI for rv64 > + Explicit mode check (vs. specfying modes in RTL template match_operand) > + due to syntax limitations. > + - Specifying X would cause multiple definitions > + - And to disambuguate that requires adding <X:mode> to name which > + no longer matches the "standard pattern name". */ > + if (GET_MODE (operands[0]) != SImode > + && GET_MODE (operands[0]) != word_mode) > + FAIL; > + > + rtx t = gen_reg_rtx (word_mode); > + rtx t_op0 = gen_reg_rtx (word_mode); > + > + emit_insn (gen_fclass<ANYF:mode> (t, operands[1])); > + riscv_emit_binary (AND, t, t, GEN_INT (0x7e)); > + rtx cmp = gen_rtx_NE (word_mode, t, const0_rtx); > + emit_insn (gen_cstore<mode>4 (t_op0, cmp, t, const0_rtx)); > + > + if (TARGET_64BIT) > + { > + t_op0 = gen_lowpart (SImode, t_op0); > + SUBREG_PROMOTED_VAR_P (t_op0) = 1; > + SUBREG_PROMOTED_SET (t_op0, SRP_SIGNED); > + } > + > + emit_move_insn (operands[0], t_op0); > + DONE; > +}) So this is repeated nearly verbatim in all 3 is* expanders. Seems ripe for refactoring where you pass in the magic constant (which is all that seems to change across those code fragments). Hate to ask for another spin, but the refactoring seems like its worth doing to avoid the duplication. And if we're going to spin again, might as well drop the fclass expander that we really don't need. Jeff
On 7/12/24 14:52, Jeff Law wrote: >> +(define_insn "*fclass<ANYF:mode><X:mode>" >> + [(set (match_operand:X 0 "register_operand" "=r") >> + (unspec [(match_operand:ANYF 1 "register_operand" " f")] >> + UNSPEC_FCLASS))] >> + "TARGET_HARD_FLOAT" >> + "fclass.<fmt>\t%0,%1"; >> + [(set_attr "type" "fcmp") >> + (set_attr "mode" "<UNITMODE>")]) >> + >> +;; Implementation note: >> +;; Indirection via the expander needed due to md syntax limitations. >> +;; define_insn needs tighter mode check (X for operand0) requiring <X:mode> >> +;; in name. This forces callers like isfinite to specify mode but can't due >> +;; to their own standard pattern name requirement. The additional expander >> +;; with matching RTL template (but elided mode check) allows callers to use >> +;; expander's name with actual asm coming from stricter define_insn. >> + >> +(define_expand "fclass<ANYF:mode>" >> + [(set (match_operand 0 "register_operand" "=r") >> + (unspec [(match_operand:ANYF 1 "register_operand" " f")] >> + UNSPEC_FCLASS))] >> + "TARGET_HARD_FLOAT") > Note that for fclass, which isn't a standard pattern name, you have > additional text in the same -- say for example a mode. It's just those > standard names defined in md.texi where we have naming restrictions. > > The net is you can avoid the fclass expander entirely. So you can drop > the '*' in the fclass pattern. Note to myself (after verifying): - dropping of '*' is also needed to be able to call them explicitly from other expanders. - '*' prefix patterns are purely for internal recog() > Then call it as > gen_fclass<ANYF:mode>di or gen_fclass<ANYF:mode>si in the is* expanders. Meaning this requires callers to be if (TARGET_64BIT) emit_insn (gen_fclass<ANYF:mode>di (t, operands[1])); else emit_insn (gen_fclass<ANYF:mode>si (t, operands[1])); since they can't do the following due to their own Std pattern name reqmt lest redefinition issue. emit_insn (gen_fclass<ANYF:mode><X:mode> (t, operands[1])); > >> + /* Allow SI for rv32/rv64 and DI for rv64 >> + Explicit mode check (vs. specfying modes in RTL template match_operand) >> + due to syntax limitations. >> + - Specifying X would cause multiple definitions >> + - And to disambuguate that requires adding <X:mode> to name which >> + no longer matches the "standard pattern name". */ >> + if (GET_MODE (operands[0]) != SImode >> + && GET_MODE (operands[0]) != word_mode) >> + FAIL; >> + >> + rtx t = gen_reg_rtx (word_mode); >> + rtx t_op0 = gen_reg_rtx (word_mode); >> + >> + emit_insn (gen_fclass<ANYF:mode> (t, operands[1])); >> + riscv_emit_binary (AND, t, t, GEN_INT (0x7e)); >> + rtx cmp = gen_rtx_NE (word_mode, t, const0_rtx); >> + emit_insn (gen_cstore<mode>4 (t_op0, cmp, t, const0_rtx)); >> + >> + if (TARGET_64BIT) >> + { >> + t_op0 = gen_lowpart (SImode, t_op0); >> + SUBREG_PROMOTED_VAR_P (t_op0) = 1; >> + SUBREG_PROMOTED_SET (t_op0, SRP_SIGNED); >> + } >> + >> + emit_move_insn (operands[0], t_op0); >> + DONE; >> +}) > So this is repeated nearly verbatim in all 3 is* expanders. Seems ripe > for refactoring where you pass in the magic constant (which is all that > seems to change across those code fragments). Yeah I should have thought about it myself. The prior tri-modal return was different but now clearly this can all be combined. I was hoping to use int iterators, but it seems this will have be a helper in riscv.cc which is called from 3 explicit expanders. > Hate to ask for another spin, but the refactoring seems like its worth > doing to avoid the duplication. Not a problem, I'll eventually get the md syntax :-) > And if we're going to spin again, might > as well drop the fclass expander that we really don't need. Of course. -Vineet
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index ff37125e3f28..c67e5129753a 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -68,6 +68,7 @@ UNSPEC_FMAX UNSPEC_FMINM UNSPEC_FMAXM + UNSPEC_FCLASS ;; Stack tie UNSPEC_TIE @@ -3436,6 +3437,139 @@ (set_attr "mode" "<UNITMODE>") (set (attr "length") (const_int 16))]) +;; fclass instruction output bitmap +;; 0 negative infinity +;; 1 negative normal number. +;; 2 negative subnormal number. +;; 3 -0 +;; 4 +0 +;; 5 positive subnormal number. +;; 6 positive normal number. +;; 7 positive infinity +;; 8 signaling NaN. +;; 9 quiet NaN + +(define_insn "*fclass<ANYF:mode><X:mode>" + [(set (match_operand:X 0 "register_operand" "=r") + (unspec [(match_operand:ANYF 1 "register_operand" " f")] + UNSPEC_FCLASS))] + "TARGET_HARD_FLOAT" + "fclass.<fmt>\t%0,%1"; + [(set_attr "type" "fcmp") + (set_attr "mode" "<UNITMODE>")]) + +;; Implementation note: +;; Indirection via the expander needed due to md syntax limitations. +;; define_insn needs tighter mode check (X for operand0) requiring <X:mode> +;; in name. This forces callers like isfinite to specify mode but can't due +;; to their own standard pattern name requirement. The additional expander +;; with matching RTL template (but elided mode check) allows callers to use +;; expander's name with actual asm coming from stricter define_insn. + +(define_expand "fclass<ANYF:mode>" + [(set (match_operand 0 "register_operand" "=r") + (unspec [(match_operand:ANYF 1 "register_operand" " f")] + UNSPEC_FCLASS))] + "TARGET_HARD_FLOAT") + +;; Implements optab for isfinite + +(define_expand "isfinite<ANYF:mode>2" + [(set (match_operand 0 "register_operand" "=r") + (match_operand:ANYF 1 "register_operand" " f"))] + "TARGET_HARD_FLOAT" +{ + /* Allow SI for rv32/rv64 and DI for rv64 + Explicit mode check (vs. specfying modes in RTL template match_operand) + due to syntax limitations. + - Specifying X would cause multiple definitions + - And to disambuguate that requires adding <X:mode> to name which + no longer matches the "standard pattern name". */ + if (GET_MODE (operands[0]) != SImode + && GET_MODE (operands[0]) != word_mode) + FAIL; + + rtx t = gen_reg_rtx (word_mode); + rtx t_op0 = gen_reg_rtx (word_mode); + + emit_insn (gen_fclass<ANYF:mode> (t, operands[1])); + riscv_emit_binary (AND, t, t, GEN_INT (0x7e)); + rtx cmp = gen_rtx_NE (word_mode, t, const0_rtx); + emit_insn (gen_cstore<mode>4 (t_op0, cmp, t, const0_rtx)); + + if (TARGET_64BIT) + { + t_op0 = gen_lowpart (SImode, t_op0); + SUBREG_PROMOTED_VAR_P (t_op0) = 1; + SUBREG_PROMOTED_SET (t_op0, SRP_SIGNED); + } + + emit_move_insn (operands[0], t_op0); + DONE; +}) + +;; Implements optab for isnormal + +(define_expand "isnormal<ANYF:mode>2" + [(set (match_operand 0 "register_operand" "=r") + (match_operand:ANYF 1 "register_operand" " f"))] + "TARGET_HARD_FLOAT" +{ + if (GET_MODE (operands[0]) != SImode + && GET_MODE (operands[0]) != word_mode) + FAIL; + + rtx t = gen_reg_rtx (word_mode); + rtx t_op0 = gen_reg_rtx (word_mode); + + emit_insn (gen_fclass<ANYF:mode> (t, operands[1])); + riscv_emit_binary (AND, t, t, GEN_INT (0x42)); + rtx cmp = gen_rtx_NE (word_mode, t, const0_rtx); + emit_insn (gen_cstore<mode>4 (t_op0, cmp, t, const0_rtx)); + + if (TARGET_64BIT) + { + t_op0 = gen_lowpart (SImode, t_op0); + SUBREG_PROMOTED_VAR_P (t_op0) = 1; + SUBREG_PROMOTED_SET (t_op0, SRP_SIGNED); + } + + emit_move_insn (operands[0], t_op0); + DONE; +}) + +;; Implements optab for isinf +;; Note: glibc man page states tri-modal (+ve inf ? 1 : -ve inf ? -1 : 0) +;; However gcc testsuite tg-test.h expect 1 for -ve. + +(define_expand "isinf<ANYF:mode>2" + [(set (match_operand 0 "register_operand" "=r") + (match_operand:ANYF 1 "register_operand" " f"))] + "TARGET_HARD_FLOAT" +{ + if (GET_MODE (operands[0]) != SImode + && GET_MODE (operands[0]) != word_mode) + FAIL; + + rtx t = gen_reg_rtx (word_mode); + rtx t_op0 = gen_reg_rtx (word_mode); + + emit_insn (gen_fclass<ANYF:mode> (t, operands[1])); + riscv_emit_binary (AND, t, t, GEN_INT (0x81)); + rtx cmp = gen_rtx_NE (word_mode, t, const0_rtx); + emit_insn (gen_cstore<mode>4 (t_op0, cmp, t, const0_rtx)); + + if (TARGET_64BIT) + { + t_op0 = gen_lowpart (SImode, t_op0); + SUBREG_PROMOTED_VAR_P (t_op0) = 1; + SUBREG_PROMOTED_SET (t_op0, SRP_SIGNED); + } + + emit_move_insn (operands[0], t_op0); + DONE; +}) + (define_insn "*seq_zero_<X:mode><GPR:mode>" [(set (match_operand:GPR 0 "register_operand" "=r") (eq:GPR (match_operand:X 1 "register_operand" " r") diff --git a/gcc/testsuite/gcc.target/riscv/fclass.c b/gcc/testsuite/gcc.target/riscv/fclass.c new file mode 100644 index 000000000000..ea0f173ecf4b --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/fclass.c @@ -0,0 +1,38 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target hard_float } */ +/* { dg-options "-march=rv64gc -mabi=lp64d -ftrapping-math" { target { rv64 } } } */ +/* { dg-options "-march=rv32gc -mabi=ilp32d -ftrapping-math" { target { rv32 } } } */ + +int d_isfinite(double a) +{ + return __builtin_isfinite(a); +} + +int d_isnormal(double a) +{ + return __builtin_isnormal(a); +} + +int d_isinf(double a) +{ + return __builtin_isinf(a); +} + +int f_isfinite(float a) +{ + return __builtin_isfinite(a); +} + +int f_isnormal(float a) +{ + return __builtin_isnormal(a); +} + +int f_isinf(float a) +{ + return __builtin_isinf(a); +} + +/* { dg-final { scan-assembler-not {\mfrflags} } } */ +/* { dg-final { scan-assembler-not {\mfsflags} } } */ +/* { dg-final { scan-assembler-times {\tfclass} 6 } } */
Changes since v2: - fclass define_insn tightened to check op0 mode "X" with additional expander w/o mode for callers. - builtins expander explicit mode check and FAIL if mode not appropriate. - subreg promoted handling to elide potential extension of ret val. - Added isinf builtin with bimodal retun value as others. Changes since v1: - Removed UNSPEC_{INFINITE,ISNORMAL} - Don't hardcode SI in patterns, try to keep X to avoid potential sign extension pitfalls. Implementation wise requires skipping :MODE specifier in match_operand which is flagged as missing mode warning. --- Currently thsse builtins use float compare instructions which require FP flags to be save/restore around them. Our perf team complained this could be costly in uarch. RV Base ISA already has FCLASS.{d,s,h} instruction to compare/identify FP values w/o disturbing FP exception flags. Coincidently, upstream very recently got support for the corresponding optabs. So this just requires wiring up in the backend. Tested for rv64, one additioal failure g++.dg/opt/pr107569.C needs upstream ranger fix for the new optab. gcc/ChangeLog: * config/riscv/riscv.md: Add UNSPEC_FCLASS. define_insn and define_expand for fclass. define_expand for isfinite, isnormal, isinf. gcc/testsuite/ChangeLog: * gcc.target/riscv/fclass.c: New tests. Signed-off-by: Vineet Gupta <vineetg@rivosinc.com> --- gcc/config/riscv/riscv.md | 134 ++++++++++++++++++++++++ gcc/testsuite/gcc.target/riscv/fclass.c | 38 +++++++ 2 files changed, 172 insertions(+) create mode 100644 gcc/testsuite/gcc.target/riscv/fclass.c