diff mbox series

[RFC/PATCH] libgcc: sh: Use soft-fp for non-hosted SH3/SH4

Message ID 20240703100855.3855337-1-sebastien.michelland@lcis.grenoble-inp.fr
State New
Headers show
Series [RFC/PATCH] libgcc: sh: Use soft-fp for non-hosted SH3/SH4 | expand

Commit Message

Sébastien Michelland July 3, 2024, 9:59 a.m. UTC
libgcc's fp-bit.c is quite slow and most modern/developed architectures
have switched to using the soft-fp library. This patch does so for
free-standing/unknown-OS SH3/SH4 builds, using soft-fp's default parameters
for the most part, most notably no exceptions.

A quick run of Whetstone (built with OpenLibm) on an SH4 machine shows
about x3 speedup (~320 -> 1050 Kwhets/s).

I'm sending this as RFC because I'm quite unsure about testing. I built
the compiler and ran the benchmark, but I don't know if GCC has a test
for soft-fp correctness and whether I can run that in my non-hosted
environment. Any advice?

Cheers,
Sébastien

libgcc/ChangeLog:

        * config.host: Use soft-fp library for non-hosted SH3/SH4
        instead of fpdbit.
        * config/sh/sfp-machine.h: New.

Signed-off-by: Sébastien Michelland <sebastien.michelland@lcis.grenoble-inp.fr>
---
 libgcc/config.host             | 10 +++-
 libgcc/config/sh/sfp-machine.h | 83 ++++++++++++++++++++++++++++++++++
 2 files changed, 92 insertions(+), 1 deletion(-)
 create mode 100644 libgcc/config/sh/sfp-machine.h

Comments

Jeff Law July 3, 2024, 3:59 p.m. UTC | #1
On 7/3/24 3:59 AM, Sébastien Michelland wrote:
> libgcc's fp-bit.c is quite slow and most modern/developed architectures
> have switched to using the soft-fp library. This patch does so for
> free-standing/unknown-OS SH3/SH4 builds, using soft-fp's default parameters
> for the most part, most notably no exceptions.
> 
> A quick run of Whetstone (built with OpenLibm) on an SH4 machine shows
> about x3 speedup (~320 -> 1050 Kwhets/s).
> 
> I'm sending this as RFC because I'm quite unsure about testing. I built
> the compiler and ran the benchmark, but I don't know if GCC has a test
> for soft-fp correctness and whether I can run that in my non-hosted
> environment. Any advice?
> 
> Cheers,
> Sébastien
> 
> libgcc/ChangeLog:
> 
>          * config.host: Use soft-fp library for non-hosted SH3/SH4
>          instead of fpdbit.
>          * config/sh/sfp-machine.h: New.
I'd really like to hear from Oleg on this, though given we're using the 
soft-fp library on other targets it seems reasonable at a high level.

As far as testing, the GCC testsuite has some FP components which would 
implicitly test soft fp on any target that doesn't have hardware 
floating point.



Jeff
Sébastien Michelland July 3, 2024, 5:28 p.m. UTC | #2
On 2024-07-03 17:59, Jeff Law wrote:
> On 7/3/24 3:59 AM, Sébastien Michelland wrote:
>> libgcc's fp-bit.c is quite slow and most modern/developed architectures
>> have switched to using the soft-fp library. This patch does so for
>> free-standing/unknown-OS SH3/SH4 builds, using soft-fp's default 
>> parameters
>> for the most part, most notably no exceptions.
>>
>> A quick run of Whetstone (built with OpenLibm) on an SH4 machine shows
>> about x3 speedup (~320 -> 1050 Kwhets/s).
>>
>> I'm sending this as RFC because I'm quite unsure about testing. I built
>> the compiler and ran the benchmark, but I don't know if GCC has a test
>> for soft-fp correctness and whether I can run that in my non-hosted
>> environment. Any advice?
>>
>> Cheers,
>> Sébastien
>>
>> libgcc/ChangeLog:
>>
>>          * config.host: Use soft-fp library for non-hosted SH3/SH4
>>          instead of fpdbit.
>>          * config/sh/sfp-machine.h: New.
> I'd really like to hear from Oleg on this, though given we're using the 
> soft-fp library on other targets it seems reasonable at a high level.
> 
> As far as testing, the GCC testsuite has some FP components which would 
> implicitly test soft fp on any target that doesn't have hardware 
> floating point.

Thank you. I went this route, following the guide [1] and the 
instructions for cross-compiling [2] before hitting "Newlib does not 
support CPU sh3eb" which I should have seen coming.

There are plenty of random ports lying around but just grabbing one 
doesn't feel right (and I don't have a canonical one to go to as I 
usually run a custom libc for... mostly bad reasons).

Deferring maybe again to the few SH users... how do you usually do it?

Sébastien

[1] https://gcc.gnu.org/install/test.html
[2] https://gcc.gnu.org/simtest-howto.html
Oleg Endo July 4, 2024, 12:21 a.m. UTC | #3
Hi!

On Wed, 2024-07-03 at 19:28 +0200, Sébastien Michelland wrote:
> On 2024-07-03 17:59, Jeff Law wrote:
> > On 7/3/24 3:59 AM, Sébastien Michelland wrote:
> > > libgcc's fp-bit.c is quite slow and most modern/developed architectures
> > > have switched to using the soft-fp library. This patch does so for
> > > free-standing/unknown-OS SH3/SH4 builds, using soft-fp's default 
> > > parameters
> > > for the most part, most notably no exceptions.
> > > 
> > > A quick run of Whetstone (built with OpenLibm) on an SH4 machine shows
> > > about x3 speedup (~320 -> 1050 Kwhets/s).
> > > 
> > > I'm sending this as RFC because I'm quite unsure about testing. I built
> > > the compiler and ran the benchmark, but I don't know if GCC has a test
> > > for soft-fp correctness and whether I can run that in my non-hosted
> > > environment. Any advice?
> > > 
> > > Cheers,
> > > Sébastien
> > > 
> > > libgcc/ChangeLog:
> > > 
> > >          * config.host: Use soft-fp library for non-hosted SH3/SH4
> > >          instead of fpdbit.
> > >          * config/sh/sfp-machine.h: New.

> > I'd really like to hear from Oleg on this, though given we're using the 
> > soft-fp library on other targets it seems reasonable at a high level.

I don't understand why this is being limited to SH3 and SH4 only?
Almost all SH4 systems out there have an FPU (unless special configurations
are used).  So I'd say if switching to soft-fp, then for SH-anything, not
just SH3/SH4.

If it yields some improvements for some users, I'm all for it.

> > As far as testing, the GCC testsuite has some FP components which would 
> > implicitly test soft fp on any target that doesn't have hardware 
> > floating point.
> 
> Thank you. I went this route, following the guide [1] and the 
> instructions for cross-compiling [2] before hitting "Newlib does not 
> support CPU sh3eb" which I should have seen coming.
> 
> There are plenty of random ports lying around but just grabbing one 
> doesn't feel right (and I don't have a canonical one to go to as I 
> usually run a custom libc for... mostly bad reasons).
> 
> Deferring maybe again to the few SH users... how do you usually do it?
> 
> 

I think it would make sense to test it using sh-sim on SH2 big-endian and
little endian at least, as that doesn't have an FPU and hence would run
tests utilizing soft-fp.

After building the toolchain for --target=sh-elf, you can use this to run
the testsuite in the simulator:

make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb}"

(add make -j parameter according to you needs -- it will be slow)

Let me know if you have any further questions.

Best regards,
Oleg Endo
Sébastien Michelland July 5, 2024, 7:28 a.m. UTC | #4
Hi Oleg!

> I don't understand why this is being limited to SH3 and SH4 only?
> Almost all SH4 systems out there have an FPU (unless special configurations
> are used).  So I'd say if switching to soft-fp, then for SH-anything, not
> just SH3/SH4.
> 
> If it yields some improvements for some users, I'm all for it.

Yeah I just defaulted to SH3/SH4 conservatively because that's the only 
hardware I have. (My main platform also happens to be one of these SH4 
without an FPU, the SH4AL-DSP.)

Once this is tested/validated on simulator, I'll happily simplify the 
patch to apply to all SH.

> I think it would make sense to test it using sh-sim on SH2 big-endian and
> little endian at least, as that doesn't have an FPU and hence would run
> tests utilizing soft-fp.
> 
> After building the toolchain for --target=sh-elf, you can use this to run
> the testsuite in the simulator:
> 
> make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb}"
> 
> (add make -j parameter according to you needs -- it will be slow)

Alright, it might take a little bit.

Building the combined tree of gcc/binutils/newlib masters (again 
following [1]) gives me an ICE in libstdc++v3/src/libbacktrace, 
irrespective of my libgcc change:

---
during RTL pass: final
elf.c: In function ‘elf_zstd_decompress’:
elf.c:4999:1: internal compiler error: in output_296, at 
config/sh/sh.md:8408
  4999 | }
       | ^
0x1c8765e internal_error(char const*, ...)
	../../combined/gcc/diagnostic-global-context.cc:491
0x881269 fancy_abort(char const*, int, char const*)
	../../combined/gcc/diagnostic.cc:1725
0x83b73b output_296
	../../combined/gcc/config/sh/sh.md:8408
0x83b73b output_296
	../../combined/gcc/config/sh/sh.md:8063
0xb783c2 final_scan_insn_1
	../../combined/gcc/final.cc:2773
0xb78938 final_scan_insn(rtx_insn*, _IO_FILE*, int, int, int*)
	../../combined/gcc/final.cc:2886
0xb78b5f final_1
	../../combined/gcc/final.cc:1977
0xb796a8 rest_of_handle_final
	../../combined/gcc/final.cc:4239
0xb796a8 execute
	../../combined/gcc/final.cc:4317
Please submit a full bug report, with preprocessed source (by using 
-freport-bug).
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.
make[9]: *** [Makefile:628: std_stacktrace-elf.lo] Error 1
make[9]: *** Waiting for unfinished jobs....
make[9]: Leaving directory 
'/home/el/Programs/sh-elf-gcc/build-combined2/sh-elf/m2a/libstdc++-v3/src/libbacktrace'
---

My configure, for reference (--disable-source-highlight came up from a 
configure error earlier):

../combined/configure                       \
     --prefix="$PREFIX"                      \
     --target="sh-elf"                       \
     --enable-languages="c,c++"              \
     --disable-gdb                           \
     --disable-source-highlight

The libbacktrace build in gcc (make all-libbacktrace) works without an 
issue.

I'll have to prepare a bug report (I couldn't find anything related), 
but bisecting on a triplet of repos doesn't sound very fun and I believe 
I do need the newlib to build libstdc++ in a reproducible way.

Any advice before I go on that tangent?

Sébastien

[1] https://gcc.gnu.org/simtest-howto.html
Jeff Law July 6, 2024, 1:35 p.m. UTC | #5
On 7/5/24 1:28 AM, Sébastien Michelland wrote:
> Hi Oleg!
> 
>> I don't understand why this is being limited to SH3 and SH4 only?
>> Almost all SH4 systems out there have an FPU (unless special 
>> configurations
>> are used).  So I'd say if switching to soft-fp, then for SH-anything, not
>> just SH3/SH4.
>>
>> If it yields some improvements for some users, I'm all for it.
> 
> Yeah I just defaulted to SH3/SH4 conservatively because that's the only 
> hardware I have. (My main platform also happens to be one of these SH4 
> without an FPU, the SH4AL-DSP.)
> 
> Once this is tested/validated on simulator, I'll happily simplify the 
> patch to apply to all SH.
> 
>> I think it would make sense to test it using sh-sim on SH2 big-endian and
>> little endian at least, as that doesn't have an FPU and hence would run
>> tests utilizing soft-fp.
>>
>> After building the toolchain for --target=sh-elf, you can use this to run
>> the testsuite in the simulator:
>>
>> make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb}"
>>
>> (add make -j parameter according to you needs -- it will be slow)
> 
> Alright, it might take a little bit.
> 
> Building the combined tree of gcc/binutils/newlib masters (again 
> following [1]) gives me an ICE in libstdc++v3/src/libbacktrace, 
> irrespective of my libgcc change:
This is almost certainly a poorly written pattern.  I just fixed a bunch 
of these, but not this one.  Essentially a recent change in the generic 
parts of the compiler is exposing some bugs in the SH backend. 
Specifically:

> ;; Store (negated) T bit as all zeros or ones in a reg.  
> ;;      subc    Rn,Rn   ! Rn = Rn - Rn - T; T = T
> ;;      not     Rn,Rn   ! Rn = 0 - Rn
> ;; 
> ;; Note the call to sh_split_treg_set_expr may clobber
> ;; the T reg.  We must express this, even though it's
> ;; not immediately obvious this pattern changes the
> ;; T register.
> (define_insn_and_split "mov_neg_si_t"
>   [(set (match_operand:SI 0 "arith_reg_dest" "=r")
>         (neg:SI (match_operand 1 "treg_set_expr")))
>    (clobber (reg:SI T_REG))] 
>   "TARGET_SH1" 
> {
>   gcc_assert (t_reg_operand (operands[1], VOIDmode));
>   return "subc  %0,%0";
> }
>   "&& can_create_pseudo_p () && !t_reg_operand (operands[1], VOIDmode)"
>   [(const_int 0)]
> {
>   sh_treg_insns ti = sh_split_treg_set_expr (operands[1], curr_insn);
>   emit_insn (gen_mov_neg_si_t (operands[0], get_t_reg_rtx ()));
> 
>   if (ti.remove_trailing_nott ())
>     emit_insn (gen_one_cmplsi2 (operands[0], operands[0]));
> 
>   DONE; 
> }
>   [(set_attr "type" "arith")])


As written this pattern could match after register allocation is 
complete and thus we can't create new pseudos (the condition TARGET_SH1 
controls that behavior).  operands[1] won't necessarily be the T 
register in that case.

The split condition fails because we can't create new pseudos, so it's 
left as-is.  At final assembly time the assertion triggers.

the "&& can_create_pseudo ()" part of the split condition should be 
moved into the main condition.  I think that's all that's necessary to 
fix this problem.  It'd probably be best of Oleg went through the 
various define_insn_and_split patterns that utilize can_create_pseudo in 
their split condition and evaluated them.

I only fixed the most obvious cases in my change from this morning.  I 
don't typically work on the SH port and for changes which aren't 
obviously correct, Oleg is in a better position to evaluate the proper fix.

jeff
Oleg Endo July 7, 2024, 12:12 a.m. UTC | #6
Hi,

( For some weird reason I keep losing Sebastien's messages ... )

On Sat, 2024-07-06 at 07:35 -0600, Jeff Law wrote:
> 
> On 7/5/24 1:28 AM, Sébastien Michelland wrote:
> > Hi Oleg!
> > 
> > > I don't understand why this is being limited to SH3 and SH4 only?
> > > Almost all SH4 systems out there have an FPU (unless special 
> > > configurations
> > > are used).  So I'd say if switching to soft-fp, then for SH-anything, not
> > > just SH3/SH4.
> > > 
> > > If it yields some improvements for some users, I'm all for it.
> > 
> > Yeah I just defaulted to SH3/SH4 conservatively because that's the only 
> > hardware I have. (My main platform also happens to be one of these SH4 
> > without an FPU, the SH4AL-DSP.)

Oh, wow, especially rare type!

> > 
> > Once this is tested/validated on simulator, I'll happily simplify the 
> > patch to apply to all SH.

The default sh-elf configuration has no multi-libs for SH3 and SH4 variants
without FPU (from what I can see).  So it won't use soft-fp so much during
sim testing.  So please change to soft-fp for sh*, not just SH3/SH4.

> > 
> > > I think it would make sense to test it using sh-sim on SH2 big-endian and
> > > little endian at least, as that doesn't have an FPU and hence would run
> > > tests utilizing soft-fp.
> > > 
> > > After building the toolchain for --target=sh-elf, you can use this to run
> > > the testsuite in the simulator:
> > > 
> > > make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb}"
> > > 
> > > (add make -j parameter according to you needs -- it will be slow)
> > 
> > Alright, it might take a little bit.
> > 
> > Building the combined tree of gcc/binutils/newlib masters (again 
> > following [1])
> > 

I have never built the toolchain using a combined tree.  Like you said, it's
difficult to debug and so on.  I've only built it separately and never had
any issues with this approach on multiple platforms/targets.

Here's an old proposed change to the simtest instructions to not use
combined trees:

https://gcc.gnu.org/pipermail/gcc-patches/attachments/20140815/fb38918e/attachment.bin


> 

> This is almost certainly a poorly written pattern.  I just fixed a bunch 
> of these, but not this one.  Essentially a recent change in the generic 
> parts of the compiler is exposing some bugs in the SH backend. 

The patterns were written and tested to the best of our knowledge at that
time many years ago.  Nobody thought that we'll get a 2nd combine pass after
RA.  Anyway, I'll have a look at the remaining patterns.

Sebastien, in the meantime you could also try out and test your changes on
the latest GCC 14 branch, which shouldn't have those issues.

Best regards,
Oleg Endo
Sébastien Michelland July 7, 2024, 10:50 a.m. UTC | #7
Hi!

> The default sh-elf configuration has no multi-libs for SH3 and SH4 variants
> without FPU (from what I can see).  So it won't use soft-fp so much during
> sim testing.  So please change to soft-fp for sh*, not just SH3/SH4.

Got it, done that locally, and will update patch once tested.

> Here's an old proposed change to the simtest instructions to not use
> combined trees:
> 
> https://gcc.gnu.org/pipermail/gcc-patches/attachments/20140815/fb38918e/attachment.bin

Thanks for the instructions. Apologies for the back-and-forth as I'm 
pretty new with this infrastructure (I usually do research stuff on LLVM).

The split-tree build goes better, still fails with GCC 15 (as expected, 
though somehow my custom toolchain did build originally) and sort of 
works with GCC 14.

The binutils/gdb repos have been merged since that attachement, and 
while I can build binutils only with --disable-gdb, building gdb (in 
another build folder, reconfiguring from scratch) seems iffy. The global 
CFLAGS/CXXFLAGS to switch to 32-bit affects at least parts of binutils, 
resulting in a broken toolchain due to architecture mixup:

---
% sh-elf-g++ /tmp/test.cc -o /tmp/test
$PREFIX/lib/gcc/sh-elf/14.1.1/../../../../sh-elf/bin/ld: 
$PREFIX/libexec/gcc/sh-elf/14.1.1/liblto_plugin.so: error loading 
plugin: $PREFIX/libexec/gcc/sh-elf/14.1.1/liblto_plugin.so: wrong ELF 
class: ELFCLASS64
---

My first build kept GDB as 64-bit, but running the test binary in 
sh-elf-run gave a Bus error. Even with the 32-bit GDB build, ignoring 
the broken toolchain and running that old binary still gives a Bus error.

For reference, here's my latest attempt.

---
% cd ${TOP}
% git clone git://sourceware.org/git/binutils-gdb.git
% git clone https://sourceware.org/git/newlib-cygwin.git newlib
% ln -sf PATH_TO_MY_GCC_14.1 gcc

% cd ${TOP}/build-binutils
% ../binutils-gdb/configure --target=sh-elf --prefix="$PREFIX" 
--disable-nls --disable-werror --disable-gdb
% make all && make install

% cd ${TOP}/build-gcc
% ../gcc/configure --target=sh-elf --prefix="$PREFIX" 
--enable-languages=c,c++ --disable-nls --disable-werror --with-newlib 
--enable-lto --enable-multilib
% make all-gcc && make install-gcc

% cd ${TOP}/build-newlib
% ../newlib/configure --target=sh-elf --prefix="$PREFIX" --enable-lto 
--enable-multilib
% make all && make install

% cd ${TOP}/build-gcc
% rm -rf *
% ../gcc/configure --target=sh-elf --prefix="$PREFIX" 
--enable-languages=c,c++ --disable-nls --disable-werror --with-newlib 
--enable-lto --enable-multilib
% make all && make install

% cd ${TOP}/build-gdb
% CFLAGS="-O2 -m32 -msse -mfpmath=sse" CXXFLAGS="-O2 -m32 -msse 
-mfpmath=sse" ../binutils-gdb/configure --target=sh-elf 
--prefix="$PREFIX" --enable-interwork --enable-multilib --disable-nls 
--disable-werror
% make all && make install
---

>>> Yeah I just defaulted to SH3/SH4 conservatively because that's the only 
>>> hardware I have. (My main platform also happens to be one of these SH4 
>>> without an FPU, the SH4AL-DSP.)
> 
> Oh, wow, especially rare type!

How active are the main types? Like are there still new products 
designed with these (maybe the J2)?

> The patterns were written and tested to the best of our knowledge at that
> time many years ago.  Nobody thought that we'll get a 2nd combine pass after
> RA.  Anyway, I'll have a look at the remaining patterns.

I'd be interested to learn more about the history of the SH backend, if 
anyone wrote that up somewhere...

Thanks again,
Sébastien
Jeff Law July 7, 2024, 3:14 p.m. UTC | #8
On 7/6/24 6:12 PM, Oleg Endo wrote:

> 
>> This is almost certainly a poorly written pattern.  I just fixed a bunch
>> of these, but not this one.  Essentially a recent change in the generic
>> parts of the compiler is exposing some bugs in the SH backend.
> 
> The patterns were written and tested to the best of our knowledge at that
> time many years ago.  Nobody thought that we'll get a 2nd combine pass after
> RA.  Anyway, I'll have a look at the remaining patterns.
My comment wasn't meant to disparage you or anyone.  Just to note that 
the patterns need fixing as they are incorrect.

We have certainly had cases where the model used by those patterns would 
cause problems in the past.   Hard register cprop, compare elimination 
and others could trigger the same kind of problem.  late-combine is just 
more likely to expose these kinds of latent bugs.

Jeff
Oleg Endo July 8, 2024, 11:52 a.m. UTC | #9
Hi,

> > > > The default sh-elf configuration has no multi-libs for SH3 and SH4 variants
> > > > without FPU (from what I can see).  So it won't use soft-fp so much during
> > > > sim testing.  So please change to soft-fp for sh*, not just SH3/SH4.
> > 
> > Got it, done that locally, and will update patch once tested.
> > 
> > > > Here's an old proposed change to the simtest instructions to not use
> > > > combined trees:
> > > > 
> > > > https://gcc.gnu.org/pipermail/gcc-patches/attachments/20140815/fb38918e/attachment.bin
> > 
> > Thanks for the instructions. Apologies for the back-and-forth as I'm 
> > pretty new with this infrastructure (I usually do research stuff on LLVM).

No need to apologize.  I know this is a tedious and annoying thing to go
through and there is only very little useful information out there.

> > The split-tree build goes better, still fails with GCC 15 (as expected, 
> > though somehow my custom toolchain did build originally) and sort of 
> > works with GCC 14.

> > The binutils/gdb repos have been merged since that attachement, and 
> > while I can build binutils only with --disable-gdb, building gdb (in 
> > another build folder, reconfiguring from scratch) seems iffy. The global 
> > CFLAGS/CXXFLAGS to switch to 32-bit affects at least parts of binutils, 
> > resulting in a broken toolchain due to architecture mixup:

It shouldn't be needed to build GDB separately or to specify the -m32 flags.
Not sure why you have to do that.

I've just tried the following configure lines:

binutils-gdb (binutils-2_41-release)
<..>/configure --target=sh-elf --prefix=/usr/local --disable-nls --disable-werror --enable-initfini-array

gcc (any version)
<..>/configure --target=sh-elf --prefix=/usr/local --enable-languages=c,c++,lto --disable-nls --disable-werror --with-newlib --enable-lto --enable-multilib --with-system-zlib --disable-libstdcxx-verbose --disable-symvers

newlib (latest)
CFLAGS_FOR_TARGET="-Wno-error=implicit-function-declaration -Wno-implicit-int -ffunction-sections -fdata-sections -flto" <..>/newlib/configure --host=sh-elf --target=sh-elf --prefix=/usr/local --enable-multilib --enable-newlib-io-c99-formats

Note that the latest newlib version will try to create multilib directories
one directory above its current build directory for some reason.  So just
create another sub-directory in the build directory and do the config and
build from there.

Other than that, the build steps are the same as before.


I could reproduce the issue with the latest GCC when building libstdc++. 
I'm working on a fix for it.


Unfortunately I'm also getting the SIGBUS error when running a C++ program
that uses std::cout / std::cerr.

To be honest, I don't remember what the issue was/is, whether this has ever
worked at all or not.  I've tried rewinding everything back ~10 years ago
but was still getting the same error.  Using printf from the simulator seems
to work fine though.  So I guess a bunch of C++ tests of the GCC testsuite
will fail on the simulator, but that could be tolerable -- it never passed
all the tests on the simulator anyway.  It's still a good way to test for
regressions that could be introduced by a patch.

> > How active are the main types? Like are there still new products 
> > designed with these (maybe the J2)?

There is some activity on the software side which mainly stems from folks
using old parts and systems.  I'd say the biggest activity is now people
hammering on Sega 32X (SH2), Saturn (SH2) and Dreamcast (SH4), but I might
be biased here.

As for new hardware, I'm not sure.  Apparently it's still possible to
license SH4A(+FPU) and SH4AL-DSP IP cores from Renesas, but I doubt anybody
is really doing that.  Some parts are still being manufactured, like SH2A
for some niche applications. Don't know what j-core people are up to these
days.  Some of the SH MCUs have been re-implemented as open source gateware
for the MisTer FPGA project.

> > 
> > I'd be interested to learn more about the history of the SH backend, if 
> > anyone wrote that up somewhere...
> > 

From what I know it started during the earlier cygwin days in the 90s,
originally contracted by Hitachi to complement their own in-house C compiler
and also to allow sh-linux to happen at some point.  It was entertained by
Renesas for a while through further contracted support work but eventually
they have abandoned it.  STmicro was also a licensee of the SH4 CPU for
their TV set top boxes and had a few guys submitting patches now and then
for a while.  But the whole thing basically went on life support about 10
years ago.

Perhaps Jeff or others can give more insight on the historical parts.


Best regards,
Oleg Endo
Jeff Law July 8, 2024, 2:08 p.m. UTC | #10
On 7/8/24 5:52 AM, Oleg Endo wrote:

> 
>  From what I know it started during the earlier cygwin days in the 90s,
> originally contracted by Hitachi to complement their own in-house C compiler
> and also to allow sh-linux to happen at some point.  It was entertained by
> Renesas for a while through further contracted support work but eventually
> they have abandoned it.  STmicro was also a licensee of the SH4 CPU for
> their TV set top boxes and had a few guys submitting patches now and then
> for a while.  But the whole thing basically went on life support about 10
> years ago.
> 
> Perhaps Jeff or others can give more insight on the historical parts.
IIRC Joern Rennecke was doing most of the GCC work for SH back in the 
90s, with perhaps Steve Chamberlain pitching in (though I think he did 
more on the BFD side).  I never really did anything with it as I was 
focused more on other systems.


Jeff
Sébastien Michelland July 8, 2024, 6:48 p.m. UTC | #11
Hi again!

> It shouldn't be needed to build GDB separately or to specify the -m32 flags.
> Not sure why you have to do that.

It was in the document you sent, especially some warning about 
sh-elf-run not working on 64-bit hosts. Guess that's solved by now.

> I've just tried the following configure lines: (...)

Thanks, this flow worked for me as well. I'm now running the tests; 
quite a few fail, all C++. I'll have to control for without/with my 
change, try a few architectures, and report back. (Wow it's slow.)

> There is some activity on the software side which mainly stems from folks
> using old parts and systems.  I'd say the biggest activity is now people
> hammering on Sega 32X (SH2), Saturn (SH2) and Dreamcast (SH4), but I might
> be biased here.

I come from a community that works on CASIO graphing calculators. CASIO 
is still producing new models with SH4AL-DSP cores (SH7305), including 
some this year and next year. Seeing the latest updates they don't seem 
to have any intention to change it anytime soon either. Not a large 
community (I'd say a few dozen rotating people a year) but still there.

>  From what I know it started during the earlier cygwin days in the 90s,
> originally contracted by Hitachi to complement their own in-house C compiler
> and also to allow sh-linux to happen at some point.  It was entertained by
> Renesas for a while through further contracted support work but eventually
> they have abandoned it.  STmicro was also a licensee of the SH4 CPU for
> their TV set top boxes and had a few guys submitting patches now and then
> for a while.  But the whole thing basically went on life support about 10
> years ago.

Thanks for the insight. It's quite telling that things are still working 
after 10 years of "life support". :-)

Sébastien
Oleg Endo Oct. 10, 2024, 12:48 a.m. UTC | #12
On Wed, 2024-07-03 at 11:59 +0200, Sébastien Michelland wrote:
> libgcc's fp-bit.c is quite slow and most modern/developed architectures
> have switched to using the soft-fp library. This patch does so for
> free-standing/unknown-OS SH3/SH4 builds, using soft-fp's default parameters
> for the most part, most notably no exceptions.
> 
> A quick run of Whetstone (built with OpenLibm) on an SH4 machine shows
> about x3 speedup (~320 -> 1050 Kwhets/s).
> 
> I'm sending this as RFC because I'm quite unsure about testing. I built
> the compiler and ran the benchmark, but I don't know if GCC has a test
> for soft-fp correctness and whether I can run that in my non-hosted
> environment. Any advice?
> 

As discussed, the patch was changed to use soft-fp not only for SH3/SH4 but
generally for all sh-elf targets.

sh-*-linux* and sh-*-rtems* are not affected by that and continue using the
fdpbit library for floating-point emulation on no-fpu variants.  If that
should be changed as well, please let me know.


Tested with

make -k check RUNTESTFLAGS="--target_board=sh-sim\{-m2/-ml,-m2/-mb}"

committed & pushed the attached version to master.


Best regards,
Oleg Endo
diff mbox series

Patch

diff --git a/libgcc/config.host b/libgcc/config.host
index 9fae51d4c..fee3bf0c0 100644
--- a/libgcc/config.host
+++ b/libgcc/config.host
@@ -1399,7 +1399,15 @@  s390x-ibm-tpf*)
 	md_unwind_header=s390/tpf-unwind.h
 	;;
 sh-*-elf* | sh[12346l]*-*-elf*)
-	tmake_file="$tmake_file sh/t-sh t-crtstuff-pic t-fdpbit"
+	tmake_file="$tmake_file sh/t-sh t-crtstuff-pic"
+	case ${host} in
+	sh[34]*-*-elf*)
+		tmake_file="${tmake_file} t-softfp-sfdf t-softfp"
+		;;
+	*)
+		tmake_file="${tmake_file} t-fdpbit"
+		;;
+	esac
 	extra_parts="$extra_parts crt1.o crti.o crtn.o crtbeginS.o crtendS.o \
 		libic_invalidate_array_4-100.a \
 		libic_invalidate_array_4-200.a \
diff --git a/libgcc/config/sh/sfp-machine.h b/libgcc/config/sh/sfp-machine.h
new file mode 100644
index 000000000..c1aa428c0
--- /dev/null
+++ b/libgcc/config/sh/sfp-machine.h
@@ -0,0 +1,83 @@ 
+/* Software floating-point machine description for SuperH.
+
+   Copyright (C) 2016-2024 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+#define _FP_W_TYPE_SIZE     32
+#define _FP_W_TYPE      unsigned long
+#define _FP_WS_TYPE     signed long
+#define _FP_I_TYPE      long
+
+#define _FP_MUL_MEAT_S(R,X,Y)               \
+  _FP_MUL_MEAT_1_wide(_FP_WFRACBITS_S,R,X,Y,umul_ppmm)
+#define _FP_MUL_MEAT_D(R,X,Y)               \
+  _FP_MUL_MEAT_2_wide(_FP_WFRACBITS_D,R,X,Y,umul_ppmm)
+#define _FP_MUL_MEAT_Q(R,X,Y)               \
+  _FP_MUL_MEAT_4_wide(_FP_WFRACBITS_Q,R,X,Y,umul_ppmm)
+
+#define _FP_DIV_MEAT_S(R,X,Y)   _FP_DIV_MEAT_1_udiv_norm(S,R,X,Y)
+#define _FP_DIV_MEAT_D(R,X,Y)   _FP_DIV_MEAT_2_udiv(D,R,X,Y)
+#define _FP_DIV_MEAT_Q(R,X,Y)   _FP_DIV_MEAT_4_udiv(Q,R,X,Y)
+
+#define _FP_NANFRAC_B       _FP_QNANBIT_B
+#define _FP_NANFRAC_H       _FP_QNANBIT_H
+#define _FP_NANFRAC_S       _FP_QNANBIT_S
+#define _FP_NANFRAC_D       _FP_QNANBIT_D, 0
+#define _FP_NANFRAC_Q       _FP_QNANBIT_Q, 0, 0, 0
+
+/* The type of the result of a floating point comparison.  This must
+   match __libgcc_cmp_return__ in GCC for the target.  */
+typedef int __gcc_CMPtype __attribute__ ((mode (__libgcc_cmp_return__)));
+#define CMPtype __gcc_CMPtype
+
+#define _FP_NANSIGN_B       0
+#define _FP_NANSIGN_H       0
+#define _FP_NANSIGN_S       0
+#define _FP_NANSIGN_D       0
+#define _FP_NANSIGN_Q       0
+
+#define _FP_KEEPNANFRACP 0
+#define _FP_QNANNEGATEDP 0
+
+#define _FP_CHOOSENAN(fs, wc, R, X, Y, OP)  \
+  do {                      \
+    R##_s = _FP_NANSIGN_##fs;           \
+    _FP_FRAC_SET_##wc(R,_FP_NANFRAC_##fs);  \
+    R##_c = FP_CLS_NAN;             \
+  } while (0)
+
+#define _FP_TININESS_AFTER_ROUNDING 1
+
+#define __LITTLE_ENDIAN 1234
+#define __BIG_ENDIAN    4321
+
+#if defined(__BYTE_ORDER__) && (__BYTE_ORDER__ == __ORDER_BIG_ENDIAN__)
+#define __BYTE_ORDER __BIG_ENDIAN
+#else
+#define __BYTE_ORDER __LITTLE_ENDIAN
+#endif
+
+/* Define ALIASNAME as a strong alias for NAME.  */
+# define strong_alias(name, aliasname) _strong_alias(name, aliasname)
+# define _strong_alias(name, aliasname) \
+  extern __typeof (name) aliasname __attribute__ ((alias (#name)));