diff mbox series

RISC-V: Enable static-pie.

Message ID 20230810233348.1214955-1-yanzhang.wang@intel.com
State New
Headers show
Series RISC-V: Enable static-pie. | expand

Commit Message

develop--- via Libc-alpha Aug. 10, 2023, 11:33 p.m. UTC
From: Yanzhang Wang <yanzhang.wang@intel.com>

This patch referents the commit 374cef3 to add static-pie support. And
because the dummy link map is used when relocating ourselves, so need
not to set __global_pointer$ at this time.
---
 sysdeps/riscv/configure    | 2 ++
 sysdeps/riscv/configure.ac | 3 +++
 sysdeps/riscv/dl-machine.h | 2 +-
 3 files changed, 6 insertions(+), 1 deletion(-)

Comments

Palmer Dabbelt Aug. 11, 2023, 1:57 a.m. UTC | #1
On Thu, 10 Aug 2023 16:33:48 PDT (-0700), libc-alpha@sourceware.org wrote:
> From: Yanzhang Wang <yanzhang.wang@intel.com>
>
> This patch referents the commit 374cef3 to add static-pie support. And
> because the dummy link map is used when relocating ourselves, so need
> not to set __global_pointer$ at this time.

Do you have test results?  IIRC the only reason we didn't enable this 
when submitting the original port was because we didn't have time to 
test it.  If it's passing a reasonable amount of the tests it's fine 
with me, but we need to at least look because there could be lurking 
issues.

> ---
>  sysdeps/riscv/configure    | 2 ++
>  sysdeps/riscv/configure.ac | 3 +++
>  sysdeps/riscv/dl-machine.h | 2 +-
>  3 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/sysdeps/riscv/configure b/sysdeps/riscv/configure
> index 2372225a26..340163779f 100644
> --- a/sysdeps/riscv/configure
> +++ b/sysdeps/riscv/configure
> @@ -29,3 +29,5 @@ fi
>  $as_echo "$libc_cv_riscv_r_align" >&6; }
>  config_vars="$config_vars
>  riscv-r-align = $libc_cv_riscv_r_align"
> +
> +$as_echo "#define SUPPORT_STATIC_PIE 1" >>confdefs.h
> diff --git a/sysdeps/riscv/configure.ac b/sysdeps/riscv/configure.ac
> index dbcc216689..36da2b5396 100644
> --- a/sysdeps/riscv/configure.ac
> +++ b/sysdeps/riscv/configure.ac
> @@ -16,3 +16,6 @@ EOF
>    fi
>    rm -rf conftest.*])
>  LIBC_CONFIG_VAR([riscv-r-align], [$libc_cv_riscv_r_align])
> +
> +dnl Static PIE is supported.
> +AC_DEFINE(SUPPORT_STATIC_PIE)
> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
> index c0c9bd93ad..ad875c0828 100644
> --- a/sysdeps/riscv/dl-machine.h
> +++ b/sysdeps/riscv/dl-machine.h
> @@ -323,7 +323,7 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
>        gotplt[1] = (ElfW(Addr)) l;
>      }
>
> -  if (l->l_type == lt_executable)
> +  if (l->l_type == lt_executable && l->l_scope != NULL)
>      {
>        /* The __global_pointer$ may not be defined by the linker if the
>  	 $gp register does not be used to access the global variable
develop--- via Libc-alpha Aug. 13, 2023, 12:20 p.m. UTC | #2
Hi Palmer,

I have tested the commit(542b110585) with this patch. The results like below,

Summary of test results:
    189 FAIL
   4328 PASS
    101 UNSUPPORTED
     16 XFAIL
      2 XPASS

And the commit(542b110585)'s results like below,

Summary of test results:
    189 FAIL
   4326 PASS
    101 UNSUPPORTED
     16 XFAIL
      2 XPASS

The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.

I use the command make check-glibc-linux in riscv-gnu-toolchain. Not sure is
that acceptable.

Thanks,
Yanzhang

> -----Original Message-----
> From: Palmer Dabbelt <palmer@dabbelt.com>
> Sent: Friday, August 11, 2023 9:58 AM
> To: libc-alpha@sourceware.org; Wang, Yanzhang <yanzhang.wang@intel.com>
> Cc: libc-alpha@sourceware.org
> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> 
> On Thu, 10 Aug 2023 16:33:48 PDT (-0700), libc-alpha@sourceware.org wrote:
> > From: Yanzhang Wang <yanzhang.wang@intel.com>
> >
> > This patch referents the commit 374cef3 to add static-pie support. And
> > because the dummy link map is used when relocating ourselves, so need
> > not to set __global_pointer$ at this time.
> 
> Do you have test results?  IIRC the only reason we didn't enable this when
> submitting the original port was because we didn't have time to test it.
> If it's passing a reasonable amount of the tests it's fine with me, but we
> need to at least look because there could be lurking issues.
> 
> > ---
> >  sysdeps/riscv/configure    | 2 ++
> >  sysdeps/riscv/configure.ac | 3 +++
> >  sysdeps/riscv/dl-machine.h | 2 +-
> >  3 files changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/sysdeps/riscv/configure b/sysdeps/riscv/configure index
> > 2372225a26..340163779f 100644
> > --- a/sysdeps/riscv/configure
> > +++ b/sysdeps/riscv/configure
> > @@ -29,3 +29,5 @@ fi
> >  $as_echo "$libc_cv_riscv_r_align" >&6; }  config_vars="$config_vars
> > riscv-r-align = $libc_cv_riscv_r_align"
> > +
> > +$as_echo "#define SUPPORT_STATIC_PIE 1" >>confdefs.h
> > diff --git a/sysdeps/riscv/configure.ac b/sysdeps/riscv/configure.ac
> > index dbcc216689..36da2b5396 100644
> > --- a/sysdeps/riscv/configure.ac
> > +++ b/sysdeps/riscv/configure.ac
> > @@ -16,3 +16,6 @@ EOF
> >    fi
> >    rm -rf conftest.*])
> >  LIBC_CONFIG_VAR([riscv-r-align], [$libc_cv_riscv_r_align])
> > +
> > +dnl Static PIE is supported.
> > +AC_DEFINE(SUPPORT_STATIC_PIE)
> > diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
> > index c0c9bd93ad..ad875c0828 100644
> > --- a/sysdeps/riscv/dl-machine.h
> > +++ b/sysdeps/riscv/dl-machine.h
> > @@ -323,7 +323,7 @@ elf_machine_runtime_setup (struct link_map *l, struct
> r_scope_elem *scope[],
> >        gotplt[1] = (ElfW(Addr)) l;
> >      }
> >
> > -  if (l->l_type == lt_executable)
> > +  if (l->l_type == lt_executable && l->l_scope != NULL)
> >      {
> >        /* The __global_pointer$ may not be defined by the linker if the
> >  	 $gp register does not be used to access the global variable
Carlos O'Donell Aug. 14, 2023, 1:12 p.m. UTC | #3
On 8/10/23 19:33, yanzhang.wang--- via Libc-alpha wrote:
> From: Yanzhang Wang <yanzhang.wang@intel.com>
> 
> This patch referents the commit 374cef3 to add static-pie support. And
> because the dummy link map is used when relocating ourselves, so need
> not to set __global_pointer$ at this time.

This fails pre-commit CI.

https://patchwork.sourceware.org/project/glibc/patch/20230810233348.1214955-1-yanzhang.wang@intel.com/

Patch conflict in configure. Please make sure you are regenerating properly.


> ---
>  sysdeps/riscv/configure    | 2 ++
>  sysdeps/riscv/configure.ac | 3 +++
>  sysdeps/riscv/dl-machine.h | 2 +-
>  3 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/sysdeps/riscv/configure b/sysdeps/riscv/configure
> index 2372225a26..340163779f 100644
> --- a/sysdeps/riscv/configure
> +++ b/sysdeps/riscv/configure
> @@ -29,3 +29,5 @@ fi
>  $as_echo "$libc_cv_riscv_r_align" >&6; }
>  config_vars="$config_vars
>  riscv-r-align = $libc_cv_riscv_r_align"
> +
> +$as_echo "#define SUPPORT_STATIC_PIE 1" >>confdefs.h
> diff --git a/sysdeps/riscv/configure.ac b/sysdeps/riscv/configure.ac
> index dbcc216689..36da2b5396 100644
> --- a/sysdeps/riscv/configure.ac
> +++ b/sysdeps/riscv/configure.ac
> @@ -16,3 +16,6 @@ EOF
>    fi
>    rm -rf conftest.*])
>  LIBC_CONFIG_VAR([riscv-r-align], [$libc_cv_riscv_r_align])
> +
> +dnl Static PIE is supported.
> +AC_DEFINE(SUPPORT_STATIC_PIE)
> diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
> index c0c9bd93ad..ad875c0828 100644
> --- a/sysdeps/riscv/dl-machine.h
> +++ b/sysdeps/riscv/dl-machine.h
> @@ -323,7 +323,7 @@ elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
>        gotplt[1] = (ElfW(Addr)) l;
>      }
>  
> -  if (l->l_type == lt_executable)
> +  if (l->l_type == lt_executable && l->l_scope != NULL)
>      {
>        /* The __global_pointer$ may not be defined by the linker if the
>  	 $gp register does not be used to access the global variable
develop--- via Libc-alpha Aug. 15, 2023, 1:48 a.m. UTC | #4
Hi Carlos,

Sorry for the inconvenience.

I have pushed a new patch and should be right now.

Thanks,
Yanzhang
Adhemerval Zanella Netto Aug. 15, 2023, 11:46 a.m. UTC | #5
On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
> Hi Palmer,
> 
> I have tested the commit(542b110585) with this patch. The results like below,
> 
> Summary of test results:
>     189 FAIL
>    4328 PASS
>     101 UNSUPPORTED
>      16 XFAIL
>       2 XPASS
> 
> And the commit(542b110585)'s results like below,
> 
> Summary of test results:
>     189 FAIL
>    4326 PASS
>     101 UNSUPPORTED
>      16 XFAIL
>       2 XPASS
> 
> The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
> 
> I use the command make check-glibc-linux in riscv-gnu-toolchain. Not sure is
> that acceptable.

The riscv reports for 2.38 release [1] list at maximum of 6 FAIL for all
the ABI variants.  The 189 failures you are reporting means that your 
environment is either missing some setup (for instance, copying the
libgcc_s.so and libstd++.so on the build folder so C++ and tests that
require pthread_cancel or backtrace works correctly) or it is not
properly configured.  

Please sort this out first, since with that amount of failures is not
straightforward to check whether static-pie is really working as
intended.
Wang, Yanzhang Sept. 9, 2023, 3:17 a.m. UTC | #6
I took some time to test the master with binfmt_misc and qemu system mode.
Both of them can't match the requirements (<= 6 failures).

- most of cases with binfmt_misc fail with abort.
- most of cases with qemu system fail with timed out.

And also tested with my risc-v board and still fails 70+ cases and most of
them are math accuracy issue.

So Adhemerval, do you know how to setup the environment to reproduce
the <= 6 failures ? Maybe I lost some important steps. Thanks very much :).

Thanks,
Yanzhang

> -----Original Message-----
> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> Sent: Tuesday, August 15, 2023 7:46 PM
> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
> <palmer@dabbelt.com>; libc-alpha@sourceware.org
> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> 
> 
> 
> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
> > Hi Palmer,
> >
> > I have tested the commit(542b110585) with this patch. The results like
> > below,
> >
> > Summary of test results:
> >     189 FAIL
> >    4328 PASS
> >     101 UNSUPPORTED
> >      16 XFAIL
> >       2 XPASS
> >
> > And the commit(542b110585)'s results like below,
> >
> > Summary of test results:
> >     189 FAIL
> >    4326 PASS
> >     101 UNSUPPORTED
> >      16 XFAIL
> >       2 XPASS
> >
> > The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
> >
> > I use the command make check-glibc-linux in riscv-gnu-toolchain. Not
> > sure is that acceptable.
> 
> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL for all
> the ABI variants.  The 189 failures you are reporting means that your
> environment is either missing some setup (for instance, copying the
> libgcc_s.so and libstd++.so on the build folder so C++ and tests that
> require pthread_cancel or backtrace works correctly) or it is not
> properly configured.
> 
> Please sort this out first, since with that amount of failures is not
> straightforward to check whether static-pie is really working as intended.
Palmer Dabbelt Sept. 9, 2023, 3:30 a.m. UTC | #7
On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com wrote:
> I took some time to test the master with binfmt_misc and qemu system mode.
> Both of them can't match the requirements (<= 6 failures).
> 
> - most of cases with binfmt_misc fail with abort.

QEMU user mode isn't a valid test suite target for glibc, there's lots 
of failures due to the emulation.  I know it's confusing that 
riscv-gnu-toolchain uses it, that come up when support was added.

> - most of cases with qemu system fail with timed out.

You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.

> And also tested with my risc-v board and still fails 70+ cases and most of
> them are math accuracy issue.

Which board are you running on?

> So Adhemerval, do you know how to setup the environment to reproduce
> the <= 6 failures ? Maybe I lost some important steps. Thanks very much :).

+DJ and Darius, who usually report test results.  They've probably got 
the best idea of how to set things up, but I don't remember this 
requiring anything fancy.

> 
> Thanks,
> Yanzhang
> 
>> -----Original Message-----
>> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
>> Sent: Tuesday, August 15, 2023 7:46 PM
>> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
>> <palmer@dabbelt.com>; libc-alpha@sourceware.org
>> Subject: Re: [PATCH] RISC-V: Enable static-pie.
>> 
>> 
>> 
>> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
>> > Hi Palmer,
>> >
>> > I have tested the commit(542b110585) with this patch. The results like
>> > below,
>> >
>> > Summary of test results:
>> >     189 FAIL
>> >    4328 PASS
>> >     101 UNSUPPORTED
>> >      16 XFAIL
>> >       2 XPASS
>> >
>> > And the commit(542b110585)'s results like below,
>> >
>> > Summary of test results:
>> >     189 FAIL
>> >    4326 PASS
>> >     101 UNSUPPORTED
>> >      16 XFAIL
>> >       2 XPASS
>> >
>> > The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
>> >
>> > I use the command make check-glibc-linux in riscv-gnu-toolchain. Not
>> > sure is that acceptable.
>> 
>> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL for all
>> the ABI variants.  The 189 failures you are reporting means that your
>> environment is either missing some setup (for instance, copying the
>> libgcc_s.so and libstd++.so on the build folder so C++ and tests that
>> require pthread_cancel or backtrace works correctly) or it is not
>> properly configured.
>> 
>> Please sort this out first, since with that amount of failures is not
>> straightforward to check whether static-pie is really working as intended.
>
Wang, Yanzhang Sept. 9, 2023, 6:54 a.m. UTC | #8
Thanks Palmer.

> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
>

I'll try to set a high value. It's really too slow.

> Which board are you running on?

I use the board licheepi 4a with Debian 12 to do the testing.

> -----Original Message-----
> From: Palmer Dabbelt <palmer@dabbelt.com>
> Sent: Saturday, September 9, 2023 11:31 AM
> To: Wang, Yanzhang <yanzhang.wang@intel.com>; DJ Delorie <dj@redhat.com>;
> Darius Rad <darius@bluespec.com>
> Cc: adhemerval.zanella@linaro.org; libc-alpha@sourceware.org
> Subject: RE: [PATCH] RISC-V: Enable static-pie.
> 
> On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com wrote:
> > I took some time to test the master with binfmt_misc and qemu system
> mode.
> > Both of them can't match the requirements (<= 6 failures).
> >
> > - most of cases with binfmt_misc fail with abort.
> 
> QEMU user mode isn't a valid test suite target for glibc, there's lots of
> failures due to the emulation.  I know it's confusing that riscv-gnu-
> toolchain uses it, that come up when support was added.
> 
> > - most of cases with qemu system fail with timed out.
> 
> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
> 
> > And also tested with my risc-v board and still fails 70+ cases and
> > most of them are math accuracy issue.
> 
> Which board are you running on?
> 
> > So Adhemerval, do you know how to setup the environment to reproduce
> > the <= 6 failures ? Maybe I lost some important steps. Thanks very
> much :).
> 
> +DJ and Darius, who usually report test results.  They've probably got
> the best idea of how to set things up, but I don't remember this
> requiring anything fancy.
> 
> >
> > Thanks,
> > Yanzhang
> >
> >> -----Original Message-----
> >> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> >> Sent: Tuesday, August 15, 2023 7:46 PM
> >> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
> >> <palmer@dabbelt.com>; libc-alpha@sourceware.org
> >> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> >>
> >>
> >>
> >> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
> >> > Hi Palmer,
> >> >
> >> > I have tested the commit(542b110585) with this patch. The results
> >> > like below,
> >> >
> >> > Summary of test results:
> >> >     189 FAIL
> >> >    4328 PASS
> >> >     101 UNSUPPORTED
> >> >      16 XFAIL
> >> >       2 XPASS
> >> >
> >> > And the commit(542b110585)'s results like below,
> >> >
> >> > Summary of test results:
> >> >     189 FAIL
> >> >    4326 PASS
> >> >     101 UNSUPPORTED
> >> >      16 XFAIL
> >> >       2 XPASS
> >> >
> >> > The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
> >> >
> >> > I use the command make check-glibc-linux in riscv-gnu-toolchain.
> >> > Not sure is that acceptable.
> >>
> >> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL for
> >> all the ABI variants.  The 189 failures you are reporting means that
> >> your environment is either missing some setup (for instance, copying
> >> the libgcc_s.so and libstd++.so on the build folder so C++ and tests
> >> that require pthread_cancel or backtrace works correctly) or it is
> >> not properly configured.
> >>
> >> Please sort this out first, since with that amount of failures is not
> >> straightforward to check whether static-pie is really working as
> intended.
> >
Darius Rad Sept. 11, 2023, 1:34 p.m. UTC | #9
On Fri, Sep 08, 2023 at 08:30:44PM -0700, Palmer Dabbelt wrote:
> On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com wrote:
> > I took some time to test the master with binfmt_misc and qemu system mode.
> > Both of them can't match the requirements (<= 6 failures).
> > 
> > - most of cases with binfmt_misc fail with abort.
> 
> QEMU user mode isn't a valid test suite target for glibc, there's lots of
> failures due to the emulation.  I know it's confusing that
> riscv-gnu-toolchain uses it, that come up when support was added.
> 
> > - most of cases with qemu system fail with timed out.
> 
> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
> 
> > And also tested with my risc-v board and still fails 70+ cases and most of
> > them are math accuracy issue.
> 
> Which board are you running on?
> 
> > So Adhemerval, do you know how to setup the environment to reproduce
> > the <= 6 failures ? Maybe I lost some important steps. Thanks very much :).
> 
> +DJ and Darius, who usually report test results.  They've probably got the
> best idea of how to set things up, but I don't remember this requiring
> anything fancy.
> 

Indeed, I don't really do anything unusual, but I do set TIMEOUTFACTOR to
10.  I regularly run on qemu-system (not user) and the Microchip Icicle Kit
board.  I also use scripts/cross-test-ssh.sh for test-wrapper, even when
testing locally (host is localhost), as that simplifies testing the various
ABIs.

// darius
Palmer Dabbelt Sept. 11, 2023, 2:14 p.m. UTC | #10
On Fri, 08 Sep 2023 23:54:49 PDT (-0700), yanzhang.wang@intel.com wrote:
> Thanks Palmer.
> 
>> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
>>
> 
> I'll try to set a high value. It's really too slow.
> 
>> Which board are you running on?
> 
> I use the board licheepi 4a with Debian 12 to do the testing.

Those T-Head cores have a bit of a reputation for ignoring the ISA, so 
it wouldn't be super surprising if something odd is going on with 
floating point.  If the tests are clean on QEMU and fail on the C910 
then we should get to the bottom of it, though.

Last time I had to chase down an FPU bug I used TestFloat 
<http://www.jhauser.us/arithmetic/TestFloat.html>.  I'm not sure if 
there's something better?

+Edwin, as I think he's got Lichee Pi 4a up to benchmark misaligned 
accesses for that GCC patch.

> 
>> -----Original Message-----
>> From: Palmer Dabbelt <palmer@dabbelt.com>
>> Sent: Saturday, September 9, 2023 11:31 AM
>> To: Wang, Yanzhang <yanzhang.wang@intel.com>; DJ Delorie <dj@redhat.com>;
>> Darius Rad <darius@bluespec.com>
>> Cc: adhemerval.zanella@linaro.org; libc-alpha@sourceware.org
>> Subject: RE: [PATCH] RISC-V: Enable static-pie.
>> 
>> On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com wrote:
>> > I took some time to test the master with binfmt_misc and qemu system
>> mode.
>> > Both of them can't match the requirements (<= 6 failures).
>> >
>> > - most of cases with binfmt_misc fail with abort.
>> 
>> QEMU user mode isn't a valid test suite target for glibc, there's lots of
>> failures due to the emulation.  I know it's confusing that riscv-gnu-
>> toolchain uses it, that come up when support was added.
>> 
>> > - most of cases with qemu system fail with timed out.
>> 
>> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
>> 
>> > And also tested with my risc-v board and still fails 70+ cases and
>> > most of them are math accuracy issue.
>> 
>> Which board are you running on?
>> 
>> > So Adhemerval, do you know how to setup the environment to reproduce
>> > the <= 6 failures ? Maybe I lost some important steps. Thanks very
>> much :).
>> 
>> +DJ and Darius, who usually report test results.  They've probably got
>> the best idea of how to set things up, but I don't remember this
>> requiring anything fancy.
>> 
>> >
>> > Thanks,
>> > Yanzhang
>> >
>> >> -----Original Message-----
>> >> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
>> >> Sent: Tuesday, August 15, 2023 7:46 PM
>> >> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
>> >> <palmer@dabbelt.com>; libc-alpha@sourceware.org
>> >> Subject: Re: [PATCH] RISC-V: Enable static-pie.
>> >>
>> >>
>> >>
>> >> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
>> >> > Hi Palmer,
>> >> >
>> >> > I have tested the commit(542b110585) with this patch. The results
>> >> > like below,
>> >> >
>> >> > Summary of test results:
>> >> >     189 FAIL
>> >> >    4328 PASS
>> >> >     101 UNSUPPORTED
>> >> >      16 XFAIL
>> >> >       2 XPASS
>> >> >
>> >> > And the commit(542b110585)'s results like below,
>> >> >
>> >> > Summary of test results:
>> >> >     189 FAIL
>> >> >    4326 PASS
>> >> >     101 UNSUPPORTED
>> >> >      16 XFAIL
>> >> >       2 XPASS
>> >> >
>> >> > The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
>> >> >
>> >> > I use the command make check-glibc-linux in riscv-gnu-toolchain.
>> >> > Not sure is that acceptable.
>> >>
>> >> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL for
>> >> all the ABI variants.  The 189 failures you are reporting means that
>> >> your environment is either missing some setup (for instance, copying
>> >> the libgcc_s.so and libstd++.so on the build folder so C++ and tests
>> >> that require pthread_cancel or backtrace works correctly) or it is
>> >> not properly configured.
>> >>
>> >> Please sort this out first, since with that amount of failures is not
>> >> straightforward to check whether static-pie is really working as
>> intended.
>> >
Adhemerval Zanella Netto Sept. 11, 2023, 4:17 p.m. UTC | #11
On 09/09/23 00:30, Palmer Dabbelt wrote:
> On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com wrote:
>> I took some time to test the master with binfmt_misc and qemu system mode.
>> Both of them can't match the requirements (<= 6 failures).
>>
>> - most of cases with binfmt_misc fail with abort.
> 
> QEMU user mode isn't a valid test suite target for glibc, there's lots of failures due to the emulation.  I know it's confusing that riscv-gnu-toolchain uses it, that come up when support was added.
> 
>> - most of cases with qemu system fail with timed out.
> 
> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
> 
>> And also tested with my risc-v board and still fails 70+ cases and most of
>> them are math accuracy issue.
> 
> Which board are you running on?
> 
>> So Adhemerval, do you know how to setup the environment to reproduce
>> the <= 6 failures ? Maybe I lost some important steps. Thanks very much :).
> 
> +DJ and Darius, who usually report test results.  They've probably got the best idea of how to set things up, but I don't remember this requiring anything fancy.

For the specific support of static-pie, I expect that qemu-system or even
qemu-user would be a feasible testing platform.  You might need some 
adjustment if the platform implements some math code in assembly, but 
if you filter out the expected failures it should be doable to check
the feature is working as intended.

However it is hard to filter out if you just specify the number of failure 
before/after without breaking down which tests has failed and why (Was is 
due timeout due emulation? Was it due missing libstdc++.so/libgcc_s.so 
support? Was is math failure due wrong emulation?). 

> 
>>
>> Thanks,
>> Yanzhang
>>
>>> -----Original Message-----
>>> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
>>> Sent: Tuesday, August 15, 2023 7:46 PM
>>> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
>>> <palmer@dabbelt.com>; libc-alpha@sourceware.org
>>> Subject: Re: [PATCH] RISC-V: Enable static-pie.
>>>
>>>
>>>
>>> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
>>> > Hi Palmer,
>>> >
>>> > I have tested the commit(542b110585) with this patch. The results like
>>> > below,
>>> >
>>> > Summary of test results:
>>> >     189 FAIL
>>> >    4328 PASS
>>> >     101 UNSUPPORTED
>>> >      16 XFAIL
>>> >       2 XPASS
>>> >
>>> > And the commit(542b110585)'s results like below,
>>> >
>>> > Summary of test results:
>>> >     189 FAIL
>>> >    4326 PASS
>>> >     101 UNSUPPORTED
>>> >      16 XFAIL
>>> >       2 XPASS
>>> >
>>> > The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
>>> >
>>> > I use the command make check-glibc-linux in riscv-gnu-toolchain. Not
>>> > sure is that acceptable.
>>>
>>> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL for all
>>> the ABI variants.  The 189 failures you are reporting means that your
>>> environment is either missing some setup (for instance, copying the
>>> libgcc_s.so and libstd++.so on the build folder so C++ and tests that
>>> require pthread_cancel or backtrace works correctly) or it is not
>>> properly configured.
>>>
>>> Please sort this out first, since with that amount of failures is not
>>> straightforward to check whether static-pie is really working as intended.
>>
DJ Delorie Sept. 11, 2023, 5:28 p.m. UTC | #12
Darius Rad <darius@bluespec.com> writes:
> Indeed, I don't really do anything unusual, but I do set TIMEOUTFACTOR to
> 10.

Likewise, I set TIMEOUTFACTOR to 20 and do nothing else special.  My
regular board is an unmatched with an M.2 drive to make it as speedy as
possible, but our (RH) policy is to set TIMEOUTFACTOR for all builds,
even x86.
Palmer Dabbelt Sept. 20, 2023, 1:36 p.m. UTC | #13
On Mon, 11 Sep 2023 09:17:22 PDT (-0700), adhemerval.zanella@linaro.org wrote:
>
>
> On 09/09/23 00:30, Palmer Dabbelt wrote:
>> On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com wrote:
>>> I took some time to test the master with binfmt_misc and qemu system mode.
>>> Both of them can't match the requirements (<= 6 failures).
>>>
>>> - most of cases with binfmt_misc fail with abort.
>>
>> QEMU user mode isn't a valid test suite target for glibc, there's lots of failures due to the emulation.  I know it's confusing that riscv-gnu-toolchain uses it, that come up when support was added.
>>
>>> - most of cases with qemu system fail with timed out.
>>
>> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
>>
>>> And also tested with my risc-v board and still fails 70+ cases and most of
>>> them are math accuracy issue.
>>
>> Which board are you running on?
>>
>>> So Adhemerval, do you know how to setup the environment to reproduce
>>> the <= 6 failures ? Maybe I lost some important steps. Thanks very much :).
>>
>> +DJ and Darius, who usually report test results.  They've probably got the best idea of how to set things up, but I don't remember this requiring anything fancy.
>
> For the specific support of static-pie, I expect that qemu-system or even
> qemu-user would be a feasible testing platform.  You might need some
> adjustment if the platform implements some math code in assembly, but
> if you filter out the expected failures it should be doable to check
> the feature is working as intended.

It looks like the HW in question likely has some issues in the FPU, see 
<https://github.com/revyos/revyos/issues/17>.  We'll have to figure 
something out (maybe just disable FP until userspace has ack'd that it 
understands the errata?), but for now it's probably best to just test on 
QEMU.

> However it is hard to filter out if you just specify the number of failure
> before/after without breaking down which tests has failed and why (Was is
> due timeout due emulation? Was it due missing libstdc++.so/libgcc_s.so
> support? Was is math failure due wrong emulation?).
>
>>
>>>
>>> Thanks,
>>> Yanzhang
>>>
>>>> -----Original Message-----
>>>> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
>>>> Sent: Tuesday, August 15, 2023 7:46 PM
>>>> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
>>>> <palmer@dabbelt.com>; libc-alpha@sourceware.org
>>>> Subject: Re: [PATCH] RISC-V: Enable static-pie.
>>>>
>>>>
>>>>
>>>> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
>>>> > Hi Palmer,
>>>> >
>>>> > I have tested the commit(542b110585) with this patch. The results like
>>>> > below,
>>>> >
>>>> > Summary of test results:
>>>> >     189 FAIL
>>>> >    4328 PASS
>>>> >     101 UNSUPPORTED
>>>> >      16 XFAIL
>>>> >       2 XPASS
>>>> >
>>>> > And the commit(542b110585)'s results like below,
>>>> >
>>>> > Summary of test results:
>>>> >     189 FAIL
>>>> >    4326 PASS
>>>> >     101 UNSUPPORTED
>>>> >      16 XFAIL
>>>> >       2 XPASS
>>>> >
>>>> > The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
>>>> >
>>>> > I use the command make check-glibc-linux in riscv-gnu-toolchain. Not
>>>> > sure is that acceptable.
>>>>
>>>> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL for all
>>>> the ABI variants.  The 189 failures you are reporting means that your
>>>> environment is either missing some setup (for instance, copying the
>>>> libgcc_s.so and libstd++.so on the build folder so C++ and tests that
>>>> require pthread_cancel or backtrace works correctly) or it is not
>>>> properly configured.
>>>>
>>>> Please sort this out first, since with that amount of failures is not
>>>> straightforward to check whether static-pie is really working as intended.
>>>
Wang, Yanzhang Sept. 21, 2023, 1:47 p.m. UTC | #14
Thanks for all your comments, Palmer, DJ, Dairus and Adhemerval.
Your suggestions are so helpful to me.

Yes. I also found this issue on GitHub too and the math failures didn't
appear with QEMU system. So it's definitely a hardware bug.

And I found the root cause of almost of the other failures. It's
because I use sshfs not nfs. :( ..

Even though I set a larger TIMEOUTFACTOR as you said, there're still
some timeout failures like below. And seems the timeout is not stable.
Sometimes, nptl/tst-stack4 can pass on lp4a and sometimes not.

  master with qemu-system       master on lp4a                static-pie patch on lp4a     
 ----------------------------- ----------------------------- ----------------------------- 
  resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit  
  nptl/tst-stack4               nptl/tst-stack4               iconvdata/tst-loading        
  libio/tst-fopenloc            libio/tst-fopenloc            localedata/tst-leaks         
  iconvdata/tst-loading         iconvdata/tst-loading         malloc/tst-dynarray-fail     
  localedata/tst-leaks          localedata/tst-leaks          posix/tst-fnmatch            
  malloc/tst-dynarray-fail      malloc/tst-dynarray-fail                                   
  posix/tst-glob-tilde          posix/tst-glob-tilde                                       
  posix/tst-fnmatch             posix/tst-fnmatch                                          

For the FAIL tests, it's like below. The math failures are filtered out
on lp4a and not appear on qemu-system.

  master with qemu-system                         master on lp4a                                  static-pie patch on lp4a                       
 ----------------------------------------------- ----------------------------------------------- ----------------------------------------------- 
  resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-resolv-res_ninit             
  nptl/tst-cancel21-static                        libio/tst-fopenloc-mem                          elf/tst-tls-allocation-failure-static-patched  
  libio/tst-fopenloc-mem                          libio/tst-fopenloc-cmp                          elf/tst-rtld-list-diagnostics                  
  libio/tst-fopenloc-cmp                          elf/tst-tls-allocation-failure-static-patched   elf/tst-sprof-basic                            
  elf/tst-tls-allocation-failure-static-patched   elf/tst-rtld-list-diagnostics                   iconvdata/mtrace-tst-loading                   
  elf/tst-rtld-list-diagnostics                   elf/tst-sprof-basic                             localedata/mtrace-tst-leaks                    
  elf/tst-sprof-basic                             iconvdata/mtrace-tst-loading                    malloc/tst-dynarray-fail-mem                   
  iconvdata/mtrace-tst-loading                    localedata/mtrace-tst-leaks                     posix/tst-fnmatch-mem                          
  localedata/mtrace-tst-leaks                     malloc/tst-dynarray-fail-mem                                                                   
  malloc/tst-dynarray-fail-mem                    posix/tst-glob-tilde-mem                                                                       
  posix/tst-glob-tilde-mem                        posix/tst-fnmatch-mem                                                                          
  posix/tst-fnmatch-mem                                                                                                                          
  posix/globtest                                                                                                                                 

Take master on lp4a as an example,

- elf/tst-rtld-list-diagnostics, due to missing abnf module
- elf/tst-sprof-basic, successfully print hello world but return status is 1, still unknown root cause
- elf/tst-tls-allocation-failure-static-patched, exec format error, still unknown root cause
- the others are memory not freed

The difference between qemu-system and lp4a for master is the two cases,

- nptl/tst-cancel21-static, it said sa_flags = SA_ONSTACK and haven't investigated.
- posix/globtest, because my qemu-system has a different user name.

The XFAILs and XPASSes are the same on all platforms and all branches.
So not list here.

I use the commit 4be913652ca115160bae1daf560170ef8b112ccb of master branch.

So is this the expected test result? Or is there still any case not correct FAIL or PASS?

Thanks,
Yanzhang

> -----Original Message-----
> From: Palmer Dabbelt <palmer@dabbelt.com>
> Sent: Wednesday, September 20, 2023 9:37 PM
> To: adhemerval.zanella@linaro.org
> Cc: Wang, Yanzhang <yanzhang.wang@intel.com>; DJ Delorie <dj@redhat.com>;
> Darius Rad <darius@bluespec.com>; libc-alpha@sourceware.org
> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> 
> On Mon, 11 Sep 2023 09:17:22 PDT (-0700), adhemerval.zanella@linaro.org
> wrote:
> >
> >
> > On 09/09/23 00:30, Palmer Dabbelt wrote:
> >> On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com wrote:
> >>> I took some time to test the master with binfmt_misc and qemu system
> mode.
> >>> Both of them can't match the requirements (<= 6 failures).
> >>>
> >>> - most of cases with binfmt_misc fail with abort.
> >>
> >> QEMU user mode isn't a valid test suite target for glibc, there's lots
> of failures due to the emulation.  I know it's confusing that riscv-gnu-
> toolchain uses it, that come up when support was added.
> >>
> >>> - most of cases with qemu system fail with timed out.
> >>
> >> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
> >>
> >>> And also tested with my risc-v board and still fails 70+ cases and
> >>> most of them are math accuracy issue.
> >>
> >> Which board are you running on?
> >>
> >>> So Adhemerval, do you know how to setup the environment to reproduce
> >>> the <= 6 failures ? Maybe I lost some important steps. Thanks very
> much :).
> >>
> >> +DJ and Darius, who usually report test results.  They've probably got
> the best idea of how to set things up, but I don't remember this requiring
> anything fancy.
> >
> > For the specific support of static-pie, I expect that qemu-system or
> > even qemu-user would be a feasible testing platform.  You might need
> > some adjustment if the platform implements some math code in assembly,
> > but if you filter out the expected failures it should be doable to
> > check the feature is working as intended.
> 
> It looks like the HW in question likely has some issues in the FPU, see
> <https://github.com/revyos/revyos/issues/17>.  We'll have to figure
> something out (maybe just disable FP until userspace has ack'd that it
> understands the errata?), but for now it's probably best to just test on
> QEMU.
> 
> > However it is hard to filter out if you just specify the number of
> > failure before/after without breaking down which tests has failed and
> > why (Was is due timeout due emulation? Was it due missing
> > libstdc++.so/libgcc_s.so support? Was is math failure due wrong
> emulation?).
> >
> >>
> >>>
> >>> Thanks,
> >>> Yanzhang
> >>>
> >>>> -----Original Message-----
> >>>> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> >>>> Sent: Tuesday, August 15, 2023 7:46 PM
> >>>> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
> >>>> <palmer@dabbelt.com>; libc-alpha@sourceware.org
> >>>> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> >>>>
> >>>>
> >>>>
> >>>> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
> >>>> > Hi Palmer,
> >>>> >
> >>>> > I have tested the commit(542b110585) with this patch. The results
> >>>> > like below,
> >>>> >
> >>>> > Summary of test results:
> >>>> >     189 FAIL
> >>>> >    4328 PASS
> >>>> >     101 UNSUPPORTED
> >>>> >      16 XFAIL
> >>>> >       2 XPASS
> >>>> >
> >>>> > And the commit(542b110585)'s results like below,
> >>>> >
> >>>> > Summary of test results:
> >>>> >     189 FAIL
> >>>> >    4326 PASS
> >>>> >     101 UNSUPPORTED
> >>>> >      16 XFAIL
> >>>> >       2 XPASS
> >>>> >
> >>>> > The binutils's commit is 2db20b97f1d and gcc's commit is bf36656a14a.
> >>>> >
> >>>> > I use the command make check-glibc-linux in riscv-gnu-toolchain.
> >>>> > Not sure is that acceptable.
> >>>>
> >>>> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL
> >>>> for all the ABI variants.  The 189 failures you are reporting means
> >>>> that your environment is either missing some setup (for instance,
> >>>> copying the libgcc_s.so and libstd++.so on the build folder so C++
> >>>> and tests that require pthread_cancel or backtrace works correctly)
> >>>> or it is not properly configured.
> >>>>
> >>>> Please sort this out first, since with that amount of failures is
> >>>> not straightforward to check whether static-pie is really working as
> intended.
> >>>
Wang, Yanzhang Oct. 17, 2023, 8:28 a.m. UTC | #15
Hi,

Is there any further comments about this patch?

I have pushed another v2 version someday ago, and there's no conflict now.
https://patchwork.sourceware.org/project/glibc/patch/20230815014434.1902446-1-yanzhang.wang@intel.com/

Thanks,
Yanzhang

> -----Original Message-----
> From: Wang, Yanzhang
> Sent: Thursday, September 21, 2023 9:48 PM
> To: Palmer Dabbelt <palmer@dabbelt.com>; adhemerval.zanella@linaro.org
> Cc: DJ Delorie <dj@redhat.com>; Darius Rad <darius@bluespec.com>; libc-
> alpha@sourceware.org
> Subject: RE: [PATCH] RISC-V: Enable static-pie.
> 
> Thanks for all your comments, Palmer, DJ, Dairus and Adhemerval.
> Your suggestions are so helpful to me.
> 
> Yes. I also found this issue on GitHub too and the math failures didn't
> appear with QEMU system. So it's definitely a hardware bug.
> 
> And I found the root cause of almost of the other failures. It's because I
> use sshfs not nfs. :( ..
> 
> Even though I set a larger TIMEOUTFACTOR as you said, there're still some
> timeout failures like below. And seems the timeout is not stable.
> Sometimes, nptl/tst-stack4 can pass on lp4a and sometimes not.
> 
>   master with qemu-system       master on lp4a                static-pie
> patch on lp4a
>  ----------------------------- ----------------------------- --------------
> ---------------
>   resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit   resolv/tst-
> resolv-res_ninit
>   nptl/tst-stack4               nptl/tst-stack4
> iconvdata/tst-loading
>   libio/tst-fopenloc            libio/tst-fopenloc
> localedata/tst-leaks
>   iconvdata/tst-loading         iconvdata/tst-loading         malloc/tst-
> dynarray-fail
>   localedata/tst-leaks          localedata/tst-leaks          posix/tst-
> fnmatch
>   malloc/tst-dynarray-fail      malloc/tst-dynarray-fail
>   posix/tst-glob-tilde          posix/tst-glob-tilde
>   posix/tst-fnmatch             posix/tst-fnmatch
> 
> For the FAIL tests, it's like below. The math failures are filtered out on
> lp4a and not appear on qemu-system.
> 
>   master with qemu-system                         master on lp4a
> static-pie patch on lp4a
>  ----------------------------------------------- --------------------------
> --------------------- -----------------------------------------------
>   resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-resolv-
> res_ninit              resolv/mtrace-tst-resolv-res_ninit
>   nptl/tst-cancel21-static                        libio/tst-fopenloc-mem
> elf/tst-tls-allocation-failure-static-patched
>   libio/tst-fopenloc-mem                          libio/tst-fopenloc-cmp
> elf/tst-rtld-list-diagnostics
>   libio/tst-fopenloc-cmp                          elf/tst-tls-allocation-
> failure-static-patched   elf/tst-sprof-basic
>   elf/tst-tls-allocation-failure-static-patched   elf/tst-rtld-list-
> diagnostics                   iconvdata/mtrace-tst-loading
>   elf/tst-rtld-list-diagnostics                   elf/tst-sprof-basic
> localedata/mtrace-tst-leaks
>   elf/tst-sprof-basic                             iconvdata/mtrace-tst-
> loading                    malloc/tst-dynarray-fail-mem
>   iconvdata/mtrace-tst-loading                    localedata/mtrace-tst-
> leaks                     posix/tst-fnmatch-mem
>   localedata/mtrace-tst-leaks                     malloc/tst-dynarray-fail-
> mem
>   malloc/tst-dynarray-fail-mem                    posix/tst-glob-tilde-mem
>   posix/tst-glob-tilde-mem                        posix/tst-fnmatch-mem
>   posix/tst-fnmatch-mem
>   posix/globtest
> 
> Take master on lp4a as an example,
> 
> - elf/tst-rtld-list-diagnostics, due to missing abnf module
> - elf/tst-sprof-basic, successfully print hello world but return status is
> 1, still unknown root cause
> - elf/tst-tls-allocation-failure-static-patched, exec format error, still
> unknown root cause
> - the others are memory not freed
> 
> The difference between qemu-system and lp4a for master is the two cases,
> 
> - nptl/tst-cancel21-static, it said sa_flags = SA_ONSTACK and haven't
> investigated.
> - posix/globtest, because my qemu-system has a different user name.
> 
> The XFAILs and XPASSes are the same on all platforms and all branches.
> So not list here.
> 
> I use the commit 4be913652ca115160bae1daf560170ef8b112ccb of master branch.
> 
> So is this the expected test result? Or is there still any case not correct
> FAIL or PASS?
> 
> Thanks,
> Yanzhang
> 
> > -----Original Message-----
> > From: Palmer Dabbelt <palmer@dabbelt.com>
> > Sent: Wednesday, September 20, 2023 9:37 PM
> > To: adhemerval.zanella@linaro.org
> > Cc: Wang, Yanzhang <yanzhang.wang@intel.com>; DJ Delorie
> > <dj@redhat.com>; Darius Rad <darius@bluespec.com>;
> > libc-alpha@sourceware.org
> > Subject: Re: [PATCH] RISC-V: Enable static-pie.
> >
> > On Mon, 11 Sep 2023 09:17:22 PDT (-0700),
> > adhemerval.zanella@linaro.org
> > wrote:
> > >
> > >
> > > On 09/09/23 00:30, Palmer Dabbelt wrote:
> > >> On Fri, 08 Sep 2023 20:17:16 PDT (-0700), yanzhang.wang@intel.com
> wrote:
> > >>> I took some time to test the master with binfmt_misc and qemu
> > >>> system
> > mode.
> > >>> Both of them can't match the requirements (<= 6 failures).
> > >>>
> > >>> - most of cases with binfmt_misc fail with abort.
> > >>
> > >> QEMU user mode isn't a valid test suite target for glibc, there's
> > >> lots
> > of failures due to the emulation.  I know it's confusing that
> > riscv-gnu- toolchain uses it, that come up when support was added.
> > >>
> > >>> - most of cases with qemu system fail with timed out.
> > >>
> > >> You can set TIMEOUTFACTOR, qemu-system is a lot slower than hardware.
> > >>
> > >>> And also tested with my risc-v board and still fails 70+ cases and
> > >>> most of them are math accuracy issue.
> > >>
> > >> Which board are you running on?
> > >>
> > >>> So Adhemerval, do you know how to setup the environment to
> > >>> reproduce the <= 6 failures ? Maybe I lost some important steps.
> > >>> Thanks very
> > much :).
> > >>
> > >> +DJ and Darius, who usually report test results.  They've probably
> > >> +got
> > the best idea of how to set things up, but I don't remember this
> > requiring anything fancy.
> > >
> > > For the specific support of static-pie, I expect that qemu-system or
> > > even qemu-user would be a feasible testing platform.  You might need
> > > some adjustment if the platform implements some math code in
> > > assembly, but if you filter out the expected failures it should be
> > > doable to check the feature is working as intended.
> >
> > It looks like the HW in question likely has some issues in the FPU,
> > see <https://github.com/revyos/revyos/issues/17>.  We'll have to
> > figure something out (maybe just disable FP until userspace has ack'd
> > that it understands the errata?), but for now it's probably best to
> > just test on QEMU.
> >
> > > However it is hard to filter out if you just specify the number of
> > > failure before/after without breaking down which tests has failed
> > > and why (Was is due timeout due emulation? Was it due missing
> > > libstdc++.so/libgcc_s.so support? Was is math failure due wrong
> > emulation?).
> > >
> > >>
> > >>>
> > >>> Thanks,
> > >>> Yanzhang
> > >>>
> > >>>> -----Original Message-----
> > >>>> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> > >>>> Sent: Tuesday, August 15, 2023 7:46 PM
> > >>>> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
> > >>>> <palmer@dabbelt.com>; libc-alpha@sourceware.org
> > >>>> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> > >>>>
> > >>>>
> > >>>>
> > >>>> On 13/08/23 09:20, Wang, Yanzhang via Libc-alpha wrote:
> > >>>> > Hi Palmer,
> > >>>> >
> > >>>> > I have tested the commit(542b110585) with this patch. The
> > >>>> > results like below,
> > >>>> >
> > >>>> > Summary of test results:
> > >>>> >     189 FAIL
> > >>>> >    4328 PASS
> > >>>> >     101 UNSUPPORTED
> > >>>> >      16 XFAIL
> > >>>> >       2 XPASS
> > >>>> >
> > >>>> > And the commit(542b110585)'s results like below,
> > >>>> >
> > >>>> > Summary of test results:
> > >>>> >     189 FAIL
> > >>>> >    4326 PASS
> > >>>> >     101 UNSUPPORTED
> > >>>> >      16 XFAIL
> > >>>> >       2 XPASS
> > >>>> >
> > >>>> > The binutils's commit is 2db20b97f1d and gcc's commit is
> bf36656a14a.
> > >>>> >
> > >>>> > I use the command make check-glibc-linux in riscv-gnu-toolchain.
> > >>>> > Not sure is that acceptable.
> > >>>>
> > >>>> The riscv reports for 2.38 release [1] list at maximum of 6 FAIL
> > >>>> for all the ABI variants.  The 189 failures you are reporting
> > >>>> means that your environment is either missing some setup (for
> > >>>> instance, copying the libgcc_s.so and libstd++.so on the build
> > >>>> folder so C++ and tests that require pthread_cancel or backtrace
> > >>>> works correctly) or it is not properly configured.
> > >>>>
> > >>>> Please sort this out first, since with that amount of failures is
> > >>>> not straightforward to check whether static-pie is really working
> > >>>> as
> > intended.
> > >>>
Adhemerval Zanella Netto Oct. 17, 2023, 1:42 p.m. UTC | #16
On 21/09/23 10:47, Wang, Yanzhang wrote:
> Thanks for all your comments, Palmer, DJ, Dairus and Adhemerval.
> Your suggestions are so helpful to me.
> 
> Yes. I also found this issue on GitHub too and the math failures didn't
> appear with QEMU system. So it's definitely a hardware bug.
> 
> And I found the root cause of almost of the other failures. It's
> because I use sshfs not nfs. :( ..

I don't have access to RISCV hardware, but we can use the 2.38 release [1] as the
baseline [1].

> 
> Even though I set a larger TIMEOUTFACTOR as you said, there're still
> some timeout failures like below. And seems the timeout is not stable.
> Sometimes, nptl/tst-stack4 can pass on lp4a and sometimes not.

The tst-stack4 was a long standing issue that should be fixed on master [2].

> 
>   master with qemu-system       master on lp4a                static-pie patch on lp4a     
>  ----------------------------- ----------------------------- ----------------------------- 
>   resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit  
>   nptl/tst-stack4               nptl/tst-stack4               iconvdata/tst-loading        
>   libio/tst-fopenloc            libio/tst-fopenloc            localedata/tst-leaks         
>   iconvdata/tst-loading         iconvdata/tst-loading         malloc/tst-dynarray-fail     
>   localedata/tst-leaks          localedata/tst-leaks          posix/tst-fnmatch            
>   malloc/tst-dynarray-fail      malloc/tst-dynarray-fail                                   
>   posix/tst-glob-tilde          posix/tst-glob-tilde                                       
>   posix/tst-fnmatch             posix/tst-fnmatch   


For static-pie I would focus on the *static* tests and check for any regressions.
On the above, all are dynamic and most likely the timeout you have found are due
a low TIMEOUTFACTOR value.

You can check by testing each one individually: 

$ TIMEOUTFACTOR=100 make test t=<test> # for instance, posix/tst-fnmatch
                                       
> 
> For the FAIL tests, it's like below. The math failures are filtered out
> on lp4a and not appear on qemu-system.
> 
>   master with qemu-system                         master on lp4a                                  static-pie patch on lp4a                       
>  ----------------------------------------------- ----------------------------------------------- ----------------------------------------------- 
>   resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-resolv-res_ninit             
>   nptl/tst-cancel21-static                        libio/tst-fopenloc-mem                          elf/tst-tls-allocation-failure-static-patched  
>   libio/tst-fopenloc-mem                          libio/tst-fopenloc-cmp                          elf/tst-rtld-list-diagnostics                  
>   libio/tst-fopenloc-cmp                          elf/tst-tls-allocation-failure-static-patched   elf/tst-sprof-basic                            
>   elf/tst-tls-allocation-failure-static-patched   elf/tst-rtld-list-diagnostics                   iconvdata/mtrace-tst-loading                   
>   elf/tst-rtld-list-diagnostics                   elf/tst-sprof-basic                             localedata/mtrace-tst-leaks                    
>   elf/tst-sprof-basic                             iconvdata/mtrace-tst-loading                    malloc/tst-dynarray-fail-mem                   
>   iconvdata/mtrace-tst-loading                    localedata/mtrace-tst-leaks                     posix/tst-fnmatch-mem                          
>   localedata/mtrace-tst-leaks                     malloc/tst-dynarray-fail-mem                                                                   
>   malloc/tst-dynarray-fail-mem                    posix/tst-glob-tilde-mem                                                                       
>   posix/tst-glob-tilde-mem                        posix/tst-fnmatch-mem                                                                          
>   posix/tst-fnmatch-mem                                                                                                                          
>   posix/globtest          

The elf/tst-sprof-basic seems to be a know issue based on 2.38 release wiki, and
most of them seems also for related to the low TIMEOUTFACTOR.
                                                                                                                       
> 
> Take master on lp4a as an example,
> 
> - elf/tst-rtld-list-diagnostics, due to missing abnf module
> - elf/tst-sprof-basic, successfully print hello world but return status is 1, still unknown root cause
> - elf/tst-tls-allocation-failure-static-patched, exec format error, still unknown root cause

This seems to be a real regression, and I think you should sort this out before
the patch is installed (I see no failure on qemu-user on master).  The exec format
error seems to come from kernel, due the execve failure; and might a corrupted
binary.

> - the others are memory not freed
> 
> The difference between qemu-system and lp4a for master is the two cases,
> 
> - nptl/tst-cancel21-static, it said sa_flags = SA_ONSTACK and haven't investigated.

This might a unrelated issue [3], either in compiler optimization or due the
the long-standing BZ#12683 issue.

> - posix/globtest, because my qemu-system has a different user name.
> 
> The XFAILs and XPASSes are the same on all platforms and all branches.
> So not list here.
> 
> I use the commit 4be913652ca115160bae1daf560170ef8b112ccb of master branch.
> 
> So is this the expected test result? Or is there still any case not correct FAIL or PASS?

I think the output look pretty ok, the only issue being the
elf/tst-tls-allocation-failure-static failure.


[1] https://sourceware.org/glibc/wiki/Release/2.38#RISC-V_.28rv64imac.2Flp64.29
[2] https://sourceware.org/bugzilla/show_bug.cgi?id=19329
[3] https://sourceware.org/pipermail/libc-alpha/2019-September/106641.html
Wang, Yanzhang Oct. 24, 2023, 5:59 a.m. UTC | #17
Hi Adhemerval,

Thanks for your comments and the two categories cases are all figured out.

For the timed out cases, I increased the TIMEOUTFACTOR to 300 and finally passed.
The 100 seems not working very well.

For the case elf/tst-tls-allocation-failure-static-patched, the root cause is it
does not run with test-wrapper. I use cross-test-ssh.sh to run tests on the board.
If no test-wrapper, it will run on my local. That's why exec format error.

I need to apply another patch below. Is it acceptable or I need to run the full
test directly on board?

diff --git a/elf/Makefile b/elf/Makefile
index 9176cbf1e3..1065b5c123 100644
--- a/elf/Makefile
+++ b/elf/Makefile
@@ -2979,7 +2979,7 @@ $(objpfx)tst-tls-allocation-failure-static-patched: \

 $(objpfx)tst-tls-allocation-failure-static-patched.out: \
   $(objpfx)tst-tls-allocation-failure-static-patched
-       $< > $@ 2>&1; echo "status: $$?" >> $@
+       $(test-wrapper) $< > $@ 2>&1; echo "status: $$?" >> $@
        grep -q '^Fatal glibc error: Cannot allocate TLS block$$' $@ \
          && grep -q '^status: 127$$' $@; \
          $(evaluate-test)
 

> -----Original Message-----
> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> Sent: Tuesday, October 17, 2023 9:42 PM
> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
> <palmer@dabbelt.com>
> Cc: DJ Delorie <dj@redhat.com>; Darius Rad <darius@bluespec.com>; libc-
> alpha@sourceware.org
> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> 
> 
> 
> On 21/09/23 10:47, Wang, Yanzhang wrote:
> > Thanks for all your comments, Palmer, DJ, Dairus and Adhemerval.
> > Your suggestions are so helpful to me.
> >
> > Yes. I also found this issue on GitHub too and the math failures
> > didn't appear with QEMU system. So it's definitely a hardware bug.
> >
> > And I found the root cause of almost of the other failures. It's
> > because I use sshfs not nfs. :( ..
> 
> I don't have access to RISCV hardware, but we can use the 2.38 release [1]
> as the baseline [1].
> 
> >
> > Even though I set a larger TIMEOUTFACTOR as you said, there're still
> > some timeout failures like below. And seems the timeout is not stable.
> > Sometimes, nptl/tst-stack4 can pass on lp4a and sometimes not.
> 
> The tst-stack4 was a long standing issue that should be fixed on master
> [2].
> 
> >
> >   master with qemu-system       master on lp4a                static-
> pie patch on lp4a
> >  ----------------------------- ----------------------------- ----------
> -------------------
> >   resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit
> resolv/tst-resolv-res_ninit
> >   nptl/tst-stack4               nptl/tst-stack4
> iconvdata/tst-loading
> >   libio/tst-fopenloc            libio/tst-fopenloc
> localedata/tst-leaks
> >   iconvdata/tst-loading         iconvdata/tst-loading
> malloc/tst-dynarray-fail
> >   localedata/tst-leaks          localedata/tst-leaks
> posix/tst-fnmatch
> >   malloc/tst-dynarray-fail      malloc/tst-dynarray-fail
> >   posix/tst-glob-tilde          posix/tst-glob-tilde
> >   posix/tst-fnmatch             posix/tst-fnmatch
> 
> 
> For static-pie I would focus on the *static* tests and check for any
> regressions.
> On the above, all are dynamic and most likely the timeout you have found
> are due a low TIMEOUTFACTOR value.
> 
> You can check by testing each one individually:
> 
> $ TIMEOUTFACTOR=100 make test t=<test> # for instance, posix/tst-fnmatch
> 
> >
> > For the FAIL tests, it's like below. The math failures are filtered
> > out on lp4a and not appear on qemu-system.
> >
> >   master with qemu-system                         master on lp4a
> static-pie patch on lp4a
> >  ----------------------------------------------- ----------------------
> ------------------------- -----------------------------------------------
> >   resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-
> resolv-res_ninit              resolv/mtrace-tst-resolv-res_ninit
> >   nptl/tst-cancel21-static                        libio/tst-fopenloc-
> mem                          elf/tst-tls-allocation-failure-static-
> patched
> >   libio/tst-fopenloc-mem                          libio/tst-fopenloc-
> cmp                          elf/tst-rtld-list-diagnostics
> >   libio/tst-fopenloc-cmp                          elf/tst-tls-
> allocation-failure-static-patched   elf/tst-sprof-basic
> >   elf/tst-tls-allocation-failure-static-patched   elf/tst-rtld-list-
> diagnostics                   iconvdata/mtrace-tst-loading
> >   elf/tst-rtld-list-diagnostics                   elf/tst-sprof-basic
> localedata/mtrace-tst-leaks
> >   elf/tst-sprof-basic                             iconvdata/mtrace-tst-
> loading                    malloc/tst-dynarray-fail-mem
> >   iconvdata/mtrace-tst-loading                    localedata/mtrace-
> tst-leaks                     posix/tst-fnmatch-mem
> >   localedata/mtrace-tst-leaks                     malloc/tst-dynarray-
> fail-mem
> >   malloc/tst-dynarray-fail-mem                    posix/tst-glob-tilde-
> mem
> >   posix/tst-glob-tilde-mem                        posix/tst-fnmatch-mem
> >   posix/tst-fnmatch-mem
> >   posix/globtest
> 
> The elf/tst-sprof-basic seems to be a know issue based on 2.38 release
> wiki, and most of them seems also for related to the low TIMEOUTFACTOR.
> 
> >
> > Take master on lp4a as an example,
> >
> > - elf/tst-rtld-list-diagnostics, due to missing abnf module
> > - elf/tst-sprof-basic, successfully print hello world but return
> > status is 1, still unknown root cause
> > - elf/tst-tls-allocation-failure-static-patched, exec format error,
> > still unknown root cause
> 
> This seems to be a real regression, and I think you should sort this out
> before the patch is installed (I see no failure on qemu-user on master).
> The exec format error seems to come from kernel, due the execve failure;
> and might a corrupted binary.
> 
> > - the others are memory not freed
> >
> > The difference between qemu-system and lp4a for master is the two
> > cases,
> >
> > - nptl/tst-cancel21-static, it said sa_flags = SA_ONSTACK and haven't
> investigated.
> 
> This might a unrelated issue [3], either in compiler optimization or due
> the the long-standing BZ#12683 issue.
> 
> > - posix/globtest, because my qemu-system has a different user name.
> >
> > The XFAILs and XPASSes are the same on all platforms and all branches.
> > So not list here.
> >
> > I use the commit 4be913652ca115160bae1daf560170ef8b112ccb of master
> branch.
> >
> > So is this the expected test result? Or is there still any case not
> correct FAIL or PASS?
> 
> I think the output look pretty ok, the only issue being the elf/tst-tls-
> allocation-failure-static failure.
> 
> 
> [1] https://sourceware.org/glibc/wiki/Release/2.38#RISC-
> V_.28rv64imac.2Flp64.29
> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=19329
> [3] https://sourceware.org/pipermail/libc-alpha/2019-
> September/106641.html
Adhemerval Zanella Netto Oct. 24, 2023, 11:39 a.m. UTC | #18
On 24/10/23 02:59, Wang, Yanzhang wrote:
> Hi Adhemerval,
> 
> Thanks for your comments and the two categories cases are all figured out.
> 
> For the timed out cases, I increased the TIMEOUTFACTOR to 300 and finally passed.
> The 100 seems not working very well.

Excellent.

> 
> For the case elf/tst-tls-allocation-failure-static-patched, the root cause is it
> does not run with test-wrapper. I use cross-test-ssh.sh to run tests on the board.
> If no test-wrapper, it will run on my local. That's why exec format error.

Alright, so I don't have any further remarks for this patch.

> 
> I need to apply another patch below. Is it acceptable or I need to run the full
> test directly on board?
> 
> diff --git a/elf/Makefile b/elf/Makefile
> index 9176cbf1e3..1065b5c123 100644
> --- a/elf/Makefile
> +++ b/elf/Makefile
> @@ -2979,7 +2979,7 @@ $(objpfx)tst-tls-allocation-failure-static-patched: \
> 
>  $(objpfx)tst-tls-allocation-failure-static-patched.out: \
>    $(objpfx)tst-tls-allocation-failure-static-patched
> -       $< > $@ 2>&1; echo "status: $$?" >> $@
> +       $(test-wrapper) $< > $@ 2>&1; echo "status: $$?" >> $@
>         grep -q '^Fatal glibc error: Cannot allocate TLS block$$' $@ \
>           && grep -q '^status: 127$$' $@; \
>           $(evaluate-test)

This would require to be in a different path, could you send it as well?

>  
> 
>> -----Original Message-----
>> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
>> Sent: Tuesday, October 17, 2023 9:42 PM
>> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
>> <palmer@dabbelt.com>
>> Cc: DJ Delorie <dj@redhat.com>; Darius Rad <darius@bluespec.com>; libc-
>> alpha@sourceware.org
>> Subject: Re: [PATCH] RISC-V: Enable static-pie.
>>
>>
>>
>> On 21/09/23 10:47, Wang, Yanzhang wrote:
>>> Thanks for all your comments, Palmer, DJ, Dairus and Adhemerval.
>>> Your suggestions are so helpful to me.
>>>
>>> Yes. I also found this issue on GitHub too and the math failures
>>> didn't appear with QEMU system. So it's definitely a hardware bug.
>>>
>>> And I found the root cause of almost of the other failures. It's
>>> because I use sshfs not nfs. :( ..
>>
>> I don't have access to RISCV hardware, but we can use the 2.38 release [1]
>> as the baseline [1].
>>
>>>
>>> Even though I set a larger TIMEOUTFACTOR as you said, there're still
>>> some timeout failures like below. And seems the timeout is not stable.
>>> Sometimes, nptl/tst-stack4 can pass on lp4a and sometimes not.
>>
>> The tst-stack4 was a long standing issue that should be fixed on master
>> [2].
>>
>>>
>>>   master with qemu-system       master on lp4a                static-
>> pie patch on lp4a
>>>  ----------------------------- ----------------------------- ----------
>> -------------------
>>>   resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit
>> resolv/tst-resolv-res_ninit
>>>   nptl/tst-stack4               nptl/tst-stack4
>> iconvdata/tst-loading
>>>   libio/tst-fopenloc            libio/tst-fopenloc
>> localedata/tst-leaks
>>>   iconvdata/tst-loading         iconvdata/tst-loading
>> malloc/tst-dynarray-fail
>>>   localedata/tst-leaks          localedata/tst-leaks
>> posix/tst-fnmatch
>>>   malloc/tst-dynarray-fail      malloc/tst-dynarray-fail
>>>   posix/tst-glob-tilde          posix/tst-glob-tilde
>>>   posix/tst-fnmatch             posix/tst-fnmatch
>>
>>
>> For static-pie I would focus on the *static* tests and check for any
>> regressions.
>> On the above, all are dynamic and most likely the timeout you have found
>> are due a low TIMEOUTFACTOR value.
>>
>> You can check by testing each one individually:
>>
>> $ TIMEOUTFACTOR=100 make test t=<test> # for instance, posix/tst-fnmatch
>>
>>>
>>> For the FAIL tests, it's like below. The math failures are filtered
>>> out on lp4a and not appear on qemu-system.
>>>
>>>   master with qemu-system                         master on lp4a
>> static-pie patch on lp4a
>>>  ----------------------------------------------- ----------------------
>> ------------------------- -----------------------------------------------
>>>   resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-
>> resolv-res_ninit              resolv/mtrace-tst-resolv-res_ninit
>>>   nptl/tst-cancel21-static                        libio/tst-fopenloc-
>> mem                          elf/tst-tls-allocation-failure-static-
>> patched
>>>   libio/tst-fopenloc-mem                          libio/tst-fopenloc-
>> cmp                          elf/tst-rtld-list-diagnostics
>>>   libio/tst-fopenloc-cmp                          elf/tst-tls-
>> allocation-failure-static-patched   elf/tst-sprof-basic
>>>   elf/tst-tls-allocation-failure-static-patched   elf/tst-rtld-list-
>> diagnostics                   iconvdata/mtrace-tst-loading
>>>   elf/tst-rtld-list-diagnostics                   elf/tst-sprof-basic
>> localedata/mtrace-tst-leaks
>>>   elf/tst-sprof-basic                             iconvdata/mtrace-tst-
>> loading                    malloc/tst-dynarray-fail-mem
>>>   iconvdata/mtrace-tst-loading                    localedata/mtrace-
>> tst-leaks                     posix/tst-fnmatch-mem
>>>   localedata/mtrace-tst-leaks                     malloc/tst-dynarray-
>> fail-mem
>>>   malloc/tst-dynarray-fail-mem                    posix/tst-glob-tilde-
>> mem
>>>   posix/tst-glob-tilde-mem                        posix/tst-fnmatch-mem
>>>   posix/tst-fnmatch-mem
>>>   posix/globtest
>>
>> The elf/tst-sprof-basic seems to be a know issue based on 2.38 release
>> wiki, and most of them seems also for related to the low TIMEOUTFACTOR.
>>
>>>
>>> Take master on lp4a as an example,
>>>
>>> - elf/tst-rtld-list-diagnostics, due to missing abnf module
>>> - elf/tst-sprof-basic, successfully print hello world but return
>>> status is 1, still unknown root cause
>>> - elf/tst-tls-allocation-failure-static-patched, exec format error,
>>> still unknown root cause
>>
>> This seems to be a real regression, and I think you should sort this out
>> before the patch is installed (I see no failure on qemu-user on master).
>> The exec format error seems to come from kernel, due the execve failure;
>> and might a corrupted binary.
>>
>>> - the others are memory not freed
>>>
>>> The difference between qemu-system and lp4a for master is the two
>>> cases,
>>>
>>> - nptl/tst-cancel21-static, it said sa_flags = SA_ONSTACK and haven't
>> investigated.
>>
>> This might a unrelated issue [3], either in compiler optimization or due
>> the the long-standing BZ#12683 issue.
>>
>>> - posix/globtest, because my qemu-system has a different user name.
>>>
>>> The XFAILs and XPASSes are the same on all platforms and all branches.
>>> So not list here.
>>>
>>> I use the commit 4be913652ca115160bae1daf560170ef8b112ccb of master
>> branch.
>>>
>>> So is this the expected test result? Or is there still any case not
>> correct FAIL or PASS?
>>
>> I think the output look pretty ok, the only issue being the elf/tst-tls-
>> allocation-failure-static failure.
>>
>>
>> [1] https://sourceware.org/glibc/wiki/Release/2.38#RISC-
>> V_.28rv64imac.2Flp64.29
>> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=19329
>> [3] https://sourceware.org/pipermail/libc-alpha/2019-
>> September/106641.html
Wang, Yanzhang Oct. 26, 2023, 3:30 a.m. UTC | #19
Of cause. I have sent it just now and cc you too.

> -----Original Message-----
> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> Sent: Tuesday, October 24, 2023 7:39 PM
> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
> <palmer@dabbelt.com>
> Cc: DJ Delorie <dj@redhat.com>; Darius Rad <darius@bluespec.com>; libc-
> alpha@sourceware.org
> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> 
> 
> 
> On 24/10/23 02:59, Wang, Yanzhang wrote:
> > Hi Adhemerval,
> >
> > Thanks for your comments and the two categories cases are all figured
> out.
> >
> > For the timed out cases, I increased the TIMEOUTFACTOR to 300 and
> finally passed.
> > The 100 seems not working very well.
> 
> Excellent.
> 
> >
> > For the case elf/tst-tls-allocation-failure-static-patched, the root
> > cause is it does not run with test-wrapper. I use cross-test-ssh.sh to
> run tests on the board.
> > If no test-wrapper, it will run on my local. That's why exec format
> error.
> 
> Alright, so I don't have any further remarks for this patch.
> 
> >
> > I need to apply another patch below. Is it acceptable or I need to run
> > the full test directly on board?
> >
> > diff --git a/elf/Makefile b/elf/Makefile index 9176cbf1e3..1065b5c123
> > 100644
> > --- a/elf/Makefile
> > +++ b/elf/Makefile
> > @@ -2979,7 +2979,7 @@
> > $(objpfx)tst-tls-allocation-failure-static-patched: \
> >
> >  $(objpfx)tst-tls-allocation-failure-static-patched.out: \
> >    $(objpfx)tst-tls-allocation-failure-static-patched
> > -       $< > $@ 2>&1; echo "status: $$?" >> $@
> > +       $(test-wrapper) $< > $@ 2>&1; echo "status: $$?" >> $@
> >         grep -q '^Fatal glibc error: Cannot allocate TLS block$$' $@ \
> >           && grep -q '^status: 127$$' $@; \
> >           $(evaluate-test)
> 
> This would require to be in a different path, could you send it as well?
> 
> >
> >
> >> -----Original Message-----
> >> From: Adhemerval Zanella Netto <adhemerval.zanella@linaro.org>
> >> Sent: Tuesday, October 17, 2023 9:42 PM
> >> To: Wang, Yanzhang <yanzhang.wang@intel.com>; Palmer Dabbelt
> >> <palmer@dabbelt.com>
> >> Cc: DJ Delorie <dj@redhat.com>; Darius Rad <darius@bluespec.com>;
> >> libc- alpha@sourceware.org
> >> Subject: Re: [PATCH] RISC-V: Enable static-pie.
> >>
> >>
> >>
> >> On 21/09/23 10:47, Wang, Yanzhang wrote:
> >>> Thanks for all your comments, Palmer, DJ, Dairus and Adhemerval.
> >>> Your suggestions are so helpful to me.
> >>>
> >>> Yes. I also found this issue on GitHub too and the math failures
> >>> didn't appear with QEMU system. So it's definitely a hardware bug.
> >>>
> >>> And I found the root cause of almost of the other failures. It's
> >>> because I use sshfs not nfs. :( ..
> >>
> >> I don't have access to RISCV hardware, but we can use the 2.38
> >> release [1] as the baseline [1].
> >>
> >>>
> >>> Even though I set a larger TIMEOUTFACTOR as you said, there're still
> >>> some timeout failures like below. And seems the timeout is not stable.
> >>> Sometimes, nptl/tst-stack4 can pass on lp4a and sometimes not.
> >>
> >> The tst-stack4 was a long standing issue that should be fixed on
> >> master [2].
> >>
> >>>
> >>>   master with qemu-system       master on lp4a                static-
> >> pie patch on lp4a
> >>>  ----------------------------- -----------------------------
> >>> ----------
> >> -------------------
> >>>   resolv/tst-resolv-res_ninit   resolv/tst-resolv-res_ninit
> >> resolv/tst-resolv-res_ninit
> >>>   nptl/tst-stack4               nptl/tst-stack4
> >> iconvdata/tst-loading
> >>>   libio/tst-fopenloc            libio/tst-fopenloc
> >> localedata/tst-leaks
> >>>   iconvdata/tst-loading         iconvdata/tst-loading
> >> malloc/tst-dynarray-fail
> >>>   localedata/tst-leaks          localedata/tst-leaks
> >> posix/tst-fnmatch
> >>>   malloc/tst-dynarray-fail      malloc/tst-dynarray-fail
> >>>   posix/tst-glob-tilde          posix/tst-glob-tilde
> >>>   posix/tst-fnmatch             posix/tst-fnmatch
> >>
> >>
> >> For static-pie I would focus on the *static* tests and check for any
> >> regressions.
> >> On the above, all are dynamic and most likely the timeout you have
> >> found are due a low TIMEOUTFACTOR value.
> >>
> >> You can check by testing each one individually:
> >>
> >> $ TIMEOUTFACTOR=100 make test t=<test> # for instance,
> >> posix/tst-fnmatch
> >>
> >>>
> >>> For the FAIL tests, it's like below. The math failures are filtered
> >>> out on lp4a and not appear on qemu-system.
> >>>
> >>>   master with qemu-system                         master on lp4a
> >> static-pie patch on lp4a
> >>>  -----------------------------------------------
> >>> ----------------------
> >> -------------------------
> >> -----------------------------------------------
> >>>   resolv/mtrace-tst-resolv-res_ninit              resolv/mtrace-tst-
> >> resolv-res_ninit              resolv/mtrace-tst-resolv-res_ninit
> >>>   nptl/tst-cancel21-static                        libio/tst-fopenloc-
> >> mem                          elf/tst-tls-allocation-failure-static-
> >> patched
> >>>   libio/tst-fopenloc-mem                          libio/tst-fopenloc-
> >> cmp                          elf/tst-rtld-list-diagnostics
> >>>   libio/tst-fopenloc-cmp                          elf/tst-tls-
> >> allocation-failure-static-patched   elf/tst-sprof-basic
> >>>   elf/tst-tls-allocation-failure-static-patched   elf/tst-rtld-list-
> >> diagnostics                   iconvdata/mtrace-tst-loading
> >>>   elf/tst-rtld-list-diagnostics                   elf/tst-sprof-basic
> >> localedata/mtrace-tst-leaks
> >>>   elf/tst-sprof-basic                             iconvdata/mtrace-
> tst-
> >> loading                    malloc/tst-dynarray-fail-mem
> >>>   iconvdata/mtrace-tst-loading                    localedata/mtrace-
> >> tst-leaks                     posix/tst-fnmatch-mem
> >>>   localedata/mtrace-tst-leaks                     malloc/tst-
> dynarray-
> >> fail-mem
> >>>   malloc/tst-dynarray-fail-mem                    posix/tst-glob-
> tilde-
> >> mem
> >>>   posix/tst-glob-tilde-mem                        posix/tst-fnmatch-
> mem
> >>>   posix/tst-fnmatch-mem
> >>>   posix/globtest
> >>
> >> The elf/tst-sprof-basic seems to be a know issue based on 2.38
> >> release wiki, and most of them seems also for related to the low
> TIMEOUTFACTOR.
> >>
> >>>
> >>> Take master on lp4a as an example,
> >>>
> >>> - elf/tst-rtld-list-diagnostics, due to missing abnf module
> >>> - elf/tst-sprof-basic, successfully print hello world but return
> >>> status is 1, still unknown root cause
> >>> - elf/tst-tls-allocation-failure-static-patched, exec format error,
> >>> still unknown root cause
> >>
> >> This seems to be a real regression, and I think you should sort this
> >> out before the patch is installed (I see no failure on qemu-user on
> master).
> >> The exec format error seems to come from kernel, due the execve
> >> failure; and might a corrupted binary.
> >>
> >>> - the others are memory not freed
> >>>
> >>> The difference between qemu-system and lp4a for master is the two
> >>> cases,
> >>>
> >>> - nptl/tst-cancel21-static, it said sa_flags = SA_ONSTACK and
> >>> haven't
> >> investigated.
> >>
> >> This might a unrelated issue [3], either in compiler optimization or
> >> due the the long-standing BZ#12683 issue.
> >>
> >>> - posix/globtest, because my qemu-system has a different user name.
> >>>
> >>> The XFAILs and XPASSes are the same on all platforms and all branches.
> >>> So not list here.
> >>>
> >>> I use the commit 4be913652ca115160bae1daf560170ef8b112ccb of master
> >> branch.
> >>>
> >>> So is this the expected test result? Or is there still any case not
> >> correct FAIL or PASS?
> >>
> >> I think the output look pretty ok, the only issue being the
> >> elf/tst-tls- allocation-failure-static failure.
> >>
> >>
> >> [1] https://sourceware.org/glibc/wiki/Release/2.38#RISC-
> >> V_.28rv64imac.2Flp64.29
> >> [2] https://sourceware.org/bugzilla/show_bug.cgi?id=19329
> >> [3] https://sourceware.org/pipermail/libc-alpha/2019-
> >> September/106641.html
diff mbox series

Patch

diff --git a/sysdeps/riscv/configure b/sysdeps/riscv/configure
index 2372225a26..340163779f 100644
--- a/sysdeps/riscv/configure
+++ b/sysdeps/riscv/configure
@@ -29,3 +29,5 @@  fi
 $as_echo "$libc_cv_riscv_r_align" >&6; }
 config_vars="$config_vars
 riscv-r-align = $libc_cv_riscv_r_align"
+
+$as_echo "#define SUPPORT_STATIC_PIE 1" >>confdefs.h
diff --git a/sysdeps/riscv/configure.ac b/sysdeps/riscv/configure.ac
index dbcc216689..36da2b5396 100644
--- a/sysdeps/riscv/configure.ac
+++ b/sysdeps/riscv/configure.ac
@@ -16,3 +16,6 @@  EOF
   fi
   rm -rf conftest.*])
 LIBC_CONFIG_VAR([riscv-r-align], [$libc_cv_riscv_r_align])
+
+dnl Static PIE is supported.
+AC_DEFINE(SUPPORT_STATIC_PIE)
diff --git a/sysdeps/riscv/dl-machine.h b/sysdeps/riscv/dl-machine.h
index c0c9bd93ad..ad875c0828 100644
--- a/sysdeps/riscv/dl-machine.h
+++ b/sysdeps/riscv/dl-machine.h
@@ -323,7 +323,7 @@  elf_machine_runtime_setup (struct link_map *l, struct r_scope_elem *scope[],
       gotplt[1] = (ElfW(Addr)) l;
     }
 
-  if (l->l_type == lt_executable)
+  if (l->l_type == lt_executable && l->l_scope != NULL)
     {
       /* The __global_pointer$ may not be defined by the linker if the
 	 $gp register does not be used to access the global variable