Message ID | 1486598532.2866.66.camel@caviumnetworks.com |
---|---|
State | New |
Headers | show |
On 09/02/17 00:02, Steve Ellcey wrote: > +static inline void __attribute__ ((unused)) > +dl_platform_init (void) > +{ > + if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0') > + /* Avoid an empty string which would disturb us. */ > + GLRO(dl_platform) = NULL; > + > +#ifdef SHARED > + /* init_cpu_features has been called early from __libc_start_main in > + static executable. */ > + init_cpu_features (&GLRO(dl_aarch64_cpu_features)); > +#endif > +} ... > +static inline void > +init_cpu_features (struct cpu_features *cpu_features) > +{ > + if (GLRO(dl_hwcap) & HWCAP_CPUID) > + { > + register uint64_t id = 0; > + asm volatile ("mrs %0, midr_el1" : "=r"(id)); > + cpu_features->midr_el1 = id; this is a trap into the kernel at every process startup since this is called very early (dynamic linking case above, static linking case below) i wonder if there could be a way for the user to request midr_el1==0 unconditionally (avoiding the overhead and making sure the most generic implementation is used) is there something like that on other targets? > + } > + else > + { > + cpu_features->midr_el1 = 0; > + } > +} ... > +#ifdef SHARED > +# include <csu/libc-start.c> > +# else > +/* The main work is done in the generic function. */ > +# define LIBC_START_DISABLE_INLINE > +# define LIBC_START_MAIN generic_start_main > +# include <csu/libc-start.c> > +# include <cpu-features.c> > + > +extern struct cpu_features _dl_aarch64_cpu_features; > + > +int > +__libc_start_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), > + int argc, char **argv, > + __typeof (main) init, > + void (*fini) (void), > + void (*rtld_fini) (void), void *stack_end) > +{ > + init_cpu_features (&_dl_aarch64_cpu_features); > + return generic_start_main (main, argc, argv, init, fini, rtld_fini, > + stack_end); > +} > +#endif >
On Thursday 09 February 2017 04:20 PM, Szabolcs Nagy wrote: > this is a trap into the kernel at every process startup > > since this is called very early (dynamic linking case > above, static linking case below) i wonder if there > could be a way for the user to request midr_el1==0 > unconditionally (avoiding the overhead and making > sure the most generic implementation is used) Well you could use tunables to avoid it, but then if a single trap is a problem for you then the tunables infra is going to be just as expensive. Why is a single trap at startup such a concern though? > is there something like that on other targets? H.J. Lu had a patch to override IFUNCs using (what will now be) a tunable but that patch did not make progress. I hope it will now since I too am interested in overriding IFUNC selection using tunables. But then this would be an orthogonal discussion to avoiding the trap since the goals of both are different. Siddhesh
On 09/02/17 11:04, Siddhesh Poyarekar wrote: > On Thursday 09 February 2017 04:20 PM, Szabolcs Nagy wrote: >> this is a trap into the kernel at every process startup >> >> since this is called very early (dynamic linking case >> above, static linking case below) i wonder if there >> could be a way for the user to request midr_el1==0 >> unconditionally (avoiding the overhead and making >> sure the most generic implementation is used) > > Well you could use tunables to avoid it, but then if a single trap is a > problem for you then the tunables infra is going to be just as expensive. > > Why is a single trap at startup such a concern though? ok, it is probably not worth worrying about (should be around 0.1% of the minimal startup time now) but doing it just to control ifunc selection is still useful (at least for development) (if eventually there will be widespread use of ifunc based function multi-versioning then the trap will not be per process startup but per module load which is a bit more relevant, but also not something we can control from the libc) >> is there something like that on other targets? > > H.J. Lu had a patch to override IFUNCs using (what will now be) a > tunable but that patch did not make progress. I hope it will now since > I too am interested in overriding IFUNC selection using tunables. But > then this would be an orthogonal discussion to avoiding the trap since > the goals of both are different. i see.
On Thursday 09 February 2017 05:10 PM, Szabolcs Nagy wrote: > ok, it is probably not worth worrying about > (should be around 0.1% of the minimal startup time now) > > but doing it just to control ifunc selection is still > useful (at least for development) A simple tunable that disables CPU selection would be nice: glibc.tune.cpu = default|thunderx|a57 However then you would have to think harder about positioning the cpu structure initialization in static binaries to have them be controlled by tunables and not just blindly put them right at the top of __libc_start as the easy way out. It is something that should be done anyway. However, couldn't we do that as an add-on to this patch? I'd really like this framework to go in early and be exercised a bit because I am interested in pushing (in the coming weeks hopefully) some optimal string routines for aarch64. > (if eventually there will be widespread use of ifunc > based function multi-versioning then the trap will > not be per process startup but per module load which > is a bit more relevant, but also not something we can > control from the libc) Right we cannot control that. Siddhesh
On Thu, Feb 9, 2017 at 2:50 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote: > On 09/02/17 00:02, Steve Ellcey wrote: >> +static inline void __attribute__ ((unused)) >> +dl_platform_init (void) >> +{ >> + if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0') >> + /* Avoid an empty string which would disturb us. */ >> + GLRO(dl_platform) = NULL; >> + >> +#ifdef SHARED >> + /* init_cpu_features has been called early from __libc_start_main in >> + static executable. */ >> + init_cpu_features (&GLRO(dl_aarch64_cpu_features)); >> +#endif >> +} > ... >> +static inline void >> +init_cpu_features (struct cpu_features *cpu_features) >> +{ >> + if (GLRO(dl_hwcap) & HWCAP_CPUID) >> + { >> + register uint64_t id = 0; >> + asm volatile ("mrs %0, midr_el1" : "=r"(id)); >> + cpu_features->midr_el1 = id; > > this is a trap into the kernel at every process startup > > since this is called very early (dynamic linking case > above, static linking case below) i wonder if there > could be a way for the user to request midr_el1==0 > unconditionally (avoiding the overhead and making > sure the most generic implementation is used) Well the easy way to do this would be use LD_HWCAP_MASK and mask off the HWCAP_CPUID bit. > > is there something like that on other targets? Some targets like PowerPC use the hwcap to say which processor they are being run on so you use the same method as above. Thanks, Andrew > >> + } >> + else >> + { >> + cpu_features->midr_el1 = 0; >> + } >> +} > ... >> +#ifdef SHARED >> +# include <csu/libc-start.c> >> +# else >> +/* The main work is done in the generic function. */ >> +# define LIBC_START_DISABLE_INLINE >> +# define LIBC_START_MAIN generic_start_main >> +# include <csu/libc-start.c> >> +# include <cpu-features.c> >> + >> +extern struct cpu_features _dl_aarch64_cpu_features; >> + >> +int >> +__libc_start_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), >> + int argc, char **argv, >> + __typeof (main) init, >> + void (*fini) (void), >> + void (*rtld_fini) (void), void *stack_end) >> +{ >> + init_cpu_features (&_dl_aarch64_cpu_features); >> + return generic_start_main (main, argc, argv, init, fini, rtld_fini, >> + stack_end); >> +} >> +#endif >> >
On Thu, 2017-02-09 at 07:54 -0800, Andrew Pinski wrote: > On Thu, Feb 9, 2017 at 2:50 AM, Szabolcs Nagy <szabolcs.nagy@arm.com> wrote:> > > since this is called very early (dynamic linking case > > above, static linking case below) i wonder if there > > could be a way for the user to request midr_el1==0 > > unconditionally (avoiding the overhead and making > > sure the most generic implementation is used) > Well the easy way to do this would be use LD_HWCAP_MASK and mask off > the HWCAP_CPUID bit. > > > is there something like that on other targets? > Some targets like PowerPC use the hwcap to say which processor they > are being run on so you use the same method as above. > > Thanks, > Andrew Do you know if LD_HWCAP_MASK actually works on PowerPC to do this? I see where PowerPC is reading GLRO(dl_hwcap) but I don't see them reading GLRO(dl_hwcap_mask). I don't think that the generic parts of libc apply LD_HWCAP_MASK to HWCAP automatically but maybe I missed it somewhere. I changed init_cpu_features in my patch from: if (GLRO(dl_hwcap) & HWCAP_CPUID) to if (GLRO(dl_hwcap) & GLRO(dl_hwcap_mask) & HWCAP_CPUID) to see what would happen. Initially this returned false because we were using the default dl-procinfo.h and HWCAP_IMPORTANT (used to initialize dl_hwcap_mask) was 0. I made a copy of dl-procinfo.h and set HWCAP_IMPORTANT to HPWCAP_CPUID and I got true here and the thunderx ifuncs were run. But if I ran things after setting LD_HWCAP_MASK to 0 it didn't seem to have any affect and I still ran thunderx ifuncs. I am not sure but it seemed like this code in init_cpu_features may have been getting run before LD_HWCAP_MASK was getting read and before GLRO(dl_hwcap_mask) was reset from its initial value. Steve Ellcey
diff --git a/sysdeps/aarch64/dl-machine.h b/sysdeps/aarch64/dl-machine.h index 84b8aec..15d79a6 100644 --- a/sysdeps/aarch64/dl-machine.h +++ b/sysdeps/aarch64/dl-machine.h @@ -25,6 +25,7 @@ #include <tls.h> #include <dl-tlsdesc.h> #include <dl-irel.h> +#include <cpu-features.c> /* Return nonzero iff ELF header is compatible with the running host. */ static inline int __attribute__ ((unused)) @@ -225,6 +226,23 @@ _dl_start_user: \n\ #define ELF_MACHINE_NO_REL 1 #define ELF_MACHINE_NO_RELA 0 +#define DL_PLATFORM_INIT dl_platform_init () + +static inline void __attribute__ ((unused)) +dl_platform_init (void) +{ + if (GLRO(dl_platform) != NULL && *GLRO(dl_platform) == '\0') + /* Avoid an empty string which would disturb us. */ + GLRO(dl_platform) = NULL; + +#ifdef SHARED + /* init_cpu_features has been called early from __libc_start_main in + static executable. */ + init_cpu_features (&GLRO(dl_aarch64_cpu_features)); +#endif +} + + static inline ElfW(Addr) elf_machine_fixup_plt (struct link_map *map, lookup_t t, const ElfW(Rela) *reloc, diff --git a/sysdeps/aarch64/ldsodefs.h b/sysdeps/aarch64/ldsodefs.h index f277074..ba4ada3 100644 --- a/sysdeps/aarch64/ldsodefs.h +++ b/sysdeps/aarch64/ldsodefs.h @@ -20,6 +20,7 @@ #define _AARCH64_LDSODEFS_H 1 #include <elf.h> +#include <cpu-features.h> struct La_aarch64_regs; struct La_aarch64_retval; diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c index e69de29..8e4b514 100644 --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c @@ -0,0 +1,38 @@ +/* Initialize CPU feature data. AArch64 version. + This file is part of the GNU C Library. + Copyright (C) 2017 Free Software Foundation, Inc. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#include <cpu-features.h> + +#ifndef HWCAP_CPUID +# define HWCAP_CPUID (1 << 11) +#endif + +static inline void +init_cpu_features (struct cpu_features *cpu_features) +{ + if (GLRO(dl_hwcap) & HWCAP_CPUID) + { + register uint64_t id = 0; + asm volatile ("mrs %0, midr_el1" : "=r"(id)); + cpu_features->midr_el1 = id; + } + else + { + cpu_features->midr_el1 = 0; + } +} diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h b/sysdeps/unix/sysv/linux/aarch64/cpu-features.h index e69de29..c92b650 100644 --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.h +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.h @@ -0,0 +1,49 @@ +/* Initialize CPU feature data. AArch64 version. + This file is part of the GNU C Library. + Copyright (C) 2017 Free Software Foundation, Inc. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifndef _CPU_FEATURES_AARCH64_H +#define _CPU_FEATURES_AARCH64_H + +#include <stdint.h> + +#define MIDR_PARTNUM_SHIFT 4 +#define MIDR_PARTNUM_MASK (0xfff << MIDR_PARTNUM_SHIFT) +#define MIDR_PARTNUM(midr) \ + (((midr) & MIDR_PARTNUM_MASK) >> MIDR_PARTNUM_SHIFT) +#define MIDR_ARCHITECTURE_SHIFT 16 +#define MIDR_ARCHITECTURE_MASK (0xf << MIDR_ARCHITECTURE_SHIFT) +#define MIDR_ARCHITECTURE(midr) \ + (((midr) & MIDR_ARCHITECTURE_MASK) >> MIDR_ARCHITECTURE_SHIFT) +#define MIDR_VARIANT_SHIFT 20 +#define MIDR_VARIANT_MASK (0xf << MIDR_VARIANT_SHIFT) +#define MIDR_VARIANT(midr) \ + (((midr) & MIDR_VARIANT_MASK) >> MIDR_VARIANT_SHIFT) +#define MIDR_IMPLEMENTOR_SHIFT 24 +#define MIDR_IMPLEMENTOR_MASK (0xff << MIDR_IMPLEMENTOR_SHIFT) +#define MIDR_IMPLEMENTOR(midr) \ + (((midr) & MIDR_IMPLEMENTOR_MASK) >> MIDR_IMPLEMENTOR_SHIFT) + +#define IS_THUNDERX(midr) (MIDR_IMPLEMENTOR(midr) == 'C' \ + && MIDR_PARTNUM(midr) == 0x0a1) + +struct cpu_features +{ + uint64_t midr_el1; +}; + +#endif /* _CPU_FEATURES_AARCH64_H */ diff --git a/sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c b/sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c index e69de29..438046a 100644 --- a/sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c +++ b/sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c @@ -0,0 +1,60 @@ +/* Data for AArch64 version of processor capability information. + Linux version. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +/* If anything should be added here check whether the size of each string + is still ok with the given array size. + + All the #ifdefs in the definitions are quite irritating but + necessary if we want to avoid duplicating the information. There + are three different modes: + + - PROCINFO_DECL is defined. This means we are only interested in + declarations. + + - PROCINFO_DECL is not defined: + + + if SHARED is defined the file is included in an array + initializer. The .element = { ... } syntax is needed. + + + if SHARED is not defined a normal array initialization is + needed. + */ + +#ifndef PROCINFO_CLASS +# define PROCINFO_CLASS +#endif + +#if !IS_IN (ldconfig) +# if !defined PROCINFO_DECL && defined SHARED + ._dl_aarch64_cpu_features +# else +PROCINFO_CLASS struct cpu_features _dl_aarch64_cpu_features +# endif +# ifndef PROCINFO_DECL += { } +# endif +# if !defined SHARED || defined PROCINFO_DECL +; +# else +, +# endif +#endif + +#undef PROCINFO_DECL +#undef PROCINFO_CLASS diff --git a/sysdeps/unix/sysv/linux/aarch64/libc-start.c b/sysdeps/unix/sysv/linux/aarch64/libc-start.c index e69de29..a5babd4 100644 --- a/sysdeps/unix/sysv/linux/aarch64/libc-start.c +++ b/sysdeps/unix/sysv/linux/aarch64/libc-start.c @@ -0,0 +1,41 @@ +/* Override csu/libc-start.c on AArch64. + Copyright (C) 2017 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <http://www.gnu.org/licenses/>. */ + +#ifdef SHARED +# include <csu/libc-start.c> +# else +/* The main work is done in the generic function. */ +# define LIBC_START_DISABLE_INLINE +# define LIBC_START_MAIN generic_start_main +# include <csu/libc-start.c> +# include <cpu-features.c> + +extern struct cpu_features _dl_aarch64_cpu_features; + +int +__libc_start_main (int (*main) (int, char **, char ** MAIN_AUXVEC_DECL), + int argc, char **argv, + __typeof (main) init, + void (*fini) (void), + void (*rtld_fini) (void), void *stack_end) +{ + init_cpu_features (&_dl_aarch64_cpu_features); + return generic_start_main (main, argc, argv, init, fini, rtld_fini, + stack_end); +} +#endif
On Wed, 2017-02-08 at 11:15 +0530, Siddhesh Poyarekar wrote: > Looks OK with a couple of nits below. Here is a de-nitted version with the ChangeLog using 'Likewise' instead of 'Ditto' and with a one line description at the top of libc-start.c. Steve Ellcey sellcey@caviium.com 2017-02-08 Steve Ellcey <sellcey@caviumnetworks.com> Adhemerval Zanella <adhemerval.zanella@linaro.org> * sysdeps/aarch64/dl-machine.h: Include cpu-features.c. (DL_PLATFORM_INIT): New define. (dl_platform_init): New function. * sysdeps/aarch64/ldsodefs.h: Include cpu-features.h. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c: New file. * sysdeps/unix/sysv/linux/aarch64/cpu-features.h: Likewise. * sysdeps/unix/sysv/linux/aarch64/dl-procinfo.c: Likewise. * sysdeps/unix/sysv/linux/aarch64/libc-start.c: Likewise.