Message ID | 20180716141633.6948-1-siddhesh@sourceware.org |
---|---|
State | New |
Headers | show |
Series | Rename the glibc.tune namespace to glibc.cpu | expand |
On 07/16/2018 10:16 AM, Siddhesh Poyarekar wrote: > The glibc.tune namespace is vaguely named since it is a 'tunable', so > give it a more specific name that describes what it refers to. Rename > the tunable namespace to 'cpu' to more accurately reflect what it > encompasses. Also rename glibc.tune.cpu to glibc.cpu.name since > glibc.cpu.cpu is weird. > > * NEWS: Mention the change. > * elf/dl-tunables.list: Rename tune namespace to cpu. > * sysdeps/powerpc/dl-tunables.list: Likewise. > * sysdeps/x86/dl-tunables.list: Likewise. > * sysdeps/aarch64/dl-tunables.list: Rename tune.cpu to > cpu.name. > * elf/dl-hwcaps.c (_dl_important_hwcaps): Adjust. > * elf/dl-hwcaps.h (GET_HWCAP_MASK): Likewise. > * manual/README.tunables: Likewise. > * manual/tunables.texi: Likewise. > * sysdeps/powerpc/cpu-features.c: Likewise. > * sysdeps/unix/sysv/linux/aarch64/cpu-features.c > (init_cpu_features): Likewise. > * sysdeps/x86/cpu-features.c: Likewise. > * sysdeps/x86/cpu-features.h: Likewise. > * sysdeps/x86/cpu-tunables.c: Likewise. > * sysdeps/x86_64/Makefile: Likewise. This looks good to me. I'd like this to wait until 2.29 opens. I want to minimize spurious changes. It would also be nice in 2.29 to rename the options to arch_* prefixed options for those that are arch-specific. Reviewed-by: Carlos O'Donell <carlos@redhat.com> > --- > NEWS | 3 ++ > elf/dl-hwcaps.c | 2 +- > elf/dl-hwcaps.h | 2 +- > elf/dl-tunables.list | 2 +- > manual/README.tunables | 6 ++-- > manual/tunables.texi | 30 +++++++++---------- > sysdeps/aarch64/dl-tunables.list | 4 +-- > sysdeps/powerpc/cpu-features.c | 2 +- > sysdeps/powerpc/dl-tunables.list | 2 +- > .../unix/sysv/linux/aarch64/cpu-features.c | 2 +- > sysdeps/x86/cpu-features.c | 4 +-- > sysdeps/x86/cpu-features.h | 2 +- > sysdeps/x86/cpu-tunables.c | 4 +-- > sysdeps/x86/dl-tunables.list | 2 +- > sysdeps/x86_64/Makefile | 4 +-- > 15 files changed, 37 insertions(+), 34 deletions(-) > > diff --git a/NEWS b/NEWS > index 5de2c2816f..b5308fd596 100644 > --- a/NEWS > +++ b/NEWS > @@ -173,6 +173,9 @@ Deprecated and removed features, and other changes affecting compatibility: > project's versions of these files. The plan is to make this the default > behavior in a future release. > > +* The glibc.tune tunable namespace has been renamed to glibc.cpu and the > + tunable glibc.tune.cpu has been renamed to glibc.cpu.name. > + > Changes to build and runtime requirements: > > GNU make 4.0 or later is now required to build glibc. > diff --git a/elf/dl-hwcaps.c b/elf/dl-hwcaps.c > index 23482a88a1..ecf00b4577 100644 > --- a/elf/dl-hwcaps.c > +++ b/elf/dl-hwcaps.c > @@ -140,7 +140,7 @@ _dl_important_hwcaps (const char *platform, size_t platform_len, size_t *sz, > string and bit like you can ignore an OS-supplied HWCAP bit. */ > hwcap_mask |= (uint64_t) mask << _DL_FIRST_EXTRA; > #if HAVE_TUNABLES > - TUNABLE_SET (glibc, tune, hwcap_mask, uint64_t, hwcap_mask); > + TUNABLE_SET (glibc, cpu, hwcap_mask, uint64_t, hwcap_mask); > #else > GLRO(dl_hwcap_mask) = hwcap_mask; > #endif > diff --git a/elf/dl-hwcaps.h b/elf/dl-hwcaps.h > index 17f0da4c73..d69ee11dc2 100644 > --- a/elf/dl-hwcaps.h > +++ b/elf/dl-hwcaps.h > @@ -19,7 +19,7 @@ > #include <elf/dl-tunables.h> > > #if HAVE_TUNABLES > -# define GET_HWCAP_MASK() TUNABLE_GET (glibc, tune, hwcap_mask, uint64_t, NULL) > +# define GET_HWCAP_MASK() TUNABLE_GET (glibc, cpu, hwcap_mask, uint64_t, NULL) > #else > # ifdef SHARED > # define GET_HWCAP_MASK() GLRO(dl_hwcap_mask) > diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list > index 1f8ecb8437..b108592b62 100644 > --- a/elf/dl-tunables.list > +++ b/elf/dl-tunables.list > @@ -86,7 +86,7 @@ glibc { > type: SIZE_T > } > } > - tune { > + cpu { > hwcap_mask { > type: UINT_64 > env_alias: LD_HWCAP_MASK > diff --git a/manual/README.tunables b/manual/README.tunables > index 3967679f43..f87a31a65e 100644 > --- a/manual/README.tunables > +++ b/manual/README.tunables > @@ -105,11 +105,11 @@ where 'check' is the tunable name, 'int32_t' is the C type of the tunable and > To get and set tunables in a different namespace from that module, use the full > form of the macros as follows: > > - val = TUNABLE_GET_FULL (glibc, tune, hwcap_mask, uint64_t, NULL) > + val = TUNABLE_GET_FULL (glibc, cpu, hwcap_mask, uint64_t, NULL) > > - TUNABLE_SET_FULL (glibc, tune, hwcap_mask, uint64_t, val) > + TUNABLE_SET_FULL (glibc, cpu, hwcap_mask, uint64_t, val) > > -where 'glibc' is the top namespace, 'tune' is the tunable namespace and the > +where 'glibc' is the top namespace, 'cpu' is the tunable namespace and the > remaining arguments are the same as the short form macros. > > When TUNABLE_NAMESPACE is not defined in a module, TUNABLE_GET is equivalent to > diff --git a/manual/tunables.texi b/manual/tunables.texi > index be33c9fc79..9b8f9e4610 100644 > --- a/manual/tunables.texi > +++ b/manual/tunables.texi > @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}. > @cindex non_temporal_threshold tunables > @cindex tunables, non_temporal_threshold > > -@deftp {Tunable namespace} glibc.tune > +@deftp {Tunable namespace} glibc.cpu > Behavior of @theglibc{} can be tuned to assume specific hardware capabilities > by setting the following tunables in the @code{tune} namespace: > @end deftp > > -@deftp Tunable glibc.tune.hwcap_mask > +@deftp Tunable glibc.cpu.hwcap_mask > This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is > identical in features. > > The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set > extensions available in the processor at runtime for some architectures. The > -@code{glibc.tune.hwcap_mask} tunable allows the user to mask out those > +@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those > capabilities at runtime, thus disabling use of those extensions. > @end deftp > > -@deftp Tunable glibc.tune.hwcaps > -The @code{glibc.tune.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to > +@deftp Tunable glibc.cpu.hwcaps > +The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to > enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx} > and @code{zzz} where the feature name is case-sensitive and has to match > the ones in @code{sysdeps/x86/cpu-features.h}. > @@ -319,8 +319,8 @@ the ones in @code{sysdeps/x86/cpu-features.h}. > This tunable is specific to i386 and x86-64. > @end deftp > > -@deftp Tunable glibc.tune.cached_memopt > -The @code{glibc.tune.cached_memopt=[0|1]} tunable allows the user to > +@deftp Tunable glibc.cpu.cached_memopt > +The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to > enable optimizations recommended for cacheable memory. If set to > @code{1}, @theglibc{} assumes that the process memory image consists > of cacheable (non-device) memory only. The default, @code{0}, > @@ -329,8 +329,8 @@ indicates that the process may use device memory. > This tunable is specific to powerpc, powerpc64 and powerpc64le. > @end deftp > > -@deftp Tunable glibc.tune.cpu > -The @code{glibc.tune.cpu=xxx} tunable allows the user to tell @theglibc{} to > +@deftp Tunable glibc.cpu.cpu > +The @code{glibc.cpu.cpu=xxx} tunable allows the user to tell @theglibc{} to > assume that the CPU is @code{xxx} where xxx may have one of these values: > @code{generic}, @code{falkor}, @code{thunderxt88}, @code{thunderx2t99}, > @code{thunderx2t99p1}. > @@ -338,20 +338,20 @@ assume that the CPU is @code{xxx} where xxx may have one of these values: > This tunable is specific to aarch64. > @end deftp > > -@deftp Tunable glibc.tune.x86_data_cache_size > -The @code{glibc.tune.x86_data_cache_size} tunable allows the user to set > +@deftp Tunable glibc.cpu.x86_data_cache_size > +The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set > data cache size in bytes for use in memory and string routines. > > This tunable is specific to i386 and x86-64. > @end deftp > > -@deftp Tunable glibc.tune.x86_shared_cache_size > -The @code{glibc.tune.x86_shared_cache_size} tunable allows the user to > +@deftp Tunable glibc.cpu.x86_shared_cache_size > +The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to > set shared cache size in bytes for use in memory and string routines. > @end deftp > > -@deftp Tunable glibc.tune.x86_non_temporal_threshold > -The @code{glibc.tune.x86_non_temporal_threshold} tunable allows the user > +@deftp Tunable glibc.cpu.x86_non_temporal_threshold > +The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user > to set threshold in bytes for non temporal store. > > This tunable is specific to i386 and x86-64. > diff --git a/sysdeps/aarch64/dl-tunables.list b/sysdeps/aarch64/dl-tunables.list > index f6a88168cc..cfcf940ebd 100644 > --- a/sysdeps/aarch64/dl-tunables.list > +++ b/sysdeps/aarch64/dl-tunables.list > @@ -17,8 +17,8 @@ > # <http://www.gnu.org/licenses/>. > > glibc { > - tune { > - cpu { > + cpu { > + name { > type: STRING > } > } > diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c > index 955d4778a6..ad809b9815 100644 > --- a/sysdeps/powerpc/cpu-features.c > +++ b/sysdeps/powerpc/cpu-features.c > @@ -30,7 +30,7 @@ init_cpu_features (struct cpu_features *cpu_features) > tunables is enable, since for this case user can explicit disable > unaligned optimizations. */ > #if HAVE_TUNABLES > - int32_t cached_memfunc = TUNABLE_GET (glibc, tune, cached_memopt, int32_t, > + int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t, > NULL); > cpu_features->use_cached_memopt = (cached_memfunc > 0); > #else > diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list > index d26636a16b..b3372555f7 100644 > --- a/sysdeps/powerpc/dl-tunables.list > +++ b/sysdeps/powerpc/dl-tunables.list > @@ -17,7 +17,7 @@ > # <http://www.gnu.org/licenses/>. > > glibc { > - tune { > + cpu { > cached_memopt { > type: INT_32 > minval: 0 > diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c > index 39eba0186f..b4f348509e 100644 > --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c > +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c > @@ -57,7 +57,7 @@ init_cpu_features (struct cpu_features *cpu_features) > > #if HAVE_TUNABLES > /* Get the tunable override. */ > - const char *mcpu = TUNABLE_GET (glibc, tune, cpu, const char *, NULL); > + const char *mcpu = TUNABLE_GET (glibc, cpu, name, const char *, NULL); > if (mcpu != NULL) > midr = get_midr_from_mcpu (mcpu); > #endif > diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c > index d41ebde823..b8bef8d54b 100644 > --- a/sysdeps/x86/cpu-features.c > +++ b/sysdeps/x86/cpu-features.c > @@ -22,7 +22,7 @@ > #include <libc-pointer-arith.h> > > #if HAVE_TUNABLES > -# define TUNABLE_NAMESPACE tune > +# define TUNABLE_NAMESPACE cpu > # include <unistd.h> /* Get STDOUT_FILENO for _dl_printf. */ > # include <elf/dl-tunables.h> > > @@ -398,7 +398,7 @@ no_cpuid: > > /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86. */ > #if !HAVE_TUNABLES && defined SHARED > - /* The glibc.tune.hwcap_mask tunable is initialized already, so no need to do > + /* The glibc.cpu.hwcap_mask tunable is initialized already, so no need to do > this. */ > GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT; > #endif > diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h > index 624e681e96..e9713f6215 100644 > --- a/sysdeps/x86/cpu-features.h > +++ b/sysdeps/x86/cpu-features.h > @@ -141,7 +141,7 @@ struct cpu_features > unsigned long int xsave_state_size; > /* The full state size for XSAVE when XSAVEC is disabled by > > - GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable > + GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable > */ > unsigned int xsave_state_full_size; > unsigned int feature[FEATURE_INDEX_MAX]; > diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c > index af761dcbbc..c38af71b8a 100644 > --- a/sysdeps/x86/cpu-tunables.c > +++ b/sysdeps/x86/cpu-tunables.c > @@ -17,7 +17,7 @@ > <http://www.gnu.org/licenses/>. */ > > #if HAVE_TUNABLES > -# define TUNABLE_NAMESPACE tune > +# define TUNABLE_NAMESPACE cpu > # include <stdbool.h> > # include <stdint.h> > # include <unistd.h> /* Get STDOUT_FILENO for _dl_printf. */ > @@ -116,7 +116,7 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) > the hardware which wasn't available when the selection was made. > The environment variable: > > - GLIBC_TUNABLES=glibc.tune.hwcaps=-xxx,yyy,-zzz,.... > + GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz,.... > > can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature > yyy and zzz, where the feature name is case-sensitive and has to > diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list > index 7c3236a68f..9a5a0b1a63 100644 > --- a/sysdeps/x86/dl-tunables.list > +++ b/sysdeps/x86/dl-tunables.list > @@ -17,7 +17,7 @@ > # <http://www.gnu.org/licenses/>. > > glibc { > - tune { > + cpu { > hwcaps { > type: STRING > } > diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile > index 9f1562f1b2..d51cf03ac9 100644 > --- a/sysdeps/x86_64/Makefile > +++ b/sysdeps/x86_64/Makefile > @@ -57,7 +57,7 @@ modules-names += x86_64/tst-x86_64mod-1 > LDFLAGS-tst-x86_64mod-1.so = -Wl,-soname,tst-x86_64mod-1.so > ifneq (no,$(have-tunables)) > # Test the state size for XSAVE when XSAVEC is disabled. > -tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable > +tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable > endif > > $(objpfx)tst-x86_64-1: $(objpfx)x86_64/tst-x86_64mod-1.so > @@ -74,7 +74,7 @@ $(objpfx)tst-platform-1.out: $(objpfx)x86_64/tst-platformmod-2.so > # Turn off AVX512F_Usable and AVX2_Usable so that GLRO(dl_platform) is > # always set to x86_64. > tst-platform-1-ENV = LD_PRELOAD=$(objpfx)\$$PLATFORM/tst-platformmod-2.so \ > - GLIBC_TUNABLES=glibc.tune.hwcaps=-AVX512F_Usable,-AVX2_Usable > + GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F_Usable,-AVX2_Usable > endif > > tests += tst-audit3 tst-audit4 tst-audit5 tst-audit6 tst-audit7 \ >
On 07/16/2018 08:47 PM, Carlos O'Donell wrote: > This looks good to me. I'd like this to wait until 2.29 opens. I want to OK, I'll queue it up for 2.29. > minimize spurious changes. It would also be nice in 2.29 to rename the > options to arch_* prefixed options for those that are arch-specific. There's only the powerpc one that's named in a generic manner but is architecture specific. I'd actually like to discuss dropping the tunable (and use hwcap_mask instead) but we can have that discussion for 2.29. Siddhesh
On 07/16/2018 07:16 AM, Siddhesh Poyarekar wrote: ... > diff --git a/manual/tunables.texi b/manual/tunables.texi > index be33c9fc79..9b8f9e4610 100644 > --- a/manual/tunables.texi > +++ b/manual/tunables.texi > @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}. > @cindex non_temporal_threshold tunables > @cindex tunables, non_temporal_threshold > > -@deftp {Tunable namespace} glibc.tune > +@deftp {Tunable namespace} glibc.cpu > Behavior of @theglibc{} can be tuned to assume specific hardware capabilities > by setting the following tunables in the @code{tune} namespace: Should be @code{cpu} now. Rical
On 07/16/2018 09:16 PM, Rical Jasan wrote: > On 07/16/2018 07:16 AM, Siddhesh Poyarekar wrote: > ... >> diff --git a/manual/tunables.texi b/manual/tunables.texi >> index be33c9fc79..9b8f9e4610 100644 >> --- a/manual/tunables.texi >> +++ b/manual/tunables.texi >> @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}. >> @cindex non_temporal_threshold tunables >> @cindex tunables, non_temporal_threshold >> >> -@deftp {Tunable namespace} glibc.tune >> +@deftp {Tunable namespace} glibc.cpu >> Behavior of @theglibc{} can be tuned to assume specific hardware capabilities >> by setting the following tunables in the @code{tune} namespace: > > Should be @code{cpu} now. Oh yes, thanks. I'll fix it up when I commit. Siddhesh
Siddhesh Poyarekar <siddhesh@sourceware.org> writes: > On 07/16/2018 08:47 PM, Carlos O'Donell wrote: >> This looks good to me. I'd like this to wait until 2.29 opens. I want to > > OK, I'll queue it up for 2.29. > >> minimize spurious changes. It would also be nice in 2.29 to rename the >> options to arch_* prefixed options for those that are arch-specific. > > There's only the powerpc one that's named in a generic manner but is > architecture specific. I'm not following your line of thought here: - glibc.cpu.hwcaps is specific to i386 and x86-64 - glibc.cpu is specific to aarch64 - glibc.cpu.cached_memopt is specific to powerpc, powerpc64 and powerpc64le What am I missing? > I'd actually like to discuss dropping the > tunable (and use hwcap_mask instead) but we can have that discussion for > 2.29. Notice the optimization is not specific to a CPU, but specific to an user scenario (cacheable memory). In other words, the optimization can't be used whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when cache-inhibited memory is being used.
Siddhesh Poyarekar <siddhesh@sourceware.org> writes: > diff --git a/manual/tunables.texi b/manual/tunables.texi > index be33c9fc79..9b8f9e4610 100644 > --- a/manual/tunables.texi > +++ b/manual/tunables.texi > @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}. > @cindex non_temporal_threshold tunables > @cindex tunables, non_temporal_threshold > > -@deftp {Tunable namespace} glibc.tune > +@deftp {Tunable namespace} glibc.cpu > Behavior of @theglibc{} can be tuned to assume specific hardware capabilities > by setting the following tunables in the @code{tune} namespace: > @end deftp > > -@deftp Tunable glibc.tune.hwcap_mask > +@deftp Tunable glibc.cpu.hwcap_mask > This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is > identical in features. > > The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set > extensions available in the processor at runtime for some architectures. The > -@code{glibc.tune.hwcap_mask} tunable allows the user to mask out those > +@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those > capabilities at runtime, thus disabling use of those extensions. > @end deftp > > -@deftp Tunable glibc.tune.hwcaps > -The @code{glibc.tune.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to > +@deftp Tunable glibc.cpu.hwcaps > +The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to > enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx} > and @code{zzz} where the feature name is case-sensitive and has to match > the ones in @code{sysdeps/x86/cpu-features.h}. > @@ -319,8 +319,8 @@ the ones in @code{sysdeps/x86/cpu-features.h}. > This tunable is specific to i386 and x86-64. > @end deftp > > -@deftp Tunable glibc.tune.cached_memopt > -The @code{glibc.tune.cached_memopt=[0|1]} tunable allows the user to > +@deftp Tunable glibc.cpu.cached_memopt > +The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to > enable optimizations recommended for cacheable memory. If set to > @code{1}, @theglibc{} assumes that the process memory image consists > of cacheable (non-device) memory only. The default, @code{0}, > @@ -329,8 +329,8 @@ indicates that the process may use device memory. > This tunable is specific to powerpc, powerpc64 and powerpc64le. > @end deftp > > -@deftp Tunable glibc.tune.cpu > -The @code{glibc.tune.cpu=xxx} tunable allows the user to tell @theglibc{} to > +@deftp Tunable glibc.cpu.cpu > +The @code{glibc.cpu.cpu=xxx} tunable allows the user to tell @theglibc{} to s/cpu.cpu/cpu.name/ in both lines. Otherwise, looks good to me.
On 07/17/2018 01:32 AM, Tulio Magno Quites Machado Filho wrote: > I'm not following your line of thought here: > > - glibc.cpu.hwcaps is specific to i386 and x86-64 > - glibc.cpu is specific to aarch64 > - glibc.cpu.cached_memopt is specific to powerpc, powerpc64 and powerpc64le > > What am I missing? The difference is that glibc.cpu.name and glibc.cpu.hwcaps are conceptually generic tunables, i.e. there is a reasonable chance that couple of releases down the line another architecture may want to provide tuning facility for CPUs by name or by HWCAPS. The cached_memopt one is not very clear to me and seems more like something that is only useful on power8. x86-specific tunables i,e, where the concept is not currently applicable for other architectures (x86_l2_temporal_threshold) are prefixed with x86_*. > Notice the optimization is not specific to a CPU, but specific to an user > scenario (cacheable memory). In other words, the optimization can't be used > whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when > cache-inhibited memory is being used. Ahh OK, I got thrown off by the fact that there's a separate routine for it and assumed that it is Power8-specific. I have a different concern then; a tunable is process-wide so the cached_memopt tunable essentially assumes that the entire process is using cache-inhibited memory. Is that a reasonable assumption? In my experience a typical process would have only a set of structures in cache-inhibited memory and most of it would be regular memory. In that sense it looks more like a tradeoff hack and it would be nice to consider alternatives. Here are a couple I can think of off the top of my head: 1. A new relocation that overlays on top of ifuncs and allows selection of routines based on specific properties. I have had this idea for a while but no time to implement it and it has much more general scope than memory type; for example memory alignment could also be a factor to short-cut parts of string routines at compile time itself. It does not have the runtime flexibility of a tunable but is probably far more configurable. 2. If there is a correlation to size then implement something similar to the x86 temporal_threshold tunable. This is probably just as good or bad as setting a cached_memopt flag but has the effect of generalizing what was a tunable. What do you think? Siddhesh
On 07/17/2018 01:46 AM, Tulio Magno Quites Machado Filho wrote: > s/cpu.cpu/cpu.name/ in both lines. > > Otherwise, looks good to me. Oops, my find/replace skills are getting rusty. I'll fix this, thanks. Siddhesh
On 07/16/2018 07:46 PM, Siddhesh Poyarekar wrote: > The glibc.tune namespace is vaguely named since it is a 'tunable', so > give it a more specific name that describes what it refers to. Rename > the tunable namespace to 'cpu' to more accurately reflect what it > encompasses. Also rename glibc.tune.cpu to glibc.cpu.name since > glibc.cpu.cpu is weird. > > * NEWS: Mention the change. > * elf/dl-tunables.list: Rename tune namespace to cpu. > * sysdeps/powerpc/dl-tunables.list: Likewise. > * sysdeps/x86/dl-tunables.list: Likewise. > * sysdeps/aarch64/dl-tunables.list: Rename tune.cpu to > cpu.name. > * elf/dl-hwcaps.c (_dl_important_hwcaps): Adjust. > * elf/dl-hwcaps.h (GET_HWCAP_MASK): Likewise. > * manual/README.tunables: Likewise. > * manual/tunables.texi: Likewise. > * sysdeps/powerpc/cpu-features.c: Likewise. > * sysdeps/unix/sysv/linux/aarch64/cpu-features.c > (init_cpu_features): Likewise. > * sysdeps/x86/cpu-features.c: Likewise. > * sysdeps/x86/cpu-features.h: Likewise. > * sysdeps/x86/cpu-tunables.c: Likewise. > * sysdeps/x86_64/Makefile: Likewise. I've pushed this now after adjusting the CET tunable. Siddhesh From dce452dc5278f2985d21315721a6ba802537b862 Mon Sep 17 00:00:00 2001 From: Siddhesh Poyarekar <siddhesh@sourceware.org> Date: Thu, 2 Aug 2018 23:49:19 +0530 Subject: [PATCH] Rename the glibc.tune namespace to glibc.cpu The glibc.tune namespace is vaguely named since it is a 'tunable', so give it a more specific name that describes what it refers to. Rename the tunable namespace to 'cpu' to more accurately reflect what it encompasses. Also rename glibc.tune.cpu to glibc.cpu.name since glibc.cpu.cpu is weird. * NEWS: Mention the change. * elf/dl-tunables.list: Rename tune namespace to cpu. * sysdeps/powerpc/dl-tunables.list: Likewise. * sysdeps/x86/dl-tunables.list: Likewise. * sysdeps/aarch64/dl-tunables.list: Rename tune.cpu to cpu.name. * elf/dl-hwcaps.c (_dl_important_hwcaps): Adjust. * elf/dl-hwcaps.h (GET_HWCAP_MASK): Likewise. * manual/README.tunables: Likewise. * manual/tunables.texi: Likewise. * sysdeps/powerpc/cpu-features.c: Likewise. * sysdeps/unix/sysv/linux/aarch64/cpu-features.c (init_cpu_features): Likewise. * sysdeps/x86/cpu-features.c: Likewise. * sysdeps/x86/cpu-features.h: Likewise. * sysdeps/x86/cpu-tunables.c: Likewise. * sysdeps/x86_64/Makefile: Likewise. * sysdeps/x86/dl-cet.c: Likewise. Reviewed-by: Carlos O'Donell <carlos@redhat.com> --- NEWS | 3 +- elf/dl-hwcaps.c | 2 +- elf/dl-hwcaps.h | 2 +- elf/dl-tunables.list | 2 +- manual/README.tunables | 6 +-- manual/tunables.texi | 40 +++++++++---------- sysdeps/aarch64/dl-tunables.list | 4 +- sysdeps/powerpc/cpu-features.c | 2 +- sysdeps/powerpc/dl-tunables.list | 2 +- .../unix/sysv/linux/aarch64/cpu-features.c | 2 +- sysdeps/x86/Makefile | 6 +-- sysdeps/x86/cpu-features.c | 6 +-- sysdeps/x86/cpu-features.h | 2 +- sysdeps/x86/cpu-tunables.c | 4 +- sysdeps/x86/dl-cet.c | 2 +- sysdeps/x86/dl-tunables.list | 2 +- sysdeps/x86_64/Makefile | 4 +- 17 files changed, 46 insertions(+), 45 deletions(-) diff --git a/NEWS b/NEWS index 33f7ba56c0..6c062a5959 100644 --- a/NEWS +++ b/NEWS @@ -13,7 +13,8 @@ Major new features: Deprecated and removed features, and other changes affecting compatibility: - [Add deprecations, removals and changes affecting compatibility here] +* The glibc.tune tunable namespace has been renamed to glibc.cpu and the + tunable glibc.tune.cpu has been renamed to glibc.cpu.name. Changes to build and runtime requirements: diff --git a/elf/dl-hwcaps.c b/elf/dl-hwcaps.c index 23482a88a1..ecf00b4577 100644 --- a/elf/dl-hwcaps.c +++ b/elf/dl-hwcaps.c @@ -140,7 +140,7 @@ _dl_important_hwcaps (const char *platform, size_t platform_len, size_t *sz, string and bit like you can ignore an OS-supplied HWCAP bit. */ hwcap_mask |= (uint64_t) mask << _DL_FIRST_EXTRA; #if HAVE_TUNABLES - TUNABLE_SET (glibc, tune, hwcap_mask, uint64_t, hwcap_mask); + TUNABLE_SET (glibc, cpu, hwcap_mask, uint64_t, hwcap_mask); #else GLRO(dl_hwcap_mask) = hwcap_mask; #endif diff --git a/elf/dl-hwcaps.h b/elf/dl-hwcaps.h index 17f0da4c73..d69ee11dc2 100644 --- a/elf/dl-hwcaps.h +++ b/elf/dl-hwcaps.h @@ -19,7 +19,7 @@ #include <elf/dl-tunables.h> #if HAVE_TUNABLES -# define GET_HWCAP_MASK() TUNABLE_GET (glibc, tune, hwcap_mask, uint64_t, NULL) +# define GET_HWCAP_MASK() TUNABLE_GET (glibc, cpu, hwcap_mask, uint64_t, NULL) #else # ifdef SHARED # define GET_HWCAP_MASK() GLRO(dl_hwcap_mask) diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list index 1f8ecb8437..b108592b62 100644 --- a/elf/dl-tunables.list +++ b/elf/dl-tunables.list @@ -86,7 +86,7 @@ glibc { type: SIZE_T } } - tune { + cpu { hwcap_mask { type: UINT_64 env_alias: LD_HWCAP_MASK diff --git a/manual/README.tunables b/manual/README.tunables index 3967679f43..f87a31a65e 100644 --- a/manual/README.tunables +++ b/manual/README.tunables @@ -105,11 +105,11 @@ where 'check' is the tunable name, 'int32_t' is the C type of the tunable and To get and set tunables in a different namespace from that module, use the full form of the macros as follows: - val = TUNABLE_GET_FULL (glibc, tune, hwcap_mask, uint64_t, NULL) + val = TUNABLE_GET_FULL (glibc, cpu, hwcap_mask, uint64_t, NULL) - TUNABLE_SET_FULL (glibc, tune, hwcap_mask, uint64_t, val) + TUNABLE_SET_FULL (glibc, cpu, hwcap_mask, uint64_t, val) -where 'glibc' is the top namespace, 'tune' is the tunable namespace and the +where 'glibc' is the top namespace, 'cpu' is the tunable namespace and the remaining arguments are the same as the short form macros. When TUNABLE_NAMESPACE is not defined in a module, TUNABLE_GET is equivalent to diff --git a/manual/tunables.texi b/manual/tunables.texi index bb4819bdf1..3345a23969 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}. @cindex non_temporal_threshold tunables @cindex tunables, non_temporal_threshold -@deftp {Tunable namespace} glibc.tune +@deftp {Tunable namespace} glibc.cpu Behavior of @theglibc{} can be tuned to assume specific hardware capabilities -by setting the following tunables in the @code{tune} namespace: +by setting the following tunables in the @code{cpu} namespace: @end deftp -@deftp Tunable glibc.tune.hwcap_mask +@deftp Tunable glibc.cpu.hwcap_mask This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is identical in features. The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set extensions available in the processor at runtime for some architectures. The -@code{glibc.tune.hwcap_mask} tunable allows the user to mask out those +@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those capabilities at runtime, thus disabling use of those extensions. @end deftp -@deftp Tunable glibc.tune.hwcaps -The @code{glibc.tune.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to +@deftp Tunable glibc.cpu.hwcaps +The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx} and @code{zzz} where the feature name is case-sensitive and has to match the ones in @code{sysdeps/x86/cpu-features.h}. @@ -319,8 +319,8 @@ the ones in @code{sysdeps/x86/cpu-features.h}. This tunable is specific to i386 and x86-64. @end deftp -@deftp Tunable glibc.tune.cached_memopt -The @code{glibc.tune.cached_memopt=[0|1]} tunable allows the user to +@deftp Tunable glibc.cpu.cached_memopt +The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to enable optimizations recommended for cacheable memory. If set to @code{1}, @theglibc{} assumes that the process memory image consists of cacheable (non-device) memory only. The default, @code{0}, @@ -329,8 +329,8 @@ indicates that the process may use device memory. This tunable is specific to powerpc, powerpc64 and powerpc64le. @end deftp -@deftp Tunable glibc.tune.cpu -The @code{glibc.tune.cpu=xxx} tunable allows the user to tell @theglibc{} to +@deftp Tunable glibc.cpu.name +The @code{glibc.cpu.name=xxx} tunable allows the user to tell @theglibc{} to assume that the CPU is @code{xxx} where xxx may have one of these values: @code{generic}, @code{falkor}, @code{thunderxt88}, @code{thunderx2t99}, @code{thunderx2t99p1}. @@ -338,27 +338,27 @@ assume that the CPU is @code{xxx} where xxx may have one of these values: This tunable is specific to aarch64. @end deftp -@deftp Tunable glibc.tune.x86_data_cache_size -The @code{glibc.tune.x86_data_cache_size} tunable allows the user to set +@deftp Tunable glibc.cpu.x86_data_cache_size +The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set data cache size in bytes for use in memory and string routines. This tunable is specific to i386 and x86-64. @end deftp -@deftp Tunable glibc.tune.x86_shared_cache_size -The @code{glibc.tune.x86_shared_cache_size} tunable allows the user to +@deftp Tunable glibc.cpu.x86_shared_cache_size +The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to set shared cache size in bytes for use in memory and string routines. @end deftp -@deftp Tunable glibc.tune.x86_non_temporal_threshold -The @code{glibc.tune.x86_non_temporal_threshold} tunable allows the user +@deftp Tunable glibc.cpu.x86_non_temporal_threshold +The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user to set threshold in bytes for non temporal store. This tunable is specific to i386 and x86-64. @end deftp -@deftp Tunable glibc.tune.x86_ibt -The @code{glibc.tune.x86_ibt} tunable allows the user to control how +@deftp Tunable glibc.cpu.x86_ibt +The @code{glibc.cpu.x86_ibt} tunable allows the user to control how indirect branch tracking (IBT) should be enabled. Accepted values are @code{on}, @code{off}, and @code{permissive}. @code{on} always turns on IBT regardless of whether IBT is enabled in the executable and its @@ -370,8 +370,8 @@ IBT on non-CET executables and shared libraries. This tunable is specific to i386 and x86-64. @end deftp -@deftp Tunable glibc.tune.x86_shstk -The @code{glibc.tune.x86_shstk} tunable allows the user to control how +@deftp Tunable glibc.cpu.x86_shstk +The @code{glibc.cpu.x86_shstk} tunable allows the user to control how the shadow stack (SHSTK) should be enabled. Accepted values are @code{on}, @code{off}, and @code{permissive}. @code{on} always turns on SHSTK regardless of whether SHSTK is enabled in the executable and its diff --git a/sysdeps/aarch64/dl-tunables.list b/sysdeps/aarch64/dl-tunables.list index f6a88168cc..cfcf940ebd 100644 --- a/sysdeps/aarch64/dl-tunables.list +++ b/sysdeps/aarch64/dl-tunables.list @@ -17,8 +17,8 @@ # <http://www.gnu.org/licenses/>. glibc { - tune { - cpu { + cpu { + name { type: STRING } } diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c index 955d4778a6..ad809b9815 100644 --- a/sysdeps/powerpc/cpu-features.c +++ b/sysdeps/powerpc/cpu-features.c @@ -30,7 +30,7 @@ init_cpu_features (struct cpu_features *cpu_features) tunables is enable, since for this case user can explicit disable unaligned optimizations. */ #if HAVE_TUNABLES - int32_t cached_memfunc = TUNABLE_GET (glibc, tune, cached_memopt, int32_t, + int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t, NULL); cpu_features->use_cached_memopt = (cached_memfunc > 0); #else diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list index d26636a16b..b3372555f7 100644 --- a/sysdeps/powerpc/dl-tunables.list +++ b/sysdeps/powerpc/dl-tunables.list @@ -17,7 +17,7 @@ # <http://www.gnu.org/licenses/>. glibc { - tune { + cpu { cached_memopt { type: INT_32 minval: 0 diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c index 39eba0186f..b4f348509e 100644 --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c @@ -57,7 +57,7 @@ init_cpu_features (struct cpu_features *cpu_features) #if HAVE_TUNABLES /* Get the tunable override. */ - const char *mcpu = TUNABLE_GET (glibc, tune, cpu, const char *, NULL); + const char *mcpu = TUNABLE_GET (glibc, cpu, name, const char *, NULL); if (mcpu != NULL) midr = get_midr_from_mcpu (mcpu); #endif diff --git a/sysdeps/x86/Makefile b/sysdeps/x86/Makefile index 337b0b63dc..f2fd031ac7 100644 --- a/sysdeps/x86/Makefile +++ b/sysdeps/x86/Makefile @@ -47,13 +47,13 @@ $(objpfx)tst-cet-legacy-4.out: $(objpfx)tst-cet-legacy-mod-4.so ifneq (no,$(have-tunables)) $(objpfx)tst-cet-legacy-4a: $(libdl) $(objpfx)tst-cet-legacy-4a.out: $(objpfx)tst-cet-legacy-mod-4.so -tst-cet-legacy-4a-ENV = GLIBC_TUNABLES=glibc.tune.x86_shstk=permissive +tst-cet-legacy-4a-ENV = GLIBC_TUNABLES=glibc.cpu.x86_shstk=permissive $(objpfx)tst-cet-legacy-4b: $(libdl) $(objpfx)tst-cet-legacy-4b.out: $(objpfx)tst-cet-legacy-mod-4.so -tst-cet-legacy-4b-ENV = GLIBC_TUNABLES=glibc.tune.x86_shstk=on +tst-cet-legacy-4b-ENV = GLIBC_TUNABLES=glibc.cpu.x86_shstk=on $(objpfx)tst-cet-legacy-4c: $(libdl) $(objpfx)tst-cet-legacy-4c.out: $(objpfx)tst-cet-legacy-mod-4.so -tst-cet-legacy-4c-ENV = GLIBC_TUNABLES=glibc.tune.x86_shstk=off +tst-cet-legacy-4c-ENV = GLIBC_TUNABLES=glibc.cpu.x86_shstk=off endif endif diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index 51642f8b6a..f4e0f5a2ed 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -22,7 +22,7 @@ #include <libc-pointer-arith.h> #if HAVE_TUNABLES -# define TUNABLE_NAMESPACE tune +# define TUNABLE_NAMESPACE cpu # include <unistd.h> /* Get STDOUT_FILENO for _dl_printf. */ # include <elf/dl-tunables.h> @@ -419,7 +419,7 @@ no_cpuid: /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86. */ #if !HAVE_TUNABLES && defined SHARED - /* The glibc.tune.hwcap_mask tunable is initialized already, so no need to do + /* The glibc.cpu.hwcap_mask tunable is initialized already, so no need to do this. */ GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT; #endif @@ -494,7 +494,7 @@ no_cpuid: /* Disable IBT and/or SHSTK if they are enabled by kernel, but disabled by environment variable: - GLIBC_TUNABLES=glibc.tune.hwcaps=-IBT,-SHSTK + GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK */ unsigned int cet_feature = 0; if (!HAS_CPU_FEATURE (IBT)) diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h index 347a4b118d..4c6d08c709 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/cpu-features.h @@ -141,7 +141,7 @@ struct cpu_features unsigned long int xsave_state_size; /* The full state size for XSAVE when XSAVEC is disabled by - GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable + GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable */ unsigned int xsave_state_full_size; unsigned int feature[FEATURE_INDEX_MAX]; diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c index 69155a8f44..8e92358c67 100644 --- a/sysdeps/x86/cpu-tunables.c +++ b/sysdeps/x86/cpu-tunables.c @@ -17,7 +17,7 @@ <http://www.gnu.org/licenses/>. */ #if HAVE_TUNABLES -# define TUNABLE_NAMESPACE tune +# define TUNABLE_NAMESPACE cpu # include <stdbool.h> # include <stdint.h> # include <unistd.h> /* Get STDOUT_FILENO for _dl_printf. */ @@ -116,7 +116,7 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) the hardware which wasn't available when the selection was made. The environment variable: - GLIBC_TUNABLES=glibc.tune.hwcaps=-xxx,yyy,-zzz,.... + GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz,.... can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz, where the feature name is case-sensitive and has to diff --git a/sysdeps/x86/dl-cet.c b/sysdeps/x86/dl-cet.c index b82ba14e75..78f36bcf53 100644 --- a/sysdeps/x86/dl-cet.c +++ b/sysdeps/x86/dl-cet.c @@ -128,7 +128,7 @@ dl_cet_check (struct link_map *m, const char *program) /* Enable IBT and SHSTK only if they are enabled in executable. NB: IBT and SHSTK may be disabled by environment variable: - GLIBC_TUNABLES=glibc.tune.hwcaps=-IBT,-SHSTK + GLIBC_TUNABLES=glibc.cpu.hwcaps=-IBT,-SHSTK */ enable_ibt &= (HAS_CPU_FEATURE (IBT) && (enable_ibt_type == CET_ALWAYS_ON diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list index 73886b1352..2a457d0eec 100644 --- a/sysdeps/x86/dl-tunables.list +++ b/sysdeps/x86/dl-tunables.list @@ -17,7 +17,7 @@ # <http://www.gnu.org/licenses/>. glibc { - tune { + cpu { hwcaps { type: STRING } diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile index 9f1562f1b2..d51cf03ac9 100644 --- a/sysdeps/x86_64/Makefile +++ b/sysdeps/x86_64/Makefile @@ -57,7 +57,7 @@ modules-names += x86_64/tst-x86_64mod-1 LDFLAGS-tst-x86_64mod-1.so = -Wl,-soname,tst-x86_64mod-1.so ifneq (no,$(have-tunables)) # Test the state size for XSAVE when XSAVEC is disabled. -tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable +tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable endif $(objpfx)tst-x86_64-1: $(objpfx)x86_64/tst-x86_64mod-1.so @@ -74,7 +74,7 @@ $(objpfx)tst-platform-1.out: $(objpfx)x86_64/tst-platformmod-2.so # Turn off AVX512F_Usable and AVX2_Usable so that GLRO(dl_platform) is # always set to x86_64. tst-platform-1-ENV = LD_PRELOAD=$(objpfx)\$$PLATFORM/tst-platformmod-2.so \ - GLIBC_TUNABLES=glibc.tune.hwcaps=-AVX512F_Usable,-AVX2_Usable + GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F_Usable,-AVX2_Usable endif tests += tst-audit3 tst-audit4 tst-audit5 tst-audit6 tst-audit7 \
Siddhesh Poyarekar <siddhesh@sourceware.org> writes: > On 07/17/2018 01:32 AM, Tulio Magno Quites Machado Filho wrote: >> I'm not following your line of thought here: >> >> - glibc.cpu.hwcaps is specific to i386 and x86-64 >> - glibc.cpu is specific to aarch64 >> - glibc.cpu.cached_memopt is specific to powerpc, powerpc64 and powerpc64le >> >> What am I missing? > > The difference is that glibc.cpu.name and glibc.cpu.hwcaps are > conceptually generic tunables, i.e. there is a reasonable chance that > couple of releases down the line another architecture may want to > provide tuning facility for CPUs by name or by HWCAPS. The > cached_memopt one is not very clear to me and seems more like something > that is only useful on power8. Maybe it isn't restricted only to powerpc: https://sourceware.org/ml/libc-alpha/2018-08/msg00069.html Obviously other machine maintainers may not be interested on cached_memopt, but this thread helps me to explain why I was thinking cached_memopt was generic. >> Notice the optimization is not specific to a CPU, but specific to an user >> scenario (cacheable memory). In other words, the optimization can't be used >> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when >> cache-inhibited memory is being used. > > Ahh OK, I got thrown off by the fact that there's a separate routine for > it and assumed that it is Power8-specific. I have a different concern > then; a tunable is process-wide so the cached_memopt tunable essentially > assumes that the entire process is using cache-inhibited memory. Is > that a reasonable assumption? It's the opposite. When cached_memopt=1, it's assumed the process only uses cacheable memory. If cached_memopt=0 (default value), nothing is assumed and a safe execution is taken. > 1. A new relocation that overlays on top of ifuncs and allows selection > of routines based on specific properties. I have had this idea for a > while but no time to implement it and it has much more general scope > than memory type; for example memory alignment could also be a factor to > short-cut parts of string routines at compile time itself. It does not > have the runtime flexibility of a tunable but is probably far more > configurable. Sounds interesting. Where are these properties coming from? > 2. If there is a correlation to size then implement something similar to > the x86 temporal_threshold tunable. This is probably just as good or > bad as setting a cached_memopt flag but has the effect of generalizing > what was a tunable. I don't think this option would help in this case. I can't correlate size to cache-inhibited memory.
On 08/04/2018 02:03 AM, Tulio Magno Quites Machado Filho wrote: > Maybe it isn't restricted only to powerpc: > https://sourceware.org/ml/libc-alpha/2018-08/msg00069.html > > Obviously other machine maintainers may not be interested on cached_memopt, > but this thread helps me to explain why I was thinking cached_memopt was > generic. OK. >>> Notice the optimization is not specific to a CPU, but specific to an user >>> scenario (cacheable memory). In other words, the optimization can't be used >>> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when >>> cache-inhibited memory is being used. >> >> Ahh OK, I got thrown off by the fact that there's a separate routine for >> it and assumed that it is Power8-specific. I have a different concern >> then; a tunable is process-wide so the cached_memopt tunable essentially >> assumes that the entire process is using cache-inhibited memory. Is >> that a reasonable assumption? > > It's the opposite. > When cached_memopt=1, it's assumed the process only uses cacheable memory. > If cached_memopt=0 (default value), nothing is assumed and a safe execution > is taken. OK, thanks for the clarification. It doesn't change my question though; is there a performance loss when you do a safe execution and does it make sense to fix this in glibc? I haven't formed a strong opinion either way for the latter yet but one thing that would be nice to ensure is that we don't do different things for different architectures. There seems to be scope to come to a consensus across architectures for this and we should try to do that. Given that Cauldron is only a month away, we could have a more detailed conversation on this in the glibc BoF too if necessary. >> 1. A new relocation that overlays on top of ifuncs and allows selection >> of routines based on specific properties. I have had this idea for a >> while but no time to implement it and it has much more general scope >> than memory type; for example memory alignment could also be a factor to >> short-cut parts of string routines at compile time itself. It does not >> have the runtime flexibility of a tunable but is probably far more >> configurable. > > Sounds interesting. Where are these properties coming from? I haven't thought this through tbh, but something like this: - Add new relocations for each special case: R_MEMCPY_REG, R_MEMCPY_CACHE_INHIBITED, R_MEMCPY_ALIGN16, etc. that can be generated based on properties of the inputs such as volatileness, alignment, etc. - Create separate entry points memcpy@plt and memcpy_noncached@plt for each relocation we end up using for that TU. - Have the ifunc resolver take into consideration the relocation type when patching in the PLT. It may be simpler to just emit different entry points (similar to the *_finite math functions) and separate ifunc resolvers if there is no overlap between ifunc implementations for these entry points. > I don't think this option would help in this case. > I can't correlate size to cache-inhibited memory. Right, I had not understood where you were coming from then and assumed you were talking about non-temporal accesses. Siddhesh
Siddhesh Poyarekar <siddhesh@sourceware.org> writes: > On 08/04/2018 02:03 AM, Tulio Magno Quites Machado Filho wrote: >>>> Notice the optimization is not specific to a CPU, but specific to an user >>>> scenario (cacheable memory). In other words, the optimization can't be used >>>> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when >>>> cache-inhibited memory is being used. >>> >>> Ahh OK, I got thrown off by the fact that there's a separate routine for >>> it and assumed that it is Power8-specific. I have a different concern >>> then; a tunable is process-wide so the cached_memopt tunable essentially >>> assumes that the entire process is using cache-inhibited memory. Is >>> that a reasonable assumption? >> >> It's the opposite. >> When cached_memopt=1, it's assumed the process only uses cacheable memory. >> If cached_memopt=0 (default value), nothing is assumed and a safe execution >> is taken. > > OK, thanks for the clarification. It doesn't change my question though; > is there a performance loss when you do a safe execution Yes, for cacheable memory. A safe execution uses only naturally aligned memory accesses and doesn't provide the best performance we have. However an unsafe execution on cached inhibited memory is catastrophic because every naturally unaligned memory access generates an alignment interruption that is treated by the kernel, causing an even greater performance impact than a safe execution on cacheable memory. > does it make sense to fix this in glibc? IMHO, yes. I haven't seen yet a good explanation on why userspace programs should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not prohibit this. > There seems to be scope to come to a consensus across architectures for this > and we should try to do that. Agreed. >>> 1. A new relocation that overlays on top of ifuncs and allows selection >>> of routines based on specific properties. I have had this idea for a >>> while but no time to implement it and it has much more general scope >>> than memory type; for example memory alignment could also be a factor to >>> short-cut parts of string routines at compile time itself. It does not >>> have the runtime flexibility of a tunable but is probably far more >>> configurable. >> >> Sounds interesting. Where are these properties coming from? > > I haven't thought this through tbh, but something like this: > > - Add new relocations for each special case: R_MEMCPY_REG, > R_MEMCPY_CACHE_INHIBITED, R_MEMCPY_ALIGN16, etc. that can be generated > based on properties of the inputs such as volatileness, alignment, etc. > > - Create separate entry points memcpy@plt and memcpy_noncached@plt for > each relocation we end up using for that TU. > > - Have the ifunc resolver take into consideration the relocation type > when patching in the PLT. > > It may be simpler to just emit different entry points (similar to the > *_finite math functions) and separate ifunc resolvers if there is no > overlap between ifunc implementations for these entry points. I still believe this could help, but there is still one open issue: how do we know a memcpy call is accessing cached inhibited memory? I'm afraid this property is not that easy to detect.
On Mon, Aug 06, 2018 at 10:33:40AM -0300, Tulio Magno Quites Machado Filho wrote: > Siddhesh Poyarekar <siddhesh@sourceware.org> writes: > > > On 08/04/2018 02:03 AM, Tulio Magno Quites Machado Filho wrote: > >>>> Notice the optimization is not specific to a CPU, but specific to an user > >>>> scenario (cacheable memory). In other words, the optimization can't be used > >>>> whenever PPC_FEATURE2_ARCH_2_07 because it could downgrade the performance when > >>>> cache-inhibited memory is being used. > >>> > >>> Ahh OK, I got thrown off by the fact that there's a separate routine for > >>> it and assumed that it is Power8-specific. I have a different concern > >>> then; a tunable is process-wide so the cached_memopt tunable essentially > >>> assumes that the entire process is using cache-inhibited memory. Is > >>> that a reasonable assumption? > >> > >> It's the opposite. > >> When cached_memopt=1, it's assumed the process only uses cacheable memory. > >> If cached_memopt=0 (default value), nothing is assumed and a safe execution > >> is taken. > > > > OK, thanks for the clarification. It doesn't change my question though; > > is there a performance loss when you do a safe execution > > Yes, for cacheable memory. A safe execution uses only naturally aligned memory > accesses and doesn't provide the best performance we have. > > However an unsafe execution on cached inhibited memory is catastrophic because > every naturally unaligned memory access generates an alignment interruption > that is treated by the kernel, causing an even greater performance impact than > a safe execution on cacheable memory. > > > does it make sense to fix this in glibc? > > IMHO, yes. I haven't seen yet a good explanation on why userspace programs > should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not > prohibit this. I'm not entirely clear on what "these conditions" are, but ISO C does not allow memcpy to be used on volatile objects. It's not clear how such objects would come into existence (presumably some sort of mmap), but if they have weird properties about how you can perform accesses on them, I think it's reasonable to argue that they have to be volatile. The compiler should not be generating calls to memcpy for volatile objects; if it does that's a compiler bug. Rich
On 08/06/2018 07:03 PM, Tulio Magno Quites Machado Filho wrote: > Yes, for cacheable memory. A safe execution uses only naturally aligned memory > accesses and doesn't provide the best performance we have. > > However an unsafe execution on cached inhibited memory is catastrophic because > every naturally unaligned memory access generates an alignment interruption > that is treated by the kernel, causing an even greater performance impact than > a safe execution on cacheable memory. There seem to be two discussions that seem to me to be slightly orthogonal: there's the issue of using memcpy for volatile objects because overlapping writes may not work correctly without barriers and then there is the question of ensuring aligned accesses for device memory that may have been mapped in as cache-inhibited and does not like misaligned access. It seems to me the issue with Power w.r.t. cache-inhibited memory access is only the latter. Is that correct? >> does it make sense to fix this in glibc? > > IMHO, yes. I haven't seen yet a good explanation on why userspace programs > should not be using memcpy in these conditions, e.g. AFAIK, ISO C 11 does not > prohibit this. If it is a question of misaligned accesses only then there may be a case to add a memcpy that strictly does aligned accesses only, but a better name for that would be glibc.cpu.misaligned_access and not cached_memopt since that has slightly different implications. If volatile (and overlapping) access is also an issue then there seems to be some amount of clarity that we need not attempt to support it in memcpy by default. I don't know if having support only in Power makes sense but if there is a strong need for it then the tunable name should change to something more precise, e.g. glibc.cpu.ppc_allow_volatile_memcpy. > I still believe this could help, but there is still one open issue: how do we > know a memcpy call is accessing cached inhibited memory? > I'm afraid this property is not that easy to detect. It's not, it has to be annotated by the developer. Siddhesh
diff --git a/NEWS b/NEWS index 5de2c2816f..b5308fd596 100644 --- a/NEWS +++ b/NEWS @@ -173,6 +173,9 @@ Deprecated and removed features, and other changes affecting compatibility: project's versions of these files. The plan is to make this the default behavior in a future release. +* The glibc.tune tunable namespace has been renamed to glibc.cpu and the + tunable glibc.tune.cpu has been renamed to glibc.cpu.name. + Changes to build and runtime requirements: GNU make 4.0 or later is now required to build glibc. diff --git a/elf/dl-hwcaps.c b/elf/dl-hwcaps.c index 23482a88a1..ecf00b4577 100644 --- a/elf/dl-hwcaps.c +++ b/elf/dl-hwcaps.c @@ -140,7 +140,7 @@ _dl_important_hwcaps (const char *platform, size_t platform_len, size_t *sz, string and bit like you can ignore an OS-supplied HWCAP bit. */ hwcap_mask |= (uint64_t) mask << _DL_FIRST_EXTRA; #if HAVE_TUNABLES - TUNABLE_SET (glibc, tune, hwcap_mask, uint64_t, hwcap_mask); + TUNABLE_SET (glibc, cpu, hwcap_mask, uint64_t, hwcap_mask); #else GLRO(dl_hwcap_mask) = hwcap_mask; #endif diff --git a/elf/dl-hwcaps.h b/elf/dl-hwcaps.h index 17f0da4c73..d69ee11dc2 100644 --- a/elf/dl-hwcaps.h +++ b/elf/dl-hwcaps.h @@ -19,7 +19,7 @@ #include <elf/dl-tunables.h> #if HAVE_TUNABLES -# define GET_HWCAP_MASK() TUNABLE_GET (glibc, tune, hwcap_mask, uint64_t, NULL) +# define GET_HWCAP_MASK() TUNABLE_GET (glibc, cpu, hwcap_mask, uint64_t, NULL) #else # ifdef SHARED # define GET_HWCAP_MASK() GLRO(dl_hwcap_mask) diff --git a/elf/dl-tunables.list b/elf/dl-tunables.list index 1f8ecb8437..b108592b62 100644 --- a/elf/dl-tunables.list +++ b/elf/dl-tunables.list @@ -86,7 +86,7 @@ glibc { type: SIZE_T } } - tune { + cpu { hwcap_mask { type: UINT_64 env_alias: LD_HWCAP_MASK diff --git a/manual/README.tunables b/manual/README.tunables index 3967679f43..f87a31a65e 100644 --- a/manual/README.tunables +++ b/manual/README.tunables @@ -105,11 +105,11 @@ where 'check' is the tunable name, 'int32_t' is the C type of the tunable and To get and set tunables in a different namespace from that module, use the full form of the macros as follows: - val = TUNABLE_GET_FULL (glibc, tune, hwcap_mask, uint64_t, NULL) + val = TUNABLE_GET_FULL (glibc, cpu, hwcap_mask, uint64_t, NULL) - TUNABLE_SET_FULL (glibc, tune, hwcap_mask, uint64_t, val) + TUNABLE_SET_FULL (glibc, cpu, hwcap_mask, uint64_t, val) -where 'glibc' is the top namespace, 'tune' is the tunable namespace and the +where 'glibc' is the top namespace, 'cpu' is the tunable namespace and the remaining arguments are the same as the short form macros. When TUNABLE_NAMESPACE is not defined in a module, TUNABLE_GET is equivalent to diff --git a/manual/tunables.texi b/manual/tunables.texi index be33c9fc79..9b8f9e4610 100644 --- a/manual/tunables.texi +++ b/manual/tunables.texi @@ -295,23 +295,23 @@ The default value of this tunable is @samp{3}. @cindex non_temporal_threshold tunables @cindex tunables, non_temporal_threshold -@deftp {Tunable namespace} glibc.tune +@deftp {Tunable namespace} glibc.cpu Behavior of @theglibc{} can be tuned to assume specific hardware capabilities by setting the following tunables in the @code{tune} namespace: @end deftp -@deftp Tunable glibc.tune.hwcap_mask +@deftp Tunable glibc.cpu.hwcap_mask This tunable supersedes the @env{LD_HWCAP_MASK} environment variable and is identical in features. The @code{AT_HWCAP} key in the Auxiliary Vector specifies instruction set extensions available in the processor at runtime for some architectures. The -@code{glibc.tune.hwcap_mask} tunable allows the user to mask out those +@code{glibc.cpu.hwcap_mask} tunable allows the user to mask out those capabilities at runtime, thus disabling use of those extensions. @end deftp -@deftp Tunable glibc.tune.hwcaps -The @code{glibc.tune.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to +@deftp Tunable glibc.cpu.hwcaps +The @code{glibc.cpu.hwcaps=-xxx,yyy,-zzz...} tunable allows the user to enable CPU/ARCH feature @code{yyy}, disable CPU/ARCH feature @code{xxx} and @code{zzz} where the feature name is case-sensitive and has to match the ones in @code{sysdeps/x86/cpu-features.h}. @@ -319,8 +319,8 @@ the ones in @code{sysdeps/x86/cpu-features.h}. This tunable is specific to i386 and x86-64. @end deftp -@deftp Tunable glibc.tune.cached_memopt -The @code{glibc.tune.cached_memopt=[0|1]} tunable allows the user to +@deftp Tunable glibc.cpu.cached_memopt +The @code{glibc.cpu.cached_memopt=[0|1]} tunable allows the user to enable optimizations recommended for cacheable memory. If set to @code{1}, @theglibc{} assumes that the process memory image consists of cacheable (non-device) memory only. The default, @code{0}, @@ -329,8 +329,8 @@ indicates that the process may use device memory. This tunable is specific to powerpc, powerpc64 and powerpc64le. @end deftp -@deftp Tunable glibc.tune.cpu -The @code{glibc.tune.cpu=xxx} tunable allows the user to tell @theglibc{} to +@deftp Tunable glibc.cpu.cpu +The @code{glibc.cpu.cpu=xxx} tunable allows the user to tell @theglibc{} to assume that the CPU is @code{xxx} where xxx may have one of these values: @code{generic}, @code{falkor}, @code{thunderxt88}, @code{thunderx2t99}, @code{thunderx2t99p1}. @@ -338,20 +338,20 @@ assume that the CPU is @code{xxx} where xxx may have one of these values: This tunable is specific to aarch64. @end deftp -@deftp Tunable glibc.tune.x86_data_cache_size -The @code{glibc.tune.x86_data_cache_size} tunable allows the user to set +@deftp Tunable glibc.cpu.x86_data_cache_size +The @code{glibc.cpu.x86_data_cache_size} tunable allows the user to set data cache size in bytes for use in memory and string routines. This tunable is specific to i386 and x86-64. @end deftp -@deftp Tunable glibc.tune.x86_shared_cache_size -The @code{glibc.tune.x86_shared_cache_size} tunable allows the user to +@deftp Tunable glibc.cpu.x86_shared_cache_size +The @code{glibc.cpu.x86_shared_cache_size} tunable allows the user to set shared cache size in bytes for use in memory and string routines. @end deftp -@deftp Tunable glibc.tune.x86_non_temporal_threshold -The @code{glibc.tune.x86_non_temporal_threshold} tunable allows the user +@deftp Tunable glibc.cpu.x86_non_temporal_threshold +The @code{glibc.cpu.x86_non_temporal_threshold} tunable allows the user to set threshold in bytes for non temporal store. This tunable is specific to i386 and x86-64. diff --git a/sysdeps/aarch64/dl-tunables.list b/sysdeps/aarch64/dl-tunables.list index f6a88168cc..cfcf940ebd 100644 --- a/sysdeps/aarch64/dl-tunables.list +++ b/sysdeps/aarch64/dl-tunables.list @@ -17,8 +17,8 @@ # <http://www.gnu.org/licenses/>. glibc { - tune { - cpu { + cpu { + name { type: STRING } } diff --git a/sysdeps/powerpc/cpu-features.c b/sysdeps/powerpc/cpu-features.c index 955d4778a6..ad809b9815 100644 --- a/sysdeps/powerpc/cpu-features.c +++ b/sysdeps/powerpc/cpu-features.c @@ -30,7 +30,7 @@ init_cpu_features (struct cpu_features *cpu_features) tunables is enable, since for this case user can explicit disable unaligned optimizations. */ #if HAVE_TUNABLES - int32_t cached_memfunc = TUNABLE_GET (glibc, tune, cached_memopt, int32_t, + int32_t cached_memfunc = TUNABLE_GET (glibc, cpu, cached_memopt, int32_t, NULL); cpu_features->use_cached_memopt = (cached_memfunc > 0); #else diff --git a/sysdeps/powerpc/dl-tunables.list b/sysdeps/powerpc/dl-tunables.list index d26636a16b..b3372555f7 100644 --- a/sysdeps/powerpc/dl-tunables.list +++ b/sysdeps/powerpc/dl-tunables.list @@ -17,7 +17,7 @@ # <http://www.gnu.org/licenses/>. glibc { - tune { + cpu { cached_memopt { type: INT_32 minval: 0 diff --git a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c index 39eba0186f..b4f348509e 100644 --- a/sysdeps/unix/sysv/linux/aarch64/cpu-features.c +++ b/sysdeps/unix/sysv/linux/aarch64/cpu-features.c @@ -57,7 +57,7 @@ init_cpu_features (struct cpu_features *cpu_features) #if HAVE_TUNABLES /* Get the tunable override. */ - const char *mcpu = TUNABLE_GET (glibc, tune, cpu, const char *, NULL); + const char *mcpu = TUNABLE_GET (glibc, cpu, name, const char *, NULL); if (mcpu != NULL) midr = get_midr_from_mcpu (mcpu); #endif diff --git a/sysdeps/x86/cpu-features.c b/sysdeps/x86/cpu-features.c index d41ebde823..b8bef8d54b 100644 --- a/sysdeps/x86/cpu-features.c +++ b/sysdeps/x86/cpu-features.c @@ -22,7 +22,7 @@ #include <libc-pointer-arith.h> #if HAVE_TUNABLES -# define TUNABLE_NAMESPACE tune +# define TUNABLE_NAMESPACE cpu # include <unistd.h> /* Get STDOUT_FILENO for _dl_printf. */ # include <elf/dl-tunables.h> @@ -398,7 +398,7 @@ no_cpuid: /* Reuse dl_platform, dl_hwcap and dl_hwcap_mask for x86. */ #if !HAVE_TUNABLES && defined SHARED - /* The glibc.tune.hwcap_mask tunable is initialized already, so no need to do + /* The glibc.cpu.hwcap_mask tunable is initialized already, so no need to do this. */ GLRO(dl_hwcap_mask) = HWCAP_IMPORTANT; #endif diff --git a/sysdeps/x86/cpu-features.h b/sysdeps/x86/cpu-features.h index 624e681e96..e9713f6215 100644 --- a/sysdeps/x86/cpu-features.h +++ b/sysdeps/x86/cpu-features.h @@ -141,7 +141,7 @@ struct cpu_features unsigned long int xsave_state_size; /* The full state size for XSAVE when XSAVEC is disabled by - GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable + GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable */ unsigned int xsave_state_full_size; unsigned int feature[FEATURE_INDEX_MAX]; diff --git a/sysdeps/x86/cpu-tunables.c b/sysdeps/x86/cpu-tunables.c index af761dcbbc..c38af71b8a 100644 --- a/sysdeps/x86/cpu-tunables.c +++ b/sysdeps/x86/cpu-tunables.c @@ -17,7 +17,7 @@ <http://www.gnu.org/licenses/>. */ #if HAVE_TUNABLES -# define TUNABLE_NAMESPACE tune +# define TUNABLE_NAMESPACE cpu # include <stdbool.h> # include <stdint.h> # include <unistd.h> /* Get STDOUT_FILENO for _dl_printf. */ @@ -116,7 +116,7 @@ TUNABLE_CALLBACK (set_hwcaps) (tunable_val_t *valp) the hardware which wasn't available when the selection was made. The environment variable: - GLIBC_TUNABLES=glibc.tune.hwcaps=-xxx,yyy,-zzz,.... + GLIBC_TUNABLES=glibc.cpu.hwcaps=-xxx,yyy,-zzz,.... can be used to enable CPU/ARCH feature yyy, disable CPU/ARCH feature yyy and zzz, where the feature name is case-sensitive and has to diff --git a/sysdeps/x86/dl-tunables.list b/sysdeps/x86/dl-tunables.list index 7c3236a68f..9a5a0b1a63 100644 --- a/sysdeps/x86/dl-tunables.list +++ b/sysdeps/x86/dl-tunables.list @@ -17,7 +17,7 @@ # <http://www.gnu.org/licenses/>. glibc { - tune { + cpu { hwcaps { type: STRING } diff --git a/sysdeps/x86_64/Makefile b/sysdeps/x86_64/Makefile index 9f1562f1b2..d51cf03ac9 100644 --- a/sysdeps/x86_64/Makefile +++ b/sysdeps/x86_64/Makefile @@ -57,7 +57,7 @@ modules-names += x86_64/tst-x86_64mod-1 LDFLAGS-tst-x86_64mod-1.so = -Wl,-soname,tst-x86_64mod-1.so ifneq (no,$(have-tunables)) # Test the state size for XSAVE when XSAVEC is disabled. -tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.tune.hwcaps=-XSAVEC_Usable +tst-x86_64-1-ENV = GLIBC_TUNABLES=glibc.cpu.hwcaps=-XSAVEC_Usable endif $(objpfx)tst-x86_64-1: $(objpfx)x86_64/tst-x86_64mod-1.so @@ -74,7 +74,7 @@ $(objpfx)tst-platform-1.out: $(objpfx)x86_64/tst-platformmod-2.so # Turn off AVX512F_Usable and AVX2_Usable so that GLRO(dl_platform) is # always set to x86_64. tst-platform-1-ENV = LD_PRELOAD=$(objpfx)\$$PLATFORM/tst-platformmod-2.so \ - GLIBC_TUNABLES=glibc.tune.hwcaps=-AVX512F_Usable,-AVX2_Usable + GLIBC_TUNABLES=glibc.cpu.hwcaps=-AVX512F_Usable,-AVX2_Usable endif tests += tst-audit3 tst-audit4 tst-audit5 tst-audit6 tst-audit7 \