Message ID | 20230718041433.217111-1-goldstein.w.n@gmail.com |
---|---|
State | New |
Headers | show |
Series | [v1] x86: Fix slight bug in `shared_per_thread` cache size calculation. | expand |
On Mon, Jul 17, 2023 at 11:14 PM Noah Goldstein <goldstein.w.n@gmail.com> wrote: > > After: > ``` > commit af992e7abdc9049714da76cae1e5e18bc4838fb8 > Author: Noah Goldstein <goldstein.w.n@gmail.com> > Date: Wed Jun 7 13:18:01 2023 -0500 > > x86: Increase `non_temporal_threshold` to roughly `sizeof_L3 / 4` > ``` > > Split `shared` (cumulative cache size) from `shared_per_thread` (cache > size per socket), the `shared_per_thread` *can* be slightly off from > the previous calculation. > > Previously we added `core` even if `threads_l2` was invalid, and only > used `threads_l2` to divide `core` if it was present. The changed > version only included `core` if `threads_l2` was valid. > > This change restores the old behavior if `threads_l2` is invalid by > adding the entire value of `core`. > --- > sysdeps/x86/dl-cacheinfo.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h > index c98fa57a7b..43be2c1229 100644 > --- a/sysdeps/x86/dl-cacheinfo.h > +++ b/sysdeps/x86/dl-cacheinfo.h > @@ -614,8 +614,8 @@ get_common_cache_info (long int *shared_ptr, long int * shared_per_thread_ptr, u > /* Account for non-inclusive L2 and L3 caches. */ > if (!inclusive_cache) > { > - if (threads_l2 > 0) > - shared_per_thread += core / threads_l2; > + long int core_per_thread = threads_l2 > 0 ? (core / threads_l2) : core; > + shared_per_thread += core_per_thread; > shared += core; > } > > -- > 2.34.1 > Noticed this when working on the AMD patch, but think the fix should be seperate.
diff --git a/sysdeps/x86/dl-cacheinfo.h b/sysdeps/x86/dl-cacheinfo.h index c98fa57a7b..43be2c1229 100644 --- a/sysdeps/x86/dl-cacheinfo.h +++ b/sysdeps/x86/dl-cacheinfo.h @@ -614,8 +614,8 @@ get_common_cache_info (long int *shared_ptr, long int * shared_per_thread_ptr, u /* Account for non-inclusive L2 and L3 caches. */ if (!inclusive_cache) { - if (threads_l2 > 0) - shared_per_thread += core / threads_l2; + long int core_per_thread = threads_l2 > 0 ? (core / threads_l2) : core; + shared_per_thread += core_per_thread; shared += core; }