Message ID | 20240214180735.702533-1-philip.cox@canonical.com |
---|---|
Headers | show |
Series | performance: mm/percpu-internal.h: Re-layout pcpu_chunk to mitigate false sharing | expand |
On 14.02.24 19:07, Philip Cox wrote: > BugLink: https://bugs.launchpad.net/bugs/2053152 > > SRU Justification: > > [Impact] > Having 'base_addr' in the same cache line as 'free_bytes', 'chunk_md', and > 'base_addr' causes a performance impact on x86_64 systems which can be > observed while running UnixBench/Excel throughput. > > [Fix] > In current pcpu_chunk layout, `base_addr' is in the same cache line with > `free_bytes' and `chunk_md', and `base_addr' is at the last 8 bytes. > This patch moves `bound_map' up to `base_addr', to let `base_addr' locate > in a new cacheline. > > [Test Plan] > I have tested this patch, as has Intel. This patch has been upstreamed since 6.5 as well. > > [Where problems could occur] > The risk for regression is minimal as there should be no logic change to the code, and just > a struct is being ordered differently. This change has been upstreamed for a while with no issues. > Changed main task of bug report to fix-released since proposed patch is upstream as of v6.5-rc1 (which also means included in Mantic). Acked-by: Stefan Bader <stefan.bader@canonical.com>
On 24/02/14 01:07PM, Philip Cox wrote: > BugLink: https://bugs.launchpad.net/bugs/2053152 > > SRU Justification: > > [Impact] > Having 'base_addr' in the same cache line as 'free_bytes', 'chunk_md', and > 'base_addr' causes a performance impact on x86_64 systems which can be > observed while running UnixBench/Excel throughput. > > [Fix] > In current pcpu_chunk layout, `base_addr' is in the same cache line with > `free_bytes' and `chunk_md', and `base_addr' is at the last 8 bytes. > This patch moves `bound_map' up to `base_addr', to let `base_addr' locate > in a new cacheline. > > [Test Plan] > I have tested this patch, as has Intel. This patch has been upstreamed since 6.5 as well. > > [Where problems could occur] > The risk for regression is minimal as there should be no logic change to the code, and just > a struct is being ordered differently. This change has been upstreamed for a while with no issues. > > -- > > > Yu Ma (1): > percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to reduce > false sharing > > mm/percpu-internal.h | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) Acked-by: Andrei Gherzan <andrei.gherzan@canonical.com>
On 14.02.24 19:07, Philip Cox wrote: > BugLink: https://bugs.launchpad.net/bugs/2053152 > > SRU Justification: > > [Impact] > Having 'base_addr' in the same cache line as 'free_bytes', 'chunk_md', and > 'base_addr' causes a performance impact on x86_64 systems which can be > observed while running UnixBench/Excel throughput. > > [Fix] > In current pcpu_chunk layout, `base_addr' is in the same cache line with > `free_bytes' and `chunk_md', and `base_addr' is at the last 8 bytes. > This patch moves `bound_map' up to `base_addr', to let `base_addr' locate > in a new cacheline. > > [Test Plan] > I have tested this patch, as has Intel. This patch has been upstreamed since 6.5 as well. > > [Where problems could occur] > The risk for regression is minimal as there should be no logic change to the code, and just > a struct is being ordered differently. This change has been upstreamed for a while with no issues. > Applied to jammy:linux/master-next. Thanks. -Stefan
On Wed, 2024-02-14 at 13:07 -0500, Philip Cox wrote: > BugLink: https://bugs.launchpad.net/bugs/2053152 > > SRU Justification: > > [Impact] > Having 'base_addr' in the same cache line as 'free_bytes', > 'chunk_md', and > 'base_addr' causes a performance impact on x86_64 systems which can > be > observed while running UnixBench/Excel throughput. > > [Fix] > In current pcpu_chunk layout, `base_addr' is in the same cache line > with > `free_bytes' and `chunk_md', and `base_addr' is at the last 8 bytes. > This patch moves `bound_map' up to `base_addr', to let `base_addr' > locate > in a new cacheline. > > [Test Plan] > I have tested this patch, as has Intel. This patch has been > upstreamed since 6.5 as well. > > [Where problems could occur] > The risk for regression is minimal as there should be no logic change > to the code, and just > a struct is being ordered differently. This change has been > upstreamed for a while with no issues. > > -- > > > Yu Ma (1): > percpu-internal/pcpu_chunk: re-layout pcpu_chunk structure to > reduce > false sharing > > mm/percpu-internal.h | 11 +++++++++-- > 1 file changed, 9 insertions(+), 2 deletions(-) >