mbox series

[SRU,m/n,0/2] arm64: TCP memory leak, slow network

Message ID 20240524152740.176176-1-philip.cox@canonical.com
Headers show
Series arm64: TCP memory leak, slow network | expand

Message

Philip Cox May 24, 2024, 3:27 p.m. UTC
BugLink: BugLink: https://bugs.launchpad.net/bugs/2045560


SRU Justification:

[Impact]
There is a slow performance regression in the arm64 TCP code path. It appears to slowly leak memory as well.

This was caused to do amd64 and arm64 behaving differently in the codepaths and being IRQ safe.

The regression was introduced in kernel v6.0, and the fix is in v6.9.


[Fix]
The fix is to bring the arm64 behavior inline with the arm64 behavior, and remove the preempt_disable()/preempt_enable() pairs.

There are two patches required from the 6.8.y stable branch for this fix.   The two patches from the stable 6.8.y branch are:
e830c804e267     net: make SK_MEMORY_PCPU_RESERV tunable 
d2fa3493811ec     net: fix sk_memory_allocated_{add|sub} vs softirqs

[Test Plan]
I have tested this patch, as has AWS.

[Where problems could occur]
Due to the differences of how irq preemption is being handled, future code may conflict with the new helpers introduced.


--


Adam Li (1):
  net: make SK_MEMORY_PCPU_RESERV tunable

Eric Dumazet (1):
  net: fix sk_memory_allocated_{add|sub} vs softirqs

 Documentation/admin-guide/sysctl/net.rst |  5 +++
 include/net/sock.h                       | 39 +++++++++++++-----------
 net/core/sock.c                          |  1 +
 net/core/sysctl_net_core.c               |  9 ++++++
 4 files changed, 36 insertions(+), 18 deletions(-)

Comments

Tim Gardner May 24, 2024, 4:09 p.m. UTC | #1
On 5/24/24 9:27 AM, Philip Cox wrote:
> 
> BugLink: BugLink: https://bugs.launchpad.net/bugs/2045560
> 
> 
> SRU Justification:
> 
> [Impact]
> There is a slow performance regression in the arm64 TCP code path. It appears to slowly leak memory as well.
> 
> This was caused to do amd64 and arm64 behaving differently in the codepaths and being IRQ safe.
> 
> The regression was introduced in kernel v6.0, and the fix is in v6.9.
> 
> 
> [Fix]
> The fix is to bring the arm64 behavior inline with the arm64 behavior, and remove the preempt_disable()/preempt_enable() pairs.
> 
> There are two patches required from the 6.8.y stable branch for this fix.   The two patches from the stable 6.8.y branch are:
> e830c804e267     net: make SK_MEMORY_PCPU_RESERV tunable
> d2fa3493811ec     net: fix sk_memory_allocated_{add|sub} vs softirqs
> 
> [Test Plan]
> I have tested this patch, as has AWS.
> 
> [Where problems could occur]
> Due to the differences of how irq preemption is being handled, future code may conflict with the new helpers introduced.
> 
> 
> --
> 
> 
> Adam Li (1):
>    net: make SK_MEMORY_PCPU_RESERV tunable
> 
> Eric Dumazet (1):
>    net: fix sk_memory_allocated_{add|sub} vs softirqs
> 
>   Documentation/admin-guide/sysctl/net.rst |  5 +++
>   include/net/sock.h                       | 39 +++++++++++++-----------
>   net/core/sock.c                          |  1 +
>   net/core/sysctl_net_core.c               |  9 ++++++
>   4 files changed, 36 insertions(+), 18 deletions(-)
> 
Acked-by: Tim Gardner <tim.gardner@canonical.com>
Thibault Ferrante May 24, 2024, 5:29 p.m. UTC | #2
On 24-05-2024 17:27, Philip Cox wrote:
> 
> BugLink: BugLink: https://bugs.launchpad.net/bugs/2045560
'Buglink:' is duplicated here, same in patchs, should be removed before applying.
> 
> 
> SRU Justification:
> 
> [Impact]
> There is a slow performance regression in the arm64 TCP code path. It appears to slowly leak memory as well.
> 
> This was caused to do amd64 and arm64 behaving differently in the codepaths and being IRQ safe.
> 
> The regression was introduced in kernel v6.0, and the fix is in v6.9.
> 
> 
> [Fix]
> The fix is to bring the arm64 behavior inline with the arm64 behavior, and remove the preempt_disable()/preempt_enable() pairs.
> 
> There are two patches required from the 6.8.y stable branch for this fix.   The two patches from the stable 6.8.y branch are:
> e830c804e267     net: make SK_MEMORY_PCPU_RESERV tunable
> d2fa3493811ec     net: fix sk_memory_allocated_{add|sub} vs softirqs
> 
> [Test Plan]
> I have tested this patch, as has AWS.
> 
> [Where problems could occur]
> Due to the differences of how irq preemption is being handled, future code may conflict with the new helpers introduced.
> 
> 
> --
> 
> 
> Adam Li (1):
>    net: make SK_MEMORY_PCPU_RESERV tunable
> 
> Eric Dumazet (1):
>    net: fix sk_memory_allocated_{add|sub} vs softirqs
> 
>   Documentation/admin-guide/sysctl/net.rst |  5 +++
>   include/net/sock.h                       | 39 +++++++++++++-----------
>   net/core/sock.c                          |  1 +
>   net/core/sysctl_net_core.c               |  9 ++++++
>   4 files changed, 36 insertions(+), 18 deletions(-)
> 

Acked-by: Thibault Ferrante <thibault.ferrante@canonical.com>

--
Thibault
Stefan Bader May 31, 2024, 1:17 p.m. UTC | #3
On 24.05.24 17:27, Philip Cox wrote:
> 
> BugLink: BugLink: https://bugs.launchpad.net/bugs/2045560
> 
> 
> SRU Justification:
> 
> [Impact]
> There is a slow performance regression in the arm64 TCP code path. It appears to slowly leak memory as well.
> 
> This was caused to do amd64 and arm64 behaving differently in the codepaths and being IRQ safe.
> 
> The regression was introduced in kernel v6.0, and the fix is in v6.9.
> 
> 
> [Fix]
> The fix is to bring the arm64 behavior inline with the arm64 behavior, and remove the preempt_disable()/preempt_enable() pairs.
> 
> There are two patches required from the 6.8.y stable branch for this fix.   The two patches from the stable 6.8.y branch are:
> e830c804e267     net: make SK_MEMORY_PCPU_RESERV tunable
> d2fa3493811ec     net: fix sk_memory_allocated_{add|sub} vs softirqs
> 
> [Test Plan]
> I have tested this patch, as has AWS.
> 
> [Where problems could occur]
> Due to the differences of how irq preemption is being handled, future code may conflict with the new helpers introduced.
> 
> 
> --
> 
> 
> Adam Li (1):
>    net: make SK_MEMORY_PCPU_RESERV tunable
> 
> Eric Dumazet (1):
>    net: fix sk_memory_allocated_{add|sub} vs softirqs
> 
>   Documentation/admin-guide/sysctl/net.rst |  5 +++
>   include/net/sock.h                       | 39 +++++++++++++-----------
>   net/core/sock.c                          |  1 +
>   net/core/sysctl_net_core.c               |  9 ++++++
>   4 files changed, 36 insertions(+), 18 deletions(-)
> 

Applied to noble,mantic:linux/master-next. Thanks.

-Stefan