mbox series

[0/1,N,linux-aws] Backport x86/kaslr fix impacting ML/HPC workloads

Message ID 20240911180952.21076-1-philip.cox@canonical.com
Headers show
Series Backport x86/kaslr fix impacting ML/HPC workloads | expand

Message

Philip Cox Sept. 11, 2024, 6:09 p.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2080414

SRU Justification:

[Impact]

EC2 team discovered a Linux bug during ML/HPC workload testing on P5 instances. The issue is described at https://lore.kernel.org/all/87ed6soy3z.ffs@tglx/ ("iounmap() on x86 occasionally fails to unmap because the provided valid ioremap address is not below high_memory.").

[Fix]

This is fixed by upstream commit ea72ce5da22806d5713f3ffb39a6d5ae73841f93:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ea72ce5da22806d5713f3ffb39a6d5ae73841f93

[Test Plan]

AWS tested

[Where problems could occur]

If there was code that made incorrect assumptions on how memory was laid out and used, it might break the functionality, but seeing as this is fixing an error, the chances of that should be fairly low.

[Other Info]
SF #00392654

Comments

Thibault Ferrante Sept. 13, 2024, 12:04 p.m. UTC | #1
Acked-by: Thibault Ferrante <thibault.ferrante@canonical.com>


Nit: This submission and your other one (https://lists.ubuntu.com/archives/kernel-team/2024-September/153679.html)
doesn't have consistent subject across the patchset (casing change, spacing) and lacks `[SRU]` in it.
As such, it isn't tracked automatically.

On 11-09-2024 20:09, Philip Cox wrote:
> BugLink: https://bugs.launchpad.net/bugs/2080414
> 
> SRU Justification:
> 
> [Impact]
> 
> EC2 team discovered a Linux bug during ML/HPC workload testing on P5 instances. The issue is described at https://lore.kernel.org/all/87ed6soy3z.ffs@tglx/ ("iounmap() on x86 occasionally fails to unmap because the provided valid ioremap address is not below high_memory.").
> 
> [Fix]
> 
> This is fixed by upstream commit ea72ce5da22806d5713f3ffb39a6d5ae73841f93:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ea72ce5da22806d5713f3ffb39a6d5ae73841f93
> 
> [Test Plan]
> 
> AWS tested
> 
> [Where problems could occur]
> 
> If there was code that made incorrect assumptions on how memory was laid out and used, it might break the functionality, but seeing as this is fixing an error, the chances of that should be fairly low.
> 
> [Other Info]
> SF #00392654
>
Cengiz Can Oct. 4, 2024, 8:36 p.m. UTC | #2
On 11-09-24 14:09:51, Philip Cox wrote:
> BugLink: https://bugs.launchpad.net/bugs/2080414
> 
> SRU Justification:
> 
> [Impact]
> 
> EC2 team discovered a Linux bug during ML/HPC workload testing on P5 instances. The issue is described at https://lore.kernel.org/all/87ed6soy3z.ffs@tglx/ ("iounmap() on x86 occasionally fails to unmap because the provided valid ioremap address is not below high_memory.").
> 
> [Fix]
> 
> This is fixed by upstream commit ea72ce5da22806d5713f3ffb39a6d5ae73841f93:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ea72ce5da22806d5713f3ffb39a6d5ae73841f93
> 
> [Test Plan]
> 
> AWS tested
> 
> [Where problems could occur]
> 
> If there was code that made incorrect assumptions on how memory was laid out and used, it might break the functionality, but seeing as this is fixing an error, the chances of that should be fairly low.
> 
> [Other Info]
> SF #00392654
> 
> -- 
> 
> 
> Thomas Gleixner (1):
>   x86/kaslr: Expose and use the end of the physical memory address space

Acked-by: Cengiz Can <cengiz.can@canonical.com>

> 
>  arch/x86/include/asm/page_64.h          |  1 +
>  arch/x86/include/asm/pgtable_64_types.h |  4 ++++
>  arch/x86/mm/init_64.c                   |  4 ++++
>  arch/x86/mm/kaslr.c                     | 32 ++++++++++++++++++++-----
>  include/linux/mm.h                      |  4 ++++
>  kernel/resource.c                       |  6 ++---
>  mm/memory_hotplug.c                     |  2 +-
>  mm/sparse.c                             |  2 +-
>  8 files changed, 43 insertions(+), 12 deletions(-)
> 
> -- 
> 2.34.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
Philip Cox Oct. 7, 2024, 5:50 p.m. UTC | #3
On 2024-09-11 2:09 p.m., Philip Cox wrote:
> BugLink: https://bugs.launchpad.net/bugs/2080414
>
> SRU Justification:
>
> [Impact]
>
> EC2 team discovered a Linux bug during ML/HPC workload testing on P5 instances. The issue is described at https://lore.kernel.org/all/87ed6soy3z.ffs@tglx/ ("iounmap() on x86 occasionally fails to unmap because the provided valid ioremap address is not below high_memory.").
>
> [Fix]
>
> This is fixed by upstream commit ea72ce5da22806d5713f3ffb39a6d5ae73841f93:
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ea72ce5da22806d5713f3ffb39a6d5ae73841f93
>
> [Test Plan]
>
> AWS tested
>
> [Where problems could occur]
>
> If there was code that made incorrect assumptions on how memory was laid out and used, it might break the functionality, but seeing as this is fixing an error, the chances of that should be fairly low.
>
> [Other Info]
> SF #00392654
>