mbox series

[SRU,Jammy,0/1] turbostat fails with too many open files on large systems

Message ID 20240827004228.16253-1-matthew.ruffell@canonical.com
Headers show
Series turbostat fails with too many open files on large systems | expand

Message

Matthew Ruffell Aug. 27, 2024, 12:42 a.m. UTC
BugLink: https://bugs.launchpad.net/bugs/2069961

[Impact]

On large systems, e.g. with 512 cpus or more, turbostat fails to run due to
exceeding the rlimit for number of files. 512 cpus requires 1028 file
descriptors, but the current limit is 999.

$ lscpu
...
CPU(s):                  512
  On-line CPU(s) list:   0-511
...

$ sudo turbostat
...
turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files

There is no workaround, apart from maybe using powerstat instead.

[Fix]

The fix is to increase the rlimit to increase the amount of file descriptors
that turbostat can open to 2^15, which should be plenty for some time to come.

commit 3ac1d14d0583a2de75d49a5234d767e2590384dd
Author: Wyes Karny <wyes.karny@amd.com>
Date:   Tue Oct 3 05:07:51 2023 +0000
Subject: tools/power turbostat: Increase the limit for fd opened
Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3ac1d14d0583a2de75d49a5234d767e2590384dd

This landed in 6.9-rc4, and requires a backport for minor context adjustment in
the first hunk for jammy. Noble got fixed already through upstream stable.

[Testcase]

Deploy a bare metal system with 512 or more cpus.

Install linux-tools:

$ sudo apt install linux-tools-$(uname -r)

Run turbostat:

$ sudo turbostat
...
turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files

There are test kernels available in the following ppa:

https://launchpad.net/~mruffell/+archive/ubuntu/sf388491-test

If you install them, you should be able to see normal turbostat output for all
cpus installed in the system.

[Where problems can occur]

We are simply increasing the rlimit for file descriptors that turbostat can
open. This should have no impact on any existing systems.

If a regression should occur, then turbostat functionality might not work.
Users could use powerstat instead as a workaround while things are fixed.

Wyes Karny (1):
  tools/power turbostat: Increase the limit for fd opened

 tools/power/x86/turbostat/turbostat.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

Comments

Stefan Bader Sept. 6, 2024, 9:17 a.m. UTC | #1
On 27.08.24 02:42, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/2069961
> 
> [Impact]
> 
> On large systems, e.g. with 512 cpus or more, turbostat fails to run due to
> exceeding the rlimit for number of files. 512 cpus requires 1028 file
> descriptors, but the current limit is 999.
> 
> $ lscpu
> ...
> CPU(s):                  512
>    On-line CPU(s) list:   0-511
> ...
> 
> $ sudo turbostat
> ...
> turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files
> 
> There is no workaround, apart from maybe using powerstat instead.
> 
> [Fix]
> 
> The fix is to increase the rlimit to increase the amount of file descriptors
> that turbostat can open to 2^15, which should be plenty for some time to come.
> 
> commit 3ac1d14d0583a2de75d49a5234d767e2590384dd
> Author: Wyes Karny <wyes.karny@amd.com>
> Date:   Tue Oct 3 05:07:51 2023 +0000
> Subject: tools/power turbostat: Increase the limit for fd opened
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3ac1d14d0583a2de75d49a5234d767e2590384dd
> 
> This landed in 6.9-rc4, and requires a backport for minor context adjustment in
> the first hunk for jammy. Noble got fixed already through upstream stable.
> 
> [Testcase]
> 
> Deploy a bare metal system with 512 or more cpus.
> 
> Install linux-tools:
> 
> $ sudo apt install linux-tools-$(uname -r)
> 
> Run turbostat:
> 
> $ sudo turbostat
> ...
> turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files
> 
> There are test kernels available in the following ppa:
> 
> https://launchpad.net/~mruffell/+archive/ubuntu/sf388491-test
> 
> If you install them, you should be able to see normal turbostat output for all
> cpus installed in the system.
> 
> [Where problems can occur]
> 
> We are simply increasing the rlimit for file descriptors that turbostat can
> open. This should have no impact on any existing systems.
> 
> If a regression should occur, then turbostat functionality might not work.
> Users could use powerstat instead as a workaround while things are fixed.
> 
> Wyes Karny (1):
>    tools/power turbostat: Increase the limit for fd opened
> 
>   tools/power/x86/turbostat/turbostat.c | 21 +++++++++++++++++++++
>   1 file changed, 21 insertions(+)
> 

Acked-by: Stefan Bader <stefan.bader@canonical.com>
Manuel Diewald Sept. 6, 2024, 10:02 a.m. UTC | #2
On Tue, Aug 27, 2024 at 12:42:05PM +1200, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/2069961
> 
> [Impact]
> 
> On large systems, e.g. with 512 cpus or more, turbostat fails to run due to
> exceeding the rlimit for number of files. 512 cpus requires 1028 file
> descriptors, but the current limit is 999.
> 
> $ lscpu
> ...
> CPU(s):                  512
>   On-line CPU(s) list:   0-511
> ...
> 
> $ sudo turbostat
> ...
> turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files
> 
> There is no workaround, apart from maybe using powerstat instead.
> 
> [Fix]
> 
> The fix is to increase the rlimit to increase the amount of file descriptors
> that turbostat can open to 2^15, which should be plenty for some time to come.
> 
> commit 3ac1d14d0583a2de75d49a5234d767e2590384dd
> Author: Wyes Karny <wyes.karny@amd.com>
> Date:   Tue Oct 3 05:07:51 2023 +0000
> Subject: tools/power turbostat: Increase the limit for fd opened
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3ac1d14d0583a2de75d49a5234d767e2590384dd
> 
> This landed in 6.9-rc4, and requires a backport for minor context adjustment in
> the first hunk for jammy. Noble got fixed already through upstream stable.
> 
> [Testcase]
> 
> Deploy a bare metal system with 512 or more cpus.
> 
> Install linux-tools:
> 
> $ sudo apt install linux-tools-$(uname -r)
> 
> Run turbostat:
> 
> $ sudo turbostat
> ...
> turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files
> 
> There are test kernels available in the following ppa:
> 
> https://launchpad.net/~mruffell/+archive/ubuntu/sf388491-test
> 
> If you install them, you should be able to see normal turbostat output for all
> cpus installed in the system.
> 
> [Where problems can occur]
> 
> We are simply increasing the rlimit for file descriptors that turbostat can
> open. This should have no impact on any existing systems.
> 
> If a regression should occur, then turbostat functionality might not work.
> Users could use powerstat instead as a workaround while things are fixed.
> 
> Wyes Karny (1):
>   tools/power turbostat: Increase the limit for fd opened
> 
>  tools/power/x86/turbostat/turbostat.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> -- 
> 2.45.2
> 
> 
> -- 
> kernel-team mailing list
> kernel-team@lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team

Acked-by: Manuel Diewald <manuel.diewald@canonical.com>
Stefan Bader Sept. 6, 2024, 3:29 p.m. UTC | #3
On 27.08.24 02:42, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/2069961
> 
> [Impact]
> 
> On large systems, e.g. with 512 cpus or more, turbostat fails to run due to
> exceeding the rlimit for number of files. 512 cpus requires 1028 file
> descriptors, but the current limit is 999.
> 
> $ lscpu
> ...
> CPU(s):                  512
>    On-line CPU(s) list:   0-511
> ...
> 
> $ sudo turbostat
> ...
> turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files
> 
> There is no workaround, apart from maybe using powerstat instead.
> 
> [Fix]
> 
> The fix is to increase the rlimit to increase the amount of file descriptors
> that turbostat can open to 2^15, which should be plenty for some time to come.
> 
> commit 3ac1d14d0583a2de75d49a5234d767e2590384dd
> Author: Wyes Karny <wyes.karny@amd.com>
> Date:   Tue Oct 3 05:07:51 2023 +0000
> Subject: tools/power turbostat: Increase the limit for fd opened
> Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3ac1d14d0583a2de75d49a5234d767e2590384dd
> 
> This landed in 6.9-rc4, and requires a backport for minor context adjustment in
> the first hunk for jammy. Noble got fixed already through upstream stable.
> 
> [Testcase]
> 
> Deploy a bare metal system with 512 or more cpus.
> 
> Install linux-tools:
> 
> $ sudo apt install linux-tools-$(uname -r)
> 
> Run turbostat:
> 
> $ sudo turbostat
> ...
> turbostat: /sys/devices/system/cpu/cpu477/cpuidle/state0/usage: open failed: Too many open files
> 
> There are test kernels available in the following ppa:
> 
> https://launchpad.net/~mruffell/+archive/ubuntu/sf388491-test
> 
> If you install them, you should be able to see normal turbostat output for all
> cpus installed in the system.
> 
> [Where problems can occur]
> 
> We are simply increasing the rlimit for file descriptors that turbostat can
> open. This should have no impact on any existing systems.
> 
> If a regression should occur, then turbostat functionality might not work.
> Users could use powerstat instead as a workaround while things are fixed.
> 
> Wyes Karny (1):
>    tools/power turbostat: Increase the limit for fd opened
> 
>   tools/power/x86/turbostat/turbostat.c | 21 +++++++++++++++++++++
>   1 file changed, 21 insertions(+)
> 

Applied to jammy:linux/master-next. Thanks.

-Stefan