Message ID | 20240617162520.4045016-1-cleger@rivosinc.com |
---|---|
State | New |
Headers | show |
Series | tap: use close_range() when forking scripts and helper | expand |
On Mon, 17 Jun 2024 at 17:25, Clément Léger <cleger@rivosinc.com> wrote: > > Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on > POSIX"), the maximum number of file descriptors that can be opened are > raised to nofile.rlim_max. On recent debian distro, this yield a maximum > of 1073741816 file descriptors. Now, when forking to start > qemu-bridge-helper, this actually calls close() on the full possible file > descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which > takes a considerable amount of time. Use close_range() which only > requires to be called twice and factorize it in a separate function for > both call sites. > > Signed-off-by: Clément Léger <cleger@rivosinc.com> > --- > net/tap.c | 25 +++++++++++++------------ > 1 file changed, 13 insertions(+), 12 deletions(-) > > diff --git a/net/tap.c b/net/tap.c > index 51f7aec39d..6f5bf06bb5 100644 > --- a/net/tap.c > +++ b/net/tap.c > @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer, > return s; > } > > +static void fork_close_all_fds_except(int fd) > +{ > + int open_max = sysconf(_SC_OPEN_MAX); > + > + if (fd > 3) > + close_range(3, fd - 1, 0); > + > + if (fd < open_max) > + close_range(fd + 1, open_max, 0); > +} We can't rely on close_range() being available, I think -- this should be ifdeffed with CONFIG_CLOSE_RANGE, with a fallback if it's not there. Also, QEMU coding style requires braces on all if() statements, even single-line ones. thanks -- PMM
On 17/06/2024 18:36, Peter Maydell wrote: > On Mon, 17 Jun 2024 at 17:25, Clément Léger <cleger@rivosinc.com> wrote: >> >> Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on >> POSIX"), the maximum number of file descriptors that can be opened are >> raised to nofile.rlim_max. On recent debian distro, this yield a maximum >> of 1073741816 file descriptors. Now, when forking to start >> qemu-bridge-helper, this actually calls close() on the full possible file >> descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which >> takes a considerable amount of time. Use close_range() which only >> requires to be called twice and factorize it in a separate function for >> both call sites. >> >> Signed-off-by: Clément Léger <cleger@rivosinc.com> >> --- >> net/tap.c | 25 +++++++++++++------------ >> 1 file changed, 13 insertions(+), 12 deletions(-) >> >> diff --git a/net/tap.c b/net/tap.c >> index 51f7aec39d..6f5bf06bb5 100644 >> --- a/net/tap.c >> +++ b/net/tap.c >> @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer, >> return s; >> } >> >> +static void fork_close_all_fds_except(int fd) >> +{ >> + int open_max = sysconf(_SC_OPEN_MAX); >> + >> + if (fd > 3) >> + close_range(3, fd - 1, 0); >> + >> + if (fd < open_max) >> + close_range(fd + 1, open_max, 0); >> +} > > We can't rely on close_range() being available, I think -- > this should be ifdeffed with CONFIG_CLOSE_RANGE, with a > fallback if it's not there. Ok, makes sense. > > Also, QEMU coding style requires braces on all if() statements, > even single-line ones. Sorry for that, noticed it after running checkpatch. I'll send a V2 with these fixes. Thanks, Clément > > thanks > -- PMM
On Mon, Jun 17, 2024 at 06:25:18PM +0200, Clément Léger wrote: > Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on > POSIX"), the maximum number of file descriptors that can be opened are > raised to nofile.rlim_max. On recent debian distro, this yield a maximum > of 1073741816 file descriptors. Now, when forking to start > qemu-bridge-helper, this actually calls close() on the full possible file > descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which > takes a considerable amount of time. Use close_range() which only > requires to be called twice and factorize it in a separate function for > both call sites. > > Signed-off-by: Clément Léger <cleger@rivosinc.com> > --- > net/tap.c | 25 +++++++++++++------------ > 1 file changed, 13 insertions(+), 12 deletions(-) > > diff --git a/net/tap.c b/net/tap.c > index 51f7aec39d..6f5bf06bb5 100644 > --- a/net/tap.c > +++ b/net/tap.c > @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer, > return s; > } > > +static void fork_close_all_fds_except(int fd) > +{ > + int open_max = sysconf(_SC_OPEN_MAX); > + > + if (fd > 3) > + close_range(3, fd - 1, 0); > + > + if (fd < open_max) > + close_range(fd + 1, open_max, 0); > +} We can't assume that 'close_range' exists on all platforms/versions that QEMU targets. In system/async-teardown.c there is close_all_open_fd() that has a fallback path to iterating over /proc, which gives good fallback for Linux. That code doesn't have to deal with non-Linux though. I'd suggest that util/osdep.c needs to have a 'close_all_open_fd()' method that accepts an array of FDs to skip closing of, rather than assuming we always skip STDIO + 1 extra FD. eg int close_all_open_fd(int *skip, int nskip); Could either declare that 'skip' must be sorted, or we can explicitly run qsort() on it. Try native close_range first. If unavailable, then on Linux try /proc, otherwise the simple for() loop. Then use this common helper from both tap.c and asynct-teardown.c > + > static void launch_script(const char *setup_script, const char *ifname, > int fd, Error **errp) > { > @@ -400,13 +411,8 @@ static void launch_script(const char *setup_script, const char *ifname, > return; > } > if (pid == 0) { > - int open_max = sysconf(_SC_OPEN_MAX), i; > + fork_close_all_fds_except(fd); > > - for (i = 3; i < open_max; i++) { > - if (i != fd) { > - close(i); > - } > - } > parg = args; > *parg++ = (char *)setup_script; > *parg++ = (char *)ifname; > @@ -490,16 +496,11 @@ static int net_bridge_run_helper(const char *helper, const char *bridge, > return -1; > } > if (pid == 0) { > - int open_max = sysconf(_SC_OPEN_MAX), i; > char *fd_buf = NULL; > char *br_buf = NULL; > char *helper_cmd = NULL; > > - for (i = 3; i < open_max; i++) { > - if (i != sv[1]) { > - close(i); > - } > - } > + fork_close_all_fds_except(sv[1]); > > fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]); > > -- > 2.45.2 > > With regards, Daniel
diff --git a/net/tap.c b/net/tap.c index 51f7aec39d..6f5bf06bb5 100644 --- a/net/tap.c +++ b/net/tap.c @@ -385,6 +385,17 @@ static TAPState *net_tap_fd_init(NetClientState *peer, return s; } +static void fork_close_all_fds_except(int fd) +{ + int open_max = sysconf(_SC_OPEN_MAX); + + if (fd > 3) + close_range(3, fd - 1, 0); + + if (fd < open_max) + close_range(fd + 1, open_max, 0); +} + static void launch_script(const char *setup_script, const char *ifname, int fd, Error **errp) { @@ -400,13 +411,8 @@ static void launch_script(const char *setup_script, const char *ifname, return; } if (pid == 0) { - int open_max = sysconf(_SC_OPEN_MAX), i; + fork_close_all_fds_except(fd); - for (i = 3; i < open_max; i++) { - if (i != fd) { - close(i); - } - } parg = args; *parg++ = (char *)setup_script; *parg++ = (char *)ifname; @@ -490,16 +496,11 @@ static int net_bridge_run_helper(const char *helper, const char *bridge, return -1; } if (pid == 0) { - int open_max = sysconf(_SC_OPEN_MAX), i; char *fd_buf = NULL; char *br_buf = NULL; char *helper_cmd = NULL; - for (i = 3; i < open_max; i++) { - if (i != sv[1]) { - close(i); - } - } + fork_close_all_fds_except(sv[1]); fd_buf = g_strdup_printf("%s%d", "--fd=", sv[1]);
Since commit 03e471c41d8b ("qemu_init: increase NOFILE soft limit on POSIX"), the maximum number of file descriptors that can be opened are raised to nofile.rlim_max. On recent debian distro, this yield a maximum of 1073741816 file descriptors. Now, when forking to start qemu-bridge-helper, this actually calls close() on the full possible file descriptor range (more precisely [3 - sysconf(_SC_OPEN_MAX)]) which takes a considerable amount of time. Use close_range() which only requires to be called twice and factorize it in a separate function for both call sites. Signed-off-by: Clément Léger <cleger@rivosinc.com> --- net/tap.c | 25 +++++++++++++------------ 1 file changed, 13 insertions(+), 12 deletions(-)