Message ID | 7bd09b7b-bbd4-ac32-02d5-687ec5e01986@redhat.com |
---|---|
State | New |
Headers | show |
Series | stdio-common: Add test for vfscanf with matches longer than INT_MAX [BZ #27650] | expand |
* Maciej W. Rozycki: > +/* Produce a stream of more than MAX_INT characters to stdout of which > + none is the new line character. This is executed as a subprocess > + and the caller wants a void callee, upon the return from which the > + process will terminate successfully, so in the case of a failure we > + need to explicitly call exit with the failure status. */ > + > +static void > +do_write (void *arg) > +{ > + static const char s[] = { [0 ... 4095] = 'a' }; > + size_t i; > + > + for (i = 0; i <= INT_MAX / sizeof (s); i++) > + if (fwrite (s, 1, sizeof (s), stdout) != sizeof (s)) > + { > + int err = errno; > + > + /* Close our stdout so that there's no risk for us to block > + while `fscanf' is waiting on our stdout in `do_read' and > + nothing checking our stderr. If closing has failed, then > + refrain from reporting anything, for the same reason. */ > + if (fclose (stdout) == 0) > + error (0, err, "%s: fwrite: output error", __func__); > + exit (EXIT_FAILURE); > + } Is there a reason for using stdout or stream I/O for writing the file? You could use create_temp_file from <support/temp_file.h> and write directly to the file descriptor. It may be faster to use <support/blob_repeat.h> to create a string in memory because it uses alias mappings, and parse that string using sscanf or fmemopen+vfscanf. Thanks, Florian
On Fri, 7 Jun 2024, Florian Weimer wrote: > * Maciej W. Rozycki: > > > +/* Produce a stream of more than MAX_INT characters to stdout of which > > + none is the new line character. This is executed as a subprocess > > + and the caller wants a void callee, upon the return from which the > > + process will terminate successfully, so in the case of a failure we > > + need to explicitly call exit with the failure status. */ > > + > > +static void > > +do_write (void *arg) > > +{ > > + static const char s[] = { [0 ... 4095] = 'a' }; > > + size_t i; > > + > > + for (i = 0; i <= INT_MAX / sizeof (s); i++) > > + if (fwrite (s, 1, sizeof (s), stdout) != sizeof (s)) > > + { > > + int err = errno; > > + > > + /* Close our stdout so that there's no risk for us to block > > + while `fscanf' is waiting on our stdout in `do_read' and > > + nothing checking our stderr. If closing has failed, then > > + refrain from reporting anything, for the same reason. */ > > + if (fclose (stdout) == 0) > > + error (0, err, "%s: fwrite: output error", __func__); > > + exit (EXIT_FAILURE); > > + } > > Is there a reason for using stdout or stream I/O for writing the file? > You could use create_temp_file from <support/temp_file.h> and write > directly to the file descriptor. A pipe is used because over 2GiB of data has to be transferred. It could be a bit of a stress for the target board if such a large amount was to be actually stored in a filesystem (even in the presence of LFS). There may be limited storage available too. My choice to use the `fwrite' interface over raw `write' was mostly symmetry with the rest of code and also less hassle in handling, as with `fwrite' you don't have to take care of partial writes. Besides, `stdout' is readily available, there's no need for an extra library call (and an error to handle) to extract the underlying file descriptor. Not a big deal overall, just a matter of style. > It may be faster to use <support/blob_repeat.h> to create a string in > memory because it uses alias mappings, and parse that string using > sscanf or fmemopen+vfscanf. Neat! However with 32-bit targets the size of the allocation required may exceed the limit of the user VM supported by hardware or the OS (some targets give room for manoeuvre to the OS as to the user/kernel VM split while other ones have it hardwired). As a quick check I have run stdlib/tst-strtod-overflow with my 74Kf target and, lo and behold: UNSUPPORTED: stdlib/tst-strtod-overflow original exit status 77 warning: memory allocation failed, cannot test for overflow And I think we do want to have coverage for scanning INT_MAX+ characters especially with 32-bit targets, where it may be hitting more than just the limit of the `int' type and the specific issue reported with BZ #27650. Do my answers address your concerns? Maciej
On Fri, 7 Jun 2024, Maciej W. Rozycki wrote: > Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer > than INT_MAX (bug 27650)") and add a test case for the issue, inspired > by the reproducer provided with the bug report. Ping for: <https://sourceware.org/pipermail/libc-alpha/2024-June/157283.html>, <https://patchwork.sourceware.org/project/glibc/patch/7bd09b7b-bbd4-ac32-02d5-687ec5e01986@redhat.com/>. Maciej
On Fri, 7 Jun 2024, Maciej W. Rozycki wrote: > Complement commit b03e4d7bd25b ("stdio: fix vfscanf with matches longer > than INT_MAX (bug 27650)") and add a test case for the issue, inspired > by the reproducer provided with the bug report. Ping for: <https://sourceware.org/pipermail/libc-alpha/2024-June/157283.html>, <https://patchwork.sourceware.org/project/glibc/patch/7bd09b7b-bbd4-ac32-02d5-687ec5e01986@redhat.com/>. Maciej
* Maciej W. Rozycki: >> It may be faster to use <support/blob_repeat.h> to create a string in >> memory because it uses alias mappings, and parse that string using >> sscanf or fmemopen+vfscanf. > > Neat! However with 32-bit targets the size of the allocation required > may exceed the limit of the user VM supported by hardware or the OS (some > targets give room for manoeuvre to the OS as to the user/kernel VM split > while other ones have it hardwired). As a quick check I have run > stdlib/tst-strtod-overflow with my 74Kf target and, lo and behold: > > UNSUPPORTED: stdlib/tst-strtod-overflow > original exit status 77 > warning: memory allocation failed, cannot test for overflow > > And I think we do want to have coverage for scanning INT_MAX+ characters > especially with 32-bit targets, where it may be hitting more than just the > limit of the `int' type and the specific issue reported with BZ #27650. Fair enough. > Do my answers address your concerns? I think the test case is still unnecessarily complicated. You can use fopencookie to generate the data, or at least use threads to simplify the error handling considerably. And please do not reuse stdout for data generation. Given how our test error reporting works (we write to stdout), fclose (stdout) is a bit of a debugger trap. Thanks, Florian
On Mon, 1 Jul 2024, Florian Weimer wrote: > > Do my answers address your concerns? > > I think the test case is still unnecessarily complicated. You can use > fopencookie to generate the data, or at least use threads to simplify > the error handling considerably. And please do not reuse stdout for > data generation. Given how our test error reporting works (we write to > stdout), fclose (stdout) is a bit of a debugger trap. But that `fclose (stdout)' is called in the child only, for the stream associated with the pipe connecting with the parent, which in turn is a consequence of how our own `support_subprocess' has been implemented -- not my choice. This is how unnamed pipes work and this stream or the underlying pipe do not communicate with the test harness and neither have anything to do with the parent's stdout (and likewise stderr). And I don't think using threads would make the test case simpler. In fact threading is complex by itself, so making the test case threaded would only complicate things. However the idea of using `fopencookie' (what a weird name!), which is a GNU extension I have not come across before, seems attractive to me. It will remove the need to use a subprocess, likely improve performance a little, as there'll be no need to cross the kernel boundary anymore, and will greatly simplify handling overall. Thanks for the suggestion, I'll be posting v2 shortly. Thank you for your review. Maciej
Index: glibc/stdio-common/Makefile =================================================================== --- glibc.orig/stdio-common/Makefile +++ glibc/stdio-common/Makefile @@ -244,6 +244,7 @@ tests := \ tst-scanf-binary-c23 \ tst-scanf-binary-gnu11 \ tst-scanf-binary-gnu89 \ + tst-scanf-bz27650 \ tst-scanf-intn \ tst-scanf-round \ tst-scanf-to_inpunct \ @@ -314,6 +315,7 @@ generated += \ tst-printf-fp-free.mtrace \ tst-printf-fp-leak-mem.out \ tst-printf-fp-leak.mtrace \ + tst-scanf-bz27650.mtrace \ tst-vfprintf-width-prec-mem.out \ tst-vfprintf-width-prec.mtrace \ # generated Index: glibc/stdio-common/tst-scanf-bz27650.c =================================================================== --- /dev/null +++ glibc/stdio-common/tst-scanf-bz27650.c @@ -0,0 +1,216 @@ +/* Test for BZ #27650, formatted input matching beyond MAX_INT. + Copyright (C) 2024 Free Software Foundation, Inc. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <array_length.h> +#include <error.h> +#include <errno.h> +#include <fcntl.h> +#include <limits.h> +#include <mcheck.h> +#include <poll.h> +#include <stdbool.h> +#include <stdio.h> +#include <stdlib.h> + +#include <support/subprocess.h> +#include <support/test-driver.h> + +/* Produce a stream of more than MAX_INT characters to stdout of which + none is the new line character. This is executed as a subprocess + and the caller wants a void callee, upon the return from which the + process will terminate successfully, so in the case of a failure we + need to explicitly call exit with the failure status. */ + +static void +do_write (void *arg) +{ + static const char s[] = { [0 ... 4095] = 'a' }; + size_t i; + + for (i = 0; i <= INT_MAX / sizeof (s); i++) + if (fwrite (s, 1, sizeof (s), stdout) != sizeof (s)) + { + int err = errno; + + /* Close our stdout so that there's no risk for us to block + while `fscanf' is waiting on our stdout in `do_read' and + nothing checking our stderr. If closing has failed, then + refrain from reporting anything, for the same reason. */ + if (fclose (stdout) == 0) + error (0, err, "%s: fwrite: output error", __func__); + exit (EXIT_FAILURE); + } +} + +/* Consume a stream of more than MAX_INT characters from IN of which + none is the new line character. The call to fscanf is supposed + to complete upon the EOF condition on IN, however in the presence + of BZ #27650 it will terminate prematurely with characters still + outstanding in IN. Diagnose the condition and return status + accordingly. */ + +static int +do_read (FILE *in) +{ + int v; + + v = fscanf (in, "%*[^\n]"); + if (v == EOF || errno != 0) + { + error (0, errno, "%s: fscanf: input failure", __func__); + return EXIT_FAILURE; + } + + if (!feof (in)) + { + v = fgetc (in); + if (v == EOF) + error (0, errno, "%s: fgetc: input failure", __func__); + else if (v == '\n') + error (0, 0, "%s: unexpected new line character received", __func__); + else + error (0, 0, + "%s: character received after end of file expected: \\x%02x", + __func__, v); + return EXIT_FAILURE; + } + + return EXIT_SUCCESS; +} + +/* Run do_write in a subprocess and communicate its output produced to + stdout via a pipe to do_read. Upon completion of do_read consume + any outstanding input from do_write and report any issues. Return + success or failure based on the status of the subprocess and ours. */ + +int +do_test (void) +{ + struct support_subprocess target; + FILE *chdout, *chderr; + int chdstatus; + int status; + + mtrace (); + + target = support_subprocess (do_write, NULL); + chdout = fdopen (target.stdout_pipe[0], "r"); + if (chdout == NULL) + { + error (0, errno, "fdopen"); + status = EXIT_FAILURE; + } + else + status = do_read (chdout); + + /* Switch the pipes to the non-blocking mode to make sure do_write + does not lock up waiting output and consume any outstanding input + received. Discard any output from do_write's stdout and pass any + output from do_write's stderr along to our stderr. */ + if (fcntl (target.stdout_pipe[0], F_SETFL, O_NONBLOCK) == -1 + || fcntl (target.stderr_pipe[0], F_SETFL, O_NONBLOCK) == -1) + { + error (0, errno, "fcntl (F_SETFL)"); + status = EXIT_FAILURE; + } + else if (chdout != NULL) + { + chderr = fdopen (target.stderr_pipe[0], "r"); + if (chderr != NULL) + { + struct pollfd fds[] = + { { .fd = target.stderr_pipe[0], .events = POLLIN }, + { .fd = target.stdout_pipe[0], .events = POLLIN } }; + FILE *ss[array_length (fds)][2] = + { { chderr, stderr }, { chdout } }; + int pollstatus; + size_t i; + + while ((pollstatus = poll (fds, array_length (fds), -1)) >= 0) + { + bool stop; + + stop = false; + for (i = 0; i < array_length (fds); i++) + { + char buf[1024]; + char *s; + + if (fds[i].revents & POLLERR) + fds[i].fd = -1; + else if (fds[i].revents & POLLIN) + do + { + s = fgets (buf, sizeof (buf), ss[i][0]); + if (s != NULL) + { + if (ss[i][1] != NULL + && fputs (buf, ss[i][1]) == EOF) + { + error (0, errno, "fputs"); + status = EXIT_FAILURE; + stop = true; + } + } + else if (errno == EAGAIN) + clearerr (chderr); + else + { + error (0, errno, "fgets"); + status = EXIT_FAILURE; + stop = true; + } + } + while (s != NULL); + else if (fds[i].revents & POLLHUP) + fds[i].fd = -1; + } + if (stop) + break; + + stop = true; + for (i = 0; i < array_length (fds); i++) + if (fds[i].fd >= 0) + { + stop = false; + break; + } + if (stop) + break; + } + if (pollstatus < 0) + { + error (0, errno, "poll"); + status = EXIT_FAILURE; + } + } + } + + /* Combine our subprocess's status and intended ours. Only succeed + if both are good. */ + chdstatus = support_process_wait (&target); + if (status == EXIT_SUCCESS && WIFEXITED (chdstatus)) + return WEXITSTATUS (chdstatus); + else if (status != EXIT_SUCCESS) + return status; + else + return EXIT_FAILURE; +} + +#define TIMEOUT (DEFAULT_TIMEOUT * 8) +#include <support/test-driver.c>