Message ID | 20240814233534.1469084-3-siddhesh@sourceware.org |
---|---|
State | New |
Headers | show |
Series | ungetc fixes for BZ #27821 | expand |
On Aug 14 2024, Siddhesh Poyarekar wrote: > When ungetc is called on an unused stream, the backup buffer is > allocated without the main get area being present. This results in two > every subsequent ungetc (as the stream remains in the backup area) I think you want to remove "two" here.
On 2024-08-15 03:15, Andreas Schwab wrote: > On Aug 14 2024, Siddhesh Poyarekar wrote: > >> When ungetc is called on an unused stream, the backup buffer is >> allocated without the main get area being present. This results in two >> every subsequent ungetc (as the stream remains in the backup area) > > I think you want to remove "two" here. > Fixed commit message, is the change OK otherwise? Thanks, Sid
On 8/14/24 7:35 PM, Siddhesh Poyarekar wrote: > When ungetc is called on an unused stream, the backup buffer is > allocated without the main get area being present. This results in two > every subsequent ungetc (as the stream remains in the backup area) > checking uninitialized memory in the backup buffer when trying to put a > character back into the stream. > > Avoid comparing the input character with buffer contents when in backup > to avoid this uninitialized read. The uninitialized read is harmless in > this context since the location is promptly overwritten with the input > character, thus fulfilling ungetc functionality. > > Also adjust wording in the manual to drop the paragraph that says glibc > cannot do multiple ungetc back to back since with this change, ungetc > can actually do this. > > Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> LGTM. Reviewed-by: Carlos O'Donell <carlos@redhat.com> > --- > libio/genops.c | 2 +- > manual/stdio.texi | 8 +++----- > stdio-common/tst-ungetc.c | 2 ++ > 3 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/libio/genops.c b/libio/genops.c > index 99f5e80f20..b012fa33d2 100644 > --- a/libio/genops.c > +++ b/libio/genops.c > @@ -662,7 +662,7 @@ _IO_sputbackc (FILE *fp, int c) > { > int result; > > - if (fp->_IO_read_ptr > fp->_IO_read_base > + if (fp->_IO_read_ptr > fp->_IO_read_base && !_IO_in_backup (fp) OK. This now *MATCHES* the conditional in _IO_default_pbackfail. We avoid reading from _IO_read_ptr[-1] if _IO_in_backup (fp). If _IO_in_backup (fp) then we should be looking at writing the character directly into the backup area rather than reading from it. > && (unsigned char)fp->_IO_read_ptr[-1] == (unsigned char)c) > { > fp->_IO_read_ptr--; > diff --git a/manual/stdio.texi b/manual/stdio.texi > index 8517653507..92614775fa 100644 > --- a/manual/stdio.texi > +++ b/manual/stdio.texi > @@ -1467,11 +1467,9 @@ program; usually @code{ungetc} is used only to unread a character that > was just read from the same stream. @Theglibc{} supports this > even on files opened in binary mode, but other systems might not. > > -@Theglibc{} only supports one character of pushback---in other > -words, it does not work to call @code{ungetc} twice without doing input > -in between. Other systems might let you push back multiple characters; > -then reading from the stream retrieves the characters in the reverse > -order that they were pushed. > +@Theglibc{} supports pushing back multiple characters; subsequently > +reading from the stream retrieves the characters in the reverse order > +that they were pushed. > > Pushing back characters doesn't alter the file; only the internal > buffering for the stream is affected. If a file positioning function > diff --git a/stdio-common/tst-ungetc.c b/stdio-common/tst-ungetc.c > index 5c808f0734..388b202493 100644 > --- a/stdio-common/tst-ungetc.c > +++ b/stdio-common/tst-ungetc.c > @@ -48,6 +48,8 @@ do_test (void) > TEST_VERIFY_EXIT (getc (fp) == 'b'); > TEST_VERIFY_EXIT (getc (fp) == 'l'); > TEST_VERIFY_EXIT (ungetc ('m', fp) == 'm'); > + TEST_VERIFY_EXIT (ungetc ('n', fp) == 'n'); > + TEST_VERIFY_EXIT (getc (fp) == 'n'); > TEST_VERIFY_EXIT (getc (fp) == 'm'); OK. Adds test case for 2 character ungetc and read back via getc. > TEST_VERIFY_EXIT ((c = getc (fp)) == 'a'); > TEST_VERIFY_EXIT (getc (fp) == EOF);
diff --git a/libio/genops.c b/libio/genops.c index 99f5e80f20..b012fa33d2 100644 --- a/libio/genops.c +++ b/libio/genops.c @@ -662,7 +662,7 @@ _IO_sputbackc (FILE *fp, int c) { int result; - if (fp->_IO_read_ptr > fp->_IO_read_base + if (fp->_IO_read_ptr > fp->_IO_read_base && !_IO_in_backup (fp) && (unsigned char)fp->_IO_read_ptr[-1] == (unsigned char)c) { fp->_IO_read_ptr--; diff --git a/manual/stdio.texi b/manual/stdio.texi index 8517653507..92614775fa 100644 --- a/manual/stdio.texi +++ b/manual/stdio.texi @@ -1467,11 +1467,9 @@ program; usually @code{ungetc} is used only to unread a character that was just read from the same stream. @Theglibc{} supports this even on files opened in binary mode, but other systems might not. -@Theglibc{} only supports one character of pushback---in other -words, it does not work to call @code{ungetc} twice without doing input -in between. Other systems might let you push back multiple characters; -then reading from the stream retrieves the characters in the reverse -order that they were pushed. +@Theglibc{} supports pushing back multiple characters; subsequently +reading from the stream retrieves the characters in the reverse order +that they were pushed. Pushing back characters doesn't alter the file; only the internal buffering for the stream is affected. If a file positioning function diff --git a/stdio-common/tst-ungetc.c b/stdio-common/tst-ungetc.c index 5c808f0734..388b202493 100644 --- a/stdio-common/tst-ungetc.c +++ b/stdio-common/tst-ungetc.c @@ -48,6 +48,8 @@ do_test (void) TEST_VERIFY_EXIT (getc (fp) == 'b'); TEST_VERIFY_EXIT (getc (fp) == 'l'); TEST_VERIFY_EXIT (ungetc ('m', fp) == 'm'); + TEST_VERIFY_EXIT (ungetc ('n', fp) == 'n'); + TEST_VERIFY_EXIT (getc (fp) == 'n'); TEST_VERIFY_EXIT (getc (fp) == 'm'); TEST_VERIFY_EXIT ((c = getc (fp)) == 'a'); TEST_VERIFY_EXIT (getc (fp) == EOF);
When ungetc is called on an unused stream, the backup buffer is allocated without the main get area being present. This results in two every subsequent ungetc (as the stream remains in the backup area) checking uninitialized memory in the backup buffer when trying to put a character back into the stream. Avoid comparing the input character with buffer contents when in backup to avoid this uninitialized read. The uninitialized read is harmless in this context since the location is promptly overwritten with the input character, thus fulfilling ungetc functionality. Also adjust wording in the manual to drop the paragraph that says glibc cannot do multiple ungetc back to back since with this change, ungetc can actually do this. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> --- libio/genops.c | 2 +- manual/stdio.texi | 8 +++----- stdio-common/tst-ungetc.c | 2 ++ 3 files changed, 6 insertions(+), 6 deletions(-)