Message ID | 20241210123004.2073202-1-siddhesh@sourceware.org |
---|---|
State | New |
Headers | show |
Series | [v4] ungetc: Guarantee single char pushback | expand |
On 12/10/24 06:30, Siddhesh Poyarekar wrote: > - int _flags2; > + int _flags2:24; > + char _short_backupbuf[1]; This doesn't create a 3-byte flags2, if that's what you're trying. The underlying storage will still be int-sized, now with 8 unallocated bits. r~
On 2024-12-10 09:03, Richard Henderson wrote: > On 12/10/24 06:30, Siddhesh Poyarekar wrote: >> - int _flags2; >> + int _flags2:24; >> + char _short_backupbuf[1]; > > This doesn't create a 3-byte flags2, if that's what you're trying. > The underlying storage will still be int-sized, now with 8 unallocated > bits. gcc and clang both appear to pack in the struct just fine, which effectively gives flags2 3-bytes storage and _short_backupbuf right next to it. Maybe I've misunderstood the issue you're trying to point out, could you please elaborate? Thanks, Sid
On Wed, 11 Dec 2024, Siddhesh Poyarekar wrote: > > > - int _flags2; > > > + int _flags2:24; > > > + char _short_backupbuf[1]; > > > > This doesn't create a 3-byte flags2, if that's what you're trying. > > The underlying storage will still be int-sized, now with 8 unallocated bits. > > gcc and clang both appear to pack in the struct just fine, which effectively > gives flags2 3-bytes storage and _short_backupbuf right next to it. Maybe > I've misunderstood the issue you're trying to point out, could you please > elaborate? Indeed a bit-field does get packed with an eligible adjacent non-bitfield member of the structure, although ISO C defers storage unit allocation to the implementation and therefore I do believe it is down to the individual platform-specific ABI. Then e.g. the o32 MIPS ABI has this clause[1]: "* Bit-fields can share a storage unit with other struct/union members, including members that are not bit-fields. Of course, struct members occupy different parts of the storage unit." and the updated structure is laid out accordingly: struct _IO_FILE { int _flags; /* 0 4 */ char * _IO_read_ptr; /* 4 4 */ char * _IO_read_end; /* 8 4 */ char * _IO_read_base; /* 12 4 */ char * _IO_write_base; /* 16 4 */ char * _IO_write_ptr; /* 20 4 */ char * _IO_write_end; /* 24 4 */ char * _IO_buf_base; /* 28 4 */ char * _IO_buf_end; /* 32 4 */ char * _IO_save_base; /* 36 4 */ char * _IO_backup_base; /* 40 4 */ char * _IO_save_end; /* 44 4 */ struct _IO_marker * _markers; /* 48 4 */ struct _IO_FILE * _chain; /* 52 4 */ int _fileno; /* 56 4 */ /* XXX 3 bytes hole, try to pack */ /* Bitfield combined with previous fields */ static int _flags2 /* 0: 0 0 */ char _short_backupbuf[1]; /* 63 1 */ __off_t _old_offset; /* 64 4 */ short unsigned int _cur_column; /* 68 2 */ signed char _vtable_offset; /* 70 1 */ char _shortbuf[1]; /* 71 1 */ _IO_lock_t * _lock; /* 72 4 */ /* size: 76, cachelines: 1, members: 21, static members: 1 */ /* sum members: 85, holes: 1, sum holes: 3 */ /* last cacheline: 76 bytes */ /* BRAIN FART ALERT! 76 != 85 + 3(holes), diff = -12 */ }; (although the tool does seem confused a little here). AFAICT from `place_field' in gcc/stor-layout.cc it is the same across all the !TARGET_MS_BITFIELD_LAYOUT_P ABIs GCC supports, there's no provision for a per-target variation. Maciej
On 2024-12-12 07:16, Maciej W. Rozycki wrote: > On Wed, 11 Dec 2024, Siddhesh Poyarekar wrote: > >>>> - int _flags2; >>>> + int _flags2:24; >>>> + char _short_backupbuf[1]; >>> >>> This doesn't create a 3-byte flags2, if that's what you're trying. >>> The underlying storage will still be int-sized, now with 8 unallocated bits. >> >> gcc and clang both appear to pack in the struct just fine, which effectively >> gives flags2 3-bytes storage and _short_backupbuf right next to it. Maybe >> I've misunderstood the issue you're trying to point out, could you please >> elaborate? > > Indeed a bit-field does get packed with an eligible adjacent non-bitfield > member of the structure, although ISO C defers storage unit allocation to > the implementation and therefore I do believe it is down to the individual > platform-specific ABI. > > Then e.g. the o32 MIPS ABI has this clause[1]: > > "* Bit-fields can share a storage unit with other struct/union members, > including members that are not bit-fields. Of course, struct members > occupy different parts of the storage unit." That sounds like it's saying that distinct members cannot share the smallest addressable unit, i.e. a byte, which is fine. > and the updated structure is laid out accordingly: > > struct _IO_FILE { > int _flags; /* 0 4 */ > char * _IO_read_ptr; /* 4 4 */ > char * _IO_read_end; /* 8 4 */ > char * _IO_read_base; /* 12 4 */ > char * _IO_write_base; /* 16 4 */ > char * _IO_write_ptr; /* 20 4 */ > char * _IO_write_end; /* 24 4 */ > char * _IO_buf_base; /* 28 4 */ > char * _IO_buf_end; /* 32 4 */ > char * _IO_save_base; /* 36 4 */ > char * _IO_backup_base; /* 40 4 */ > char * _IO_save_end; /* 44 4 */ > struct _IO_marker * _markers; /* 48 4 */ > struct _IO_FILE * _chain; /* 52 4 */ > int _fileno; /* 56 4 */ > > /* XXX 3 bytes hole, try to pack */ > /* Bitfield combined with previous fields */ > > static int _flags2 /* 0: 0 0 */ Is the "static" a typo that may have confused the tool? Sid
On Thu, 12 Dec 2024, Siddhesh Poyarekar wrote: > > > gcc and clang both appear to pack in the struct just fine, which > > > effectively > > > gives flags2 3-bytes storage and _short_backupbuf right next to it. Maybe > > > I've misunderstood the issue you're trying to point out, could you please > > > elaborate? > > > > Indeed a bit-field does get packed with an eligible adjacent non-bitfield > > member of the structure, although ISO C defers storage unit allocation to > > the implementation and therefore I do believe it is down to the individual > > platform-specific ABI. > > > > Then e.g. the o32 MIPS ABI has this clause[1]: > > > > "* Bit-fields can share a storage unit with other struct/union members, > > including members that are not bit-fields. Of course, struct members > > occupy different parts of the storage unit." > > That sounds like it's saying that distinct members cannot share the smallest > addressable unit, i.e. a byte, which is fine. Obviously struct members cannot overlap, but more importantly it does say non-bitfield members can share a storage unit with bit-fields, e.g. one used for the "int" type, just as bit-field members do between themselves. NB below there is the document reference I forgot to add. > > and the updated structure is laid out accordingly: > > > > struct _IO_FILE { > > int _flags; /* 0 4 */ > > char * _IO_read_ptr; /* 4 4 */ > > char * _IO_read_end; /* 8 4 */ > > char * _IO_read_base; /* 12 4 */ > > char * _IO_write_base; /* 16 4 */ > > char * _IO_write_ptr; /* 20 4 */ > > char * _IO_write_end; /* 24 4 */ > > char * _IO_buf_base; /* 28 4 */ > > char * _IO_buf_end; /* 32 4 */ > > char * _IO_save_base; /* 36 4 */ > > char * _IO_backup_base; /* 40 4 */ > > char * _IO_save_end; /* 44 4 */ > > struct _IO_marker * _markers; /* 48 4 */ > > struct _IO_FILE * _chain; /* 52 4 */ > > int _fileno; /* 56 4 */ > > > > /* XXX 3 bytes hole, try to pack */ > > /* Bitfield combined with previous fields */ > > > > static int _flags2 /* 0: 0 0 */ > > Is the "static" a typo that may have confused the tool? This is all output from the tool used over a glibc object compiled with your patch applied and it's not clear to me why the took has added the "static" keyword to the declaration. It seems related to the offset/width reported on the right hand side all being zero. The 64-bit POWER/LE variant is similar except for another hole reported at the top and also further down, but then for the Alpha there's no hole reported here, the offset/width are all nonzero and there's no "static" keyword: struct _IO_FILE { int _flags; /* 0 4 */ /* XXX 4 bytes hole, try to pack */ char * _IO_read_ptr; /* 8 8 */ char * _IO_read_end; /* 16 8 */ char * _IO_read_base; /* 24 8 */ char * _IO_write_base; /* 32 8 */ char * _IO_write_ptr; /* 40 8 */ char * _IO_write_end; /* 48 8 */ char * _IO_buf_base; /* 56 8 */ char * _IO_buf_end; /* 64 8 */ char * _IO_save_base; /* 72 8 */ char * _IO_backup_base; /* 80 8 */ char * _IO_save_end; /* 88 8 */ struct _IO_marker * _markers; /* 96 8 */ struct _IO_FILE * _chain; /* 104 8 */ int _fileno; /* 112 4 */ int _flags2:24; /* 116: 8 4 */ /* Bitfield combined with next fields */ char _short_backupbuf[1]; /* 119 1 */ __off_t _old_offset; /* 120 8 */ /* --- cacheline 1 boundary (128 bytes) --- */ short unsigned int _cur_column; /* 128 2 */ signed char _vtable_offset; /* 130 1 */ char _shortbuf[1]; /* 131 1 */ /* XXX 4 bytes hole, try to pack */ _IO_lock_t * _lock; /* 136 8 */ /* size: 144, cachelines: 2, members: 22 */ /* sum members: 136, holes: 2, sum holes: 8 */ /* last cacheline: 16 bytes */ }; I take it it comes from slight variations between DWARF information produced -- I have DWARFv2 forced for my Alpha configuration, to satisfy ancient native GDB I have installed on the target machine (there's no `gdbserver' port available yet for Alpha/Linux, so I cannot use a modern cross-debugger instead). I suspect it is a bug/limitation in the tool, which seems a bit picky as well and refuses to produce output for some objects. References: [1] "SYSTEM V APPLICATION BINARY INTERFACE, MIPS RISC Processor Supplement, 3rd Edition", Section "Bit-Fields", p. 3-8 Maciej
On 2024-12-12 08:00, Maciej W. Rozycki wrote: > On Thu, 12 Dec 2024, Siddhesh Poyarekar wrote: > >>>> gcc and clang both appear to pack in the struct just fine, which >>>> effectively >>>> gives flags2 3-bytes storage and _short_backupbuf right next to it. Maybe >>>> I've misunderstood the issue you're trying to point out, could you please >>>> elaborate? >>> >>> Indeed a bit-field does get packed with an eligible adjacent non-bitfield >>> member of the structure, although ISO C defers storage unit allocation to >>> the implementation and therefore I do believe it is down to the individual >>> platform-specific ABI. >>> >>> Then e.g. the o32 MIPS ABI has this clause[1]: >>> >>> "* Bit-fields can share a storage unit with other struct/union members, >>> including members that are not bit-fields. Of course, struct members >>> occupy different parts of the storage unit." >> >> That sounds like it's saying that distinct members cannot share the smallest >> addressable unit, i.e. a byte, which is fine. > > Obviously struct members cannot overlap, but more importantly it does say > non-bitfield members can share a storage unit with bit-fields, e.g. one > used for the "int" type, just as bit-field members do between themselves. Right, which is why I reckon the layout I'm proposing is OK; would you agree? Thanks, Sid
On Thu, 12 Dec 2024, Siddhesh Poyarekar wrote: > > > > "* Bit-fields can share a storage unit with other struct/union members, > > > > including members that are not bit-fields. Of course, struct > > > > members > > > > occupy different parts of the storage unit." > > > > > > That sounds like it's saying that distinct members cannot share the > > > smallest > > > addressable unit, i.e. a byte, which is fine. > > > > Obviously struct members cannot overlap, but more importantly it does say > > non-bitfield members can share a storage unit with bit-fields, e.g. one > > used for the "int" type, just as bit-field members do between themselves. > > Right, which is why I reckon the layout I'm proposing is OK; would you agree? I do and actually I like it; I'll go through your patch in detail soon. Maciej
Hi Sid, Thanks for your effort, this is looking mostly good to me. My main concern is `_IO_free_backup_buf', which I think will perform better as a static inline function. I have some questions as well as to the test functions including a request to add introductory comments for them with the answers. Plus a couple of small nits, as all detailed below. > The C standard requires that ungetc guarantees at least one pushback, so > put a single byte pushback buffer in the FILE struct to enable that. Please mention here why this single byte pushback buffer is needed to fulfil the C standard's requirement (i.e. that we fail to fulfil it now because we use `malloc', which can fail). > Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> Hmm, please clarify your copyright status. > diff --git a/libio/bits/types/struct_FILE.h b/libio/bits/types/struct_FILE.h > index d8d26639d1..a5e0679de3 100644 > --- a/libio/bits/types/struct_FILE.h > +++ b/libio/bits/types/struct_FILE.h > @@ -1,4 +1,5 @@ > /* Copyright (C) 1991-2024 Free Software Foundation, Inc. > + Copyright The GNU Toolchain Authors. OK. Consistent with DCO (but see above). > @@ -70,7 +71,8 @@ struct _IO_FILE > struct _IO_FILE *_chain; > > int _fileno; > - int _flags2; > + int _flags2:24; > + char _short_backupbuf[1]; OK. Taking advantage of a char member sharing the storage unit with the preceding bit-field, so there's no change in the size of the structure or member offsets. Please add a short description of the new member, just as with most of the existing ones (all should have one IMO). > diff --git a/libio/fileops.c b/libio/fileops.c > index 759d737ec7..d49e489f55 100644 > --- a/libio/fileops.c > +++ b/libio/fileops.c > @@ -1,4 +1,5 @@ > /* Copyright (C) 1993-2024 Free Software Foundation, Inc. > + Copyright The GNU Toolchain Authors. OK. Consistent with DCO (but see above). > @@ -480,7 +481,7 @@ _IO_new_file_underflow (FILE *fp) > /* Maybe we already have a push back pointer. */ > if (fp->_IO_save_base != NULL) > { > - free (fp->_IO_save_base); > + _IO_free_backup_buf (fp, fp->_IO_save_base); OK. Mechanical update. > @@ -932,7 +933,7 @@ _IO_new_file_seekoff (FILE *fp, off64_t offset, int dir, int mode) > /* It could be that we already have a pushback buffer. */ > if (fp->_IO_read_base != NULL) > { > - free (fp->_IO_read_base); > + _IO_free_backup_buf (fp, fp->_IO_read_base); OK. Mechanical update. > @@ -1282,7 +1283,7 @@ _IO_file_xsgetn (FILE *fp, void *data, size_t n) > /* Maybe we already have a push back pointer. */ > if (fp->_IO_save_base != NULL) > { > - free (fp->_IO_save_base); > + _IO_free_backup_buf (fp, fp->_IO_save_base); OK. Mechanical update. > diff --git a/libio/genops.c b/libio/genops.c > index d7e35e67d5..dddd420ee2 100644 > --- a/libio/genops.c > +++ b/libio/genops.c > @@ -1,4 +1,5 @@ > /* Copyright (C) 1993-2024 Free Software Foundation, Inc. > + Copyright The GNU Toolchain Authors. OK. Consistent with DCO (but see above). > @@ -48,6 +49,13 @@ flush_cleanup (void *not_used) > } > #endif > > +void > +_IO_free_backup_buf (FILE *fp, char *ptr) > +{ > + if (ptr != fp->_short_backupbuf) > + free (ptr); > +} > + OK, this replaces explicit calls to `free', taking care of the special case of the backup buffer. But is there a need for this to be an external function? ISTM there could be a performance benefit from making it static inline: an arrangement for making calls here is likely not to be cheaper in terms of instruction size/count or execution time than making the comparison and branching around `free', even for simplistic predictors that predict all forward branches untaken. From the look of the code I infer we normally expect `ptr' not to point at the backup buffer as that will only happen in the case of a `malloc' failure, so firstly such a forward branch usually won't indeed be taken, making it virtually free for the fall-through case, and secondly please annotate the expression with `__glibc_unlikely' accordingly. > @@ -212,7 +220,7 @@ _IO_free_backup_area (FILE *fp) > { > if (_IO_in_backup (fp)) > _IO_switch_to_main_get_area (fp); /* Just in case. */ > - free (fp->_IO_save_base); > + _IO_free_backup_buf (fp, fp->_IO_save_base); OK. Mechanical update. > @@ -260,7 +268,7 @@ save_for_backup (FILE *fp, char *end_p) > memcpy (new_buffer + avail, > fp->_IO_read_base + least_mark, > needed_size); > - free (fp->_IO_save_base); > + _IO_free_backup_buf (fp, fp->_IO_save_base); OK. Mechanical update. > @@ -636,7 +644,7 @@ _IO_default_finish (FILE *fp, int dummy) > > if (fp->_IO_save_base) > { > - free (fp->_IO_save_base); > + _IO_free_backup_buf (fp, fp->_IO_save_base); OK. Mechanical update. > @@ -998,11 +1006,14 @@ _IO_default_pbackfail (FILE *fp, int c) > else if (!_IO_have_backup (fp)) > { > /* No backup buffer: allocate one. */ > - /* Use nshort buffer, if unused? (probably not) FIXME */ > int backup_size = 128; > char *bbuf = (char *) malloc (backup_size); > if (bbuf == NULL) > - return EOF; > + { > + /* Guarantee a 1-char pushback. */ > + bbuf = fp->_short_backupbuf; > + backup_size = 1; > + } OK. In the unlikely case of a `malloc' failure we'll resort to the single-character backup buffer, avoiding an unsuccessful return. Thanks for discarding a comment that's no longer relevant. From observation such bits are too easily missed. > @@ -1022,7 +1033,7 @@ _IO_default_pbackfail (FILE *fp, int c) > return EOF; > memcpy (new_buf + (new_size - old_size), fp->_IO_read_base, > old_size); > - free (fp->_IO_read_base); > + _IO_free_backup_buf (fp, fp->_IO_read_base); OK. Mechanical update. > diff --git a/libio/libioP.h b/libio/libioP.h > index 34bf91fcd8..287caf8664 100644 > --- a/libio/libioP.h > +++ b/libio/libioP.h > @@ -1,4 +1,5 @@ > /* Copyright (C) 1993-2024 Free Software Foundation, Inc. > + Copyright The GNU Toolchain Authors. OK. Consistent with DCO (but see above). > @@ -357,6 +358,8 @@ typedef FILE *_IO_ITER; > > /* Generic functions */ > > +extern void _IO_free_backup_buf (FILE *, char *); > +libc_hidden_proto (_IO_free_backup_buf) OK. But this won't be needed with a static inline function. > @@ -911,13 +914,13 @@ extern int _IO_vscanf (const char *, va_list) __THROW; > # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ > { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ > - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock } > + 0, { 0 }, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock } OK. New member initialised. > # else > # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ > { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ > NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, \ > - NULL, NULL, (FILE *) CHAIN, FD, \ > - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD,\ > + NULL, NULL, (FILE *) CHAIN, FD, 0, { 0 }, \ > + _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD, \ OK. New member initialised. I think it will make sense to keep the line breaks between the same members across all the four FILEBUF_LITERAL definitions so as to make it easier to people to match the variants against each other. Please coordinate with Alejandro Colomar (CC'd) on cleaning up these definitions, which went out of sync; cf. <https://inbox.sourceware.org/libc-alpha/042e25d3-1b02-c448-1f8c-84f52549f5b5@redhat.com/>. > @@ -925,12 +928,12 @@ extern int _IO_vscanf (const char *, va_list) __THROW; > # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ > { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ > - 0, _IO_pos_BAD } > + 0, { 0 }, _IO_pos_BAD } OK. New member initialised. > # else > # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ > { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ > 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ > - 0, _IO_pos_BAD, 0, 0, { 0 }, 0, _IO_pos_BAD, \ > + 0, { 0 }, _IO_pos_BAD, 0, 0, { 0 }, 0, _IO_pos_BAD, \ OK. New member initialised. > diff --git a/libio/oldfileops.c b/libio/oldfileops.c > index 8f775c9094..03f4d76a57 100644 > --- a/libio/oldfileops.c > +++ b/libio/oldfileops.c > @@ -1,4 +1,5 @@ > /* Copyright (C) 1993-2024 Free Software Foundation, Inc. > + Copyright The GNU Toolchain Authors. OK. Consistent with DCO (but see above). > @@ -311,7 +312,7 @@ _IO_old_file_underflow (FILE *fp) > /* Maybe we already have a push back pointer. */ > if (fp->_IO_save_base != NULL) > { > - free (fp->_IO_save_base); > + _IO_free_backup_buf (fp, fp->_IO_save_base); OK. Mechanical update. > @@ -464,7 +465,7 @@ _IO_old_file_seekoff (FILE *fp, off64_t offset, int dir, int mode) > /* It could be that we already have a pushback buffer. */ > if (fp->_IO_read_base != NULL) > { > - free (fp->_IO_read_base); > + _IO_free_backup_buf (fp, fp->_IO_read_base); OK. Mechanical update. > diff --git a/libio/wfileops.c b/libio/wfileops.c > index 16beab1f3a..a96bfa589b 100644 > --- a/libio/wfileops.c > +++ b/libio/wfileops.c > @@ -1,4 +1,5 @@ > /* Copyright (C) 1993-2024 Free Software Foundation, Inc. > + Copyright The GNU Toolchain Authors. OK. Consistent with DCO (but see above). > @@ -175,7 +176,7 @@ _IO_wfile_underflow (FILE *fp) > /* Maybe we already have a push back pointer. */ > if (fp->_IO_save_base != NULL) > { > - free (fp->_IO_save_base); > + _IO_free_backup_buf (fp, fp->_IO_save_base); OK. Mechanical update. > diff --git a/stdio-common/Makefile b/stdio-common/Makefile > index e76e40e587..b1a04fd064 100644 > --- a/stdio-common/Makefile > +++ b/stdio-common/Makefile > @@ -1,4 +1,5 @@ > # Copyright (C) 1991-2024 Free Software Foundation, Inc. > +# Copyright The GNU Toolchain Authors. OK. Consistent with DCO (but see above). > @@ -303,6 +304,7 @@ tests := \ > tst-tmpnam \ > tst-ungetc \ > tst-ungetc-leak \ > + tst-ungetc-nomem \ OK. New test added. > diff --git a/stdio-common/tst-ungetc-nomem.c b/stdio-common/tst-ungetc-nomem.c > new file mode 100644 > index 0000000000..49db33fff3 > --- /dev/null > +++ b/stdio-common/tst-ungetc-nomem.c > @@ -0,0 +1,115 @@ > +/* Test ungetc behavior with malloc failures. > + Copyright The GNU Toolchain Authors. > + This file is part of the GNU C Library. > + > + The GNU C Library is free software; you can redistribute it and/or > + modify it under the terms of the GNU Lesser General Public > + License as published by the Free Software Foundation; either > + version 2.1 of the License, or (at your option) any later version. > + > + The GNU C Library is distributed in the hope that it will be useful, > + but WITHOUT ANY WARRANTY; without even the implied warranty of > + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU > + Lesser General Public License for more details. > + > + You should have received a copy of the GNU Lesser General Public > + License along with the GNU C Library; if not, see > + <https://www.gnu.org/licenses/>. */ > + > +#include <stdio.h> > +#include <string.h> > +#include <support/check.h> > +#include <support/support.h> > +#include <support/temp_file.h> > +#include <support/xstdio.h> OK, alphabetic order. > + > +extern void *__libc_malloc (size_t) > + __attribute__ ((malloc)) __attribute__ ((alloc_size (1))); Please list both attributes together, also avoiding the question as to how much to indent here (as the prototype will fit in one line then). > + > +static volatile bool fail = false; OK, marked as `volatile' to prevent the compiler from interfering, as previously advised. I note this could skip the initialiser so as to place it in BSS, but I'm fine with the current arrangement if you consider it desirable to make it explicit. > + > +void * > +malloc (size_t sz) > +{ > + if (fail) > + return NULL; > + > + return __libc_malloc (sz); > +} OK, this interposes `malloc' so as to conditionally induce a failure and refers to `__libc_malloc' if the condition does not stand. A bit hackish IMO, but we're in control here, so let it be. I think this function deserves an introductory comment, even if a single terse line. > + > +static int > +do_test (void) > +{ > + char *filename = NULL; > + struct stat props = {}; As nice as empty initialisers are they're a C23-ism, so please rewrite using older syntax. > + size_t bufsz = 0; > + > + create_temp_file ("tst-ungetc-nomem.", &filename); > + if (stat (filename, &props) != 0) > + FAIL_EXIT1 ("Could not get file status: %m\n"); > + > + FILE *fp = fopen (filename, "w"); > + > + /* The libio buffer sizes are the same as block size. */ > + bufsz = props.st_blksize + 2; Why do we want the file to be the size of the libio buffer plus 2? The answer seems like a good candidate for the function's introductory comment. > + > + char *buf = xmalloc (bufsz); > + memset (buf, 'a', bufsz); > + > + if (fwrite (buf, sizeof (char), bufsz, fp) != bufsz) > + FAIL_EXIT1 ("fwrite failed: %m\n"); > + xfclose (fp); OK, we make a test file made up of "a" letters. > + > + /* Begin test. */ > + fp = xfopen (filename, "r"); > + > + while (!feof (fp)) > + { > + /* Reset the pushback buffer state. */ > + fseek (fp, 0, SEEK_CUR); > + > + fail = true; > + /* 1: First ungetc should always succeed, as the standard requires. */ > + TEST_COMPARE (ungetc ('y', fp), 'y'); > + > + /* 2: This will result in resizing, which should fail. */ > + TEST_COMPARE (ungetc ('w', fp), EOF); > + > + /* 3: Now allow the resizing, which should immediately fill up the buffer > + too, since this allocates only double the current buffer, i.e. > + 2-bytes. */ > + fail = false; > + TEST_COMPARE (ungetc ('x', fp), 'x'); This does verify new semantics, and I take it it's intentional that after a `malloc' failure for the initial buffer we don't go back to the minimum of 128 bytes for the buffer, but instead start from 2 up in a hope for a smaller allocation to succeed where a somewhat larger one might not. But I think this new semantics should be mentioned in the change description. > + > + /* 4: And fail again because this again forces an alloc, which fails. */ > + fail = true; > + TEST_COMPARE (ungetc ('x', fp), EOF); > + > + /* 5: Enable allocations again so that we now get a 4-byte buffer. Now > + both calls should work. */ > + fail = false; > + TEST_COMPARE (ungetc ('x', fp), 'x'); > + fail = true; > + TEST_COMPARE (ungetc ('x', fp), 'x'); > + > + /* Drain out the x's. */ > + TEST_COMPARE (fgetc (fp), 'x'); > + TEST_COMPARE (fgetc (fp), 'x'); > + TEST_COMPARE (fgetc (fp), 'x'); Shouldn't the `ungetc' calls use different characters each, so that we have an additional check that rejected characters do not come back and that the accepted ones come back in the correct order? > + > + /* Finally, drain out the first char we had pushed back, followed by one more char > + from the stream, if present. */ Please wrap the comment, cf: <https://sourceware.org/glibc/wiki/Style_and_Conventions#A79-Column_Lines>. > + TEST_COMPARE (fgetc (fp), 'y'); > + char c = fgetc (fp); > + if (!feof (fp)) > + TEST_COMPARE (c, 'a'); > + } So this loop, if successful, runs libio buffer size plus 2 times. Please state in the function's introductory comment why this specific iteration count has been chosen. > + > + /* Final sanity check before we're done. */ > + TEST_COMPARE (ferror (fp), 0); > + xfclose (fp); OK. Checking for no error and closing the test file. File removed automagically by test support clean-up. > + > + return 0; > +} > + > +#include <support/test-driver.c> OK. Please resend with the updates applied (but I note further clarification might be needed first). Maciej
Hi Maciej, On Mon, Dec 16, 2024 at 02:52:55AM +0000, Maciej W. Rozycki wrote: > > # else > > # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ > > { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ > > NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, \ > > - NULL, NULL, (FILE *) CHAIN, FD, \ > > - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD,\ > > + NULL, NULL, (FILE *) CHAIN, FD, 0, { 0 }, \ > > + _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD, \ > > OK. New member initialised. > > I think it will make sense to keep the line breaks between the same > members across all the four FILEBUF_LITERAL definitions so as to make it > easier to people to match the variants against each other. > > Please coordinate with Alejandro Colomar (CC'd) on cleaning up these > definitions, which went out of sync; cf. > <https://inbox.sourceware.org/libc-alpha/042e25d3-1b02-c448-1f8c-84f52549f5b5@redhat.com/>. Ack. (I've also checked the other email.) > > + > > +static int > > +do_test (void) > > +{ > > + char *filename = NULL; > > + struct stat props = {}; > > As nice as empty initialisers are they're a C23-ism, so please rewrite > using older syntax. They are also an older GNU extension, which I think is OK in .c files (unlike in headers). No? Have a lovely day! Alex
On 2024-12-16 05:05, Alejandro Colomar wrote: > Hi Maciej, > > On Mon, Dec 16, 2024 at 02:52:55AM +0000, Maciej W. Rozycki wrote: >>> # else >>> # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ >>> { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ >>> NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, \ >>> - NULL, NULL, (FILE *) CHAIN, FD, \ >>> - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD,\ >>> + NULL, NULL, (FILE *) CHAIN, FD, 0, { 0 }, \ >>> + _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD, \ >> >> OK. New member initialised. >> >> I think it will make sense to keep the line breaks between the same >> members across all the four FILEBUF_LITERAL definitions so as to make it >> easier to people to match the variants against each other. >> >> Please coordinate with Alejandro Colomar (CC'd) on cleaning up these >> definitions, which went out of sync; cf. >> <https://inbox.sourceware.org/libc-alpha/042e25d3-1b02-c448-1f8c-84f52549f5b5@redhat.com/>. > > Ack. (I've also checked the other email.) Do you want to send a patch isolated to this file, which I can rebase on top of and push whenever my patch is acked? Thanks, Sid
Hi Sid, On Mon, Dec 16, 2024 at 07:38:24AM -0500, Siddhesh Poyarekar wrote: > On 2024-12-16 05:05, Alejandro Colomar wrote: > > Hi Maciej, > > > > On Mon, Dec 16, 2024 at 02:52:55AM +0000, Maciej W. Rozycki wrote: > > > > # else > > > > # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ > > > > { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ > > > > NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, \ > > > > - NULL, NULL, (FILE *) CHAIN, FD, \ > > > > - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD,\ > > > > + NULL, NULL, (FILE *) CHAIN, FD, 0, { 0 }, \ > > > > + _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD, \ > > > > > > OK. New member initialised. > > > > > > I think it will make sense to keep the line breaks between the same > > > members across all the four FILEBUF_LITERAL definitions so as to make it > > > easier to people to match the variants against each other. > > > > > > Please coordinate with Alejandro Colomar (CC'd) on cleaning up these > > > definitions, which went out of sync; cf. > > > <https://inbox.sourceware.org/libc-alpha/042e25d3-1b02-c448-1f8c-84f52549f5b5@redhat.com/>. > > > > Ack. (I've also checked the other email.) > > Do you want to send a patch isolated to this file, which I can rebase on top > of and push whenever my patch is acked? Yep. Cheers, Alex
On 2024-12-15 21:52, Maciej W. Rozycki wrote: > Hi Sid, > > Thanks for your effort, this is looking mostly good to me. > > My main concern is `_IO_free_backup_buf', which I think will perform > better as a static inline function. I have some questions as well as to > the test functions including a request to add introductory comments for > them with the answers. Plus a couple of small nits, as all detailed > below. > >> The C standard requires that ungetc guarantees at least one pushback, so >> put a single byte pushback buffer in the FILE struct to enable that. > > Please mention here why this single byte pushback buffer is needed to > fulfil the C standard's requirement (i.e. that we fail to fulfil it now > because we use `malloc', which can fail). Ack, I'll update the comment. > >> Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> > > Hmm, please clarify your copyright status. I had my personal copyright assignment to the FSF revoked in ~2019. My employer (Red Hat) too has disclaimed copyright to my contributions to GNU projects. As author, I am the owner of the copyright to code I write for GNU projects, hence the Signed-off-by. >> diff --git a/libio/bits/types/struct_FILE.h b/libio/bits/types/struct_FILE.h >> index d8d26639d1..a5e0679de3 100644 >> --- a/libio/bits/types/struct_FILE.h >> +++ b/libio/bits/types/struct_FILE.h >> @@ -1,4 +1,5 @@ >> /* Copyright (C) 1991-2024 Free Software Foundation, Inc. >> + Copyright The GNU Toolchain Authors. > > OK. Consistent with DCO (but see above). > >> @@ -70,7 +71,8 @@ struct _IO_FILE >> struct _IO_FILE *_chain; >> >> int _fileno; >> - int _flags2; >> + int _flags2:24; >> + char _short_backupbuf[1]; > > OK. Taking advantage of a char member sharing the storage unit with the > preceding bit-field, so there's no change in the size of the structure or > member offsets. Please add a short description of the new member, just as > with most of the existing ones (all should have one IMO). OK. >> diff --git a/libio/fileops.c b/libio/fileops.c >> index 759d737ec7..d49e489f55 100644 >> --- a/libio/fileops.c >> +++ b/libio/fileops.c >> @@ -1,4 +1,5 @@ >> /* Copyright (C) 1993-2024 Free Software Foundation, Inc. >> + Copyright The GNU Toolchain Authors. > > OK. Consistent with DCO (but see above). > >> @@ -480,7 +481,7 @@ _IO_new_file_underflow (FILE *fp) >> /* Maybe we already have a push back pointer. */ >> if (fp->_IO_save_base != NULL) >> { >> - free (fp->_IO_save_base); >> + _IO_free_backup_buf (fp, fp->_IO_save_base); > > OK. Mechanical update. > >> @@ -932,7 +933,7 @@ _IO_new_file_seekoff (FILE *fp, off64_t offset, int dir, int mode) >> /* It could be that we already have a pushback buffer. */ >> if (fp->_IO_read_base != NULL) >> { >> - free (fp->_IO_read_base); >> + _IO_free_backup_buf (fp, fp->_IO_read_base); > > OK. Mechanical update. > >> @@ -1282,7 +1283,7 @@ _IO_file_xsgetn (FILE *fp, void *data, size_t n) >> /* Maybe we already have a push back pointer. */ >> if (fp->_IO_save_base != NULL) >> { >> - free (fp->_IO_save_base); >> + _IO_free_backup_buf (fp, fp->_IO_save_base); > > OK. Mechanical update. > >> diff --git a/libio/genops.c b/libio/genops.c >> index d7e35e67d5..dddd420ee2 100644 >> --- a/libio/genops.c >> +++ b/libio/genops.c >> @@ -1,4 +1,5 @@ >> /* Copyright (C) 1993-2024 Free Software Foundation, Inc. >> + Copyright The GNU Toolchain Authors. > > OK. Consistent with DCO (but see above). > >> @@ -48,6 +49,13 @@ flush_cleanup (void *not_used) >> } >> #endif >> >> +void >> +_IO_free_backup_buf (FILE *fp, char *ptr) >> +{ >> + if (ptr != fp->_short_backupbuf) >> + free (ptr); >> +} >> + > > OK, this replaces explicit calls to `free', taking care of the special > case of the backup buffer. > > But is there a need for this to be an external function? > > ISTM there could be a performance benefit from making it static inline: > an arrangement for making calls here is likely not to be cheaper in terms > of instruction size/count or execution time than making the comparison and > branching around `free', even for simplistic predictors that predict all > forward branches untaken. > > From the look of the code I infer we normally expect `ptr' not to point > at the backup buffer as that will only happen in the case of a `malloc' > failure, so firstly such a forward branch usually won't indeed be taken, > making it virtually free for the fall-through case, and secondly please > annotate the expression with `__glibc_unlikely' accordingly. That's a good point, I'll move it to libioP.h. >> @@ -212,7 +220,7 @@ _IO_free_backup_area (FILE *fp) >> { >> if (_IO_in_backup (fp)) >> _IO_switch_to_main_get_area (fp); /* Just in case. */ >> - free (fp->_IO_save_base); >> + _IO_free_backup_buf (fp, fp->_IO_save_base); > > OK. Mechanical update. > >> @@ -260,7 +268,7 @@ save_for_backup (FILE *fp, char *end_p) >> memcpy (new_buffer + avail, >> fp->_IO_read_base + least_mark, >> needed_size); >> - free (fp->_IO_save_base); >> + _IO_free_backup_buf (fp, fp->_IO_save_base); > > OK. Mechanical update. > >> @@ -636,7 +644,7 @@ _IO_default_finish (FILE *fp, int dummy) >> >> if (fp->_IO_save_base) >> { >> - free (fp->_IO_save_base); >> + _IO_free_backup_buf (fp, fp->_IO_save_base); > > OK. Mechanical update. > >> @@ -998,11 +1006,14 @@ _IO_default_pbackfail (FILE *fp, int c) >> else if (!_IO_have_backup (fp)) >> { >> /* No backup buffer: allocate one. */ >> - /* Use nshort buffer, if unused? (probably not) FIXME */ >> int backup_size = 128; >> char *bbuf = (char *) malloc (backup_size); >> if (bbuf == NULL) >> - return EOF; >> + { >> + /* Guarantee a 1-char pushback. */ >> + bbuf = fp->_short_backupbuf; >> + backup_size = 1; >> + } > > OK. In the unlikely case of a `malloc' failure we'll resort to the > single-character backup buffer, avoiding an unsuccessful return. > > Thanks for discarding a comment that's no longer relevant. From > observation such bits are too easily missed. > >> @@ -1022,7 +1033,7 @@ _IO_default_pbackfail (FILE *fp, int c) >> return EOF; >> memcpy (new_buf + (new_size - old_size), fp->_IO_read_base, >> old_size); >> - free (fp->_IO_read_base); >> + _IO_free_backup_buf (fp, fp->_IO_read_base); > > OK. Mechanical update. > >> diff --git a/libio/libioP.h b/libio/libioP.h >> index 34bf91fcd8..287caf8664 100644 >> --- a/libio/libioP.h >> +++ b/libio/libioP.h >> @@ -1,4 +1,5 @@ >> /* Copyright (C) 1993-2024 Free Software Foundation, Inc. >> + Copyright The GNU Toolchain Authors. > > OK. Consistent with DCO (but see above). > >> @@ -357,6 +358,8 @@ typedef FILE *_IO_ITER; >> >> /* Generic functions */ >> >> +extern void _IO_free_backup_buf (FILE *, char *); >> +libc_hidden_proto (_IO_free_backup_buf) > > OK. But this won't be needed with a static inline function. > >> @@ -911,13 +914,13 @@ extern int _IO_vscanf (const char *, va_list) __THROW; >> # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ >> { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ >> - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock } >> + 0, { 0 }, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock } > > OK. New member initialised. > >> # else >> # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ >> { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ >> NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, \ >> - NULL, NULL, (FILE *) CHAIN, FD, \ >> - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD,\ >> + NULL, NULL, (FILE *) CHAIN, FD, 0, { 0 }, \ >> + _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD, \ > > OK. New member initialised. > > I think it will make sense to keep the line breaks between the same > members across all the four FILEBUF_LITERAL definitions so as to make it > easier to people to match the variants against each other. > > Please coordinate with Alejandro Colomar (CC'd) on cleaning up these > definitions, which went out of sync; cf. > <https://inbox.sourceware.org/libc-alpha/042e25d3-1b02-c448-1f8c-84f52549f5b5@redhat.com/>. I'll rebase on top of whatever he pushes, or alternatively I've asked if he could send a separate patch for this file which I can include in my series and push whenever my patch is ready. >> @@ -925,12 +928,12 @@ extern int _IO_vscanf (const char *, va_list) __THROW; >> # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ >> { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ >> - 0, _IO_pos_BAD } >> + 0, { 0 }, _IO_pos_BAD } > > OK. New member initialised. > >> # else >> # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ >> { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ >> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ >> - 0, _IO_pos_BAD, 0, 0, { 0 }, 0, _IO_pos_BAD, \ >> + 0, { 0 }, _IO_pos_BAD, 0, 0, { 0 }, 0, _IO_pos_BAD, \ > > OK. New member initialised. > >> diff --git a/libio/oldfileops.c b/libio/oldfileops.c >> index 8f775c9094..03f4d76a57 100644 >> --- a/libio/oldfileops.c >> +++ b/libio/oldfileops.c >> @@ -1,4 +1,5 @@ >> /* Copyright (C) 1993-2024 Free Software Foundation, Inc. >> + Copyright The GNU Toolchain Authors. > > OK. Consistent with DCO (but see above). > >> @@ -311,7 +312,7 @@ _IO_old_file_underflow (FILE *fp) >> /* Maybe we already have a push back pointer. */ >> if (fp->_IO_save_base != NULL) >> { >> - free (fp->_IO_save_base); >> + _IO_free_backup_buf (fp, fp->_IO_save_base); > > OK. Mechanical update. > >> @@ -464,7 +465,7 @@ _IO_old_file_seekoff (FILE *fp, off64_t offset, int dir, int mode) >> /* It could be that we already have a pushback buffer. */ >> if (fp->_IO_read_base != NULL) >> { >> - free (fp->_IO_read_base); >> + _IO_free_backup_buf (fp, fp->_IO_read_base); > > OK. Mechanical update. > >> diff --git a/libio/wfileops.c b/libio/wfileops.c >> index 16beab1f3a..a96bfa589b 100644 >> --- a/libio/wfileops.c >> +++ b/libio/wfileops.c >> @@ -1,4 +1,5 @@ >> /* Copyright (C) 1993-2024 Free Software Foundation, Inc. >> + Copyright The GNU Toolchain Authors. > > OK. Consistent with DCO (but see above). > >> @@ -175,7 +176,7 @@ _IO_wfile_underflow (FILE *fp) >> /* Maybe we already have a push back pointer. */ >> if (fp->_IO_save_base != NULL) >> { >> - free (fp->_IO_save_base); >> + _IO_free_backup_buf (fp, fp->_IO_save_base); > > OK. Mechanical update. > >> diff --git a/stdio-common/Makefile b/stdio-common/Makefile >> index e76e40e587..b1a04fd064 100644 >> --- a/stdio-common/Makefile >> +++ b/stdio-common/Makefile >> @@ -1,4 +1,5 @@ >> # Copyright (C) 1991-2024 Free Software Foundation, Inc. >> +# Copyright The GNU Toolchain Authors. > > OK. Consistent with DCO (but see above). > >> @@ -303,6 +304,7 @@ tests := \ >> tst-tmpnam \ >> tst-ungetc \ >> tst-ungetc-leak \ >> + tst-ungetc-nomem \ > > OK. New test added. > >> diff --git a/stdio-common/tst-ungetc-nomem.c b/stdio-common/tst-ungetc-nomem.c >> new file mode 100644 >> index 0000000000..49db33fff3 >> --- /dev/null >> +++ b/stdio-common/tst-ungetc-nomem.c >> @@ -0,0 +1,115 @@ >> +/* Test ungetc behavior with malloc failures. >> + Copyright The GNU Toolchain Authors. >> + This file is part of the GNU C Library. >> + >> + The GNU C Library is free software; you can redistribute it and/or >> + modify it under the terms of the GNU Lesser General Public >> + License as published by the Free Software Foundation; either >> + version 2.1 of the License, or (at your option) any later version. >> + >> + The GNU C Library is distributed in the hope that it will be useful, >> + but WITHOUT ANY WARRANTY; without even the implied warranty of >> + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU >> + Lesser General Public License for more details. >> + >> + You should have received a copy of the GNU Lesser General Public >> + License along with the GNU C Library; if not, see >> + <https://www.gnu.org/licenses/>. */ >> + >> +#include <stdio.h> >> +#include <string.h> >> +#include <support/check.h> >> +#include <support/support.h> >> +#include <support/temp_file.h> >> +#include <support/xstdio.h> > > OK, alphabetic order. > >> + >> +extern void *__libc_malloc (size_t) >> + __attribute__ ((malloc)) __attribute__ ((alloc_size (1))); > > Please list both attributes together, also avoiding the question as to > how much to indent here (as the prototype will fit in one line then). OK. >> + >> +static volatile bool fail = false; > > OK, marked as `volatile' to prevent the compiler from interfering, as > previously advised. I note this could skip the initialiser so as to place > it in BSS, but I'm fine with the current arrangement if you consider it > desirable to make it explicit. > >> + >> +void * >> +malloc (size_t sz) >> +{ >> + if (fail) >> + return NULL; >> + >> + return __libc_malloc (sz); >> +} > > OK, this interposes `malloc' so as to conditionally induce a failure and > refers to `__libc_malloc' if the condition does not stand. A bit hackish > IMO, but we're in control here, so let it be. > > I think this function deserves an introductory comment, even if a single > terse line. OK. >> + >> +static int >> +do_test (void) >> +{ >> + char *filename = NULL; >> + struct stat props = {}; > > As nice as empty initialisers are they're a C23-ism, so please rewrite > using older syntax. We use this quite extensively across glibc, so I'm inclined to keep this unless you have a strong objection; it just looks so clean :) Besides, as Alejandro pointed out, it's been a GNU extension since before that. >> + size_t bufsz = 0; >> + >> + create_temp_file ("tst-ungetc-nomem.", &filename); >> + if (stat (filename, &props) != 0) >> + FAIL_EXIT1 ("Could not get file status: %m\n"); >> + >> + FILE *fp = fopen (filename, "w"); >> + >> + /* The libio buffer sizes are the same as block size. */ >> + bufsz = props.st_blksize + 2; > > Why do we want the file to be the size of the libio buffer plus 2? The > answer seems like a good candidate for the function's introductory > comment. Ack, it's basically to test to the point of running out of buffer space so that we test at the read underflow border. >> + >> + char *buf = xmalloc (bufsz); >> + memset (buf, 'a', bufsz); >> + >> + if (fwrite (buf, sizeof (char), bufsz, fp) != bufsz) >> + FAIL_EXIT1 ("fwrite failed: %m\n"); >> + xfclose (fp); > > OK, we make a test file made up of "a" letters. > >> + >> + /* Begin test. */ >> + fp = xfopen (filename, "r"); >> + >> + while (!feof (fp)) >> + { >> + /* Reset the pushback buffer state. */ >> + fseek (fp, 0, SEEK_CUR); >> + >> + fail = true; >> + /* 1: First ungetc should always succeed, as the standard requires. */ >> + TEST_COMPARE (ungetc ('y', fp), 'y'); >> + >> + /* 2: This will result in resizing, which should fail. */ >> + TEST_COMPARE (ungetc ('w', fp), EOF); >> + >> + /* 3: Now allow the resizing, which should immediately fill up the buffer >> + too, since this allocates only double the current buffer, i.e. >> + 2-bytes. */ >> + fail = false; >> + TEST_COMPARE (ungetc ('x', fp), 'x'); > > This does verify new semantics, and I take it it's intentional that after > a `malloc' failure for the initial buffer we don't go back to the minimum > of 128 bytes for the buffer, but instead start from 2 up in a hope for a > smaller allocation to succeed where a somewhat larger one might not. But > I think this new semantics should be mentioned in the change description. OK, I'll add a comment in the pushback code change. > >> + >> + /* 4: And fail again because this again forces an alloc, which fails. */ >> + fail = true; >> + TEST_COMPARE (ungetc ('x', fp), EOF); >> + >> + /* 5: Enable allocations again so that we now get a 4-byte buffer. Now >> + both calls should work. */ >> + fail = false; >> + TEST_COMPARE (ungetc ('x', fp), 'x'); >> + fail = true; >> + TEST_COMPARE (ungetc ('x', fp), 'x'); >> + >> + /* Drain out the x's. */ >> + TEST_COMPARE (fgetc (fp), 'x'); >> + TEST_COMPARE (fgetc (fp), 'x'); >> + TEST_COMPARE (fgetc (fp), 'x'); > > Shouldn't the `ungetc' calls use different characters each, so that we > have an additional check that rejected characters do not come back and > that the accepted ones come back in the correct order? I've grouped them, but sure, I could make each call unget/get a distinct char. >> + >> + /* Finally, drain out the first char we had pushed back, followed by one more char >> + from the stream, if present. */ > > Please wrap the comment, cf: > <https://sourceware.org/glibc/wiki/Style_and_Conventions#A79-Column_Lines>. Oops. >> + TEST_COMPARE (fgetc (fp), 'y'); >> + char c = fgetc (fp); >> + if (!feof (fp)) >> + TEST_COMPARE (c, 'a'); >> + } > > So this loop, if successful, runs libio buffer size plus 2 times. Please > state in the function's introductory comment why this specific iteration > count has been chosen. Same as mentioned above, to test at the read underflow boundary, which happens at libio buffer size. >> + >> + /* Final sanity check before we're done. */ >> + TEST_COMPARE (ferror (fp), 0); >> + xfclose (fp); > > OK. Checking for no error and closing the test file. File removed > automagically by test support clean-up. > >> + >> + return 0; >> +} >> + >> +#include <support/test-driver.c> > > OK. > > Please resend with the updates applied (but I note further clarification > might be needed first). Thanks, Sid
On 2024-12-16 07:58, Siddhesh Poyarekar wrote: >>> + >>> +void * >>> +malloc (size_t sz) >>> +{ >>> + if (fail) >>> + return NULL; >>> + >>> + return __libc_malloc (sz); >>> +} >> >> OK, this interposes `malloc' so as to conditionally induce a failure >> and >> refers to `__libc_malloc' if the condition does not stand. A bit hackish >> IMO, but we're in control here, so let it be. >> >> I think this function deserves an introductory comment, even if a >> single >> terse line. > > OK. > I just realized while adding the comment that the reason why I was using __libc_malloc (which was to avoid dlsym) was a flimsy one and will in fact end up skipping any interposed malloc implementations. It's not an immediate problem since I don't think anybody runs the testsuite with interposed malloc today, but I'll use dlsym anyway to be future-proof. Sid
On Mon, 16 Dec 2024, Siddhesh Poyarekar wrote: > > > > + > > > > +void * > > > > +malloc (size_t sz) > > > > +{ > > > > + if (fail) > > > > + return NULL; > > > > + > > > > + return __libc_malloc (sz); > > > > +} > > > > > > OK, this interposes `malloc' so as to conditionally induce a failure and > > > refers to `__libc_malloc' if the condition does not stand. A bit hackish > > > IMO, but we're in control here, so let it be. > > > > > > I think this function deserves an introductory comment, even if a single > > > terse line. > > > > OK. > > > > I just realized while adding the comment that the reason why I was using > __libc_malloc (which was to avoid dlsym) was a flimsy one and will in fact end > up skipping any interposed malloc implementations. It's not an immediate > problem since I don't think anybody runs the testsuite with interposed malloc > today, but I'll use dlsym anyway to be future-proof. But is using `dlsym' going to actually work? AFAIK a symbol in the main executable pre-empts any other ones of the same name coming from the loader's initial namespace (i.e. including any pulled early via LD_PRELOAD, but not those loaded via `dlopen' into a new namespace). NB we have a preexisting use of `__libc_malloc' in another test case, so if coming up with an alternative arrangement, I suggest to cover it too. Maciej
On 2024-12-16 11:16, Maciej W. Rozycki wrote: >> I just realized while adding the comment that the reason why I was using >> __libc_malloc (which was to avoid dlsym) was a flimsy one and will in fact end >> up skipping any interposed malloc implementations. It's not an immediate >> problem since I don't think anybody runs the testsuite with interposed malloc >> today, but I'll use dlsym anyway to be future-proof. > > But is using `dlsym' going to actually work? > > AFAIK a symbol in the main executable pre-empts any other ones of the > same name coming from the loader's initial namespace (i.e. including any > pulled early via LD_PRELOAD, but not those loaded via `dlopen' into a new > namespace). The idea is to use RTLD_NEXT from within the malloc in the executable to get the next available 'malloc'. This would find the glibc malloc under normal circumstances, but also be able to use any interposed malloc if that's a future need. I've submitted v5 with that change. Thanks, Sid
diff --git a/libio/bits/types/struct_FILE.h b/libio/bits/types/struct_FILE.h index d8d26639d1..a5e0679de3 100644 --- a/libio/bits/types/struct_FILE.h +++ b/libio/bits/types/struct_FILE.h @@ -1,4 +1,5 @@ /* Copyright (C) 1991-2024 Free Software Foundation, Inc. + Copyright The GNU Toolchain Authors. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -70,7 +71,8 @@ struct _IO_FILE struct _IO_FILE *_chain; int _fileno; - int _flags2; + int _flags2:24; + char _short_backupbuf[1]; __off_t _old_offset; /* This used to be _offset but it's too small. */ /* 1+column number of pbase(); 0 is unknown. */ diff --git a/libio/fileops.c b/libio/fileops.c index 759d737ec7..d49e489f55 100644 --- a/libio/fileops.c +++ b/libio/fileops.c @@ -1,4 +1,5 @@ /* Copyright (C) 1993-2024 Free Software Foundation, Inc. + Copyright The GNU Toolchain Authors. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -480,7 +481,7 @@ _IO_new_file_underflow (FILE *fp) /* Maybe we already have a push back pointer. */ if (fp->_IO_save_base != NULL) { - free (fp->_IO_save_base); + _IO_free_backup_buf (fp, fp->_IO_save_base); fp->_flags &= ~_IO_IN_BACKUP; } _IO_doallocbuf (fp); @@ -932,7 +933,7 @@ _IO_new_file_seekoff (FILE *fp, off64_t offset, int dir, int mode) /* It could be that we already have a pushback buffer. */ if (fp->_IO_read_base != NULL) { - free (fp->_IO_read_base); + _IO_free_backup_buf (fp, fp->_IO_read_base); fp->_flags &= ~_IO_IN_BACKUP; } _IO_doallocbuf (fp); @@ -1282,7 +1283,7 @@ _IO_file_xsgetn (FILE *fp, void *data, size_t n) /* Maybe we already have a push back pointer. */ if (fp->_IO_save_base != NULL) { - free (fp->_IO_save_base); + _IO_free_backup_buf (fp, fp->_IO_save_base); fp->_flags &= ~_IO_IN_BACKUP; } _IO_doallocbuf (fp); diff --git a/libio/genops.c b/libio/genops.c index d7e35e67d5..dddd420ee2 100644 --- a/libio/genops.c +++ b/libio/genops.c @@ -1,4 +1,5 @@ /* Copyright (C) 1993-2024 Free Software Foundation, Inc. + Copyright The GNU Toolchain Authors. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -48,6 +49,13 @@ flush_cleanup (void *not_used) } #endif +void +_IO_free_backup_buf (FILE *fp, char *ptr) +{ + if (ptr != fp->_short_backupbuf) + free (ptr); +} + /* Fields in struct _IO_FILE after the _lock field are internal to glibc and opaque to applications. We can change them as long as the size of struct _IO_FILE is unchanged, which is checked as the @@ -212,7 +220,7 @@ _IO_free_backup_area (FILE *fp) { if (_IO_in_backup (fp)) _IO_switch_to_main_get_area (fp); /* Just in case. */ - free (fp->_IO_save_base); + _IO_free_backup_buf (fp, fp->_IO_save_base); fp->_IO_save_base = NULL; fp->_IO_save_end = NULL; fp->_IO_backup_base = NULL; @@ -260,7 +268,7 @@ save_for_backup (FILE *fp, char *end_p) memcpy (new_buffer + avail, fp->_IO_read_base + least_mark, needed_size); - free (fp->_IO_save_base); + _IO_free_backup_buf (fp, fp->_IO_save_base); fp->_IO_save_base = new_buffer; fp->_IO_save_end = new_buffer + avail + needed_size; } @@ -636,7 +644,7 @@ _IO_default_finish (FILE *fp, int dummy) if (fp->_IO_save_base) { - free (fp->_IO_save_base); + _IO_free_backup_buf (fp, fp->_IO_save_base); fp->_IO_save_base = NULL; } @@ -998,11 +1006,14 @@ _IO_default_pbackfail (FILE *fp, int c) else if (!_IO_have_backup (fp)) { /* No backup buffer: allocate one. */ - /* Use nshort buffer, if unused? (probably not) FIXME */ int backup_size = 128; char *bbuf = (char *) malloc (backup_size); if (bbuf == NULL) - return EOF; + { + /* Guarantee a 1-char pushback. */ + bbuf = fp->_short_backupbuf; + backup_size = 1; + } fp->_IO_save_base = bbuf; fp->_IO_save_end = fp->_IO_save_base + backup_size; fp->_IO_backup_base = fp->_IO_save_end; @@ -1022,7 +1033,7 @@ _IO_default_pbackfail (FILE *fp, int c) return EOF; memcpy (new_buf + (new_size - old_size), fp->_IO_read_base, old_size); - free (fp->_IO_read_base); + _IO_free_backup_buf (fp, fp->_IO_read_base); _IO_setg (fp, new_buf, new_buf + (new_size - old_size), new_buf + new_size); fp->_IO_backup_base = fp->_IO_read_ptr; diff --git a/libio/libioP.h b/libio/libioP.h index 34bf91fcd8..287caf8664 100644 --- a/libio/libioP.h +++ b/libio/libioP.h @@ -1,4 +1,5 @@ /* Copyright (C) 1993-2024 Free Software Foundation, Inc. + Copyright The GNU Toolchain Authors. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -357,6 +358,8 @@ typedef FILE *_IO_ITER; /* Generic functions */ +extern void _IO_free_backup_buf (FILE *, char *); +libc_hidden_proto (_IO_free_backup_buf) extern void _IO_switch_to_main_get_area (FILE *) __THROW; extern void _IO_switch_to_backup_area (FILE *) __THROW; extern int _IO_switch_to_get_mode (FILE *); @@ -911,13 +914,13 @@ extern int _IO_vscanf (const char *, va_list) __THROW; # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock } + 0, { 0 }, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock } # else # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, \ - NULL, NULL, (FILE *) CHAIN, FD, \ - 0, _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD,\ + NULL, NULL, (FILE *) CHAIN, FD, 0, { 0 }, \ + _IO_pos_BAD, 0, 0, { 0 }, &_IO_stdfile_##FD##_lock, _IO_pos_BAD, \ NULL, WDP, NULL } # endif #else @@ -925,12 +928,12 @@ extern int _IO_vscanf (const char *, va_list) __THROW; # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ - 0, _IO_pos_BAD } + 0, { 0 }, _IO_pos_BAD } # else # define FILEBUF_LITERAL(CHAIN, FLAGS, FD, WDP) \ { _IO_MAGIC+_IO_LINKED+_IO_IS_FILEBUF+FLAGS, \ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, (FILE *) CHAIN, FD, \ - 0, _IO_pos_BAD, 0, 0, { 0 }, 0, _IO_pos_BAD, \ + 0, { 0 }, _IO_pos_BAD, 0, 0, { 0 }, 0, _IO_pos_BAD, \ NULL, WDP, 0 } # endif #endif diff --git a/libio/oldfileops.c b/libio/oldfileops.c index 8f775c9094..03f4d76a57 100644 --- a/libio/oldfileops.c +++ b/libio/oldfileops.c @@ -1,4 +1,5 @@ /* Copyright (C) 1993-2024 Free Software Foundation, Inc. + Copyright The GNU Toolchain Authors. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -311,7 +312,7 @@ _IO_old_file_underflow (FILE *fp) /* Maybe we already have a push back pointer. */ if (fp->_IO_save_base != NULL) { - free (fp->_IO_save_base); + _IO_free_backup_buf (fp, fp->_IO_save_base); fp->_flags &= ~_IO_IN_BACKUP; } _IO_doallocbuf (fp); @@ -464,7 +465,7 @@ _IO_old_file_seekoff (FILE *fp, off64_t offset, int dir, int mode) /* It could be that we already have a pushback buffer. */ if (fp->_IO_read_base != NULL) { - free (fp->_IO_read_base); + _IO_free_backup_buf (fp, fp->_IO_read_base); fp->_flags &= ~_IO_IN_BACKUP; } _IO_doallocbuf (fp); diff --git a/libio/wfileops.c b/libio/wfileops.c index 16beab1f3a..a96bfa589b 100644 --- a/libio/wfileops.c +++ b/libio/wfileops.c @@ -1,4 +1,5 @@ /* Copyright (C) 1993-2024 Free Software Foundation, Inc. + Copyright The GNU Toolchain Authors. This file is part of the GNU C Library. The GNU C Library is free software; you can redistribute it and/or @@ -175,7 +176,7 @@ _IO_wfile_underflow (FILE *fp) /* Maybe we already have a push back pointer. */ if (fp->_IO_save_base != NULL) { - free (fp->_IO_save_base); + _IO_free_backup_buf (fp, fp->_IO_save_base); fp->_flags &= ~_IO_IN_BACKUP; } _IO_doallocbuf (fp); diff --git a/stdio-common/Makefile b/stdio-common/Makefile index e76e40e587..b1a04fd064 100644 --- a/stdio-common/Makefile +++ b/stdio-common/Makefile @@ -1,4 +1,5 @@ # Copyright (C) 1991-2024 Free Software Foundation, Inc. +# Copyright The GNU Toolchain Authors. # This file is part of the GNU C Library. # The GNU C Library is free software; you can redistribute it and/or @@ -303,6 +304,7 @@ tests := \ tst-tmpnam \ tst-ungetc \ tst-ungetc-leak \ + tst-ungetc-nomem \ tst-unlockedio \ tst-vfprintf-mbs-prec \ tst-vfprintf-user-type \ diff --git a/stdio-common/tst-ungetc-nomem.c b/stdio-common/tst-ungetc-nomem.c new file mode 100644 index 0000000000..49db33fff3 --- /dev/null +++ b/stdio-common/tst-ungetc-nomem.c @@ -0,0 +1,115 @@ +/* Test ungetc behavior with malloc failures. + Copyright The GNU Toolchain Authors. + This file is part of the GNU C Library. + + The GNU C Library is free software; you can redistribute it and/or + modify it under the terms of the GNU Lesser General Public + License as published by the Free Software Foundation; either + version 2.1 of the License, or (at your option) any later version. + + The GNU C Library is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + Lesser General Public License for more details. + + You should have received a copy of the GNU Lesser General Public + License along with the GNU C Library; if not, see + <https://www.gnu.org/licenses/>. */ + +#include <stdio.h> +#include <string.h> +#include <support/check.h> +#include <support/support.h> +#include <support/temp_file.h> +#include <support/xstdio.h> + +extern void *__libc_malloc (size_t) + __attribute__ ((malloc)) __attribute__ ((alloc_size (1))); + +static volatile bool fail = false; + +void * +malloc (size_t sz) +{ + if (fail) + return NULL; + + return __libc_malloc (sz); +} + +static int +do_test (void) +{ + char *filename = NULL; + struct stat props = {}; + size_t bufsz = 0; + + create_temp_file ("tst-ungetc-nomem.", &filename); + if (stat (filename, &props) != 0) + FAIL_EXIT1 ("Could not get file status: %m\n"); + + FILE *fp = fopen (filename, "w"); + + /* The libio buffer sizes are the same as block size. */ + bufsz = props.st_blksize + 2; + + char *buf = xmalloc (bufsz); + memset (buf, 'a', bufsz); + + if (fwrite (buf, sizeof (char), bufsz, fp) != bufsz) + FAIL_EXIT1 ("fwrite failed: %m\n"); + xfclose (fp); + + /* Begin test. */ + fp = xfopen (filename, "r"); + + while (!feof (fp)) + { + /* Reset the pushback buffer state. */ + fseek (fp, 0, SEEK_CUR); + + fail = true; + /* 1: First ungetc should always succeed, as the standard requires. */ + TEST_COMPARE (ungetc ('y', fp), 'y'); + + /* 2: This will result in resizing, which should fail. */ + TEST_COMPARE (ungetc ('w', fp), EOF); + + /* 3: Now allow the resizing, which should immediately fill up the buffer + too, since this allocates only double the current buffer, i.e. + 2-bytes. */ + fail = false; + TEST_COMPARE (ungetc ('x', fp), 'x'); + + /* 4: And fail again because this again forces an alloc, which fails. */ + fail = true; + TEST_COMPARE (ungetc ('x', fp), EOF); + + /* 5: Enable allocations again so that we now get a 4-byte buffer. Now + both calls should work. */ + fail = false; + TEST_COMPARE (ungetc ('x', fp), 'x'); + fail = true; + TEST_COMPARE (ungetc ('x', fp), 'x'); + + /* Drain out the x's. */ + TEST_COMPARE (fgetc (fp), 'x'); + TEST_COMPARE (fgetc (fp), 'x'); + TEST_COMPARE (fgetc (fp), 'x'); + + /* Finally, drain out the first char we had pushed back, followed by one more char + from the stream, if present. */ + TEST_COMPARE (fgetc (fp), 'y'); + char c = fgetc (fp); + if (!feof (fp)) + TEST_COMPARE (c, 'a'); + } + + /* Final sanity check before we're done. */ + TEST_COMPARE (ferror (fp), 0); + xfclose (fp); + + return 0; +} + +#include <support/test-driver.c>
The C standard requires that ungetc guarantees at least one pushback, so put a single byte pushback buffer in the FILE struct to enable that. Signed-off-by: Siddhesh Poyarekar <siddhesh@sourceware.org> --- Changes from v3: - Shrunk _flags2 and moved _short_backupbuf to the old FILE struct. Changes from v2: - Fixed nits - Used _IO_free_backup_buf in oldfileops and wfileops as well. - Enhanced test to try some more cases Changes from v1: - Drop ungetwc from scope of the patchset - Fixed nits - Retain old behaviour for legacy applications - Minimize changes to fileops - Namespace-ize free_backup_buf - Add a test to verify that the subsequent malloc failure results in ungetc failure too - Add GNU Toolchain Authors copyright notice. libio/bits/types/struct_FILE.h | 4 +- libio/fileops.c | 7 +- libio/genops.c | 23 +++++-- libio/libioP.h | 13 ++-- libio/oldfileops.c | 5 +- libio/wfileops.c | 3 +- stdio-common/Makefile | 2 + stdio-common/tst-ungetc-nomem.c | 115 ++++++++++++++++++++++++++++++++ 8 files changed, 154 insertions(+), 18 deletions(-) create mode 100644 stdio-common/tst-ungetc-nomem.c