Message ID | 20140317174452.GA10644@domone |
---|---|
State | New |
Headers | show |
ping On Mon, Mar 17, 2014 at 06:44:52PM +0100, Ondřej Bílka wrote: > On Tue, Mar 11, 2014 at 06:51:55AM -0400, Mike Frysinger wrote: > > On Wed 05 Mar 2014 10:13:31 Ondřej Bílka wrote: > > > On Thu, Feb 27, 2014 at 09:45:37AM -0800, Paul Eggert wrote: > > > > Squashing an inode that way has a small chance of introducing what > > > > could be a serious bug. If glibc is going to squash them, it should > > > > do so reliably, by maintaining a table of all the inodes it's ever > > > > seen and making sure there are no collisions. > > > > > > > > Why bother to squash them at all, though? Programs that care about > > > > files should be compiled with _FILE_OFFSET_BITS defined to 64. If > > > > we're worried about programs that don't define _FILE_OFFSET_BITS, we > > > > could change glibc to default to _FILE_OFFSET_BITS=64; that's a > > > > better long-term solution anyway. > > > > > > Changing default would be better. I dig while how to do it and can not > > > find a nonugly solution. How should we do this? > > > > isn't it simply: > > > > --- a/include/features.h > > +++ b/include/features.h > > @@ -303,7 +303,7 @@ > > # define __USE_LARGEFILE64 1 > > #endif > > > > -#if defined _FILE_OFFSET_BITS && _FILE_OFFSET_BITS == 64 > > +#if !defined _FILE_OFFSET_BITS || _FILE_OFFSET_BITS == 64 > > # define __USE_FILE_OFFSET64 1 > > #endif > > > > > > the glibc source files themselves seem to expect default of > > _FILE_OFFSET_BITS=32 though, so we might have to add that to default CPPFLAGS > > when building. > > > > i kind of feel like deploying this on Gentoo systems to see what breaks ;). > > -mike > > Yes, it is. Could we in this release at least add this warning? > I tried to restrict warning only when _ino_t is used but warning > attribute does not apply to types and using __attribute__ ((deprecated)) > would confuse users. > > * dirent/dirent.h [defined __USE_XOPEN && defined __ino_t_defined > && ! defined __USE_FILE_OFFSET64]: Add warning. > > diff --git a/dirent/dirent.h b/dirent/dirent.h > index f6db6c6..35c0a93 100644 > --- a/dirent/dirent.h > +++ b/dirent/dirent.h > @@ -31,6 +31,7 @@ __BEGIN_DECLS > #ifdef __USE_XOPEN > # ifndef __ino_t_defined > # ifndef __USE_FILE_OFFSET64 > +# warning Please use -D_FILE_OFFSET_BITS=64 to support long files. > typedef __ino_t ino_t; > # else > typedef __ino64_t ino_t;
On Fri, 21 Mar 2014, Ondrej Bilka wrote:
> ping
Scattergun patches changing a random header with a random deprecation (not
even the header most likely to be used to get the type in question, which
is sys/types.h not dirent.h) make no sense here.
The following kinds of patches make sense here as incremental improvements
(they are not necessarily needed for a change in the default, just
somewhat vaguely related, but they also have self-contained justification
independent of any such change or deprecation): (this is not an exhaustive
list)
* Adding *64 versions of interfaces without them (at least fts).
* Refining conditions used in headers (at least fts.h) to reject use in a
particular _FILE_OFFSET_BITS mode. The conditions need to be
conservatively safe. The relevant conditions for fts.h are, I think, that
ino_t and struct stat match between the two interfaces. I don't think any
form of static assertion is needed there; bits/typesizes.h has a macro
__INO_T_MATCHES_INO64_T, and a header could be added that defines a macro
for whether stat and stat64 match (probably with a generic version
defining it if __LP64__, and a MIPS GNU/Linux version that never defines
it, but anyone implementing this should check all systems carefully to
confirm this). Of course, if the previous point were implemented, this
would become irrelevant.
* Fixing the ISO C / POSIX namespace issues for _FILE_OFFSET_BITS=64,
using existing __*64 symbols where already exported at a public symbol
version or adding new ones at a new symbol version if not (bug 14106).
* Stopping _FILE_OFFSET_BITS=64 affecting the C++ ABI on 64-bit
architectures (bug 15766) - although inevitably that could break things
for some existing C++ binaries (but sometimes such mangling breakage seems
unavoidable, e.g. the removal of the "siginfo" tag from siginfo_t).
* Where code internally in glibc and the miscellaneous programs it
installs uses non-large-files interfaces such as fopen or __open, making
it use the versions supporting large files instead.
But for any change to the default or deprecation, we need an in-depth
analysis showing what the effects would be in practice. For all the
library headers in a large GNU/Linux distribution, how many use any
affected type in an ABI, and what settings are those libraries built with?
For all the program and library binaries in a large distribution, how many
use the _FILE_OFFSET_BITS=32 interfaces, how many use the
_FILE_OFFSET_BITS=64 interfaces, how many use interfaces such as fseek and
ftell that are implicitly unsafe in the _FILE_OFFSET_BITS=64 context and
need changing to fseeko / ftello to work reliably in that context? Post
the scripts used so other distributions can adapt them to tell what's
affected in their distributions. If you rebuild a distribution with a
changed default, what builds break?
In principle I think a change to the default is likely to be a good idea,
but first we need a clear understanding of the extent of its impact on
ABIs of non-glibc libraries, and tools that distributors can use to work
out what libraries in their distributions are affected and plan ABI
transitions as needed.
(I'm not asking for a detailed analysis of each affected library, but
lists like "these 50 library packages are affected out of 1000 in
distribution X" would help indicate the scale of the issue. And some
examples for individual cases from distributors, "this is how we'd handle
the transition for library Y so that binaries users built with the old
version of library Y continue to work correctly", would help as well -
such a change means a large whole-system transition that distributors need
to be on-board with to avoid breaking things for their users, or breaking
cross-distribution binary compatibility as far as possible. Details
agreed between distributors of how to handle particular libraries might
then go on a wiki page somewhere in the interests of cross-distribution
binary compatibility.)
On 03/21/2014 06:11 AM, Joseph S. Myers wrote: > * Adding *64 versions of interfaces without them (at least fts). This is a messy situation. GNU applications typically use Gnulib's version of fts, and Gnulib's interface has diverged from libc's. The Gnulib interface and implementation support features that glibc doesn't, e.g., checking for directory loops, and these are used by programs like 'rm -r' and 'chmod -R'. One possible way forward would be to deprecate the current fts API, and switch to the gnulib API (with room for future extensions), retaining binary support for old applications that dynamically link to the new glibc. This would be some work, though. > * Refining conditions used in headers (at least fts.h) to reject use in a > particular _FILE_OFFSET_BITS mode. I looked into this a bit for fts.h, and expect that it's more trouble than it's worth due to glitches similar to what you mentioned for MIPS n64. For example, on 64-bit hosts sysdeps/mach/hurd/bits/stat.h appends slightly different padding at struct stat's end when _FILE_OFFSET_BITS is 64 (why, I don't know), making struct stat incompatible with struct stat64. In practice, people building 64-bit applications rarely define _FILE_OFFSET_BITS to be 64 (because Autoconf and Gnulib don't do that), so it's probably not worth our while to improve support for that combination. It is odd that libc supports two different 64-bit ABIs for 'stat' etc. on all 64-bit platforms, one hardly ever used. It might make sense to deprecate the rarely-used ABI, i.e., have _FILE_OFFSET_BITS=64 be a no-op on 64-bit platforms. This should fix the C++ ABI problem you mentioned (bug #15766). > we need an in-depth > analysis showing what the effects would be in practice. For all the > library headers in a large GNU/Linux distribution, how many use any > affected type in an ABI, and what settings are those libraries built with? It'd take quite some time to generate such an analysis. I grabbed all the Fedora 20 -devel packages that I could (i.e., all those that didn't conflict with something already installed), and ran the attached script while in /usr/include, and this generated 2847 lines of output. Not every line needs to be looked at in detail, but enough do. I can tell you the results of looking at the first 45 lines of output of "file_offset *", though. I found four categories of libraries: 1. These libraries are compiled with -D_FILE_OFFSET_BITS=64, so theycurrently should have problems unless applications are also built with-D_FILE_OFFSET_BITS=64. Changing the default will fix these problems. flac fltk GraphicsMagick hdf5 ImageMagick 2. These libraries use a type like off_t in macros or static or inlinefunctions, but it's not part of the binary API. Changing the defaultto 64-bit will therefore not affect the library's API. Programs recompiled with the new default will have larger off_t etc., just asthey will with glibc. libeplplot InsightToolkit 3. These libraries use a type like off_t in defining a data type that isnot used anywhere (either as part of the API, or internally), so changingthe default shouldn't affect them. GeoIP 4. These libraries use a type like off_t in redeclaring standardfunctions like 'stat', which should continue to work if the defaultis changed. ice From looking at the above, it appears that changing the default _FILE_OFFSET_BITS to 64 will fix more bugs than it'll cause. Of course this is just one small sample. I stopped at 45 lines of grep output because the next lines were big batches from Qt and from X11; these will be a pain to analyze, and it's not clear to me what the results will be. One other data point: on 32-bit platforms Perl and Python both assume _FILE_OFFSET_BITS=64, so I suspect that libraries linked with these programs can have problems if they are compiled without _FILE_OFFSET_BITS=64. #! /bin/sh LC_ALL=C export LC_ALL pat='RLIM_INFINITY blkcnt_t ino_t off_t rlim_t struct rlimit struct dirent struct statfs struct stat struct statvfs fsblkcnt_t fsfilcnt_t fpos_t struct flock struct aiocb fts.h' exec find "${@-.}" -type f -exec grep -nwHF "$pat" {} +
> On 03/21/2014 06:11 AM, Joseph S. Myers wrote: > > * Adding *64 versions of interfaces without them (at least fts). > > This is a messy situation. GNU applications typically use Gnulib's > version of fts, and Gnulib's interface has diverged from libc's. The > Gnulib interface and implementation support features that glibc doesn't, > e.g., checking for directory loops, and these are used by programs like > 'rm -r' and 'chmod -R'. > > One possible way forward would be to deprecate the current fts API, and > switch to the gnulib API (with room for future extensions), retaining > binary support for old applications that dynamically link to the new > glibc. This would be some work, though. I don't think it's so hard. But the first question is if perhaps we shouldn't just deprecate fts entirely and not support a new one in libc. If there is no great benefit to having this in libc vs just everyone building their own from gnulib, that might be simpler. Before deciding where to go, we should see what the current versions of 4.4BSD-derived systems are doing for fts. Compatibility with 4.4BSD libc was the original reason we (I) added fts. > It is odd that libc supports two different 64-bit ABIs for 'stat' etc. > on all 64-bit platforms, one hardly ever used. It might make sense to > deprecate the rarely-used ABI, i.e., have _FILE_OFFSET_BITS=64 be a > no-op on 64-bit platforms. This should fix the C++ ABI problem you > mentioned (bug #15766). I think that was always the intent. Perhaps we should also consider what we ought to be doing better in API/ABI checking that could be made to ensure that we maintain the desired invariant in the future. > 1. These libraries are compiled with -D_FILE_OFFSET_BITS=64, so > theycurrently should have problems unless applications are also built > with-D_FILE_OFFSET_BITS=64. Changing the default will fix these problems. > > flac > fltk > GraphicsMagick > hdf5 > ImageMagick You didn't list elfutils ({-libelf,{-devel}}). That is built with -D_FILE_OFFSET_BITS=64 but uses off64_t et al in its public API headers, so it should not have a problem. I don't know if your method of analysis groks those cases or not. Thanks, Roland
Roland McGrath wrote: > the first question is if perhaps we shouldn't just deprecate fts entirely and not support a new one in libc. That would be easier, yes. > we should see what the current versions of 4.4BSD-derived > systems are doing for fts. It's in the latest versions of FreeBSD (10.0), NetBSD (6.1.3), and OpenBSD (5.4). FreeBSD has some minor extensions that the others (including glibc) lack. > You didn't list elfutils ({-libelf,{-devel}}). That package is free of the problem; my script gave its include files a clean bill of health. In fact I'm puzzled why you mentioned it, as its include files don't use off_t, off64_t, or any of the other types (with or without "64") affected by the problem. Here's a list of elfutils-devel's include files on Fedora 20: dwarf.h elfutils/elf-knowledge.h elfutils/libasm.h elfutils/libdwfl.h elfutils/libdw.h elfutils/libebl.h elfutils/version.h
> That package is free of the problem; my script gave its include files a > clean bill of health. In fact I'm puzzled why you mentioned it, as its > include files don't use off_t, off64_t, or any of the other types (with > or without "64") affected by the problem. I brought it up because it's a case of a library (libelf) that did have an API whose ABI was (accidentally) dependent on the _FILE_OFFSET_BITS setting. The library itself was compiled with -D_FILE_OFFSET_BITS=64 and so on 32-bit platforms it required that its users be compiled that way too. We fixed it back in 2007. I misremembered and thought we'd used off64_t. In fact it uses loff_t. The log entry I wrote at the time said that only loff_t (not off64_t) was defined by <sys/types.h> in all circumstances. Thanks, Roland
diff --git a/dirent/dirent.h b/dirent/dirent.h index f6db6c6..35c0a93 100644 --- a/dirent/dirent.h +++ b/dirent/dirent.h @@ -31,6 +31,7 @@ __BEGIN_DECLS #ifdef __USE_XOPEN # ifndef __ino_t_defined # ifndef __USE_FILE_OFFSET64 +# warning Please use -D_FILE_OFFSET_BITS=64 to support long files. typedef __ino_t ino_t; # else typedef __ino64_t ino_t;