Message ID | 56E17C8E.1070209@redhat.com |
---|---|
State | New |
Headers | show |
Justify with clear rationale.
On 03/11/2016 10:52 PM, Roland McGrath wrote:
> Justify with clear rationale.
It fixes bug 4099. We need an arbitrary limit for that.
The libstdc++ buffer size is 8192 (or 8191), so this makes buffering
more consistent across the system.
The PostgreSQL people did extensive benchmarks to determine their
block/page size, and settled for a 8192 (but they do not use stdio
streams, for obvious reasons).
<stdio.h> documents BUFSIZ as the default buffer size. The new
implementation matches that.
Additional memory consumption is limited because file descriptors are a
scarce resource.
I can do some benchmarking, but I don't expect any compelling results.
Florian
On 03/14/2016 07:08 AM, Florian Weimer wrote: > On 03/11/2016 10:52 PM, Roland McGrath wrote: >> Justify with clear rationale. > > It fixes bug 4099. We need an arbitrary limit for that. > > The libstdc++ buffer size is 8192 (or 8191), so this makes buffering > more consistent across the system. > > The PostgreSQL people did extensive benchmarks to determine their > block/page size, and settled for a 8192 (but they do not use stdio > streams, for obvious reasons). > > <stdio.h> documents BUFSIZ as the default buffer size. The new > implementation matches that. > > Additional memory consumption is limited because file descriptors are a > scarce resource. > > I can do some benchmarking, but I don't expect any compelling results. I don't know that benchmarking is required, Roland just asked for clear rationale. However, it would be wonderful if you added a microbenchmark just to make sure we don't actually cause any unforseen problems. This way people can run such a benchmark again on their remote filesystems and give us results. Your answer seems clear enough to me. I agree with it too. The advertised st_blksize is useful only in the abstract. The runtime has to pick something which works well with the current implementation as a whole. The only objection I might see is that this is actually a Linux-specific tuning that you've done. Nobody knows if this tuning has any impact on Hurd or not. I would consider this OK to checkin only if you provide a detailed comment that talks about the tradeoffs being made here and why _IO_BUFSIZE was chosen. In summary: - Add comment just above setting _IO_BUFSIZE about tradeoff [Required] - Add microbenchmark to avoid surprises [Optional]
> On 03/11/2016 10:52 PM, Roland McGrath wrote: > > Justify with clear rationale. > > It fixes bug 4099. We need an arbitrary limit for that. That is justification for imposing an arbitrary maximum on the automatically-chosen size. Similar logic on the other side of the coin is justification for imposing an arbitrary minimum on the automatically-chosen size. Neither is justification for always using a single fixed size. > The libstdc++ buffer size is 8192 (or 8191), so this makes buffering > more consistent across the system. That's an internal implementation choice in libstdc++. There is no reason to expect it to stay the same, nor special reason to think that just because libstdc++ chose it that it's ideal. > The PostgreSQL people did extensive benchmarks to determine their > block/page size, and settled for a 8192 (but they do not use stdio > streams, for obvious reasons). That's lovely. They can inform the implementors of whatever filesystem(s) they were using in their benchmarks that st_blksize=8192 is what they should be reporting. > <stdio.h> documents BUFSIZ as the default buffer size. The new > implementation matches that. It's the default in the sense that it's what setbuf uses. So it's a permanent part of the ABI and therefore can't be changed easily regardless of whether it's a desireable value. If the comments or other documentation are unclear as to the true (very tiny) significance of BUFSIZ, they should be fixed. > Additional memory consumption is limited because file descriptors are a > scarce resource. There is no reason to consider file descriptors scarce. The per-process limit is fungible. > I can do some benchmarking, but I don't expect any compelling results. Whatever the results, they would not IMHO be relevant here. POSIX specifies that st_blksize is the "preferred I/O block size for this object". It's the kernel's responsibility to give userland good advice through this channel. If there are common buggy kernels that give bad advice, that is a reason to apply upper and lower limits to the advice from the kernel. But the expectation should be that the kernel gets fixed to give good advice, and the optimal thing to do with a good kernel is to follow its advice. Since the recommended use of st_blksize in this way is a standard user feature and not just what stdio's implementation happens to do, there is an argument to be made that the limiting of the value should be done in the *stat functions reported st_blksize values rather than in stdio's use of them. (I'm ambivalent about this point.) Thanks, Roland
On 03/18/2016 11:52 PM, Roland McGrath wrote: > Whatever the results, they would not IMHO be relevant here. > > POSIX specifies that st_blksize is the "preferred I/O block size for this > object". It's the kernel's responsibility to give userland good advice > through this channel. If there are common buggy kernels that give bad > advice, that is a reason to apply upper and lower limits to the advice from > the kernel. But the expectation should be that the kernel gets fixed to > give good advice, and the optimal thing to do with a good kernel is to > follow its advice. > > Since the recommended use of st_blksize in this way is a standard user > feature and not just what stdio's implementation happens to do, there is an > argument to be made that the limiting of the value should be done in the > *stat functions reported st_blksize values rather than in stdio's use of > them. (I'm ambivalent about this point.) That's a good point. I'll try to get feedback from kernel file system developers on this matter. Thanks, Florian
On Fri, Mar 18, 2016 at 03:52:58PM -0700, Roland McGrath wrote: > > I can do some benchmarking, but I don't expect any compelling results. > > Whatever the results, they would not IMHO be relevant here. > > POSIX specifies that st_blksize is the "preferred I/O block size for this > object". It's the kernel's responsibility to give userland good advice > through this channel. If there are common buggy kernels that give bad > advice, that is a reason to apply upper and lower limits to the advice from > the kernel. But the expectation should be that the kernel gets fixed to > give good advice, and the optimal thing to do with a good kernel is to > follow its advice. > > Since the recommended use of st_blksize in this way is a standard user > feature and not just what stdio's implementation happens to do, there is an > argument to be made that the limiting of the value should be done in the > *stat functions reported st_blksize values rather than in stdio's use of > them. (I'm ambivalent about this point.) Regardless of st_blksize being "the preferred size", it's not suitable for stdio, at least not for read purposes, because sparse/random access reads are a valid application usage for stdio. Reading an unboundedly large "optimal" block size, only to use one byte and throw the rest away, is unacceptably pessimistic behavior and is the whole point behind bug 4099. If you insist on keeping unboundedly large buffers honoring st_blksize, one option would be to only use the full buffer for writing, and limit it to 4k or 8k for reading. But I think it's best to just ignore st_blksize and use a reasonable buffer size all the time. Rich
On 04/01/2016 02:19 PM, Rich Felker wrote: > On Fri, Mar 18, 2016 at 03:52:58PM -0700, Roland McGrath wrote: >>> I can do some benchmarking, but I don't expect any compelling results. >> >> Whatever the results, they would not IMHO be relevant here. >> >> POSIX specifies that st_blksize is the "preferred I/O block size for this >> object". It's the kernel's responsibility to give userland good advice >> through this channel. If there are common buggy kernels that give bad >> advice, that is a reason to apply upper and lower limits to the advice from >> the kernel. But the expectation should be that the kernel gets fixed to >> give good advice, and the optimal thing to do with a good kernel is to >> follow its advice. >> >> Since the recommended use of st_blksize in this way is a standard user >> feature and not just what stdio's implementation happens to do, there is an >> argument to be made that the limiting of the value should be done in the >> *stat functions reported st_blksize values rather than in stdio's use of >> them. (I'm ambivalent about this point.) > > Regardless of st_blksize being "the preferred size", it's not suitable > for stdio, at least not for read purposes, because sparse/random > access reads are a valid application usage for stdio. Reading an > unboundedly large "optimal" block size, only to use one byte and throw > the rest away, is unacceptably pessimistic behavior and is the whole > point behind bug 4099. > > If you insist on keeping unboundedly large buffers honoring > st_blksize, one option would be to only use the full buffer for > writing, and limit it to 4k or 8k for reading. But I think it's best > to just ignore st_blksize and use a reasonable buffer size all the > time. I think Roland has a good point though, if the kernel is going to buffer for you using filesystem-based knowledge, then why doesn't it just report an st_blksize that's small, say 8192 bytes, given the implementation? What purpose does it serve to set st_blksize to 2MB?
On Fri, Apr 01, 2016 at 09:15:21PM -0400, Carlos O'Donell wrote: > On 04/01/2016 02:19 PM, Rich Felker wrote: > > On Fri, Mar 18, 2016 at 03:52:58PM -0700, Roland McGrath wrote: > >>> I can do some benchmarking, but I don't expect any compelling results. > >> > >> Whatever the results, they would not IMHO be relevant here. > >> > >> POSIX specifies that st_blksize is the "preferred I/O block size for this > >> object". It's the kernel's responsibility to give userland good advice > >> through this channel. If there are common buggy kernels that give bad > >> advice, that is a reason to apply upper and lower limits to the advice from > >> the kernel. But the expectation should be that the kernel gets fixed to > >> give good advice, and the optimal thing to do with a good kernel is to > >> follow its advice. > >> > >> Since the recommended use of st_blksize in this way is a standard user > >> feature and not just what stdio's implementation happens to do, there is an > >> argument to be made that the limiting of the value should be done in the > >> *stat functions reported st_blksize values rather than in stdio's use of > >> them. (I'm ambivalent about this point.) > > > > Regardless of st_blksize being "the preferred size", it's not suitable > > for stdio, at least not for read purposes, because sparse/random > > access reads are a valid application usage for stdio. Reading an > > unboundedly large "optimal" block size, only to use one byte and throw > > the rest away, is unacceptably pessimistic behavior and is the whole > > point behind bug 4099. > > > > If you insist on keeping unboundedly large buffers honoring > > st_blksize, one option would be to only use the full buffer for > > writing, and limit it to 4k or 8k for reading. But I think it's best > > to just ignore st_blksize and use a reasonable buffer size all the > > time. > > I think Roland has a good point though, if the kernel is going to buffer > for you using filesystem-based knowledge, then why doesn't it just report > an st_blksize that's small, say 8192 bytes, given the implementation? > What purpose does it serve to set st_blksize to 2MB? Oh, I totally agree that the kernel is being stupid here. But it's also stupid for glibc to honor values that obviously do not make sense for the usage case at hand (reading when you can't know whether you'll throw most of the result away). Rich
> Oh, I totally agree that the kernel is being stupid here. But it's > also stupid for glibc to honor values that obviously do not make sense > for the usage case at hand (reading when you can't know whether you'll > throw most of the result away). Hence the only actual suggestion I made: apply fixed lower and upper bounds to the st_blksize value.
2016-03-08 Florian Weimer <fweimer@redhat.com> [BZ #4099] * libio/filedoalloc.c (_IO_file_doallocate): Always use _IO_BUFSIZ as the buffer size. diff --git a/libio/filedoalloc.c b/libio/filedoalloc.c index 4f9d738..74ff79b 100644 --- a/libio/filedoalloc.c +++ b/libio/filedoalloc.c @@ -56,8 +56,6 @@ /* Modified for GNU iostream by Per Bothner 1991, 1992. */ #include "libioP.h" -#include <device-nrs.h> -#include <sys/stat.h> #include <stdlib.h> #include <unistd.h> @@ -72,36 +70,17 @@ local_isatty (int fd) } /* Allocate a file buffer, or switch to unbuffered I/O. Streams for - TTY devices default to line buffered. */ + * TTY devices default to line buffered. */ int _IO_file_doallocate (_IO_FILE *fp) { - _IO_size_t size; - char *p; - struct stat64 st; - - size = _IO_BUFSIZ; - if (fp->_fileno >= 0 && __builtin_expect (_IO_SYSSTAT (fp, &st), 0) >= 0) - { - if (S_ISCHR (st.st_mode)) - { - /* Possibly a tty. */ - if ( -#ifdef DEV_TTY_P - DEV_TTY_P (&st) || -#endif - local_isatty (fp->_fileno)) - fp->_flags |= _IO_LINE_BUF; - } -#if _IO_HAVE_ST_BLKSIZE - if (st.st_blksize > 0) - size = st.st_blksize; -#endif - } - p = malloc (size); + /* Switch to line buffering for TTYs. */ + if (fp->_fileno >= 0 && local_isatty (fp->_fileno)) + fp->_flags |= _IO_LINE_BUF; + char *p = malloc (_IO_BUFSIZ); if (__glibc_unlikely (p == NULL)) return EOF; - _IO_setb (fp, p, p + size, 1); + _IO_setb (fp, p, p + _IO_BUFSIZ, 1); return 1; } libc_hidden_def (_IO_file_doallocate) -- 2.4.3