Message ID | 4b6f06e1-e1ad-9489-5311-5be78b1ae1dd@netcologne.de |
---|---|
State | New |
Headers | show |
Series | [libfortran] Adjust block size for libgfortran for unformatted reads | expand |
Am 07.07.19 um 22:13 schrieb Thomas Koenig: > Hello world, > > the attached patch sets the I/O block size for unformatted files to > 2**17 and makes this, and the block size for formatted files, > adjustable via environment variables. > > The main reason is that -fconvert=big-endian was quite slow on > some HPC systems. A bigger buffer should eliminate that. Also, > People who use unformatted files are likely to write large amounts > of data, so this seems like a good fit. Finally, some benchmarking > showed that 131072 seemed like a good value to use. Thanks to Jerry > for support. > > I didn't change the value for formatted files because, frankly, we are > using a lot of CPU for converting numbers there, so any gain > negligible (unless somebody comes up with a benchmark which says > otherwise). formatted write: Did you try writing to an USB stick or similar? I guess for flash based devices anything below 64k will slow down operation. Computers like Raspberry Pi and the like often have flash based storage, and it is not extremely unrealistic to run fortran programs on them. Cheers. Manfred > > As this is a change in behavior / new feature, I don't think that > backporting is indicated, but if somebody feels otherwise, please > speak up. > > Regression-tested. OK for trunk? > > Regards > > Thomas > > 2019-07-07 Thomas König <tkoenig@gcc.gnu.org> > > PR libfortran/91030 > * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document > (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise. > > 2019-07-07 Thomas König <tkoenig@gcc.gnu.org> > > PR libfortran/91030 > * io/unix.c (BUFFER_SIZE): Delete. > (BUFFER_SIZE_FORMATTED_DEFAULT): New variable. > (BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable. > (unix_stream): Add buffer_size. > (buf_read): Use s->buffer_size instead of BUFFER_SIZE. > (buf_write): Likewise. > (buf_init): Add argument unformatted. Handle block sizes > for unformatted vs. formatted, using defaults if provided. > (fd_to_stream): Add argument unformatted in call to buf_init. > * libgfortran.h (options_t): Add buffer_size_formatted and > buffer_size_unformatted. > * runtime/environ.c (variable_table): Add > GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED. >
On Mon, Jul 8, 2019 at 11:18 AM Manfred Schwarb <manfred99@gmx.ch> wrote: > > Am 07.07.19 um 22:13 schrieb Thomas Koenig: > > Hello world, > > > > the attached patch sets the I/O block size for unformatted files to > > 2**17 and makes this, and the block size for formatted files, > > adjustable via environment variables. > > > > The main reason is that -fconvert=big-endian was quite slow on > > some HPC systems. A bigger buffer should eliminate that. Also, > > People who use unformatted files are likely to write large amounts > > of data, so this seems like a good fit. Finally, some benchmarking > > showed that 131072 seemed like a good value to use. Thanks to Jerry > > for support. > > > > I didn't change the value for formatted files because, frankly, we are > > using a lot of CPU for converting numbers there, so any gain > > negligible (unless somebody comes up with a benchmark which says > > otherwise). > > formatted write: Did you try writing to an USB stick or similar? I guess > for flash based devices anything below 64k will slow down operation. > Computers like Raspberry Pi and the like often have flash based storage, > and it is not extremely unrealistic to run fortran programs on them. Good point. If you happen to have a USB stick handy, can you try the simple C benchmark program at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ? (the kernel will coalesce IO's by itself, so the granularity of IO syscalls is not necessarily the same as the actual IO to devices. Network filesystems like NFS/Lustre/GPFS may have less latitude here due to coherency requirements etc.) -- Janne Blomqvist
On Sun, Jul 7, 2019 at 11:13 PM Thomas Koenig <tkoenig@netcologne.de> wrote: > > Hello world, > > the attached patch sets the I/O block size for unformatted files to > 2**17 and makes this, and the block size for formatted files, > adjustable via environment variables. Should the default be affected by the page size (sysconf(_SC_PAGESIZE)) and/or the IO blocksize (we already fstat() a file when we open it, so we could get st_blksize for free)? Though one could of course argue those aren't particularly useful; except for power, everything uses a 4k page size (?), and the IO blocksize varies from too small (4k, ext4) to too large (1 MB, NFS (or whatever rsize/wsize is set to), or 4 MB for Lustre). > 2019-07-07 Thomas König <tkoenig@gcc.gnu.org> > > PR libfortran/91030 > * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document > (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise. One of these should be _UNFORMATTED?
On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote: > > Good point. If you happen to have a USB stick handy, can you try the > simple C benchmark program at > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ? > > (the kernel will coalesce IO's by itself, so the granularity of IO > syscalls is not necessarily the same as the actual IO to devices. > Network filesystems like NFS/Lustre/GPFS may have less latitude here > due to coherency requirements etc.) > Writing to USB is constrained be the speed of the bus. kernel: ugen6.3: <Kingston DataTraveler 3.0> at usbus6 kernel: umass1 on uhub0 kernel: umass1: <Kingston DataTraveler 3.0, class 0/0, rev 2.10/0.01, addr 3> on usbus6 kernel: da1 at umass-sim1 bus 1 scbus5 target 0 lun 0 kernel: da1: <Kingston DataTraveler 3.0 > Removable Direct Access SPC-4 SCSI device kernel: da1: Serial Number 0C9D927F0A77F330A89D1210 kernel: da1: 40.000MB/s transfers kernel: da1: 29510MB (60437492 512 byte sectors) kernel: da1: quirks=0x2<NO_6_BYTE> Mounting the thumb drive as a MSDOSFS on FreeBSD. Test using 100000000 bytes Block size of file system: 16384 bs = 1024, 8.86 MiB/s bs = 2048, 7.52 MiB/s bs = 4096, 7.32 MiB/s bs = 8192, 7.32 MiB/s bs = 16384, 7.40 MiB/s bs = 32768, 7.40 MiB/s bs = 65536, 5.57 MiB/s bs = 131072, 7.43 MiB/s bs = 262144, 5.55 MiB/s bs = 524288, 7.43 MiB/s bs = 1048576, 5.56 MiB/s bs = 2097152, 7.39 MiB/s bs = 4194304, 7.62 MiB/s bs = 8388608, 5.58 MiB/s bs = 16777216, 7.42 MiB/s bs = 33554432, 5.59 MiB/s bs = 67108864, 4.77 MiB/s For the same binary, writing to a UFS2 on a SATAII SSD Test using 100000000 bytes Block size of file system: 32768 bs = 1024, 123.09 MiB/s bs = 2048, 210.30 MiB/s bs = 4096, 184.75 MiB/s bs = 8192, 237.28 MiB/s bs = 16384, 244.42 MiB/s bs = 32768, 243.20 MiB/s bs = 65536, 253.77 MiB/s bs = 131072, 243.32 MiB/s bs = 262144, 238.81 MiB/s bs = 524288, 243.87 MiB/s bs = 1048576, 246.55 MiB/s bs = 2097152, 242.39 MiB/s bs = 4194304, 243.68 MiB/s bs = 8388608, 243.88 MiB/s bs = 16777216, 242.21 MiB/s bs = 33554432, 253.19 MiB/s bs = 67108864, 178.29 MiB/s
On Mon, 8 Jul 2019 09:05:04 -0700 Steve Kargl <sgk@troutmask.apl.washington.edu> wrote: > On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote: > > > > Good point. If you happen to have a USB stick handy, can you try the > > simple C benchmark program at > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ? > > > > (the kernel will coalesce IO's by itself, so the granularity of IO > > syscalls is not necessarily the same as the actual IO to devices. > > Network filesystems like NFS/Lustre/GPFS may have less latitude here > > due to coherency requirements etc.) GFORTRAN_BUFFER_SIZE_FORMATTED sounds a bit odd, maybe GFORTRAN_(UN)FORMATTED_BUFFER_SIZE would read more natural. And let me note 2 flaws in the benchmark: > left = N; > w = p; > t1 = walltime(); > while (left > 0) > { > if (left >= blocksize) > to_write = blocksize; > else > to_write = left; > > write (fd, w, blocksize); 1) this should write to_write, not blocksize i assume. 2) you don't catch short writes > w += to_write; > left -= to_write; > } So, short of using iozone, it should probably be more like (modulo typos): left = N; w = p; t1 = walltime(); while (left > 0) { if (left >= blocksize) to_write = blocksize; else to_write = left; while (to_write > 0) { errno = 0; ssize_t wrote = write (fd, w, to_write); if (wrote < 0 && errno != EINTR) /* retry EINTR or bail */ break; w += wrote; left -= wrote; to_write -= wrote; } } thanks,
OK, so here is a new version. I think the discussion has shown that enlaring the buffer makes sense, and that the buffer size for unformatted seems to be too bad. I've reversed the names of the environment variables according to Behnard's suggestion. So, OK for trunk? Also, what should we do about gcc-9? I have now come to think that we should add the environment variables to set the buffer lengths, but leave the old default (8192). What do you think? Regards Thomas 2019-07-14 Thomas König <tkoenig@gcc.gnu.org> PR libfortran/91030 * gfortran.texi (GFORTRAN_FORMATTED_BUFFER_SIZE): Document (GFORTRAN_UNFORMATTED_BUFFER_SIZE): Likewise. 2019-07-14 Thomas König <tkoenig@gcc.gnu.org> PR libfortran/91030 * io/unix.c (BUFFER_SIZE): Delete. (BUFFER_FORMATTED_SIZE_DEFAULT): New variable. (BUFFER_UNFORMATTED_SIZE_DEFAULT): New variable. (unix_stream): Add buffer_size. (buf_read): Use s->buffer_size instead of BUFFER_SIZE. (buf_write): Likewise. (buf_init): Add argument unformatted. Handle block sizes for unformatted vs. formatted, using defaults if provided. (fd_to_stream): Add argument unformatted in call to buf_init. * libgfortran.h (options_t): Add buffer_size_formatted and buffer_size_unformatted. * runtime/environ.c (variable_table): Add GFORTRAN_UNFORMATTED_BUFFER_SIZE and GFORTRAN_FORMATTED_BUFFER_SIZE.
... of course, better with the actual patch.
On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote: > OK, so here is a new version. > > I think the discussion has shown that enlaring the buffer makes sense, > and that the buffer size for unformatted seems to be too bad. > > I've reversed the names of the environment variables according to > Behnard's suggestion. > > So, OK for trunk? > > Also, what should we do about gcc-9? I have now come to think > that we should add the environment variables to set the buffer lengths, > but leave the old default (8192). > > What do you think? > If you are inclined to back port a portion of the patch to 9-branch, then bumping up the old default would seem to be the most important part. As dje noted, users seem to have an aversion to reading the documentation, so finding the environment variables may not happen. Isn't 8192 an internal implementation detail for libgfortran? Can bumping it to larger value in 9-branch cause an issue for a normal user?
Hi Steve, > On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote: >> OK, so here is a new version. >> >> I think the discussion has shown that enlaring the buffer makes sense, >> and that the buffer size for unformatted seems to be too bad. >> >> I've reversed the names of the environment variables according to >> Behnard's suggestion. >> >> So, OK for trunk? >> >> Also, what should we do about gcc-9? I have now come to think >> that we should add the environment variables to set the buffer lengths, >> but leave the old default (8192). >> >> What do you think? >> > > If you are inclined to back port a portion of the patch to 9-branch, > then bumping up the old default would seem to be the most important > part. As dje noted, users seem to have an aversion to reading the > documentation, so finding the environment variables may not happen. > > Isn't 8192 an internal implementation detail for libgfortran? Can > bumping it to larger value in 9-branch cause an issue for a normal > user? Well, it allocates a bigger memory block, that's all. Upon reconsideration, I think your point about people not reading the docs is valid :-| So, I will commit the patch to trunk over the weekend and to 9.2 a few days afterwards, unless somebody objects. Regards Thomas
Index: gcc/fortran/gfortran.texi =================================================================== --- gcc/fortran/gfortran.texi (Revision 273183) +++ gcc/fortran/gfortran.texi (Arbeitskopie) @@ -611,6 +611,8 @@ Malformed environment variables are silently ignor * GFORTRAN_LIST_SEPARATOR:: Separator for list output * GFORTRAN_CONVERT_UNIT:: Set endianness for unformatted I/O * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors +* GFORTRAN_BUFFER_SIZE_FORMATTED:: Buffer size for formatted files. +* GFORTRAN_BUFFER_SIZE_UNFORMATTED:: Buffer size for unformatted files. @end menu @node TMPDIR @@ -782,6 +784,20 @@ the backtracing, set the variable to @samp{n}, @sa Default is to print a backtrace unless the @option{-fno-backtrace} compile option was used. +@node GFORTRAN_BUFFER_SIZE_FORMATTED +@section @env{GFORTRAN_BUFFER_SIZE_FORMATTED}---Set buffer size for formatted I/O + +The @env{GFORTRAN_BUFFER_SIZE_FORMATTED} environment variable +specifies buffer size in bytes to be used for formatted output. +The default value is 8192. + +@node GFORTRAN_BUFFER_SIZE_UNFORMATTED +@section @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED}---Set buffer size for unformatted I/O + +The @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED} environment variable +specifies buffer size in bytes to be used for unformatted output. +The default value is 131072. + @c ===================================================================== @c PART II: LANGUAGE REFERENCE @c ===================================================================== Index: libgfortran/io/unix.c =================================================================== --- libgfortran/io/unix.c (Revision 273183) +++ libgfortran/io/unix.c (Arbeitskopie) @@ -193,7 +193,8 @@ fallback_access (const char *path, int mode) /* Unix and internal stream I/O module */ -static const int BUFFER_SIZE = 8192; +static const int BUFFER_SIZE_FORMATTED_DEFAULT = 8192; +static const int BUFFER_SIZE_UNFORMATTED_DEFAULT = 128*1024; typedef struct { @@ -205,6 +206,7 @@ typedef struct gfc_offset file_length; /* Length of the file. */ char *buffer; /* Pointer to the buffer. */ + ssize_t buffer_size; /* Length of the buffer. */ int fd; /* The POSIX file descriptor. */ int active; /* Length of valid bytes in the buffer */ @@ -592,9 +594,9 @@ buf_read (unix_stream *s, void *buf, ssize_t nbyte && raw_seek (s, new_logical, SEEK_SET) < 0) return -1; s->buffer_offset = s->physical_offset = new_logical; - if (to_read <= BUFFER_SIZE/2) + if (to_read <= s->buffer_size/2) { - did_read = raw_read (s, s->buffer, BUFFER_SIZE); + did_read = raw_read (s, s->buffer, s->buffer_size); if (likely (did_read >= 0)) { s->physical_offset += did_read; @@ -632,11 +634,11 @@ buf_write (unix_stream *s, const void *buf, ssize_ s->buffer_offset = s->logical_offset; /* Does the data fit into the buffer? As a special case, if the - buffer is empty and the request is bigger than BUFFER_SIZE/2, + buffer is empty and the request is bigger than s->buffer_size/2, write directly. This avoids the case where the buffer would have to be flushed at every write. */ - if (!(s->ndirty == 0 && nbyte > BUFFER_SIZE/2) - && s->logical_offset + nbyte <= s->buffer_offset + BUFFER_SIZE + if (!(s->ndirty == 0 && nbyte > s->buffer_size/2) + && s->logical_offset + nbyte <= s->buffer_offset + s->buffer_size && s->buffer_offset <= s->logical_offset && s->buffer_offset + s->ndirty >= s->logical_offset) { @@ -651,7 +653,7 @@ buf_write (unix_stream *s, const void *buf, ssize_ the request is bigger than the buffer size, write directly bypassing the buffer. */ buf_flush (s); - if (nbyte <= BUFFER_SIZE/2) + if (nbyte <= s->buffer_size/2) { memcpy (s->buffer, buf, nbyte); s->buffer_offset = s->logical_offset; @@ -688,7 +690,7 @@ buf_write (unix_stream *s, const void *buf, ssize_ static int buf_markeor (unix_stream *s) { - if (s->unbuffered || s->ndirty >= BUFFER_SIZE / 2) + if (s->unbuffered || s->ndirty >= s->buffer_size / 2) return buf_flush (s); return 0; } @@ -765,11 +767,32 @@ static const struct stream_vtable buf_vtable = { }; static int -buf_init (unix_stream *s) +buf_init (unix_stream *s, bool unformatted) { s->st.vptr = &buf_vtable; - s->buffer = xmalloc (BUFFER_SIZE); + /* Try to guess a good value for the buffer size. For formatted + I/O, we use so many CPU cycles converting the data that there is + more sense in converving memory and especially cache. For + unformatted, a bigger block can have a large impact in some + environments. */ + + if (unformatted) + { + if (options.buffer_size_unformatted > 0) + s->buffer_size = options.buffer_size_unformatted; + else + s->buffer_size = BUFFER_SIZE_UNFORMATTED_DEFAULT; + } + else + { + if (options.buffer_size_formatted > 0) + s->buffer_size = options.buffer_size_formatted; + else + s->buffer_size = BUFFER_SIZE_FORMATTED_DEFAULT; + } + + s->buffer = xmalloc (s->buffer_size); return 0; } @@ -1120,13 +1143,13 @@ fd_to_stream (int fd, bool unformatted) (s->fd == STDIN_FILENO || s->fd == STDOUT_FILENO || s->fd == STDERR_FILENO))) - buf_init (s); + buf_init (s, unformatted); else { if (unformatted) { s->unbuffered = true; - buf_init (s); + buf_init (s, unformatted); } else raw_init (s); Index: libgfortran/libgfortran.h =================================================================== --- libgfortran/libgfortran.h (Revision 273183) +++ libgfortran/libgfortran.h (Arbeitskopie) @@ -540,6 +540,7 @@ typedef struct int all_unbuffered, unbuffered_preconnected; int fpe, backtrace; + int buffer_size_unformatted, buffer_size_formatted; } options_t; Index: libgfortran/runtime/environ.c =================================================================== --- libgfortran/runtime/environ.c (Revision 273183) +++ libgfortran/runtime/environ.c (Arbeitskopie) @@ -198,6 +198,14 @@ static variable variable_table[] = { /* Print out a backtrace if possible on runtime error */ { "GFORTRAN_ERROR_BACKTRACE", -1, &options.backtrace, init_boolean }, + /* Buffer size for unformatted files. */ + { "GFORTRAN_BUFFER_SIZE_UNFORMATTED", 0, &options.buffer_size_unformatted, + init_integer }, + + /* Buffer size for formatted files. */ + { "GFORTRAN_BUFFER_SIZE_FORMATTED", 0, &options.buffer_size_formatted, + init_integer }, + { NULL, 0, NULL, NULL } };