Message ID | 1419931250-19259-7-git-send-email-den@openvz.org |
---|---|
State | New |
Headers | show |
On Tue, 12/30 12:20, Denis V. Lunev wrote: > This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported. > > Simple fallocate(0) will extend file with zeroes when appropriate in the > middle of the file if there is a hole there and at the end of the file. > Unfortunately fallocate(0) does not drop the content of the file if > there is a data on this offset. Therefore to make the situation consistent > we should drop the data beforehand. This is done using FALLOC_FL_PUNCH_HOLE > > This should increase the performance a bit for not-so-modern kernels or for > filesystems which do not support FALLOC_FL_ZERO_RANGE. > > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Kevin Wolf <kwolf@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > CC: Peter Lieven <pl@kamp.de> > --- > block/raw-posix.c | 17 +++++++++++++++++ > 1 file changed, 17 insertions(+) > > diff --git a/block/raw-posix.c b/block/raw-posix.c > index 7866d31..96a8678 100644 > --- a/block/raw-posix.c > +++ b/block/raw-posix.c > @@ -968,6 +968,23 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb) > #endif > > s->has_write_zeroes = false; > + > +#ifdef CONFIG_FALLOCATE_PUNCH_HOLE > + if (s->has_discard) { > + int ret; > + ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, > + aiocb->aio_offset, aiocb->aio_nbytes); > + if (ret < 0) { > + if (ret == -ENOTSUP) { > + s->has_discard = false; > + } > + return ret; > + } > + return do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes); Why is fallocate(0) necessary here? The manpage says: Deallocating file space Specifying the FALLOC_FL_PUNCH_HOLE flag (available since Linux 2.6.38) in mode deallocates space (i.e., creates a hole) in the byte range starting at offset and continuing for len bytes. Within the specified range, partial file system blocks are zeroed, and whole file system blocks are removed from the file. After a successful call, subsequent reads from this range will return zeroes. So the data are already zeroes after FALLOC_FL_PUNCH_HOLE. Fam > + } > +#endif > + > + s->has_discard = false; > return -ENOTSUP; > } > > -- > 1.9.1 > >
On 05/01/15 10:02, Fam Zheng wrote: > On Tue, 12/30 12:20, Denis V. Lunev wrote: >> This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported. >> >> Simple fallocate(0) will extend file with zeroes when appropriate in the >> middle of the file if there is a hole there and at the end of the file. >> Unfortunately fallocate(0) does not drop the content of the file if >> there is a data on this offset. Therefore to make the situation consistent >> we should drop the data beforehand. This is done using FALLOC_FL_PUNCH_HOLE >> >> This should increase the performance a bit for not-so-modern kernels or for >> filesystems which do not support FALLOC_FL_ZERO_RANGE. >> >> Signed-off-by: Denis V. Lunev <den@openvz.org> >> CC: Kevin Wolf <kwolf@redhat.com> >> CC: Stefan Hajnoczi <stefanha@redhat.com> >> CC: Peter Lieven <pl@kamp.de> >> --- >> block/raw-posix.c | 17 +++++++++++++++++ >> 1 file changed, 17 insertions(+) >> >> diff --git a/block/raw-posix.c b/block/raw-posix.c >> index 7866d31..96a8678 100644 >> --- a/block/raw-posix.c >> +++ b/block/raw-posix.c >> @@ -968,6 +968,23 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb) >> #endif >> >> s->has_write_zeroes = false; >> + >> +#ifdef CONFIG_FALLOCATE_PUNCH_HOLE >> + if (s->has_discard) { >> + int ret; >> + ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, >> + aiocb->aio_offset, aiocb->aio_nbytes); >> + if (ret < 0) { >> + if (ret == -ENOTSUP) { >> + s->has_discard = false; >> + } >> + return ret; >> + } >> + return do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes); > Why is fallocate(0) necessary here? The manpage says: > > Deallocating file space > Specifying the FALLOC_FL_PUNCH_HOLE flag (available since Linux 2.6.38) > in mode deallocates space (i.e., creates a hole) in the byte range > starting at offset and continuing for len bytes. Within the specified > range, partial file system blocks are zeroed, and whole file system > blocks are removed from the file. After a successful call, subsequent > reads from this range will return zeroes. > > So the data are already zeroes after FALLOC_FL_PUNCH_HOLE. > > Fam These zeroes will have different properties. FALLOC_FL_PUNCH_HOLE deallocates disk space on that range. Thus this call work work in a different way in respect to the method of zero writing. This does not look good for me. The function should keep the file in the same state using all possible internal implementations. If the caller wants to use FALLOC_FL_PUNCH_HOLE alone, it should call handle_aiocb_discard method.
diff --git a/block/raw-posix.c b/block/raw-posix.c index 7866d31..96a8678 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -968,6 +968,23 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb) #endif s->has_write_zeroes = false; + +#ifdef CONFIG_FALLOCATE_PUNCH_HOLE + if (s->has_discard) { + int ret; + ret = do_fallocate(s->fd, FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE, + aiocb->aio_offset, aiocb->aio_nbytes); + if (ret < 0) { + if (ret == -ENOTSUP) { + s->has_discard = false; + } + return ret; + } + return do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes); + } +#endif + + s->has_discard = false; return -ENOTSUP; }
This sequence works efficiently if FALLOC_FL_ZERO_RANGE is not supported. Simple fallocate(0) will extend file with zeroes when appropriate in the middle of the file if there is a hole there and at the end of the file. Unfortunately fallocate(0) does not drop the content of the file if there is a data on this offset. Therefore to make the situation consistent we should drop the data beforehand. This is done using FALLOC_FL_PUNCH_HOLE This should increase the performance a bit for not-so-modern kernels or for filesystems which do not support FALLOC_FL_ZERO_RANGE. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Peter Lieven <pl@kamp.de> --- block/raw-posix.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)