Message ID | 1422607337-25335-6-git-send-email-den@openvz.org |
---|---|
State | New |
Headers | show |
On 2015-01-30 at 03:42, Denis V. Lunev wrote: > There is a possibility that we are extending our image and thus writing > zeroes beyond the end of the file. In this case we do not need to care > about the hole to make sure that there is no data in the file under > this offset (pre-condition to fallocate(0) to work). We could simply call > fallocate(0). > > This improves the performance of writing zeroes even on really old > platforms which do not have even FALLOC_FL_PUNCH_HOLE. > > Before the patch do_fallocate was used when either > CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE are defined. > Now the story is different. CONFIG_FALLOCATE is defined when Linux > fallocate is defined, posix_fallocate is completely different story > (CONFIG_POSIX_FALLOCATE). CONFIG_FALLOCATE is mandatory prerequite > for both CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE > thus we are on the safe side. > > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Max Reitz <mreitz@redhat.com> > CC: Kevin Wolf <kwolf@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > CC: Peter Lieven <pl@kamp.de> > CC: Fam Zheng <famz@redhat.com> > --- > block/raw-posix.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/block/raw-posix.c b/block/raw-posix.c > index 5a777e7..1c88ad8 100644 > --- a/block/raw-posix.c > +++ b/block/raw-posix.c > @@ -147,6 +147,7 @@ typedef struct BDRVRawState { > bool has_discard:1; > bool has_write_zeroes:1; > bool discard_zeroes:1; > + bool has_fallocate; > bool needs_alignment; > } BDRVRawState; > > @@ -452,6 +453,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, > } > if (S_ISREG(st.st_mode)) { > s->discard_zeroes = true; > + s->has_fallocate = true; This could be moved upwards where has_discard and has_write_zeroes are initialized; but it won't matter in practice, I hope. Thus: Reviewed-by: Max Reitz <mreitz@redhat.com> > } > if (S_ISBLK(st.st_mode)) { > #ifdef BLKDISCARDZEROES > @@ -902,7 +904,7 @@ static int translate_err(int err) > return err; > } > > -#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || defined(CONFIG_FALLOCATE_ZERO_RANGE) > +#ifdef CONFIG_FALLOCATE > static int do_fallocate(int fd, int mode, off_t offset, off_t len) > { > do { > @@ -965,6 +967,16 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb) > } > #endif > > +#ifdef CONFIG_FALLOCATE > + if (s->has_fallocate && aiocb->aio_offset >= bdrv_getlength(aiocb->bs)) { > + int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes); > + if (ret == 0 || ret != -ENOTSUP) { > + return ret; > + } > + s->has_fallocate = false; > + } > +#endif > + > return -ENOTSUP; > } >
On 30/01/15 17:58, Max Reitz wrote: > On 2015-01-30 at 03:42, Denis V. Lunev wrote: >> There is a possibility that we are extending our image and thus writing >> zeroes beyond the end of the file. In this case we do not need to care >> about the hole to make sure that there is no data in the file under >> this offset (pre-condition to fallocate(0) to work). We could simply >> call >> fallocate(0). >> >> This improves the performance of writing zeroes even on really old >> platforms which do not have even FALLOC_FL_PUNCH_HOLE. >> >> Before the patch do_fallocate was used when either >> CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE are defined. >> Now the story is different. CONFIG_FALLOCATE is defined when Linux >> fallocate is defined, posix_fallocate is completely different story >> (CONFIG_POSIX_FALLOCATE). CONFIG_FALLOCATE is mandatory prerequite >> for both CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE >> thus we are on the safe side. >> >> Signed-off-by: Denis V. Lunev <den@openvz.org> >> CC: Max Reitz <mreitz@redhat.com> >> CC: Kevin Wolf <kwolf@redhat.com> >> CC: Stefan Hajnoczi <stefanha@redhat.com> >> CC: Peter Lieven <pl@kamp.de> >> CC: Fam Zheng <famz@redhat.com> >> --- >> block/raw-posix.c | 14 +++++++++++++- >> 1 file changed, 13 insertions(+), 1 deletion(-) >> >> diff --git a/block/raw-posix.c b/block/raw-posix.c >> index 5a777e7..1c88ad8 100644 >> --- a/block/raw-posix.c >> +++ b/block/raw-posix.c >> @@ -147,6 +147,7 @@ typedef struct BDRVRawState { >> bool has_discard:1; >> bool has_write_zeroes:1; >> bool discard_zeroes:1; >> + bool has_fallocate; >> bool needs_alignment; >> } BDRVRawState; >> @@ -452,6 +453,7 @@ static int raw_open_common(BlockDriverState >> *bs, QDict *options, >> } >> if (S_ISREG(st.st_mode)) { >> s->discard_zeroes = true; >> + s->has_fallocate = true; > > This could be moved upwards where has_discard and has_write_zeroes are > initialized; but it won't matter in practice, I hope. Thus: > > Reviewed-by: Max Reitz <mreitz@redhat.com> This does matter as has_discard and has_write_zeroes are bit fields thus I can not insert something useful into the middle of those fields.
On 2015-01-30 at 10:41, Denis V. Lunev wrote: > On 30/01/15 17:58, Max Reitz wrote: >> On 2015-01-30 at 03:42, Denis V. Lunev wrote: >>> There is a possibility that we are extending our image and thus writing >>> zeroes beyond the end of the file. In this case we do not need to care >>> about the hole to make sure that there is no data in the file under >>> this offset (pre-condition to fallocate(0) to work). We could simply >>> call >>> fallocate(0). >>> >>> This improves the performance of writing zeroes even on really old >>> platforms which do not have even FALLOC_FL_PUNCH_HOLE. >>> >>> Before the patch do_fallocate was used when either >>> CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE are defined. >>> Now the story is different. CONFIG_FALLOCATE is defined when Linux >>> fallocate is defined, posix_fallocate is completely different story >>> (CONFIG_POSIX_FALLOCATE). CONFIG_FALLOCATE is mandatory prerequite >>> for both CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE >>> thus we are on the safe side. >>> >>> Signed-off-by: Denis V. Lunev <den@openvz.org> >>> CC: Max Reitz <mreitz@redhat.com> >>> CC: Kevin Wolf <kwolf@redhat.com> >>> CC: Stefan Hajnoczi <stefanha@redhat.com> >>> CC: Peter Lieven <pl@kamp.de> >>> CC: Fam Zheng <famz@redhat.com> >>> --- >>> block/raw-posix.c | 14 +++++++++++++- >>> 1 file changed, 13 insertions(+), 1 deletion(-) >>> >>> diff --git a/block/raw-posix.c b/block/raw-posix.c >>> index 5a777e7..1c88ad8 100644 >>> --- a/block/raw-posix.c >>> +++ b/block/raw-posix.c >>> @@ -147,6 +147,7 @@ typedef struct BDRVRawState { >>> bool has_discard:1; >>> bool has_write_zeroes:1; >>> bool discard_zeroes:1; >>> + bool has_fallocate; >>> bool needs_alignment; >>> } BDRVRawState; >>> @@ -452,6 +453,7 @@ static int raw_open_common(BlockDriverState >>> *bs, QDict *options, >>> } >>> if (S_ISREG(st.st_mode)) { >>> s->discard_zeroes = true; >>> + s->has_fallocate = true; >> >> This could be moved upwards where has_discard and has_write_zeroes >> are initialized; but it won't matter in practice, I hope. Thus: >> >> Reviewed-by: Max Reitz <mreitz@redhat.com> > > This does matter as has_discard and has_write_zeroes are bit fields > thus I can not insert something useful into the middle of those > fields. Right, but I did not mean the placement inside of the structure but the placement of the initialization statement (s->has_fallocate = true) in raw_open_common(). Max
On 30/01/15 18:42, Max Reitz wrote: > On 2015-01-30 at 10:41, Denis V. Lunev wrote: >> On 30/01/15 17:58, Max Reitz wrote: >>> On 2015-01-30 at 03:42, Denis V. Lunev wrote: >>>> There is a possibility that we are extending our image and thus >>>> writing >>>> zeroes beyond the end of the file. In this case we do not need to care >>>> about the hole to make sure that there is no data in the file under >>>> this offset (pre-condition to fallocate(0) to work). We could >>>> simply call >>>> fallocate(0). >>>> >>>> This improves the performance of writing zeroes even on really old >>>> platforms which do not have even FALLOC_FL_PUNCH_HOLE. >>>> >>>> Before the patch do_fallocate was used when either >>>> CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE are >>>> defined. >>>> Now the story is different. CONFIG_FALLOCATE is defined when Linux >>>> fallocate is defined, posix_fallocate is completely different story >>>> (CONFIG_POSIX_FALLOCATE). CONFIG_FALLOCATE is mandatory prerequite >>>> for both CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE >>>> thus we are on the safe side. >>>> >>>> Signed-off-by: Denis V. Lunev <den@openvz.org> >>>> CC: Max Reitz <mreitz@redhat.com> >>>> CC: Kevin Wolf <kwolf@redhat.com> >>>> CC: Stefan Hajnoczi <stefanha@redhat.com> >>>> CC: Peter Lieven <pl@kamp.de> >>>> CC: Fam Zheng <famz@redhat.com> >>>> --- >>>> block/raw-posix.c | 14 +++++++++++++- >>>> 1 file changed, 13 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/block/raw-posix.c b/block/raw-posix.c >>>> index 5a777e7..1c88ad8 100644 >>>> --- a/block/raw-posix.c >>>> +++ b/block/raw-posix.c >>>> @@ -147,6 +147,7 @@ typedef struct BDRVRawState { >>>> bool has_discard:1; >>>> bool has_write_zeroes:1; >>>> bool discard_zeroes:1; >>>> + bool has_fallocate; >>>> bool needs_alignment; >>>> } BDRVRawState; >>>> @@ -452,6 +453,7 @@ static int raw_open_common(BlockDriverState >>>> *bs, QDict *options, >>>> } >>>> if (S_ISREG(st.st_mode)) { >>>> s->discard_zeroes = true; >>>> + s->has_fallocate = true; >>> >>> This could be moved upwards where has_discard and has_write_zeroes >>> are initialized; but it won't matter in practice, I hope. Thus: >>> >>> Reviewed-by: Max Reitz <mreitz@redhat.com> >> >> This does matter as has_discard and has_write_zeroes are bit fields >> thus I can not insert something useful into the middle of those >> fields. > > Right, but I did not mean the placement inside of the structure but > the placement of the initialization statement (s->has_fallocate = > true) in raw_open_common(). > > Max hmm, you are right. This is possible but I don't want to have this bit set for block/character etc devices even if they are not using this bit/code. With my approach the assignment is made in a way to indicate application area. Thank you for a review :) It is somewhat difficult to obtain feedback here in comparison with Linux kernel.
diff --git a/block/raw-posix.c b/block/raw-posix.c index 5a777e7..1c88ad8 100644 --- a/block/raw-posix.c +++ b/block/raw-posix.c @@ -147,6 +147,7 @@ typedef struct BDRVRawState { bool has_discard:1; bool has_write_zeroes:1; bool discard_zeroes:1; + bool has_fallocate; bool needs_alignment; } BDRVRawState; @@ -452,6 +453,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options, } if (S_ISREG(st.st_mode)) { s->discard_zeroes = true; + s->has_fallocate = true; } if (S_ISBLK(st.st_mode)) { #ifdef BLKDISCARDZEROES @@ -902,7 +904,7 @@ static int translate_err(int err) return err; } -#if defined(CONFIG_FALLOCATE_PUNCH_HOLE) || defined(CONFIG_FALLOCATE_ZERO_RANGE) +#ifdef CONFIG_FALLOCATE static int do_fallocate(int fd, int mode, off_t offset, off_t len) { do { @@ -965,6 +967,16 @@ static ssize_t handle_aiocb_write_zeroes(RawPosixAIOData *aiocb) } #endif +#ifdef CONFIG_FALLOCATE + if (s->has_fallocate && aiocb->aio_offset >= bdrv_getlength(aiocb->bs)) { + int ret = do_fallocate(s->fd, 0, aiocb->aio_offset, aiocb->aio_nbytes); + if (ret == 0 || ret != -ENOTSUP) { + return ret; + } + s->has_fallocate = false; + } +#endif + return -ENOTSUP; }
There is a possibility that we are extending our image and thus writing zeroes beyond the end of the file. In this case we do not need to care about the hole to make sure that there is no data in the file under this offset (pre-condition to fallocate(0) to work). We could simply call fallocate(0). This improves the performance of writing zeroes even on really old platforms which do not have even FALLOC_FL_PUNCH_HOLE. Before the patch do_fallocate was used when either CONFIG_FALLOCATE_PUNCH_HOLE or CONFIG_FALLOCATE_ZERO_RANGE are defined. Now the story is different. CONFIG_FALLOCATE is defined when Linux fallocate is defined, posix_fallocate is completely different story (CONFIG_POSIX_FALLOCATE). CONFIG_FALLOCATE is mandatory prerequite for both CONFIG_FALLOCATE_PUNCH_HOLE and CONFIG_FALLOCATE_ZERO_RANGE thus we are on the safe side. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Max Reitz <mreitz@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Stefan Hajnoczi <stefanha@redhat.com> CC: Peter Lieven <pl@kamp.de> CC: Fam Zheng <famz@redhat.com> --- block/raw-posix.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)