Message ID | 1466435958-308-1-git-send-email-den@openvz.org |
---|---|
State | New |
Headers | show |
On 06/20/2016 09:19 AM, Denis V. Lunev wrote: > From: Evgeny Yakovlev <eyakovlev@virtuozzo.com> > > Some guests (win2008 server for example) do a lot of unnecessary > flushing when underlying media has not changed. This adds additional > overhead on host when calling fsync/fdatasync. > > This change introduces a dirty flag in BlockDriverState which is set > in bdrv_set_dirty and is checked in bdrv_co_flush. This allows us to > avoid unnessesary flushing when storage is clean. s/unnessesary/unnecessary/ > > The problem with excessive flushing was found by a performance test > which does parallel directory tree creation (from 2 processes). > Results improved from 0.424 loops/sec to 0.432 loops/sec. > Each loop creates 10^3 directories with 10 files in each. > > Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com> > Signed-off-by: Denis V. Lunev <den@openvz.org> > CC: Kevin Wolf <kwolf@redhat.com> > CC: Max Reitz <mreitz@redhat.com> > CC: Stefan Hajnoczi <stefanha@redhat.com> > CC: Fam Zheng <famz@redhat.com> > --- > block.c | 1 + > block/dirty-bitmap.c | 3 +++ > block/io.c | 6 ++++++ > include/block/block_int.h | 1 + > 4 files changed, 11 insertions(+) Otherwise seems reasonable, but I'll let others with more experience on flush semantics chime in.
On 20/06/2016 17:19, Denis V. Lunev wrote: > + /* Check if storage is actually dirty before flushing to disk */ > + if (!bs->dirty) { > + goto flush_parent; > + } > + bs->dirty = false; > + This should be cleared after the flush is complete. If you have write begin write end flush #1 begin flush #2 begin Then the second flush must only return after the first has finished. Paolo
On 06/21/2016 10:32 AM, Paolo Bonzini wrote: > > On 20/06/2016 17:19, Denis V. Lunev wrote: >> + /* Check if storage is actually dirty before flushing to disk */ >> + if (!bs->dirty) { >> + goto flush_parent; >> + } >> + bs->dirty = false; >> + > This should be cleared after the flush is complete. If you have > > write begin > write end > flush #1 begin > flush #2 begin > > Then the second flush must only return after the first has finished. > > Paolo Really interesting point, I have missed it. Though this case is exactly one which we want to optimize. 2nd flush is unnecessary and should not be sent, BUT you perfectly correct it must return later than the first to the guest. Have to rework. Nice catch! Den
Am 21.06.2016 um 09:32 hat Paolo Bonzini geschrieben: > > > On 20/06/2016 17:19, Denis V. Lunev wrote: > > + /* Check if storage is actually dirty before flushing to disk */ > > + if (!bs->dirty) { > > + goto flush_parent; > > + } > > + bs->dirty = false; > > + > > This should be cleared after the flush is complete. If you have > > write begin > write end > flush #1 begin > flush #2 begin > > Then the second flush must only return after the first has finished. I think clearing bs->dirty after the flush completion wouldn't necessarily be right either if there are concurrent writes in flight, as only completed writes are guaranteed to be flushed by it. Kevin
On 06/21/2016 10:45 AM, Kevin Wolf wrote: > Am 21.06.2016 um 09:32 hat Paolo Bonzini geschrieben: >> >> On 20/06/2016 17:19, Denis V. Lunev wrote: >>> + /* Check if storage is actually dirty before flushing to disk */ >>> + if (!bs->dirty) { >>> + goto flush_parent; >>> + } >>> + bs->dirty = false; >>> + >> This should be cleared after the flush is complete. If you have >> >> write begin >> write end >> flush #1 begin >> flush #2 begin >> >> Then the second flush must only return after the first has finished. > I think clearing bs->dirty after the flush completion wouldn't > necessarily be right either if there are concurrent writes in flight, as > only completed writes are guaranteed to be flushed by it. > > Kevin this is not a problem if flush 2 will return after flush 1. This will mean that all writes prior to both flushes will land to the disk. Keeping this in mind dirty should be cleared before flush operation start. Den
Am 21.06.2016 um 10:17 hat Denis V. Lunev geschrieben: > On 06/21/2016 10:45 AM, Kevin Wolf wrote: > >Am 21.06.2016 um 09:32 hat Paolo Bonzini geschrieben: > >> > >>On 20/06/2016 17:19, Denis V. Lunev wrote: > >>>+ /* Check if storage is actually dirty before flushing to disk */ > >>>+ if (!bs->dirty) { > >>>+ goto flush_parent; > >>>+ } > >>>+ bs->dirty = false; > >>>+ > >>This should be cleared after the flush is complete. If you have > >> > >> write begin > >> write end > >> flush #1 begin > >> flush #2 begin > >> > >>Then the second flush must only return after the first has finished. > >I think clearing bs->dirty after the flush completion wouldn't > >necessarily be right either if there are concurrent writes in flight, as > >only completed writes are guaranteed to be flushed by it. > > > >Kevin > this is not a problem if flush 2 will return after flush 1. > This will mean that all writes prior to both flushes > will land to the disk. We need to be careful in cases like this: start + complete write #1 start flush #1 start + complete write #2 start flush #2 complete flush #1 complete flush #2 Here letting flush #2 wait for flush #1 isn't enough because that one only guarantees to flush write #1, but not write #2. However, flush #2 is required to flush write #2, too. > Keeping this in mind dirty should be cleared before > flush operation start. Yes, if you do it like this, so that flush #2 ends up being an actual flush instead of just waiting for flush #1, we're good. Kevin
diff --git a/block.c b/block.c index b331eb9..1a4e154 100644 --- a/block.c +++ b/block.c @@ -2591,6 +2591,7 @@ int bdrv_truncate(BlockDriverState *bs, int64_t offset) bdrv_dirty_bitmap_truncate(bs); bdrv_parent_cb_resize(bs); } + bs->dirty = true; /* file node sync is needed after truncate */ return ret; } diff --git a/block/dirty-bitmap.c b/block/dirty-bitmap.c index 4902ca5..ad208ad 100644 --- a/block/dirty-bitmap.c +++ b/block/dirty-bitmap.c @@ -370,6 +370,9 @@ void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector, } hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors); } + + /* Set global block driver dirty flag */ + bs->dirty = true; } /** diff --git a/block/io.c b/block/io.c index ebdb9d8..ef372d2 100644 --- a/block/io.c +++ b/block/io.c @@ -2234,6 +2234,12 @@ int coroutine_fn bdrv_co_flush(BlockDriverState *bs) } } + /* Check if storage is actually dirty before flushing to disk */ + if (!bs->dirty) { + goto flush_parent; + } + bs->dirty = false; + /* But don't actually force it to the disk with cache=unsafe */ if (bs->open_flags & BDRV_O_NO_FLUSH) { goto flush_parent; diff --git a/include/block/block_int.h b/include/block/block_int.h index 688c6be..a62943b 100644 --- a/include/block/block_int.h +++ b/include/block/block_int.h @@ -417,6 +417,7 @@ struct BlockDriverState { int sg; /* if true, the device is a /dev/sg* */ int copy_on_read; /* if true, copy read backing sectors into image note this is a reference count */ + bool dirty; bool probed; BlockDriver *drv; /* NULL means no media */