Message ID | 20240801123516.4498-13-farosas@suse.de |
---|---|
State | New |
Headers | show |
Series | migration/multifd: Remove multifd_send_state->pages | expand |
On Thu, Aug 01, 2024 at 09:35:14AM -0300, Fabiano Rosas wrote: > Separate the multifd sync from flushing the client data to the > channels. These two operations are closely related but not strictly > necessary to be executed together. > > The multifd sync is intrinsic to how multifd works. The multiple > channels operate independently and may finish IO out of order in > relation to each other. This applies also between the source and > destination QEMU. > > Flushing the data that is left in the client-owned data structures > (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, > but that is particular to how the ram migration is implemented with > several passes over dirty data. > > Make these two routines separate, allowing future code to call the > sync by itself if needed. This also allows the usage of > multifd_ram_send to be isolated to ram code. What's the usage of sync but not flush here?
On Thu, Aug 22, 2024 at 12:03:47PM -0400, Peter Xu wrote: > On Thu, Aug 01, 2024 at 09:35:14AM -0300, Fabiano Rosas wrote: > > Separate the multifd sync from flushing the client data to the > > channels. These two operations are closely related but not strictly > > necessary to be executed together. > > > > The multifd sync is intrinsic to how multifd works. The multiple > > channels operate independently and may finish IO out of order in > > relation to each other. This applies also between the source and > > destination QEMU. > > > > Flushing the data that is left in the client-owned data structures > > (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, > > but that is particular to how the ram migration is implemented with > > several passes over dirty data. > > > > Make these two routines separate, allowing future code to call the > > sync by itself if needed. This also allows the usage of > > multifd_ram_send to be isolated to ram code. > > What's the usage of sync but not flush here? Oh I think I see your point.. I think flush+sync is always needed, it's just that RAM may not always be the one to flush, but something else. Makes sense then. If you want, you may touch up the commit message to clarify that. E.g. I still don't see any use case that we want to sync without a flush, that part might be a bit ambiguous. If my understanding is correct, take this: Reviewed-by: Peter Xu <peterx@redhat.com>
Peter Xu <peterx@redhat.com> writes: > On Thu, Aug 22, 2024 at 12:03:47PM -0400, Peter Xu wrote: >> On Thu, Aug 01, 2024 at 09:35:14AM -0300, Fabiano Rosas wrote: >> > Separate the multifd sync from flushing the client data to the >> > channels. These two operations are closely related but not strictly >> > necessary to be executed together. >> > >> > The multifd sync is intrinsic to how multifd works. The multiple >> > channels operate independently and may finish IO out of order in >> > relation to each other. This applies also between the source and >> > destination QEMU. >> > >> > Flushing the data that is left in the client-owned data structures >> > (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, >> > but that is particular to how the ram migration is implemented with >> > several passes over dirty data. >> > >> > Make these two routines separate, allowing future code to call the >> > sync by itself if needed. This also allows the usage of >> > multifd_ram_send to be isolated to ram code. >> >> What's the usage of sync but not flush here? > > Oh I think I see your point.. I think flush+sync is always needed, it's > just that RAM may not always be the one to flush, but something else. > Makes sense then. > I'm thinking of "flush" here as a last multifd_send() before sync. We need multiple multifd_send() along the migration to send the data, but we might not need this extra flush. It could be that there's nothing to flush and the code guarantees it: <populate MultiFDSendData> multifd_send() sync Where RAM currently does: multifd_queue_page() multifd_queue_page() multifd_queue_page() ... multifd_queue_page() multifd_send() sync Today there is a multifd_send() inside multifd_queue_page() and the amount sent depends on the ram.c code. At the time sync gets called, there could be data queued but not yet sent. Another client (not ram) could just produce data in a deterministic manner and match that with calls to multifd_send(). > If you want, you may touch up the commit message to clarify that. E.g. I > still don't see any use case that we want to sync without a flush, that > part might be a bit ambiguous. > > If my understanding is correct, take this: > > Reviewed-by: Peter Xu <peterx@redhat.com>
On Thu, Aug 22, 2024 at 02:05:30PM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > On Thu, Aug 22, 2024 at 12:03:47PM -0400, Peter Xu wrote: > >> On Thu, Aug 01, 2024 at 09:35:14AM -0300, Fabiano Rosas wrote: > >> > Separate the multifd sync from flushing the client data to the > >> > channels. These two operations are closely related but not strictly > >> > necessary to be executed together. > >> > > >> > The multifd sync is intrinsic to how multifd works. The multiple > >> > channels operate independently and may finish IO out of order in > >> > relation to each other. This applies also between the source and > >> > destination QEMU. > >> > > >> > Flushing the data that is left in the client-owned data structures > >> > (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, > >> > but that is particular to how the ram migration is implemented with > >> > several passes over dirty data. > >> > > >> > Make these two routines separate, allowing future code to call the > >> > sync by itself if needed. This also allows the usage of > >> > multifd_ram_send to be isolated to ram code. > >> > >> What's the usage of sync but not flush here? > > > > Oh I think I see your point.. I think flush+sync is always needed, it's > > just that RAM may not always be the one to flush, but something else. > > Makes sense then. > > > > I'm thinking of "flush" here as a last multifd_send() before sync. We > need multiple multifd_send() along the migration to send the data, but > we might not need this extra flush. It could be that there's nothing to > flush and the code guarantees it: > > <populate MultiFDSendData> > multifd_send() > sync > > Where RAM currently does: > > multifd_queue_page() > multifd_queue_page() > multifd_queue_page() > ... > multifd_queue_page() > multifd_send() > sync > > Today there is a multifd_send() inside multifd_queue_page() and the > amount sent depends on the ram.c code. At the time sync gets called, > there could be data queued but not yet sent. Another client (not ram) > could just produce data in a deterministic manner and match that with > calls to multifd_send(). I hope I read it alright.. I suppose you meant we have chance to do: ram_send() vfio_send() flush() Instead of: ram_send() flush() vfio_send() flush() Am I right? > > > If you want, you may touch up the commit message to clarify that. E.g. I > > still don't see any use case that we want to sync without a flush, that > > part might be a bit ambiguous. > > > > If my understanding is correct, take this: > > > > Reviewed-by: Peter Xu <peterx@redhat.com> >
Peter Xu <peterx@redhat.com> writes: > On Thu, Aug 22, 2024 at 02:05:30PM -0300, Fabiano Rosas wrote: >> Peter Xu <peterx@redhat.com> writes: >> >> > On Thu, Aug 22, 2024 at 12:03:47PM -0400, Peter Xu wrote: >> >> On Thu, Aug 01, 2024 at 09:35:14AM -0300, Fabiano Rosas wrote: >> >> > Separate the multifd sync from flushing the client data to the >> >> > channels. These two operations are closely related but not strictly >> >> > necessary to be executed together. >> >> > >> >> > The multifd sync is intrinsic to how multifd works. The multiple >> >> > channels operate independently and may finish IO out of order in >> >> > relation to each other. This applies also between the source and >> >> > destination QEMU. >> >> > >> >> > Flushing the data that is left in the client-owned data structures >> >> > (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, >> >> > but that is particular to how the ram migration is implemented with >> >> > several passes over dirty data. >> >> > >> >> > Make these two routines separate, allowing future code to call the >> >> > sync by itself if needed. This also allows the usage of >> >> > multifd_ram_send to be isolated to ram code. >> >> >> >> What's the usage of sync but not flush here? >> > >> > Oh I think I see your point.. I think flush+sync is always needed, it's >> > just that RAM may not always be the one to flush, but something else. >> > Makes sense then. >> > >> >> I'm thinking of "flush" here as a last multifd_send() before sync. We >> need multiple multifd_send() along the migration to send the data, but >> we might not need this extra flush. It could be that there's nothing to >> flush and the code guarantees it: >> >> <populate MultiFDSendData> >> multifd_send() >> sync >> >> Where RAM currently does: >> >> multifd_queue_page() >> multifd_queue_page() >> multifd_queue_page() >> ... >> multifd_queue_page() >> multifd_send() >> sync >> >> Today there is a multifd_send() inside multifd_queue_page() and the >> amount sent depends on the ram.c code. At the time sync gets called, >> there could be data queued but not yet sent. Another client (not ram) >> could just produce data in a deterministic manner and match that with >> calls to multifd_send(). > > I hope I read it alright.. I suppose you meant we have chance to do: > > ram_send() > vfio_send() > flush() > > Instead of: > > ram_send() > flush() > vfio_send() > flush() > > Am I right? Not really. I'm saying that RAM doesn't always send the data, that's why it needs a final flush before sync: multifd_queue_page() if (multifd_queue_empty(pages)) { multifd_enqueue(pages, offset); } if (multifd_queue_full(pages)) { multifd_send_pages() <-- this might not happen } multifd_enqueue() multifd_send_sync_main() if (pages->num) { <-- data left unsent multifd_send() <-- flush } <sync routine> > >> >> > If you want, you may touch up the commit message to clarify that. E.g. I >> > still don't see any use case that we want to sync without a flush, that >> > part might be a bit ambiguous. >> > >> > If my understanding is correct, take this: >> > >> > Reviewed-by: Peter Xu <peterx@redhat.com> >>
On Thu, Aug 22, 2024 at 03:07:55PM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > On Thu, Aug 22, 2024 at 02:05:30PM -0300, Fabiano Rosas wrote: > >> Peter Xu <peterx@redhat.com> writes: > >> > >> > On Thu, Aug 22, 2024 at 12:03:47PM -0400, Peter Xu wrote: > >> >> On Thu, Aug 01, 2024 at 09:35:14AM -0300, Fabiano Rosas wrote: > >> >> > Separate the multifd sync from flushing the client data to the > >> >> > channels. These two operations are closely related but not strictly > >> >> > necessary to be executed together. > >> >> > > >> >> > The multifd sync is intrinsic to how multifd works. The multiple > >> >> > channels operate independently and may finish IO out of order in > >> >> > relation to each other. This applies also between the source and > >> >> > destination QEMU. > >> >> > > >> >> > Flushing the data that is left in the client-owned data structures > >> >> > (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, > >> >> > but that is particular to how the ram migration is implemented with > >> >> > several passes over dirty data. > >> >> > > >> >> > Make these two routines separate, allowing future code to call the > >> >> > sync by itself if needed. This also allows the usage of > >> >> > multifd_ram_send to be isolated to ram code. > >> >> > >> >> What's the usage of sync but not flush here? > >> > > >> > Oh I think I see your point.. I think flush+sync is always needed, it's > >> > just that RAM may not always be the one to flush, but something else. > >> > Makes sense then. > >> > > >> > >> I'm thinking of "flush" here as a last multifd_send() before sync. We > >> need multiple multifd_send() along the migration to send the data, but > >> we might not need this extra flush. It could be that there's nothing to > >> flush and the code guarantees it: > >> > >> <populate MultiFDSendData> > >> multifd_send() > >> sync > >> > >> Where RAM currently does: > >> > >> multifd_queue_page() > >> multifd_queue_page() > >> multifd_queue_page() > >> ... > >> multifd_queue_page() > >> multifd_send() > >> sync > >> > >> Today there is a multifd_send() inside multifd_queue_page() and the > >> amount sent depends on the ram.c code. At the time sync gets called, > >> there could be data queued but not yet sent. Another client (not ram) > >> could just produce data in a deterministic manner and match that with > >> calls to multifd_send(). > > > > I hope I read it alright.. I suppose you meant we have chance to do: > > > > ram_send() > > vfio_send() > > flush() > > > > Instead of: > > > > ram_send() > > flush() > > vfio_send() > > flush() > > > > Am I right? > > Not really. I'm saying that RAM doesn't always send the data, that's why > it needs a final flush before sync: > > multifd_queue_page() > if (multifd_queue_empty(pages)) { > multifd_enqueue(pages, offset); > } > > if (multifd_queue_full(pages)) { > multifd_send_pages() <-- this might not happen > } > multifd_enqueue() > > multifd_send_sync_main() > if (pages->num) { <-- data left unsent > multifd_send() <-- flush > } > <sync routine> I see now. At least for now I'd expect VFIO doesn't need to use sync at all, not before the kernel ABI changes. It's just that when that comes we may need to move that final sync() out of ram code then, but before a migration completes, to cover both. It's ok then, my R-b holds.
diff --git a/migration/multifd.c b/migration/multifd.c index 1c16879495..c25ab4924c 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -921,11 +921,8 @@ static int multifd_zero_copy_flush(QIOChannel *c) return ret; } -int multifd_send_sync_main(void) +int multifd_ram_flush_and_sync(void) { - int i; - bool flush_zero_copy; - if (!migrate_multifd()) { return 0; } @@ -937,6 +934,14 @@ int multifd_send_sync_main(void) } } + return multifd_send_sync_main(); +} + +int multifd_send_sync_main(void) +{ + int i; + bool flush_zero_copy; + flush_zero_copy = migrate_zero_copy_send(); for (i = 0; i < migrate_multifd_channels(); i++) { diff --git a/migration/multifd.h b/migration/multifd.h index e3d0120837..19a43d46c0 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -278,4 +278,5 @@ static inline uint32_t multifd_ram_page_count(void) void multifd_ram_save_setup(void); void multifd_ram_save_cleanup(void); +int multifd_ram_flush_and_sync(void); #endif diff --git a/migration/ram.c b/migration/ram.c index 1815b2557b..67ca3d5d51 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -1326,7 +1326,7 @@ static int find_dirty_block(RAMState *rs, PageSearchStatus *pss) (!migrate_multifd_flush_after_each_section() || migrate_mapped_ram())) { QEMUFile *f = rs->pss[RAM_CHANNEL_PRECOPY].pss_channel; - int ret = multifd_send_sync_main(); + int ret = multifd_ram_flush_and_sync(); if (ret < 0) { return ret; } @@ -3066,7 +3066,7 @@ static int ram_save_setup(QEMUFile *f, void *opaque, Error **errp) } bql_unlock(); - ret = multifd_send_sync_main(); + ret = multifd_ram_flush_and_sync(); bql_lock(); if (ret < 0) { error_setg(errp, "%s: multifd synchronization failed", __func__); @@ -3213,7 +3213,7 @@ out: && migration_is_setup_or_active()) { if (migrate_multifd() && migrate_multifd_flush_after_each_section() && !migrate_mapped_ram()) { - ret = multifd_send_sync_main(); + ret = multifd_ram_flush_and_sync(); if (ret < 0) { return ret; } @@ -3285,7 +3285,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque) } } - ret = multifd_send_sync_main(); + ret = multifd_ram_flush_and_sync(); if (ret < 0) { return ret; }
Separate the multifd sync from flushing the client data to the channels. These two operations are closely related but not strictly necessary to be executed together. The multifd sync is intrinsic to how multifd works. The multiple channels operate independently and may finish IO out of order in relation to each other. This applies also between the source and destination QEMU. Flushing the data that is left in the client-owned data structures (e.g. MultiFDPages_t) prior to sync is usually the right thing to do, but that is particular to how the ram migration is implemented with several passes over dirty data. Make these two routines separate, allowing future code to call the sync by itself if needed. This also allows the usage of multifd_ram_send to be isolated to ram code. Signed-off-by: Fabiano Rosas <farosas@suse.de> --- migration/multifd.c | 13 +++++++++---- migration/multifd.h | 1 + migration/ram.c | 8 ++++---- 3 files changed, 14 insertions(+), 8 deletions(-)