Message ID | 20240426142042.14573-6-farosas@suse.de |
---|---|
State | New |
Headers | show |
Series | migration/mapped-ram: Add direct-io support | expand |
On Fri, Apr 26, 2024 at 11:20:38AM -0300, Fabiano Rosas wrote: > When multifd is used along with mapped-ram, we can take benefit of a > filesystem that supports the O_DIRECT flag and perform direct I/O in > the multifd threads. This brings a significant performance improvement > because direct-io writes bypass the page cache which would otherwise > be thrashed by the multifd data which is unlikely to be needed again > in a short period of time. > > To be able to use a multifd channel opened with O_DIRECT, we must > ensure that a certain aligment is used. Filesystems usually require a > block-size alignment for direct I/O. The way to achieve this is by > enabling the mapped-ram feature, which already aligns its I/O properly > (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c). > > By setting O_DIRECT on the multifd channels, all writes to the same > file descriptor need to be aligned as well, even the ones that come > from outside multifd, such as the QEMUFile I/O from the main migration > code. This makes it impossible to use the same file descriptor for the > QEMUFile and for the multifd channels. The various flags and metadata > written by the main migration code will always be unaligned by virtue > of their small size. To workaround this issue, we'll require a second > file descriptor to be used exclusively for direct I/O. > > The second file descriptor can be obtained by QEMU by re-opening the > migration file (already possible), or by being provided by the user or > management application (support to be added in future patches). > > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > migration/file.c | 22 +++++++++++++++++++--- > migration/migration.c | 23 +++++++++++++++++++++++ > 2 files changed, 42 insertions(+), 3 deletions(-) > > diff --git a/migration/file.c b/migration/file.c > index 8f30999400..b9265b14dd 100644 > --- a/migration/file.c > +++ b/migration/file.c > @@ -83,17 +83,33 @@ void file_cleanup_outgoing_migration(void) > > bool file_send_channel_create(gpointer opaque, Error **errp) > { > - QIOChannelFile *ioc; > + QIOChannelFile *ioc = NULL; > int flags = O_WRONLY; > - bool ret = true; > + bool ret = false; > + > + if (migrate_direct_io()) { > +#ifdef O_DIRECT > + /* > + * Enable O_DIRECT for the secondary channels. These are used > + * for sending ram pages and writes should be guaranteed to be > + * aligned to at least page size. > + */ > + flags |= O_DIRECT; > +#else > + error_setg(errp, "System does not support O_DIRECT"); > + error_append_hint(errp, > + "Try disabling direct-io migration capability\n"); > + goto out; > +#endif Hopefully if we can fail migrate-set-parameters correctly always, we will never trigger this error. I know Linux used some trick like this to even avoid such ifdefs: if (qemu_has_direct_io() && migrate_direct_io()) { // reference O_DIRECT } So as long as qemu_has_direct_io() can return a constant "false" when O_DIRECT not defined, the compiler is smart enough to ignore the O_DIRECT inside the block. Even if it won't work, we can still avoid that error (and rely on the set-parameter failure): #ifdef O_DIRECT if (migrate_direct_io()) { // reference O_DIRECT } #endif Then it should run the same, just to try making ifdefs as light as possible.. > + } > > ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp); > if (!ioc) { > - ret = false; > goto out; > } > > multifd_channel_connect(opaque, QIO_CHANNEL(ioc)); > + ret = true; > > out: > /* > diff --git a/migration/migration.c b/migration/migration.c > index b5af6b5105..cb923a3f62 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -155,6 +155,16 @@ static bool migration_needs_seekable_channel(void) > return migrate_mapped_ram(); > } > > +static bool migration_needs_multiple_fds(void) If I suggest to rename this, would you agree? :) I'd try with "migrate_needs_extra_fd()" or "migrate_needs_two_fds()", or... just to avoid "multi" + "fd" used altogether, perhaps. Other than that looks all good. Thanks, > +{ > + /* > + * When doing direct-io, multifd requires two different, > + * non-duplicated file descriptors so we can use one of them for > + * unaligned IO. > + */ > + return migrate_multifd() && migrate_direct_io(); > +} > + > static bool transport_supports_seeking(MigrationAddress *addr) > { > if (addr->transport == MIGRATION_ADDRESS_TYPE_FILE) { > @@ -164,6 +174,12 @@ static bool transport_supports_seeking(MigrationAddress *addr) > return false; > } > > +static bool transport_supports_multiple_fds(MigrationAddress *addr) > +{ > + /* file: works because QEMU can open it multiple times */ > + return addr->transport == MIGRATION_ADDRESS_TYPE_FILE; > +} > + > static bool > migration_channels_and_transport_compatible(MigrationAddress *addr, > Error **errp) > @@ -180,6 +196,13 @@ migration_channels_and_transport_compatible(MigrationAddress *addr, > return false; > } > > + if (migration_needs_multiple_fds() && > + !transport_supports_multiple_fds(addr)) { > + error_setg(errp, > + "Migration requires a transport that allows for multiple fds (e.g. file)"); > + return false; > + } > + > return true; > } > > -- > 2.35.3 >
Peter Xu <peterx@redhat.com> writes: > On Fri, Apr 26, 2024 at 11:20:38AM -0300, Fabiano Rosas wrote: >> When multifd is used along with mapped-ram, we can take benefit of a >> filesystem that supports the O_DIRECT flag and perform direct I/O in >> the multifd threads. This brings a significant performance improvement >> because direct-io writes bypass the page cache which would otherwise >> be thrashed by the multifd data which is unlikely to be needed again >> in a short period of time. >> >> To be able to use a multifd channel opened with O_DIRECT, we must >> ensure that a certain aligment is used. Filesystems usually require a >> block-size alignment for direct I/O. The way to achieve this is by >> enabling the mapped-ram feature, which already aligns its I/O properly >> (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c). >> >> By setting O_DIRECT on the multifd channels, all writes to the same >> file descriptor need to be aligned as well, even the ones that come >> from outside multifd, such as the QEMUFile I/O from the main migration >> code. This makes it impossible to use the same file descriptor for the >> QEMUFile and for the multifd channels. The various flags and metadata >> written by the main migration code will always be unaligned by virtue >> of their small size. To workaround this issue, we'll require a second >> file descriptor to be used exclusively for direct I/O. >> >> The second file descriptor can be obtained by QEMU by re-opening the >> migration file (already possible), or by being provided by the user or >> management application (support to be added in future patches). >> >> Signed-off-by: Fabiano Rosas <farosas@suse.de> >> --- >> migration/file.c | 22 +++++++++++++++++++--- >> migration/migration.c | 23 +++++++++++++++++++++++ >> 2 files changed, 42 insertions(+), 3 deletions(-) >> >> diff --git a/migration/file.c b/migration/file.c >> index 8f30999400..b9265b14dd 100644 >> --- a/migration/file.c >> +++ b/migration/file.c >> @@ -83,17 +83,33 @@ void file_cleanup_outgoing_migration(void) >> >> bool file_send_channel_create(gpointer opaque, Error **errp) >> { >> - QIOChannelFile *ioc; >> + QIOChannelFile *ioc = NULL; >> int flags = O_WRONLY; >> - bool ret = true; >> + bool ret = false; >> + >> + if (migrate_direct_io()) { >> +#ifdef O_DIRECT >> + /* >> + * Enable O_DIRECT for the secondary channels. These are used >> + * for sending ram pages and writes should be guaranteed to be >> + * aligned to at least page size. >> + */ >> + flags |= O_DIRECT; >> +#else >> + error_setg(errp, "System does not support O_DIRECT"); >> + error_append_hint(errp, >> + "Try disabling direct-io migration capability\n"); >> + goto out; >> +#endif > > Hopefully if we can fail migrate-set-parameters correctly always, we will > never trigger this error. > > I know Linux used some trick like this to even avoid such ifdefs: > > if (qemu_has_direct_io() && migrate_direct_io()) { > // reference O_DIRECT > } > > So as long as qemu_has_direct_io() can return a constant "false" when > O_DIRECT not defined, the compiler is smart enough to ignore the O_DIRECT > inside the block. > > Even if it won't work, we can still avoid that error (and rely on the > set-parameter failure): > > #ifdef O_DIRECT > if (migrate_direct_io()) { > // reference O_DIRECT > } > #endif > > Then it should run the same, just to try making ifdefs as light as > possible.. Ok. Just FYI, in v2 I'm adding direct-io to migration incoming side as well, so I put this logic into a helper: static bool file_enable_direct_io(int *flags, Error **errp) { if (migrate_direct_io()) { #ifdef O_DIRECT *flags |= O_DIRECT; #else error_setg(errp, "System does not support O_DIRECT"); error_append_hint(errp, "Try disabling direct-io migration capability\n"); return false; #endif } return true; } But I'll apply your suggestions nonetheless. > >> + } >> >> ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp); >> if (!ioc) { >> - ret = false; >> goto out; >> } >> >> multifd_channel_connect(opaque, QIO_CHANNEL(ioc)); >> + ret = true; >> >> out: >> /* >> diff --git a/migration/migration.c b/migration/migration.c >> index b5af6b5105..cb923a3f62 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -155,6 +155,16 @@ static bool migration_needs_seekable_channel(void) >> return migrate_mapped_ram(); >> } >> >> +static bool migration_needs_multiple_fds(void) > > If I suggest to rename this, would you agree? :) > Sure, although this is the more accurate usage than "multifd" hehe. > I'd try with "migrate_needs_extra_fd()" or "migrate_needs_two_fds()", > or... just to avoid "multi" + "fd" used altogether, perhaps. > > Other than that looks all good. > > Thanks, > >> +{ >> + /* >> + * When doing direct-io, multifd requires two different, >> + * non-duplicated file descriptors so we can use one of them for >> + * unaligned IO. >> + */ >> + return migrate_multifd() && migrate_direct_io(); >> +} >> + >> static bool transport_supports_seeking(MigrationAddress *addr) >> { >> if (addr->transport == MIGRATION_ADDRESS_TYPE_FILE) { >> @@ -164,6 +174,12 @@ static bool transport_supports_seeking(MigrationAddress *addr) >> return false; >> } >> >> +static bool transport_supports_multiple_fds(MigrationAddress *addr) >> +{ >> + /* file: works because QEMU can open it multiple times */ >> + return addr->transport == MIGRATION_ADDRESS_TYPE_FILE; >> +} >> + >> static bool >> migration_channels_and_transport_compatible(MigrationAddress *addr, >> Error **errp) >> @@ -180,6 +196,13 @@ migration_channels_and_transport_compatible(MigrationAddress *addr, >> return false; >> } >> >> + if (migration_needs_multiple_fds() && >> + !transport_supports_multiple_fds(addr)) { >> + error_setg(errp, >> + "Migration requires a transport that allows for multiple fds (e.g. file)"); >> + return false; >> + } >> + >> return true; >> } >> >> -- >> 2.35.3 >>
On Fri, May 03, 2024 at 05:54:28PM -0300, Fabiano Rosas wrote: > Peter Xu <peterx@redhat.com> writes: > > > On Fri, Apr 26, 2024 at 11:20:38AM -0300, Fabiano Rosas wrote: > >> When multifd is used along with mapped-ram, we can take benefit of a > >> filesystem that supports the O_DIRECT flag and perform direct I/O in > >> the multifd threads. This brings a significant performance improvement > >> because direct-io writes bypass the page cache which would otherwise > >> be thrashed by the multifd data which is unlikely to be needed again > >> in a short period of time. > >> > >> To be able to use a multifd channel opened with O_DIRECT, we must > >> ensure that a certain aligment is used. Filesystems usually require a > >> block-size alignment for direct I/O. The way to achieve this is by > >> enabling the mapped-ram feature, which already aligns its I/O properly > >> (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c). > >> > >> By setting O_DIRECT on the multifd channels, all writes to the same > >> file descriptor need to be aligned as well, even the ones that come > >> from outside multifd, such as the QEMUFile I/O from the main migration > >> code. This makes it impossible to use the same file descriptor for the > >> QEMUFile and for the multifd channels. The various flags and metadata > >> written by the main migration code will always be unaligned by virtue > >> of their small size. To workaround this issue, we'll require a second > >> file descriptor to be used exclusively for direct I/O. > >> > >> The second file descriptor can be obtained by QEMU by re-opening the > >> migration file (already possible), or by being provided by the user or > >> management application (support to be added in future patches). > >> > >> Signed-off-by: Fabiano Rosas <farosas@suse.de> > >> --- > >> migration/file.c | 22 +++++++++++++++++++--- > >> migration/migration.c | 23 +++++++++++++++++++++++ > >> 2 files changed, 42 insertions(+), 3 deletions(-) > >> > >> diff --git a/migration/file.c b/migration/file.c > >> index 8f30999400..b9265b14dd 100644 > >> --- a/migration/file.c > >> +++ b/migration/file.c > >> @@ -83,17 +83,33 @@ void file_cleanup_outgoing_migration(void) > >> > >> bool file_send_channel_create(gpointer opaque, Error **errp) > >> { > >> - QIOChannelFile *ioc; > >> + QIOChannelFile *ioc = NULL; > >> int flags = O_WRONLY; > >> - bool ret = true; > >> + bool ret = false; > >> + > >> + if (migrate_direct_io()) { > >> +#ifdef O_DIRECT > >> + /* > >> + * Enable O_DIRECT for the secondary channels. These are used > >> + * for sending ram pages and writes should be guaranteed to be > >> + * aligned to at least page size. > >> + */ > >> + flags |= O_DIRECT; > >> +#else > >> + error_setg(errp, "System does not support O_DIRECT"); > >> + error_append_hint(errp, > >> + "Try disabling direct-io migration capability\n"); > >> + goto out; > >> +#endif > > > > Hopefully if we can fail migrate-set-parameters correctly always, we will > > never trigger this error. > > > > I know Linux used some trick like this to even avoid such ifdefs: > > > > if (qemu_has_direct_io() && migrate_direct_io()) { > > // reference O_DIRECT > > } > > > > So as long as qemu_has_direct_io() can return a constant "false" when > > O_DIRECT not defined, the compiler is smart enough to ignore the O_DIRECT > > inside the block. > > > > Even if it won't work, we can still avoid that error (and rely on the > > set-parameter failure): > > > > #ifdef O_DIRECT > > if (migrate_direct_io()) { > > // reference O_DIRECT > > } > > #endif > > > > Then it should run the same, just to try making ifdefs as light as > > possible.. > > Ok. > > Just FYI, in v2 I'm adding direct-io to migration incoming side as well, > so I put this logic into a helper: > > static bool file_enable_direct_io(int *flags, Error **errp) > { > if (migrate_direct_io()) { > #ifdef O_DIRECT > *flags |= O_DIRECT; > #else > error_setg(errp, "System does not support O_DIRECT"); > error_append_hint(errp, > "Try disabling direct-io migration capability\n"); > return false; > #endif > } > > return true; > } > > But I'll apply your suggestions nonetheless. Thanks, please give it a shot, I hope it will work with either way. One thing to mention is, if you want to play with the qemu_has_direct_io() approach with no "#ifdefs", you can't keep qemu_has_direct_io() in osdep.c, but you must define it in osdep.h as static inline functions. Otherwise I think osdep.o is forced to include it as a function so that trick won't work. Just try compile without O_DIRECT should see.
On Fri, Apr 26, 2024 at 11:20:38AM -0300, Fabiano Rosas wrote: > When multifd is used along with mapped-ram, we can take benefit of a > filesystem that supports the O_DIRECT flag and perform direct I/O in > the multifd threads. This brings a significant performance improvement > because direct-io writes bypass the page cache which would otherwise > be thrashed by the multifd data which is unlikely to be needed again > in a short period of time. > > To be able to use a multifd channel opened with O_DIRECT, we must > ensure that a certain aligment is used. Filesystems usually require a > block-size alignment for direct I/O. The way to achieve this is by > enabling the mapped-ram feature, which already aligns its I/O properly > (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c). > > By setting O_DIRECT on the multifd channels, all writes to the same > file descriptor need to be aligned as well, even the ones that come > from outside multifd, such as the QEMUFile I/O from the main migration > code. This makes it impossible to use the same file descriptor for the > QEMUFile and for the multifd channels. The various flags and metadata > written by the main migration code will always be unaligned by virtue > of their small size. To workaround this issue, we'll require a second > file descriptor to be used exclusively for direct I/O. > > The second file descriptor can be obtained by QEMU by re-opening the > migration file (already possible), or by being provided by the user or > management application (support to be added in future patches). > > Signed-off-by: Fabiano Rosas <farosas@suse.de> > --- > migration/file.c | 22 +++++++++++++++++++--- > migration/migration.c | 23 +++++++++++++++++++++++ > 2 files changed, 42 insertions(+), 3 deletions(-) > > diff --git a/migration/file.c b/migration/file.c > index 8f30999400..b9265b14dd 100644 > --- a/migration/file.c > +++ b/migration/file.c > @@ -83,17 +83,33 @@ void file_cleanup_outgoing_migration(void) > > bool file_send_channel_create(gpointer opaque, Error **errp) > { > - QIOChannelFile *ioc; > + QIOChannelFile *ioc = NULL; > int flags = O_WRONLY; > - bool ret = true; > + bool ret = false; > + > + if (migrate_direct_io()) { > +#ifdef O_DIRECT > + /* > + * Enable O_DIRECT for the secondary channels. These are used > + * for sending ram pages and writes should be guaranteed to be > + * aligned to at least page size. > + */ > + flags |= O_DIRECT; > +#else > + error_setg(errp, "System does not support O_DIRECT"); > + error_append_hint(errp, > + "Try disabling direct-io migration capability\n"); > + goto out; > +#endif If we conditionalize existance of 'direct-io' feature in the QAPI schema, then the '#else' clause no longer needs to even exist as it will be unreachable - it could be a g_assert_not_reached(); > + } > > ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp); > if (!ioc) { > - ret = false; > goto out; > } > > multifd_channel_connect(opaque, QIO_CHANNEL(ioc)); > + ret = true; > > out: > /* > diff --git a/migration/migration.c b/migration/migration.c > index b5af6b5105..cb923a3f62 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -155,6 +155,16 @@ static bool migration_needs_seekable_channel(void) > return migrate_mapped_ram(); > } > > +static bool migration_needs_multiple_fds(void) > +{ > + /* > + * When doing direct-io, multifd requires two different, > + * non-duplicated file descriptors so we can use one of them for > + * unaligned IO. > + */ > + return migrate_multifd() && migrate_direct_io(); > +} > + > static bool transport_supports_seeking(MigrationAddress *addr) > { > if (addr->transport == MIGRATION_ADDRESS_TYPE_FILE) { > @@ -164,6 +174,12 @@ static bool transport_supports_seeking(MigrationAddress *addr) > return false; > } > > +static bool transport_supports_multiple_fds(MigrationAddress *addr) > +{ > + /* file: works because QEMU can open it multiple times */ > + return addr->transport == MIGRATION_ADDRESS_TYPE_FILE; > +} > + > static bool > migration_channels_and_transport_compatible(MigrationAddress *addr, > Error **errp) > @@ -180,6 +196,13 @@ migration_channels_and_transport_compatible(MigrationAddress *addr, > return false; > } > > + if (migration_needs_multiple_fds() && > + !transport_supports_multiple_fds(addr)) { > + error_setg(errp, > + "Migration requires a transport that allows for multiple fds (e.g. file)"); > + return false; > + } > + > return true; > } > > -- > 2.35.3 > With regards, Daniel
diff --git a/migration/file.c b/migration/file.c index 8f30999400..b9265b14dd 100644 --- a/migration/file.c +++ b/migration/file.c @@ -83,17 +83,33 @@ void file_cleanup_outgoing_migration(void) bool file_send_channel_create(gpointer opaque, Error **errp) { - QIOChannelFile *ioc; + QIOChannelFile *ioc = NULL; int flags = O_WRONLY; - bool ret = true; + bool ret = false; + + if (migrate_direct_io()) { +#ifdef O_DIRECT + /* + * Enable O_DIRECT for the secondary channels. These are used + * for sending ram pages and writes should be guaranteed to be + * aligned to at least page size. + */ + flags |= O_DIRECT; +#else + error_setg(errp, "System does not support O_DIRECT"); + error_append_hint(errp, + "Try disabling direct-io migration capability\n"); + goto out; +#endif + } ioc = qio_channel_file_new_path(outgoing_args.fname, flags, 0, errp); if (!ioc) { - ret = false; goto out; } multifd_channel_connect(opaque, QIO_CHANNEL(ioc)); + ret = true; out: /* diff --git a/migration/migration.c b/migration/migration.c index b5af6b5105..cb923a3f62 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -155,6 +155,16 @@ static bool migration_needs_seekable_channel(void) return migrate_mapped_ram(); } +static bool migration_needs_multiple_fds(void) +{ + /* + * When doing direct-io, multifd requires two different, + * non-duplicated file descriptors so we can use one of them for + * unaligned IO. + */ + return migrate_multifd() && migrate_direct_io(); +} + static bool transport_supports_seeking(MigrationAddress *addr) { if (addr->transport == MIGRATION_ADDRESS_TYPE_FILE) { @@ -164,6 +174,12 @@ static bool transport_supports_seeking(MigrationAddress *addr) return false; } +static bool transport_supports_multiple_fds(MigrationAddress *addr) +{ + /* file: works because QEMU can open it multiple times */ + return addr->transport == MIGRATION_ADDRESS_TYPE_FILE; +} + static bool migration_channels_and_transport_compatible(MigrationAddress *addr, Error **errp) @@ -180,6 +196,13 @@ migration_channels_and_transport_compatible(MigrationAddress *addr, return false; } + if (migration_needs_multiple_fds() && + !transport_supports_multiple_fds(addr)) { + error_setg(errp, + "Migration requires a transport that allows for multiple fds (e.g. file)"); + return false; + } + return true; }
When multifd is used along with mapped-ram, we can take benefit of a filesystem that supports the O_DIRECT flag and perform direct I/O in the multifd threads. This brings a significant performance improvement because direct-io writes bypass the page cache which would otherwise be thrashed by the multifd data which is unlikely to be needed again in a short period of time. To be able to use a multifd channel opened with O_DIRECT, we must ensure that a certain aligment is used. Filesystems usually require a block-size alignment for direct I/O. The way to achieve this is by enabling the mapped-ram feature, which already aligns its I/O properly (see MAPPED_RAM_FILE_OFFSET_ALIGNMENT at ram.c). By setting O_DIRECT on the multifd channels, all writes to the same file descriptor need to be aligned as well, even the ones that come from outside multifd, such as the QEMUFile I/O from the main migration code. This makes it impossible to use the same file descriptor for the QEMUFile and for the multifd channels. The various flags and metadata written by the main migration code will always be unaligned by virtue of their small size. To workaround this issue, we'll require a second file descriptor to be used exclusively for direct I/O. The second file descriptor can be obtained by QEMU by re-opening the migration file (already possible), or by being provided by the user or management application (support to be added in future patches). Signed-off-by: Fabiano Rosas <farosas@suse.de> --- migration/file.c | 22 +++++++++++++++++++--- migration/migration.c | 23 +++++++++++++++++++++++ 2 files changed, 42 insertions(+), 3 deletions(-)