Message ID | 20230926161841.98464-5-joao.m.martins@oracle.com |
---|---|
State | New |
Headers | show |
Series | migration: Downtime observability improvements | expand |
Hi, Joao, On Tue, Sep 26, 2023 at 05:18:40PM +0100, Joao Martins wrote: > Deliver the downtime breakdown also via `query-migrate` > to allow users to understand what their downtime value > represents. I agree downtime is an area we definitely need to improve.. however do we need to make it part of qapi? Or do we need them mostly for debugging purpose? Any introduction of motivation of this work, especially on exporting the values to the mgmt apps? > > Signed-off-by: Joao Martins <joao.m.martins@oracle.com> > --- > qapi/migration.json | 22 ++++++++++++++++++++++ > migration/migration.c | 14 ++++++++++++++ > 2 files changed, 36 insertions(+) > > diff --git a/qapi/migration.json b/qapi/migration.json > index 2d91fbcb22ff..088e1b2bf440 100644 > --- a/qapi/migration.json > +++ b/qapi/migration.json > @@ -217,6 +217,27 @@ > 'data': [ 'start', 'stop', 'precopy-iterable', 'precopy-noniterable', > 'resume-return-path' ] } > > +## > +# @DowntimeStats: > +# > +# Detailed migration downtime statistics. > +# > +# @stop: Time taken to stop the VM during switchover. > +# > +# @precopy: Time taken to save all precopy state during switchover. > +# > +# @precopy-iterable: Time taken to save all precopy iterable state. > +# > +# @precopy-noniterable: Time taken to save all precopy non iterable state. > +# > +# @resume-return-path: Time taken to resume if return path is enabled, > +# otherwise zero. All these fields will more or less duplicate the ones in the other MigrationDowntime just introduced. We suffer from duplicated fields for migration parameters, proof shows that dropping the duplication is normally harder.. I'm trying to dedup the existing Migration*Parameter* objects and still in progress, so even if we want to expose downtime in qapi I hope we can expose only one and single object. IIUC we can already do that by keeping DowntimeStats, keeing an object in MigrationState, and just drop MigrationDowntime? Thanks,
On 04/10/2023 18:10, Peter Xu wrote: > Hi, Joao, > > On Tue, Sep 26, 2023 at 05:18:40PM +0100, Joao Martins wrote: >> Deliver the downtime breakdown also via `query-migrate` >> to allow users to understand what their downtime value >> represents. > > I agree downtime is an area we definitely need to improve.. however do we > need to make it part of qapi? Or do we need them mostly for debugging > purpose? > > Any introduction of motivation of this work, especially on exporting the > values to the mgmt apps? > I added the statistics mainly for observability (e.g. you would grep in the libvirt logs for a non developer and they can understand how downtime is explained). I wasn't specifically thinking about management app using this, just broad access to the metrics. One can get the same level of observability with a BPF/dtrace/systemtap script, albeit in a less obvious way. With respect to motivation: I am doing migration with VFs and sometimes vhost-net, and the downtime/switchover is the only thing that is either non-determinisc or not captured in the migration math. There are some things that aren't accounted (e.g. vhost with enough queues will give you high downtimes), and algorithimally not really possible to account for as one needs to account every possible instruction when we quiesce the guest (or at least that's my understanding). Just having these metrics, help the developer *and* user see why such downtime is high, and maybe open up window for fixes/bug-reports or where to improve. Furthermore, hopefully these tracepoints or stats could be a starting point for developers to understand how much downtime is spent in a particular device in Qemu(as a follow-up to this series), or allow to implement bounds check limits in switchover limits in way that doesn't violate downtime-limit SLAs (I have a small set of patches for this). >> >> Signed-off-by: Joao Martins <joao.m.martins@oracle.com> >> --- >> qapi/migration.json | 22 ++++++++++++++++++++++ >> migration/migration.c | 14 ++++++++++++++ >> 2 files changed, 36 insertions(+) >> >> diff --git a/qapi/migration.json b/qapi/migration.json >> index 2d91fbcb22ff..088e1b2bf440 100644 >> --- a/qapi/migration.json >> +++ b/qapi/migration.json >> @@ -217,6 +217,27 @@ >> 'data': [ 'start', 'stop', 'precopy-iterable', 'precopy-noniterable', >> 'resume-return-path' ] } >> >> +## >> +# @DowntimeStats: >> +# >> +# Detailed migration downtime statistics. >> +# >> +# @stop: Time taken to stop the VM during switchover. >> +# >> +# @precopy: Time taken to save all precopy state during switchover. >> +# >> +# @precopy-iterable: Time taken to save all precopy iterable state. >> +# >> +# @precopy-noniterable: Time taken to save all precopy non iterable state. >> +# >> +# @resume-return-path: Time taken to resume if return path is enabled, >> +# otherwise zero. > > All these fields will more or less duplicate the ones in the other > MigrationDowntime just introduced. > > We suffer from duplicated fields for migration parameters, proof shows that > dropping the duplication is normally harder.. I'm trying to dedup the > existing Migration*Parameter* objects and still in progress, so even if we > want to expose downtime in qapi I hope we can expose only one and single > object. > Thanks for the background; I am now recalling your other series doing this sort of duplication[0] [0] https://lore.kernel.org/qemu-devel/20230905162335.235619-5-peterx@redhat.com/ > IIUC we can already do that by keeping DowntimeStats, keeing an object in > MigrationState, and just drop MigrationDowntime? > I can try that, sounds way cleaner. I didn't like the duplication either.
On Fri, Oct 06, 2023 at 12:37:15PM +0100, Joao Martins wrote: > I added the statistics mainly for observability (e.g. you would grep in the > libvirt logs for a non developer and they can understand how downtime is > explained). I wasn't specifically thinking about management app using this, just > broad access to the metrics. > > One can get the same level of observability with a BPF/dtrace/systemtap script, > albeit in a less obvious way. Makes sense. > > With respect to motivation: I am doing migration with VFs and sometimes > vhost-net, and the downtime/switchover is the only thing that is either > non-determinisc or not captured in the migration math. There are some things > that aren't accounted (e.g. vhost with enough queues will give you high > downtimes), Will this be something relevant to loading of the queues? There used to be a work on greatly reducing downtime especially for virtio scenarios over multiple queues (and iirc even 1 queue also benefits from that), it wasn't merged probably because not enough review: https://lore.kernel.org/r/20230317081904.24389-1-xuchuangxclwt@bytedance.com Though personally I think that's some direction good to keep exploring at least, maybe some slightly enhancement to that series will work for us. > and algorithimally not really possible to account for as one needs > to account every possible instruction when we quiesce the guest (or at least > that's my understanding). > > Just having these metrics, help the developer *and* user see why such downtime > is high, and maybe open up window for fixes/bug-reports or where to improve. > > Furthermore, hopefully these tracepoints or stats could be a starting point for > developers to understand how much downtime is spent in a particular device in > Qemu(as a follow-up to this series), Yes, I was actually expecting that when read the cover letter. :) This also makes sense. One thing worth mention is, the real downtime measured can, IMHO, differ on src/dst due to "pre_save" and "post_load" may not really doing similar things. IIUC it can happen that some device sents fast, but loads slow. I'm not sure whether there's reversed use case. Maybe we want to capture that on both sides on some metrics? > or allow to implement bounds check limits in switchover limits in way > that doesn't violate downtime-limit SLAs (I have a small set of patches > for this). I assume that decision will always be synchronized between src/dst in some way, or guaranteed to be same. But I can wait to read the series first. Thanks,
diff --git a/qapi/migration.json b/qapi/migration.json index 2d91fbcb22ff..088e1b2bf440 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -217,6 +217,27 @@ 'data': [ 'start', 'stop', 'precopy-iterable', 'precopy-noniterable', 'resume-return-path' ] } +## +# @DowntimeStats: +# +# Detailed migration downtime statistics. +# +# @stop: Time taken to stop the VM during switchover. +# +# @precopy: Time taken to save all precopy state during switchover. +# +# @precopy-iterable: Time taken to save all precopy iterable state. +# +# @precopy-noniterable: Time taken to save all precopy non iterable state. +# +# @resume-return-path: Time taken to resume if return path is enabled, +# otherwise zero. +# +# Since: 8.2 +## +{ 'struct': 'DowntimeStats', + 'data': {'stop': 'int64', 'precopy': 'int64', 'precopy-iterable': 'int64', + 'precopy-noniterable': 'int64', 'resume-return-path': 'int64' } } ## # @MigrationInfo: @@ -308,6 +329,7 @@ '*total-time': 'int', '*expected-downtime': 'int', '*downtime': 'int', + '*downtime-stats': 'DowntimeStats', '*setup-time': 'int', '*cpu-throttle-percentage': 'int', '*error-desc': 'str', diff --git a/migration/migration.c b/migration/migration.c index 4b5bed3eb09b..dec6c88fbff9 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -921,6 +921,19 @@ static int64_t migrate_get_downtime_resume_rp(MigrationState *s) return 0; } +static void populate_downtime_info(MigrationInfo *info, MigrationState *s) +{ + DowntimeStats *stats; + + info->downtime_stats = g_malloc0(sizeof(*info->downtime_stats)); + stats = info->downtime_stats; + stats->stop = migrate_get_downtime_stop(s); + stats->precopy_iterable = migrate_get_downtime_precopy_iterable(s); + stats->precopy_noniterable = migrate_get_downtime_precopy_noniterable(s); + stats->precopy = stats->precopy_iterable + stats->precopy_noniterable; + stats->resume_return_path = migrate_get_downtime_resume_rp(s); +} + static void populate_time_info(MigrationInfo *info, MigrationState *s) { info->has_status = true; @@ -939,6 +952,7 @@ static void populate_time_info(MigrationInfo *info, MigrationState *s) if (migrate_show_downtime(s)) { info->has_downtime = true; info->downtime = s->downtime; + populate_downtime_info(info, s); } else { info->has_expected_downtime = true; info->expected_downtime = s->expected_downtime;
Deliver the downtime breakdown also via `query-migrate` to allow users to understand what their downtime value represents. Signed-off-by: Joao Martins <joao.m.martins@oracle.com> --- qapi/migration.json | 22 ++++++++++++++++++++++ migration/migration.c | 14 ++++++++++++++ 2 files changed, 36 insertions(+)