diff mbox series

migration: Allow user to specify migration available bandwidth

Message ID 20230724170755.1114519-1-peterx@redhat.com
State New
Headers show
Series migration: Allow user to specify migration available bandwidth | expand

Commit Message

Peter Xu July 24, 2023, 5:07 p.m. UTC
Migration bandwidth is a very important value to live migration.  It's
because it's one of the major factors that we'll make decision on when to
switchover to destination in a precopy process.

This value is currently estimated by QEMU during the whole live migration
process by monitoring how fast we were sending the data.  This can be the
most accurate bandwidth if in the ideal world, where we're always feeding
unlimited data to the migration channel, and then it'll be limited to the
bandwidth that is available.

However in reality it may be very different, e.g., over a 10Gbps network we
can see query-migrate showing migration bandwidth of only a few tens of
MB/s just because there are plenty of other things the migration thread
might be doing.  For example, the migration thread can be busy scanning
zero pages, or it can be fetching dirty bitmap from other external dirty
sources (like vhost or KVM).  It means we may not be pushing data as much
as possible to migration channel, so the bandwidth estimated from "how many
data we sent in the channel" can be dramatically inaccurate sometimes,
e.g., that a few tens of MB/s even if 10Gbps available, and then the
decision to switchover will be further affected by this.

The migration may not even converge at all with the downtime specified,
with that wrong estimation of bandwidth.

The issue is QEMU itself may not be able to avoid those uncertainties on
measuing the real "available migration bandwidth".  At least not something
I can think of so far.

One way to fix this is when the user is fully aware of the available
bandwidth, then we can allow the user to help providing an accurate value.

For example, if the user has a dedicated channel of 10Gbps for migration
for this specific VM, the user can specify this bandwidth so QEMU can
always do the calculation based on this fact, trusting the user as long as
specified.

When the user wants to have migration only use 5Gbps out of that 10Gbps,
one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
other things).  So it can be useful even if the network is not dedicated,
but as long as the user can know a solid value.

A new parameter "available-bandwidth" is introduced just for this. So when
the user specified this parameter, instead of trusting the estimated value
from QEMU itself (based on the QEMUFile send speed), let's trust the user
more.

This can resolve issues like "unconvergence migration" which is caused by
hilarious low "migration bandwidth" detected for whatever reason.

Reported-by: Zhiyi Guo <zhguo@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/migration.json            | 20 +++++++++++++++++++-
 migration/migration.h          |  2 +-
 migration/options.h            |  1 +
 migration/migration-hmp-cmds.c | 14 ++++++++++++++
 migration/migration.c          | 19 +++++++++++++++----
 migration/options.c            | 28 ++++++++++++++++++++++++++++
 migration/trace-events         |  2 +-
 7 files changed, 79 insertions(+), 7 deletions(-)

Comments

Daniel P. Berrangé July 24, 2023, 6:04 p.m. UTC | #1
On Mon, Jul 24, 2023 at 01:07:55PM -0400, Peter Xu wrote:
> Migration bandwidth is a very important value to live migration.  It's
> because it's one of the major factors that we'll make decision on when to
> switchover to destination in a precopy process.

To elaborate on this for those reading along...

QEMU takes maxmimum downtime limit and multiplies by its estimate
of bandwidth. This gives a figure for the amount of data QEMU thinks
it can transfer within the downtime period.

QEMU compares this figure to the amount of data that is still pending
at the end of an iteration.

> This value is currently estimated by QEMU during the whole live migration
> process by monitoring how fast we were sending the data.  This can be the
> most accurate bandwidth if in the ideal world, where we're always feeding
> unlimited data to the migration channel, and then it'll be limited to the
> bandwidth that is available.

The QEMU estimate for available bandwidth will definitely be wrong,
potentially by orders of magnitude, if QEMU has a max bandwidth limit
set, as in that case it is never trying to push the peak rates available
from the NICs/network fabric.

> The issue is QEMU itself may not be able to avoid those uncertainties on
> measuing the real "available migration bandwidth".  At least not something
> I can think of so far.

IIUC, you can query the NIC properties to find the hardware transfer
rate of the NICs. That doesn't imply apps can actually reach that
rate in practice - it has a decent chance of being a over-estimate
of bandwidth, possibly very very much over.

Is such an over estimate better or worse than QEMU's current
under-estimate ? It depends on the POV.

From the POV of QEMU, over-estimating means means it'll be not
be throttling as much as it should. That's not a downside of
migration - it makes it more likely for migration to complete :-)

From the POV of non-QEMU apps though, if QEMU over-estimates,
it'll mean other apps get starved of network bandwidth.

Overall I agree, there's no obvious way QEMU can ever come up
with a reliable estimate for bandwidth available.

> One way to fix this is when the user is fully aware of the available
> bandwidth, then we can allow the user to help providing an accurate value.
>
> For example, if the user has a dedicated channel of 10Gbps for migration
> for this specific VM, the user can specify this bandwidth so QEMU can
> always do the calculation based on this fact, trusting the user as long as
> specified.

I can see that in theory, but when considering a non-trivial
deployments of QEMU, I wonder if the user can really have any
such certainty of what is truely avaialble. It would need
global awareness of the whole network of hosts & workloads.

> When the user wants to have migration only use 5Gbps out of that 10Gbps,
> one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
> so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
> other things).  So it can be useful even if the network is not dedicated,
> but as long as the user can know a solid value.
> 
> A new parameter "available-bandwidth" is introduced just for this. So when
> the user specified this parameter, instead of trusting the estimated value
> from QEMU itself (based on the QEMUFile send speed), let's trust the user
> more.

I feel like rather than "available-bandwidth", we should call
it "max-convergance-bandwidth".

To me that name would better reflect the fact that this isn't
really required to be a measure of how much NIC bandwidth is
available. It is merely an expression of a different bandwidth
limit to apply during switch over.

IOW

* max-bandwidth: limit during pre-copy main transfer
* max-convergance-bandwidth: limit during pre-copy switch-over
* max-postcopy-banwidth: limit during post-copy phase



> This can resolve issues like "unconvergence migration" which is caused by
> hilarious low "migration bandwidth" detected for whatever reason.
> 
> Reported-by: Zhiyi Guo <zhguo@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/migration.json            | 20 +++++++++++++++++++-
>  migration/migration.h          |  2 +-
>  migration/options.h            |  1 +
>  migration/migration-hmp-cmds.c | 14 ++++++++++++++
>  migration/migration.c          | 19 +++++++++++++++----
>  migration/options.c            | 28 ++++++++++++++++++++++++++++
>  migration/trace-events         |  2 +-
>  7 files changed, 79 insertions(+), 7 deletions(-)
> 
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 47dfef0278..fdc269e0a1 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -730,6 +730,16 @@
>  # @max-bandwidth: to set maximum speed for migration.  maximum speed
>  #     in bytes per second.  (Since 2.8)
>  #
> +# @available-bandwidth: to set available bandwidth for migration.  By
> +#     default, this value is zero, means the user is not aware of the
> +#     available bandwidth that can be used by QEMU migration, so QEMU will
> +#     estimate the bandwidth automatically.  This can be set when the
> +#     estimated value is not accurate, while the user is able to guarantee
> +#     such bandwidth is available for migration purpose during the
> +#     migration procedure.  When specified correctly, this can make the
> +#     switchover decision much more accurate, which will also be based on
> +#     the max downtime specified.  (Since 8.2)
> +#
>  # @downtime-limit: set maximum tolerated downtime for migration.
>  #     maximum downtime in milliseconds (Since 2.8)
>  #
> @@ -803,7 +813,7 @@
>             'cpu-throttle-initial', 'cpu-throttle-increment',
>             'cpu-throttle-tailslow',
>             'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
> -           'downtime-limit',
> +           'available-bandwidth', 'downtime-limit',
>             { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
>             'block-incremental',
>             'multifd-channels',
> @@ -886,6 +896,9 @@
>  # @max-bandwidth: to set maximum speed for migration.  maximum speed
>  #     in bytes per second.  (Since 2.8)
>  #
> +# @available-bandwidth: to set available migration bandwidth.  Please refer
> +#     to comments in MigrationParameter for more information. (Since 8.2)
> +#
>  # @downtime-limit: set maximum tolerated downtime for migration.
>  #     maximum downtime in milliseconds (Since 2.8)
>  #
> @@ -971,6 +984,7 @@
>              '*tls-hostname': 'StrOrNull',
>              '*tls-authz': 'StrOrNull',
>              '*max-bandwidth': 'size',
> +            '*available-bandwidth': 'size',
>              '*downtime-limit': 'uint64',
>              '*x-checkpoint-delay': { 'type': 'uint32',
>                                       'features': [ 'unstable' ] },
> @@ -1078,6 +1092,9 @@
>  # @max-bandwidth: to set maximum speed for migration.  maximum speed
>  #     in bytes per second.  (Since 2.8)
>  #
> +# @available-bandwidth: to set available migration bandwidth.  Please refer
> +#     to comments in MigrationParameter for more information. (Since 8.2)
> +#
>  # @downtime-limit: set maximum tolerated downtime for migration.
>  #     maximum downtime in milliseconds (Since 2.8)
>  #
> @@ -1160,6 +1177,7 @@
>              '*tls-hostname': 'str',
>              '*tls-authz': 'str',
>              '*max-bandwidth': 'size',
> +            '*available-bandwidth': 'size',
>              '*downtime-limit': 'uint64',
>              '*x-checkpoint-delay': { 'type': 'uint32',
>                                       'features': [ 'unstable' ] },
> diff --git a/migration/migration.h b/migration/migration.h
> index b7c8b67542..fadbf64d9d 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -283,7 +283,7 @@ struct MigrationState {
>      /*
>       * The final stage happens when the remaining data is smaller than
>       * this threshold; it's calculated from the requested downtime and
> -     * measured bandwidth
> +     * measured bandwidth, or available-bandwidth if user specified.
>       */
>      int64_t threshold_size;
>  
> diff --git a/migration/options.h b/migration/options.h
> index 9aaf363322..ed2d9c92e7 100644
> --- a/migration/options.h
> +++ b/migration/options.h
> @@ -79,6 +79,7 @@ int migrate_decompress_threads(void);
>  uint64_t migrate_downtime_limit(void);
>  uint8_t migrate_max_cpu_throttle(void);
>  uint64_t migrate_max_bandwidth(void);
> +uint64_t migrate_available_bandwidth(void);
>  uint64_t migrate_max_postcopy_bandwidth(void);
>  int migrate_multifd_channels(void);
>  MultiFDCompression migrate_multifd_compression(void);
> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> index 9885d7c9f7..53fbe1b1af 100644
> --- a/migration/migration-hmp-cmds.c
> +++ b/migration/migration-hmp-cmds.c
> @@ -311,6 +311,10 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>          monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_MAX_BANDWIDTH),
>              params->max_bandwidth);
> +        assert(params->has_available_bandwidth);
> +        monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
> +            MigrationParameter_str(MIGRATION_PARAMETER_AVAILABLE_BANDWIDTH),
> +            params->available_bandwidth);
>          assert(params->has_downtime_limit);
>          monitor_printf(mon, "%s: %" PRIu64 " ms\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_DOWNTIME_LIMIT),
> @@ -556,6 +560,16 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
>          }
>          p->max_bandwidth = valuebw;
>          break;
> +    case MIGRATION_PARAMETER_AVAILABLE_BANDWIDTH:
> +        p->has_available_bandwidth = true;
> +        ret = qemu_strtosz_MiB(valuestr, NULL, &valuebw);
> +        if (ret < 0 || valuebw > INT64_MAX
> +            || (size_t)valuebw != valuebw) {
> +            error_setg(&err, "Invalid size %s", valuestr);
> +            break;
> +        }
> +        p->available_bandwidth = valuebw;
> +        break;
>      case MIGRATION_PARAMETER_DOWNTIME_LIMIT:
>          p->has_downtime_limit = true;
>          visit_type_size(v, param, &p->downtime_limit, &err);
> diff --git a/migration/migration.c b/migration/migration.c
> index 91bba630a8..391ddfd8ce 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -2671,7 +2671,7 @@ static void migration_update_counters(MigrationState *s,
>  {
>      uint64_t transferred, transferred_pages, time_spent;
>      uint64_t current_bytes; /* bytes transferred since the beginning */
> -    double bandwidth;
> +    double bandwidth, avail_bw;
>  
>      if (current_time < s->iteration_start_time + BUFFER_DELAY) {
>          return;
> @@ -2681,7 +2681,17 @@ static void migration_update_counters(MigrationState *s,
>      transferred = current_bytes - s->iteration_initial_bytes;
>      time_spent = current_time - s->iteration_start_time;
>      bandwidth = (double)transferred / time_spent;
> -    s->threshold_size = bandwidth * migrate_downtime_limit();
> +    if (migrate_available_bandwidth()) {
> +        /*
> +         * If the user specified an available bandwidth, let's trust the
> +         * user so that can be more accurate than what we estimated.
> +         */
> +        avail_bw = migrate_available_bandwidth();
> +    } else {
> +        /* If the user doesn't specify bandwidth, we use the estimated */
> +        avail_bw = bandwidth;
> +    }
> +    s->threshold_size = avail_bw * migrate_downtime_limit();
>  
>      s->mbps = (((double) transferred * 8.0) /
>                 ((double) time_spent / 1000.0)) / 1000.0 / 1000.0;
> @@ -2698,7 +2708,7 @@ static void migration_update_counters(MigrationState *s,
>      if (stat64_get(&mig_stats.dirty_pages_rate) &&
>          transferred > 10000) {
>          s->expected_downtime =
> -            stat64_get(&mig_stats.dirty_bytes_last_sync) / bandwidth;
> +            stat64_get(&mig_stats.dirty_bytes_last_sync) / avail_bw;
>      }
>  
>      migration_rate_reset(s->to_dst_file);
> @@ -2706,7 +2716,8 @@ static void migration_update_counters(MigrationState *s,
>      update_iteration_initial_status(s);
>  
>      trace_migrate_transferred(transferred, time_spent,
> -                              bandwidth, s->threshold_size);
> +                              bandwidth, migrate_available_bandwidth(),
> +                              s->threshold_size);
>  }
>  
>  static bool migration_can_switchover(MigrationState *s)
> diff --git a/migration/options.c b/migration/options.c
> index 5a9505adf7..160faca71a 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -121,6 +121,8 @@ Property migration_properties[] = {
>                        parameters.cpu_throttle_tailslow, false),
>      DEFINE_PROP_SIZE("x-max-bandwidth", MigrationState,
>                        parameters.max_bandwidth, MAX_THROTTLE),
> +    DEFINE_PROP_SIZE("available-bandwidth", MigrationState,
> +                      parameters.available_bandwidth, 0),
>      DEFINE_PROP_UINT64("x-downtime-limit", MigrationState,
>                        parameters.downtime_limit,
>                        DEFAULT_MIGRATE_SET_DOWNTIME),
> @@ -735,6 +737,13 @@ uint64_t migrate_max_bandwidth(void)
>      return s->parameters.max_bandwidth;
>  }
>  
> +uint64_t migrate_available_bandwidth(void)
> +{
> +    MigrationState *s = migrate_get_current();
> +
> +    return s->parameters.available_bandwidth;
> +}
> +
>  uint64_t migrate_max_postcopy_bandwidth(void)
>  {
>      MigrationState *s = migrate_get_current();
> @@ -872,6 +881,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>                                   s->parameters.tls_authz : "");
>      params->has_max_bandwidth = true;
>      params->max_bandwidth = s->parameters.max_bandwidth;
> +    params->has_available_bandwidth = true;
> +    params->available_bandwidth = s->parameters.available_bandwidth;
>      params->has_downtime_limit = true;
>      params->downtime_limit = s->parameters.downtime_limit;
>      params->has_x_checkpoint_delay = true;
> @@ -1004,6 +1015,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
>          return false;
>      }
>  
> +    if (params->has_available_bandwidth &&
> +        (params->available_bandwidth > SIZE_MAX)) {
> +        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> +                   "available_bandwidth",
> +                   "an integer in the range of 0 to "stringify(SIZE_MAX)
> +                   " bytes/second");
> +        return false;
> +    }
> +
>      if (params->has_downtime_limit &&
>          (params->downtime_limit > MAX_MIGRATE_DOWNTIME)) {
>          error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
> @@ -1156,6 +1176,10 @@ static void migrate_params_test_apply(MigrateSetParameters *params,
>          dest->max_bandwidth = params->max_bandwidth;
>      }
>  
> +    if (params->has_available_bandwidth) {
> +        dest->available_bandwidth = params->available_bandwidth;
> +    }
> +
>      if (params->has_downtime_limit) {
>          dest->downtime_limit = params->downtime_limit;
>      }
> @@ -1264,6 +1288,10 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
>          }
>      }
>  
> +    if (params->has_available_bandwidth) {
> +        s->parameters.available_bandwidth = params->available_bandwidth;
> +    }
> +
>      if (params->has_downtime_limit) {
>          s->parameters.downtime_limit = params->downtime_limit;
>      }
> diff --git a/migration/trace-events b/migration/trace-events
> index 5259c1044b..fffed1f995 100644
> --- a/migration/trace-events
> +++ b/migration/trace-events
> @@ -184,7 +184,7 @@ source_return_path_thread_shut(uint32_t val) "0x%x"
>  source_return_path_thread_resume_ack(uint32_t v) "%"PRIu32
>  source_return_path_thread_switchover_acked(void) ""
>  migration_thread_low_pending(uint64_t pending) "%" PRIu64
> -migrate_transferred(uint64_t tranferred, uint64_t time_spent, uint64_t bandwidth, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " max_size %" PRId64
> +migrate_transferred(uint64_t tranferred, uint64_t time_spent, uint64_t bandwidth, uint64_t avail_bw, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " avail_bw %" PRIu64 " max_size %" PRId64
>  process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d"
>  process_incoming_migration_co_postcopy_end_main(void) ""
>  postcopy_preempt_enabled(bool value) "%d"
> -- 
> 2.41.0
> 

With regards,
Daniel
Peter Maydell July 24, 2023, 6:12 p.m. UTC | #2
On Mon, 24 Jul 2023 at 19:05, Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> I feel like rather than "available-bandwidth", we should call
> it "max-convergance-bandwidth".

"convergence" (I mention only since it's a proposed
user-visible bit of API :-)

-- PMM
Peter Xu July 24, 2023, 7:47 p.m. UTC | #3
On Mon, Jul 24, 2023 at 07:04:29PM +0100, Daniel P. Berrangé wrote:
> On Mon, Jul 24, 2023 at 01:07:55PM -0400, Peter Xu wrote:
> > Migration bandwidth is a very important value to live migration.  It's
> > because it's one of the major factors that we'll make decision on when to
> > switchover to destination in a precopy process.
> 
> To elaborate on this for those reading along...
> 
> QEMU takes maxmimum downtime limit and multiplies by its estimate
> of bandwidth. This gives a figure for the amount of data QEMU thinks
> it can transfer within the downtime period.
> 
> QEMU compares this figure to the amount of data that is still pending
> at the end of an iteration.
> 
> > This value is currently estimated by QEMU during the whole live migration
> > process by monitoring how fast we were sending the data.  This can be the
> > most accurate bandwidth if in the ideal world, where we're always feeding
> > unlimited data to the migration channel, and then it'll be limited to the
> > bandwidth that is available.
> 
> The QEMU estimate for available bandwidth will definitely be wrong,
> potentially by orders of magnitude, if QEMU has a max bandwidth limit
> set, as in that case it is never trying to push the peak rates available
> from the NICs/network fabric.
> 
> > The issue is QEMU itself may not be able to avoid those uncertainties on
> > measuing the real "available migration bandwidth".  At least not something
> > I can think of so far.
> 
> IIUC, you can query the NIC properties to find the hardware transfer
> rate of the NICs. That doesn't imply apps can actually reach that
> rate in practice - it has a decent chance of being a over-estimate
> of bandwidth, possibly very very much over.
> 
> Is such an over estimate better or worse than QEMU's current
> under-estimate ? It depends on the POV.
> 
> From the POV of QEMU, over-estimating means means it'll be not
> be throttling as much as it should. That's not a downside of
> migration - it makes it more likely for migration to complete :-)

Heh. :)

> 
> From the POV of non-QEMU apps though, if QEMU over-estimates,
> it'll mean other apps get starved of network bandwidth.
> 
> Overall I agree, there's no obvious way QEMU can ever come up
> with a reliable estimate for bandwidth available.
> 
> > One way to fix this is when the user is fully aware of the available
> > bandwidth, then we can allow the user to help providing an accurate value.
> >
> > For example, if the user has a dedicated channel of 10Gbps for migration
> > for this specific VM, the user can specify this bandwidth so QEMU can
> > always do the calculation based on this fact, trusting the user as long as
> > specified.
> 
> I can see that in theory, but when considering a non-trivial
> deployments of QEMU, I wonder if the user can really have any
> such certainty of what is truely avaialble. It would need
> global awareness of the whole network of hosts & workloads.

Indeed it may or may not be easy always.

The good thing about this parameter is we always use the old estimation if
the user can't specify anything valid, so this is always optional not
required.

It solves the cases where the user can still specify accurately on the bw -
our QE team has already verified that it worked for us on GPU tests, where
it used to not be able to migrate at all with any sane downtime specified.
I should have attached a Tested-By from Zhiyi but since this is not exactly
the patch he was using I didn't.

> 
> > When the user wants to have migration only use 5Gbps out of that 10Gbps,
> > one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
> > so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
> > other things).  So it can be useful even if the network is not dedicated,
> > but as long as the user can know a solid value.
> > 
> > A new parameter "available-bandwidth" is introduced just for this. So when
> > the user specified this parameter, instead of trusting the estimated value
> > from QEMU itself (based on the QEMUFile send speed), let's trust the user
> > more.
> 
> I feel like rather than "available-bandwidth", we should call
> it "max-convergance-bandwidth".
> 
> To me that name would better reflect the fact that this isn't
> really required to be a measure of how much NIC bandwidth is
> available. It is merely an expression of a different bandwidth
> limit to apply during switch over.
> 
> IOW
> 
> * max-bandwidth: limit during pre-copy main transfer
> * max-convergance-bandwidth: limit during pre-copy switch-over
> * max-postcopy-banwidth: limit during post-copy phase

I worry the new name suggested is not straightforward enough at the 1st
glance, even to me as a developer.

"available-bandwidth" doesn't even bind that value to "convergence" at all,
even though it was for solving this specific problem here. I wanted to make
this parameter sololy for the admin to answer the question "how much
bandwidth is available to QEMU migration in general?"  That's pretty much
straightforward IMHO.  With that, it's pretty sane to consider using all we
have during switchover (aka, unlimited bandwidth, as fast as possible).

Maybe at some point we can even leverage this information for other purpose
rather than making the migration converge.

Thanks,
Daniel P. Berrangé July 25, 2023, 9:16 a.m. UTC | #4
On Mon, Jul 24, 2023 at 03:47:50PM -0400, Peter Xu wrote:
> On Mon, Jul 24, 2023 at 07:04:29PM +0100, Daniel P. Berrangé wrote:
> > On Mon, Jul 24, 2023 at 01:07:55PM -0400, Peter Xu wrote:
> > > Migration bandwidth is a very important value to live migration.  It's
> > > because it's one of the major factors that we'll make decision on when to
> > > switchover to destination in a precopy process.
> > 
> > To elaborate on this for those reading along...
> > 
> > QEMU takes maxmimum downtime limit and multiplies by its estimate
> > of bandwidth. This gives a figure for the amount of data QEMU thinks
> > it can transfer within the downtime period.
> > 
> > QEMU compares this figure to the amount of data that is still pending
> > at the end of an iteration.
> > 
> > > This value is currently estimated by QEMU during the whole live migration
> > > process by monitoring how fast we were sending the data.  This can be the
> > > most accurate bandwidth if in the ideal world, where we're always feeding
> > > unlimited data to the migration channel, and then it'll be limited to the
> > > bandwidth that is available.
> > 
> > The QEMU estimate for available bandwidth will definitely be wrong,
> > potentially by orders of magnitude, if QEMU has a max bandwidth limit
> > set, as in that case it is never trying to push the peak rates available
> > from the NICs/network fabric.
> > 
> > > The issue is QEMU itself may not be able to avoid those uncertainties on
> > > measuing the real "available migration bandwidth".  At least not something
> > > I can think of so far.
> > 
> > IIUC, you can query the NIC properties to find the hardware transfer
> > rate of the NICs. That doesn't imply apps can actually reach that
> > rate in practice - it has a decent chance of being a over-estimate
> > of bandwidth, possibly very very much over.
> > 
> > Is such an over estimate better or worse than QEMU's current
> > under-estimate ? It depends on the POV.
> > 
> > From the POV of QEMU, over-estimating means means it'll be not
> > be throttling as much as it should. That's not a downside of
> > migration - it makes it more likely for migration to complete :-)
> 
> Heh. :)
> 
> > 
> > From the POV of non-QEMU apps though, if QEMU over-estimates,
> > it'll mean other apps get starved of network bandwidth.
> > 
> > Overall I agree, there's no obvious way QEMU can ever come up
> > with a reliable estimate for bandwidth available.
> > 
> > > One way to fix this is when the user is fully aware of the available
> > > bandwidth, then we can allow the user to help providing an accurate value.
> > >
> > > For example, if the user has a dedicated channel of 10Gbps for migration
> > > for this specific VM, the user can specify this bandwidth so QEMU can
> > > always do the calculation based on this fact, trusting the user as long as
> > > specified.
> > 
> > I can see that in theory, but when considering a non-trivial
> > deployments of QEMU, I wonder if the user can really have any
> > such certainty of what is truely avaialble. It would need
> > global awareness of the whole network of hosts & workloads.
> 
> Indeed it may or may not be easy always.
> 
> The good thing about this parameter is we always use the old estimation if
> the user can't specify anything valid, so this is always optional not
> required.
> 
> It solves the cases where the user can still specify accurately on the bw -
> our QE team has already verified that it worked for us on GPU tests, where
> it used to not be able to migrate at all with any sane downtime specified.
> I should have attached a Tested-By from Zhiyi but since this is not exactly
> the patch he was using I didn't.
> 
> > 
> > > When the user wants to have migration only use 5Gbps out of that 10Gbps,
> > > one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
> > > so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
> > > other things).  So it can be useful even if the network is not dedicated,
> > > but as long as the user can know a solid value.
> > > 
> > > A new parameter "available-bandwidth" is introduced just for this. So when
> > > the user specified this parameter, instead of trusting the estimated value
> > > from QEMU itself (based on the QEMUFile send speed), let's trust the user
> > > more.
> > 
> > I feel like rather than "available-bandwidth", we should call
> > it "max-convergance-bandwidth".
> > 
> > To me that name would better reflect the fact that this isn't
> > really required to be a measure of how much NIC bandwidth is
> > available. It is merely an expression of a different bandwidth
> > limit to apply during switch over.
> > 
> > IOW
> > 
> > * max-bandwidth: limit during pre-copy main transfer
> > * max-convergance-bandwidth: limit during pre-copy switch-over
> > * max-postcopy-banwidth: limit during post-copy phase
> 
> I worry the new name suggested is not straightforward enough at the 1st
> glance, even to me as a developer.
> 
> "available-bandwidth" doesn't even bind that value to "convergence" at all,
> even though it was for solving this specific problem here. I wanted to make
> this parameter sololy for the admin to answer the question "how much
> bandwidth is available to QEMU migration in general?"  That's pretty much
> straightforward IMHO.  With that, it's pretty sane to consider using all we
> have during switchover (aka, unlimited bandwidth, as fast as possible).
> 
> Maybe at some point we can even leverage this information for other purpose
> rather than making the migration converge.

The flipside is that the semantics & limits we want for convergance
are already known to be different from what we wanted for pre-copy
and post-copy. With that existing practice, it is probably more
likely that we would not want to re-use the same setting for different
cases, which makes me think a specifically targetted parameter is
better.

With regards,
Daniel
Markus Armbruster July 25, 2023, 11:10 a.m. UTC | #5
Peter Xu <peterx@redhat.com> writes:

> Migration bandwidth is a very important value to live migration.  It's
> because it's one of the major factors that we'll make decision on when to
> switchover to destination in a precopy process.
>
> This value is currently estimated by QEMU during the whole live migration
> process by monitoring how fast we were sending the data.  This can be the
> most accurate bandwidth if in the ideal world, where we're always feeding
> unlimited data to the migration channel, and then it'll be limited to the
> bandwidth that is available.
>
> However in reality it may be very different, e.g., over a 10Gbps network we
> can see query-migrate showing migration bandwidth of only a few tens of
> MB/s just because there are plenty of other things the migration thread
> might be doing.  For example, the migration thread can be busy scanning
> zero pages, or it can be fetching dirty bitmap from other external dirty
> sources (like vhost or KVM).  It means we may not be pushing data as much
> as possible to migration channel, so the bandwidth estimated from "how many
> data we sent in the channel" can be dramatically inaccurate sometimes,
> e.g., that a few tens of MB/s even if 10Gbps available, and then the
> decision to switchover will be further affected by this.
>
> The migration may not even converge at all with the downtime specified,
> with that wrong estimation of bandwidth.
>
> The issue is QEMU itself may not be able to avoid those uncertainties on
> measuing the real "available migration bandwidth".  At least not something
> I can think of so far.
>
> One way to fix this is when the user is fully aware of the available
> bandwidth, then we can allow the user to help providing an accurate value.
>
> For example, if the user has a dedicated channel of 10Gbps for migration
> for this specific VM, the user can specify this bandwidth so QEMU can
> always do the calculation based on this fact, trusting the user as long as
> specified.
>
> When the user wants to have migration only use 5Gbps out of that 10Gbps,
> one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
> so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
> other things).  So it can be useful even if the network is not dedicated,
> but as long as the user can know a solid value.
>
> A new parameter "available-bandwidth" is introduced just for this. So when
> the user specified this parameter, instead of trusting the estimated value
> from QEMU itself (based on the QEMUFile send speed), let's trust the user
> more.
>
> This can resolve issues like "unconvergence migration" which is caused by
> hilarious low "migration bandwidth" detected for whatever reason.
>
> Reported-by: Zhiyi Guo <zhguo@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/migration.json            | 20 +++++++++++++++++++-
>  migration/migration.h          |  2 +-
>  migration/options.h            |  1 +
>  migration/migration-hmp-cmds.c | 14 ++++++++++++++
>  migration/migration.c          | 19 +++++++++++++++----
>  migration/options.c            | 28 ++++++++++++++++++++++++++++
>  migration/trace-events         |  2 +-
>  7 files changed, 79 insertions(+), 7 deletions(-)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 47dfef0278..fdc269e0a1 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -730,6 +730,16 @@
>  # @max-bandwidth: to set maximum speed for migration.  maximum speed
>  #     in bytes per second.  (Since 2.8)
>  #
> +# @available-bandwidth: to set available bandwidth for migration.  By
> +#     default, this value is zero, means the user is not aware of the
> +#     available bandwidth that can be used by QEMU migration, so QEMU will
> +#     estimate the bandwidth automatically.  This can be set when the
> +#     estimated value is not accurate, while the user is able to guarantee
> +#     such bandwidth is available for migration purpose during the
> +#     migration procedure.  When specified correctly, this can make the
> +#     switchover decision much more accurate, which will also be based on
> +#     the max downtime specified.  (Since 8.2)

Humor me: break lines slightly earlier, like

   # @available-bandwidth: to set available bandwidth for migration.  By
   #     default, this value is zero, means the user is not aware of the
   #     available bandwidth that can be used by QEMU migration, so QEMU
   #     will estimate the bandwidth automatically.  This can be set when
   #     the estimated value is not accurate, while the user is able to
   #     guarantee such bandwidth is available for migration purpose
   #     during the migration procedure.  When specified correctly, this
   #     can make the switchover decision much more accurate, which will
   #     also be based on the max downtime specified.  (Since 8.2)

> +#
>  # @downtime-limit: set maximum tolerated downtime for migration.
>  #     maximum downtime in milliseconds (Since 2.8)
>  #
> @@ -803,7 +813,7 @@
>             'cpu-throttle-initial', 'cpu-throttle-increment',
>             'cpu-throttle-tailslow',
>             'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
> -           'downtime-limit',
> +           'available-bandwidth', 'downtime-limit',
>             { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
>             'block-incremental',
>             'multifd-channels',
> @@ -886,6 +896,9 @@
>  # @max-bandwidth: to set maximum speed for migration.  maximum speed
>  #     in bytes per second.  (Since 2.8)
>  #
> +# @available-bandwidth: to set available migration bandwidth.  Please refer
> +#     to comments in MigrationParameter for more information. (Since 8.2)

For better or worse, we duplicate full documentation between
MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
would be the first instance where we reference instead.  I'm not opposed
to use references, but if we do, I want them used consistently.


> +#
>  # @downtime-limit: set maximum tolerated downtime for migration.
>  #     maximum downtime in milliseconds (Since 2.8)
>  #
> @@ -971,6 +984,7 @@
>              '*tls-hostname': 'StrOrNull',
>              '*tls-authz': 'StrOrNull',
>              '*max-bandwidth': 'size',
> +            '*available-bandwidth': 'size',
>              '*downtime-limit': 'uint64',
>              '*x-checkpoint-delay': { 'type': 'uint32',
>                                       'features': [ 'unstable' ] },
> @@ -1078,6 +1092,9 @@
>  # @max-bandwidth: to set maximum speed for migration.  maximum speed
>  #     in bytes per second.  (Since 2.8)
>  #
> +# @available-bandwidth: to set available migration bandwidth.  Please refer
> +#     to comments in MigrationParameter for more information. (Since 8.2)
> +#
>  # @downtime-limit: set maximum tolerated downtime for migration.
>  #     maximum downtime in milliseconds (Since 2.8)
>  #
> @@ -1160,6 +1177,7 @@
>              '*tls-hostname': 'str',
>              '*tls-authz': 'str',
>              '*max-bandwidth': 'size',
> +            '*available-bandwidth': 'size',
>              '*downtime-limit': 'uint64',
>              '*x-checkpoint-delay': { 'type': 'uint32',
>                                       'features': [ 'unstable' ] },
> diff --git a/migration/migration.h b/migration/migration.h
> index b7c8b67542..fadbf64d9d 100644
> --- a/migration/migration.h
> +++ b/migration/migration.h
> @@ -283,7 +283,7 @@ struct MigrationState {
>      /*
>       * The final stage happens when the remaining data is smaller than
>       * this threshold; it's calculated from the requested downtime and
> -     * measured bandwidth
> +     * measured bandwidth, or available-bandwidth if user specified.

Suggest to scratch "user".

>       */
>      int64_t threshold_size;
>  

[...]
Peter Xu July 25, 2023, 3:54 p.m. UTC | #6
On Tue, Jul 25, 2023 at 10:16:52AM +0100, Daniel P. Berrangé wrote:
> On Mon, Jul 24, 2023 at 03:47:50PM -0400, Peter Xu wrote:
> > On Mon, Jul 24, 2023 at 07:04:29PM +0100, Daniel P. Berrangé wrote:
> > > On Mon, Jul 24, 2023 at 01:07:55PM -0400, Peter Xu wrote:
> > > > Migration bandwidth is a very important value to live migration.  It's
> > > > because it's one of the major factors that we'll make decision on when to
> > > > switchover to destination in a precopy process.
> > > 
> > > To elaborate on this for those reading along...
> > > 
> > > QEMU takes maxmimum downtime limit and multiplies by its estimate
> > > of bandwidth. This gives a figure for the amount of data QEMU thinks
> > > it can transfer within the downtime period.
> > > 
> > > QEMU compares this figure to the amount of data that is still pending
> > > at the end of an iteration.
> > > 
> > > > This value is currently estimated by QEMU during the whole live migration
> > > > process by monitoring how fast we were sending the data.  This can be the
> > > > most accurate bandwidth if in the ideal world, where we're always feeding
> > > > unlimited data to the migration channel, and then it'll be limited to the
> > > > bandwidth that is available.
> > > 
> > > The QEMU estimate for available bandwidth will definitely be wrong,
> > > potentially by orders of magnitude, if QEMU has a max bandwidth limit
> > > set, as in that case it is never trying to push the peak rates available
> > > from the NICs/network fabric.
> > > 
> > > > The issue is QEMU itself may not be able to avoid those uncertainties on
> > > > measuing the real "available migration bandwidth".  At least not something
> > > > I can think of so far.
> > > 
> > > IIUC, you can query the NIC properties to find the hardware transfer
> > > rate of the NICs. That doesn't imply apps can actually reach that
> > > rate in practice - it has a decent chance of being a over-estimate
> > > of bandwidth, possibly very very much over.
> > > 
> > > Is such an over estimate better or worse than QEMU's current
> > > under-estimate ? It depends on the POV.
> > > 
> > > From the POV of QEMU, over-estimating means means it'll be not
> > > be throttling as much as it should. That's not a downside of
> > > migration - it makes it more likely for migration to complete :-)
> > 
> > Heh. :)
> > 
> > > 
> > > From the POV of non-QEMU apps though, if QEMU over-estimates,
> > > it'll mean other apps get starved of network bandwidth.
> > > 
> > > Overall I agree, there's no obvious way QEMU can ever come up
> > > with a reliable estimate for bandwidth available.
> > > 
> > > > One way to fix this is when the user is fully aware of the available
> > > > bandwidth, then we can allow the user to help providing an accurate value.
> > > >
> > > > For example, if the user has a dedicated channel of 10Gbps for migration
> > > > for this specific VM, the user can specify this bandwidth so QEMU can
> > > > always do the calculation based on this fact, trusting the user as long as
> > > > specified.
> > > 
> > > I can see that in theory, but when considering a non-trivial
> > > deployments of QEMU, I wonder if the user can really have any
> > > such certainty of what is truely avaialble. It would need
> > > global awareness of the whole network of hosts & workloads.
> > 
> > Indeed it may or may not be easy always.
> > 
> > The good thing about this parameter is we always use the old estimation if
> > the user can't specify anything valid, so this is always optional not
> > required.
> > 
> > It solves the cases where the user can still specify accurately on the bw -
> > our QE team has already verified that it worked for us on GPU tests, where
> > it used to not be able to migrate at all with any sane downtime specified.
> > I should have attached a Tested-By from Zhiyi but since this is not exactly
> > the patch he was using I didn't.
> > 
> > > 
> > > > When the user wants to have migration only use 5Gbps out of that 10Gbps,
> > > > one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
> > > > so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
> > > > other things).  So it can be useful even if the network is not dedicated,
> > > > but as long as the user can know a solid value.
> > > > 
> > > > A new parameter "available-bandwidth" is introduced just for this. So when
> > > > the user specified this parameter, instead of trusting the estimated value
> > > > from QEMU itself (based on the QEMUFile send speed), let's trust the user
> > > > more.
> > > 
> > > I feel like rather than "available-bandwidth", we should call
> > > it "max-convergance-bandwidth".
> > > 
> > > To me that name would better reflect the fact that this isn't
> > > really required to be a measure of how much NIC bandwidth is
> > > available. It is merely an expression of a different bandwidth
> > > limit to apply during switch over.
> > > 
> > > IOW
> > > 
> > > * max-bandwidth: limit during pre-copy main transfer
> > > * max-convergance-bandwidth: limit during pre-copy switch-over
> > > * max-postcopy-banwidth: limit during post-copy phase
> > 
> > I worry the new name suggested is not straightforward enough at the 1st
> > glance, even to me as a developer.
> > 
> > "available-bandwidth" doesn't even bind that value to "convergence" at all,
> > even though it was for solving this specific problem here. I wanted to make
> > this parameter sololy for the admin to answer the question "how much
> > bandwidth is available to QEMU migration in general?"  That's pretty much
> > straightforward IMHO.  With that, it's pretty sane to consider using all we
> > have during switchover (aka, unlimited bandwidth, as fast as possible).
> > 
> > Maybe at some point we can even leverage this information for other purpose
> > rather than making the migration converge.
> 
> The flipside is that the semantics & limits we want for convergance
> are already known to be different from what we wanted for pre-copy
> and post-copy. With that existing practice, it is probably more
> likely that we would not want to re-use the same setting for different
> cases, which makes me think a specifically targetted parameter is
> better.

We can make the semantics specific, no strong opinion here.  I wished it
can be as generic / easy as possible but maybe I went too far.

Though, is there anything else we can choose from besides
"max-convergence-bandwidth"? Or am I the only one that thinks it's hard to
understand when put "max" and "convergence" together?

When I take one step back to look at the whole "bandwidth" parameters, I am
not sure why we'd even need both "convergence" and "postcopy" bandwidth
being separate.  With my current understanding of migration, we may
actually need:

  - One bandwidth that we may want to run the background migration, aka,
    precopy migration, where we don't rush on pushing data.

  - One bandwidth that is whatever we can have maximum; for dedicated NIC
    that's the line speed.  We should always use this full speed for
    important things.  I'd say postcopy falls into this, and this
    "convergence" calculation should also rely on this.

So another way to do this is we leverage the existing "postcopy-bandwidth"
for calculation when set, it may help us to shrink the bandwidth values to
two, but I'm not sure whether the name can be confusing too.

Thanks,
Daniel P. Berrangé July 25, 2023, 4:09 p.m. UTC | #7
On Tue, Jul 25, 2023 at 11:54:52AM -0400, Peter Xu wrote:
> We can make the semantics specific, no strong opinion here.  I wished it
> can be as generic / easy as possible but maybe I went too far.
> 
> Though, is there anything else we can choose from besides
> "max-convergence-bandwidth"? Or am I the only one that thinks it's hard to
> understand when put "max" and "convergence" together?
> 
> When I take one step back to look at the whole "bandwidth" parameters, I am
> not sure why we'd even need both "convergence" and "postcopy" bandwidth
> being separate.  With my current understanding of migration, we may
> actually need:
> 
>   - One bandwidth that we may want to run the background migration, aka,
>     precopy migration, where we don't rush on pushing data.
> 
>   - One bandwidth that is whatever we can have maximum; for dedicated NIC
>     that's the line speed.  We should always use this full speed for
>     important things.  I'd say postcopy falls into this, and this
>     "convergence" calculation should also rely on this.

I don't think postcopy should be assumed to run at line speed.

At the point where you flip to post-copy mode, there could
conceivably still be GB's worth of data still dirty and
pending transfer.

The migration convergance step is reasonable to put at line
speed, because the max downtime parameter caps how long this
burst will be, genrally to some fraction of a second.

Once in post-copy mode, while the remaining data to transfer
is finite, the wall clock time to complete that transfer may
still be huge. It is unreasonable to assume users want to
run at max linespeed for many minutes to finish post-copy
at least in terms of the background transfer. You could make
a  case for the page fault handling to run at a higher bandwidth
cap than the background transfer, but I think it is still probably
not reasonable to run page fault fetches at line speed by default.

IOW, I don't think we can put the same bandwidth limit on the
short convergance operation, as on the longer post-copy operation.

With regards,
Daniel
Peter Xu July 25, 2023, 4:38 p.m. UTC | #8
On Tue, Jul 25, 2023 at 05:09:57PM +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 25, 2023 at 11:54:52AM -0400, Peter Xu wrote:
> > We can make the semantics specific, no strong opinion here.  I wished it
> > can be as generic / easy as possible but maybe I went too far.
> > 
> > Though, is there anything else we can choose from besides
> > "max-convergence-bandwidth"? Or am I the only one that thinks it's hard to
> > understand when put "max" and "convergence" together?
> > 
> > When I take one step back to look at the whole "bandwidth" parameters, I am
> > not sure why we'd even need both "convergence" and "postcopy" bandwidth
> > being separate.  With my current understanding of migration, we may
> > actually need:
> > 
> >   - One bandwidth that we may want to run the background migration, aka,
> >     precopy migration, where we don't rush on pushing data.
> > 
> >   - One bandwidth that is whatever we can have maximum; for dedicated NIC
> >     that's the line speed.  We should always use this full speed for
> >     important things.  I'd say postcopy falls into this, and this
> >     "convergence" calculation should also rely on this.
> 
> I don't think postcopy should be assumed to run at line speed.
> 
> At the point where you flip to post-copy mode, there could
> conceivably still be GB's worth of data still dirty and
> pending transfer.
> 
> The migration convergance step is reasonable to put at line
> speed, because the max downtime parameter caps how long this
> burst will be, genrally to some fraction of a second.
> 
> Once in post-copy mode, while the remaining data to transfer
> is finite, the wall clock time to complete that transfer may
> still be huge. It is unreasonable to assume users want to
> run at max linespeed for many minutes to finish post-copy
> at least in terms of the background transfer. You could make
> a  case for the page fault handling to run at a higher bandwidth
> cap than the background transfer, but I think it is still probably
> not reasonable to run page fault fetches at line speed by default.
> 
> IOW, I don't think we can put the same bandwidth limit on the
> short convergance operation, as on the longer post-copy operation.

Postcopy still heavily affects the performance of the VM for the whole
duration, and afaiu that's so far the major issue (after we fix postcopy
interruptions with recovery capability) that postcopy may not be wanted in
many cases.

If I am the admin I'd want it to run at full speed even if the pages were
not directly requested just to shrink the duration of postcopy; I'd just
want to make sure requested pages are queued sooner.

But that's okay if any of us still thinks that three values would be
helpful here, because we can simply have the latter two having the same
value when we want.  Three is the superset of two anyway.

I see you used "convergance" explicitly even after PeterM's reply, is that
what you prefer over "convergence"?  I do see more occurances of
"convergence" as a word in migration context, though.  Any better name you
can come up with, before I just go with "max-convergence-bandwidth" (I
really cannot come up with anything better than this or available-bandwidth
for now)?

Thanks,
Peter Xu July 25, 2023, 4:42 p.m. UTC | #9
Hi, Markus,

On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > Migration bandwidth is a very important value to live migration.  It's
> > because it's one of the major factors that we'll make decision on when to
> > switchover to destination in a precopy process.
> >
> > This value is currently estimated by QEMU during the whole live migration
> > process by monitoring how fast we were sending the data.  This can be the
> > most accurate bandwidth if in the ideal world, where we're always feeding
> > unlimited data to the migration channel, and then it'll be limited to the
> > bandwidth that is available.
> >
> > However in reality it may be very different, e.g., over a 10Gbps network we
> > can see query-migrate showing migration bandwidth of only a few tens of
> > MB/s just because there are plenty of other things the migration thread
> > might be doing.  For example, the migration thread can be busy scanning
> > zero pages, or it can be fetching dirty bitmap from other external dirty
> > sources (like vhost or KVM).  It means we may not be pushing data as much
> > as possible to migration channel, so the bandwidth estimated from "how many
> > data we sent in the channel" can be dramatically inaccurate sometimes,
> > e.g., that a few tens of MB/s even if 10Gbps available, and then the
> > decision to switchover will be further affected by this.
> >
> > The migration may not even converge at all with the downtime specified,
> > with that wrong estimation of bandwidth.
> >
> > The issue is QEMU itself may not be able to avoid those uncertainties on
> > measuing the real "available migration bandwidth".  At least not something
> > I can think of so far.
> >
> > One way to fix this is when the user is fully aware of the available
> > bandwidth, then we can allow the user to help providing an accurate value.
> >
> > For example, if the user has a dedicated channel of 10Gbps for migration
> > for this specific VM, the user can specify this bandwidth so QEMU can
> > always do the calculation based on this fact, trusting the user as long as
> > specified.
> >
> > When the user wants to have migration only use 5Gbps out of that 10Gbps,
> > one can set max-bandwidth to 5Gbps, along with available-bandwidth to 5Gbps
> > so it'll never use over 5Gbps too (so the user can have the rest 5Gbps for
> > other things).  So it can be useful even if the network is not dedicated,
> > but as long as the user can know a solid value.
> >
> > A new parameter "available-bandwidth" is introduced just for this. So when
> > the user specified this parameter, instead of trusting the estimated value
> > from QEMU itself (based on the QEMUFile send speed), let's trust the user
> > more.
> >
> > This can resolve issues like "unconvergence migration" which is caused by
> > hilarious low "migration bandwidth" detected for whatever reason.
> >
> > Reported-by: Zhiyi Guo <zhguo@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  qapi/migration.json            | 20 +++++++++++++++++++-
> >  migration/migration.h          |  2 +-
> >  migration/options.h            |  1 +
> >  migration/migration-hmp-cmds.c | 14 ++++++++++++++
> >  migration/migration.c          | 19 +++++++++++++++----
> >  migration/options.c            | 28 ++++++++++++++++++++++++++++
> >  migration/trace-events         |  2 +-
> >  7 files changed, 79 insertions(+), 7 deletions(-)
> >
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 47dfef0278..fdc269e0a1 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -730,6 +730,16 @@
> >  # @max-bandwidth: to set maximum speed for migration.  maximum speed
> >  #     in bytes per second.  (Since 2.8)
> >  #
> > +# @available-bandwidth: to set available bandwidth for migration.  By
> > +#     default, this value is zero, means the user is not aware of the
> > +#     available bandwidth that can be used by QEMU migration, so QEMU will
> > +#     estimate the bandwidth automatically.  This can be set when the
> > +#     estimated value is not accurate, while the user is able to guarantee
> > +#     such bandwidth is available for migration purpose during the
> > +#     migration procedure.  When specified correctly, this can make the
> > +#     switchover decision much more accurate, which will also be based on
> > +#     the max downtime specified.  (Since 8.2)
> 
> Humor me: break lines slightly earlier, like
> 
>    # @available-bandwidth: to set available bandwidth for migration.  By
>    #     default, this value is zero, means the user is not aware of the
>    #     available bandwidth that can be used by QEMU migration, so QEMU
>    #     will estimate the bandwidth automatically.  This can be set when
>    #     the estimated value is not accurate, while the user is able to
>    #     guarantee such bandwidth is available for migration purpose
>    #     during the migration procedure.  When specified correctly, this
>    #     can make the switchover decision much more accurate, which will
>    #     also be based on the max downtime specified.  (Since 8.2)

Sure.

> 
> > +#
> >  # @downtime-limit: set maximum tolerated downtime for migration.
> >  #     maximum downtime in milliseconds (Since 2.8)
> >  #
> > @@ -803,7 +813,7 @@
> >             'cpu-throttle-initial', 'cpu-throttle-increment',
> >             'cpu-throttle-tailslow',
> >             'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
> > -           'downtime-limit',
> > +           'available-bandwidth', 'downtime-limit',
> >             { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
> >             'block-incremental',
> >             'multifd-channels',
> > @@ -886,6 +896,9 @@
> >  # @max-bandwidth: to set maximum speed for migration.  maximum speed
> >  #     in bytes per second.  (Since 2.8)
> >  #
> > +# @available-bandwidth: to set available migration bandwidth.  Please refer
> > +#     to comments in MigrationParameter for more information. (Since 8.2)
> 
> For better or worse, we duplicate full documentation between
> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
> would be the first instance where we reference instead.  I'm not opposed
> to use references, but if we do, I want them used consistently.

We discussed this over the other "switchover" parameter, but that patchset
just stranded..

Perhaps I just provide a pre-requisite patch to remove all the comments in
MigrateSetParameters and MigrationParameters, letting them all point to
MigrationParameter?

One thing I should have mentioned much earlier is that this patch is for
8.2 material.

> 
> > +#
> >  # @downtime-limit: set maximum tolerated downtime for migration.
> >  #     maximum downtime in milliseconds (Since 2.8)
> >  #
> > @@ -971,6 +984,7 @@
> >              '*tls-hostname': 'StrOrNull',
> >              '*tls-authz': 'StrOrNull',
> >              '*max-bandwidth': 'size',
> > +            '*available-bandwidth': 'size',
> >              '*downtime-limit': 'uint64',
> >              '*x-checkpoint-delay': { 'type': 'uint32',
> >                                       'features': [ 'unstable' ] },
> > @@ -1078,6 +1092,9 @@
> >  # @max-bandwidth: to set maximum speed for migration.  maximum speed
> >  #     in bytes per second.  (Since 2.8)
> >  #
> > +# @available-bandwidth: to set available migration bandwidth.  Please refer
> > +#     to comments in MigrationParameter for more information. (Since 8.2)
> > +#
> >  # @downtime-limit: set maximum tolerated downtime for migration.
> >  #     maximum downtime in milliseconds (Since 2.8)
> >  #
> > @@ -1160,6 +1177,7 @@
> >              '*tls-hostname': 'str',
> >              '*tls-authz': 'str',
> >              '*max-bandwidth': 'size',
> > +            '*available-bandwidth': 'size',
> >              '*downtime-limit': 'uint64',
> >              '*x-checkpoint-delay': { 'type': 'uint32',
> >                                       'features': [ 'unstable' ] },
> > diff --git a/migration/migration.h b/migration/migration.h
> > index b7c8b67542..fadbf64d9d 100644
> > --- a/migration/migration.h
> > +++ b/migration/migration.h
> > @@ -283,7 +283,7 @@ struct MigrationState {
> >      /*
> >       * The final stage happens when the remaining data is smaller than
> >       * this threshold; it's calculated from the requested downtime and
> > -     * measured bandwidth
> > +     * measured bandwidth, or available-bandwidth if user specified.
> 
> Suggest to scratch "user".

Will do, thanks.
Daniel P. Berrangé July 25, 2023, 5:10 p.m. UTC | #10
On Tue, Jul 25, 2023 at 12:38:23PM -0400, Peter Xu wrote:
> I see you used "convergance" explicitly even after PeterM's reply, is that
> what you prefer over "convergence"?  I do see more occurances of
> "convergence" as a word in migration context, though.

Ignore my speling erors :-)

>                                                       Any better name you
> can come up with, before I just go with "max-convergence-bandwidth" (I
> really cannot come up with anything better than this or available-bandwidth
> for now)?

Anothre idea could be 'max-switchover-bandwidth'  ?


With regards,
Daniel
Markus Armbruster July 26, 2023, 6:21 a.m. UTC | #11
Peter Xu <peterx@redhat.com> writes:

> Hi, Markus,
>
> On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:

[...]

>> For better or worse, we duplicate full documentation between
>> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
>> would be the first instance where we reference instead.  I'm not opposed
>> to use references, but if we do, I want them used consistently.
>
> We discussed this over the other "switchover" parameter, but that patchset
> just stranded..
>
> Perhaps I just provide a pre-requisite patch to remove all the comments in
> MigrateSetParameters and MigrationParameters, letting them all point to
> MigrationParameter?

Simplifies maintaining the doc commments.  But how does it affect the
documentation generated from it?  Better, neutral, or worse?

> One thing I should have mentioned much earlier is that this patch is for
> 8.2 material.

Understood.

[...]
Peter Xu July 26, 2023, 3:12 p.m. UTC | #12
On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > Hi, Markus,
> >
> > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
> 
> [...]
> 
> >> For better or worse, we duplicate full documentation between
> >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
> >> would be the first instance where we reference instead.  I'm not opposed
> >> to use references, but if we do, I want them used consistently.
> >
> > We discussed this over the other "switchover" parameter, but that patchset
> > just stranded..
> >
> > Perhaps I just provide a pre-requisite patch to remove all the comments in
> > MigrateSetParameters and MigrationParameters, letting them all point to
> > MigrationParameter?
> 
> Simplifies maintaining the doc commments.  But how does it affect the
> documentation generated from it?  Better, neutral, or worse?

Probably somewhere neutral.  There are definitely benefits, shorter
man/html page in total, and avoid accidentally different docs over the same
fields.  E.g., we sometimes have different wordings for different objects:

       max-cpu-throttle
              maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)

       max-cpu-throttle: int (optional)
              maximum cpu throttle percentage.  The default value is 99. (Since 3.1)

This one is fine, but it's just very easy to leak in something that shows
differently.  It's good to provide coherent documentation for the same
fields over all three objects.

When looking at qemu-qmp-ref.7, it can be like this when we can invite the
reader to read the other section (assuming we only keep MigrationParameter
to keep the documentations):

   MigrationParameters (Object)

       The object structure to represent a list of migration parameters.
       The optional members aren't actually optional.  For detailed
       explanation for each of the field, please refer to the documentation
       of MigrationParameter.

But the problem is we always will generate the Members entry, where now
it'll all filled up with "Not documented"..

   Members
       announce-initial: int (optional)
              Not documented

       announce-max: int (optional)
              Not documented

       announce-rounds: int (optional)
              Not documented

       [...]

I think maybe it's better we just list the members without showing "Not
documented" every time for the other two objects.  Not sure whether it's an
easy way to fix it, or is it a major issue.

For developers, dedup the comment should always be a win, afaict.

Thanks,
Peter Xu July 26, 2023, 3:19 p.m. UTC | #13
On Tue, Jul 25, 2023 at 06:10:18PM +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 25, 2023 at 12:38:23PM -0400, Peter Xu wrote:
> > I see you used "convergance" explicitly even after PeterM's reply, is that
> > what you prefer over "convergence"?  I do see more occurances of
> > "convergence" as a word in migration context, though.
> 
> Ignore my speling erors :-)

Ohh, so that's not intended. :) Actually there's indeed the word
"convergance" which can be applied here, but since I'm not native I really
can't figure out the fact even with a dictionary.

> 
> >                                                       Any better name you
> > can come up with, before I just go with "max-convergence-bandwidth" (I
> > really cannot come up with anything better than this or available-bandwidth
> > for now)?
> 
> Anothre idea could be 'max-switchover-bandwidth'  ?

Yeah, switchover can also be called a phase of migration, so in parallel of
existing precopy/postcopy, sounds like a good one.  I'll respin with that
if no further suggestions.  Thanks.
Markus Armbruster Aug. 4, 2023, 12:06 p.m. UTC | #14
Peter Xu <peterx@redhat.com> writes:

> On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > Hi, Markus,
>> >
>> > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
>> 
>> [...]
>> 
>> >> For better or worse, we duplicate full documentation between
>> >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
>> >> would be the first instance where we reference instead.  I'm not opposed
>> >> to use references, but if we do, I want them used consistently.
>> >
>> > We discussed this over the other "switchover" parameter, but that patchset
>> > just stranded..
>> >
>> > Perhaps I just provide a pre-requisite patch to remove all the comments in
>> > MigrateSetParameters and MigrationParameters, letting them all point to
>> > MigrationParameter?
>> 
>> Simplifies maintaining the doc commments.  But how does it affect the
>> documentation generated from it?  Better, neutral, or worse?
>
> Probably somewhere neutral.  There are definitely benefits, shorter
> man/html page in total, and avoid accidentally different docs over the same
> fields.  E.g., we sometimes have different wordings for different objects:
>
>        max-cpu-throttle
>               maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)
>
>        max-cpu-throttle: int (optional)
>               maximum cpu throttle percentage.  The default value is 99. (Since 3.1)
>
> This one is fine, but it's just very easy to leak in something that shows
> differently.  It's good to provide coherent documentation for the same
> fields over all three objects.

Yes, but we've been doing okay regardless.

The drawback of replacing duplicates by references is that readers need
to follow the references.

Less onerous when the references can be clicked.

If we de-duplicate, which copy do we keep, MigrationParameter,
MigrateSetParameters, or MigrationParameter?  Do we have an idea which
of these users are likely to read first?

> When looking at qemu-qmp-ref.7, it can be like this when we can invite the
> reader to read the other section (assuming we only keep MigrationParameter
> to keep the documentations):
>
>    MigrationParameters (Object)
>
>        The object structure to represent a list of migration parameters.
>        The optional members aren't actually optional.  For detailed
>        explanation for each of the field, please refer to the documentation
>        of MigrationParameter.
>
> But the problem is we always will generate the Members entry, where now
> it'll all filled up with "Not documented"..
>
>    Members
>        announce-initial: int (optional)
>               Not documented
>
>        announce-max: int (optional)
>               Not documented
>
>        announce-rounds: int (optional)
>               Not documented
>
>        [...]
>
> I think maybe it's better we just list the members without showing "Not
> documented" every time for the other two objects.  Not sure whether it's an
> easy way to fix it, or is it a major issue.

The automatic generation of "Not documented" documentation is a
stop-gap.  Leaving a member undocumented should be a hard error.  It
isn't only because we have 511 instances to fix.

Perhaps a directive to ignore undocumented members could be useful.
I.e. to suppress the automatic "Not documented" documented now, the
error later.

We could write things out in longhand instead, like

    # @announce-initial: Same as MigrationParameter member
    #     @announce-initial.

> For developers, dedup the comment should always be a win, afaict.

No argument.
Peter Xu Aug. 4, 2023, 1:28 p.m. UTC | #15
On Fri, Aug 04, 2023 at 02:06:02PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > Hi, Markus,
> >> >
> >> > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
> >> 
> >> [...]
> >> 
> >> >> For better or worse, we duplicate full documentation between
> >> >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
> >> >> would be the first instance where we reference instead.  I'm not opposed
> >> >> to use references, but if we do, I want them used consistently.
> >> >
> >> > We discussed this over the other "switchover" parameter, but that patchset
> >> > just stranded..
> >> >
> >> > Perhaps I just provide a pre-requisite patch to remove all the comments in
> >> > MigrateSetParameters and MigrationParameters, letting them all point to
> >> > MigrationParameter?
> >> 
> >> Simplifies maintaining the doc commments.  But how does it affect the
> >> documentation generated from it?  Better, neutral, or worse?
> >
> > Probably somewhere neutral.  There are definitely benefits, shorter
> > man/html page in total, and avoid accidentally different docs over the same
> > fields.  E.g., we sometimes have different wordings for different objects:
> >
> >        max-cpu-throttle
> >               maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)
> >
> >        max-cpu-throttle: int (optional)
> >               maximum cpu throttle percentage.  The default value is 99. (Since 3.1)
> >
> > This one is fine, but it's just very easy to leak in something that shows
> > differently.  It's good to provide coherent documentation for the same
> > fields over all three objects.
> 
> Yes, but we've been doing okay regardless.
> 
> The drawback of replacing duplicates by references is that readers need
> to follow the references.
> 
> Less onerous when the references can be clicked.
> 
> If we de-duplicate, which copy do we keep, MigrationParameter,
> MigrateSetParameters, or MigrationParameter?  Do we have an idea which
> of these users are likely to read first?

I chose MigrationParameter for no explicit reason, because I can't find
good argument to differenciate them.  Please let me know if you have any
suggestion.

> 
> > When looking at qemu-qmp-ref.7, it can be like this when we can invite the
> > reader to read the other section (assuming we only keep MigrationParameter
> > to keep the documentations):
> >
> >    MigrationParameters (Object)
> >
> >        The object structure to represent a list of migration parameters.
> >        The optional members aren't actually optional.  For detailed
> >        explanation for each of the field, please refer to the documentation
> >        of MigrationParameter.
> >
> > But the problem is we always will generate the Members entry, where now
> > it'll all filled up with "Not documented"..
> >
> >    Members
> >        announce-initial: int (optional)
> >               Not documented
> >
> >        announce-max: int (optional)
> >               Not documented
> >
> >        announce-rounds: int (optional)
> >               Not documented
> >
> >        [...]
> >
> > I think maybe it's better we just list the members without showing "Not
> > documented" every time for the other two objects.  Not sure whether it's an
> > easy way to fix it, or is it a major issue.
> 
> The automatic generation of "Not documented" documentation is a
> stop-gap.  Leaving a member undocumented should be a hard error.  It
> isn't only because we have 511 instances to fix.
> 
> Perhaps a directive to ignore undocumented members could be useful.
> I.e. to suppress the automatic "Not documented" documented now, the
> error later.
> 
> We could write things out in longhand instead, like
> 
>     # @announce-initial: Same as MigrationParameter member
>     #     @announce-initial.

Yes I can definitely do this.

Since I don't really know whether the "put a link" will work at all (at
least man page doesn't really have those, does it?), would this be the way
you suggest us forward?

Note that I am also always happy to simply duplicate the three paragraphs
just like before; that's not something I must do with solving the migration
problem so far, we can decouple the two problems essentially.  But since
we're at it, if you think worthwhile we may have a chance get rid of
duplicated documents here (before code) I can try.

> 
> > For developers, dedup the comment should always be a win, afaict.
> 
> No argument.

Let me explain a bit: I meant the patch author who will reduce writting
duplicated documents, making sure everything match together.  And reviewers
who will read the duplicated content, making sure that everything match
together again.  The two efforts can be avoided.  That's all I meant here
for when I was referring to as "developers" in this context..  Not everyone
as a common sense of developer.

Thanks,
Daniel P. Berrangé Aug. 4, 2023, 1:39 p.m. UTC | #16
On Wed, Jul 26, 2023 at 11:12:31AM -0400, Peter Xu wrote:
> On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
> > Peter Xu <peterx@redhat.com> writes:
> > 
> > > Hi, Markus,
> > >
> > > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
> > 
> > [...]
> > 
> > >> For better or worse, we duplicate full documentation between
> > >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
> > >> would be the first instance where we reference instead.  I'm not opposed
> > >> to use references, but if we do, I want them used consistently.
> > >
> > > We discussed this over the other "switchover" parameter, but that patchset
> > > just stranded..
> > >
> > > Perhaps I just provide a pre-requisite patch to remove all the comments in
> > > MigrateSetParameters and MigrationParameters, letting them all point to
> > > MigrationParameter?
> > 
> > Simplifies maintaining the doc commments.  But how does it affect the
> > documentation generated from it?  Better, neutral, or worse?
> 
> Probably somewhere neutral.  There are definitely benefits, shorter
>
> man/html page in total, and avoid accidentally different docs over the same
> fields.  E.g., we sometimes have different wordings for different objects:
> 
>        max-cpu-throttle
>               maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)
> 
>        max-cpu-throttle: int (optional)
>               maximum cpu throttle percentage.  The default value is 99. (Since 3.1)
> 
> This one is fine, but it's just very easy to leak in something that shows
> differently.  It's good to provide coherent documentation for the same
> fields over all three objects.

Do we have any actual problems though where the difference in
docs is actively harmful ? I agree there's a possbility of the
duplication being problematic, but unless its actually the
case in reality, it is merely a theoretical downside.

IMHO for someone consuming the docs, this patch is worse, not neutral.

> 
> When looking at qemu-qmp-ref.7, it can be like this when we can invite the
> reader to read the other section (assuming we only keep MigrationParameter
> to keep the documentations):
> 
>    MigrationParameters (Object)
> 
>        The object structure to represent a list of migration parameters.
>        The optional members aren't actually optional.  For detailed
>        explanation for each of the field, please refer to the documentation
>        of MigrationParameter.
> 
> But the problem is we always will generate the Members entry, where now
> it'll all filled up with "Not documented"..
> 
>    Members
>        announce-initial: int (optional)
>               Not documented
> 
>        announce-max: int (optional)
>               Not documented
> 
>        announce-rounds: int (optional)
>               Not documented
> 
>        [...]
> 
> I think maybe it's better we just list the members without showing "Not
> documented" every time for the other two objects.  Not sure whether it's an
> easy way to fix it, or is it a major issue.
> 
> For developers, dedup the comment should always be a win, afaict.

IMHO that's optimizing for the wrong people and isn't a measurable
win anyway. Someone adding a new parameter can just cut+paste the
same docs string in the three places. It is a cost, but it is a
small one time cost, where having "not documented" is a ongoing
cost for consumers of our API.

I don't think the burden on QEMU maintainers adding new migration
parameters is significant enough to justify ripping out our existing
docs.

With regards,
Daniel
Peter Xu Aug. 4, 2023, 2:02 p.m. UTC | #17
On Fri, Aug 04, 2023 at 02:39:15PM +0100, Daniel P. Berrangé wrote:
> On Wed, Jul 26, 2023 at 11:12:31AM -0400, Peter Xu wrote:
> > On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
> > > Peter Xu <peterx@redhat.com> writes:
> > > 
> > > > Hi, Markus,
> > > >
> > > > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
> > > 
> > > [...]
> > > 
> > > >> For better or worse, we duplicate full documentation between
> > > >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
> > > >> would be the first instance where we reference instead.  I'm not opposed
> > > >> to use references, but if we do, I want them used consistently.
> > > >
> > > > We discussed this over the other "switchover" parameter, but that patchset
> > > > just stranded..
> > > >
> > > > Perhaps I just provide a pre-requisite patch to remove all the comments in
> > > > MigrateSetParameters and MigrationParameters, letting them all point to
> > > > MigrationParameter?
> > > 
> > > Simplifies maintaining the doc commments.  But how does it affect the
> > > documentation generated from it?  Better, neutral, or worse?
> > 
> > Probably somewhere neutral.  There are definitely benefits, shorter
> >
> > man/html page in total, and avoid accidentally different docs over the same
> > fields.  E.g., we sometimes have different wordings for different objects:
> > 
> >        max-cpu-throttle
> >               maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)
> > 
> >        max-cpu-throttle: int (optional)
> >               maximum cpu throttle percentage.  The default value is 99. (Since 3.1)
> > 
> > This one is fine, but it's just very easy to leak in something that shows
> > differently.  It's good to provide coherent documentation for the same
> > fields over all three objects.
> 
> Do we have any actual problems though where the difference in
> docs is actively harmful ? I agree there's a possbility of the
> duplication being problematic, but unless its actually the
> case in reality, it is merely a theoretical downside.
> 
> IMHO for someone consuming the docs, this patch is worse, not neutral.

I agree.

> 
> > 
> > When looking at qemu-qmp-ref.7, it can be like this when we can invite the
> > reader to read the other section (assuming we only keep MigrationParameter
> > to keep the documentations):
> > 
> >    MigrationParameters (Object)
> > 
> >        The object structure to represent a list of migration parameters.
> >        The optional members aren't actually optional.  For detailed
> >        explanation for each of the field, please refer to the documentation
> >        of MigrationParameter.
> > 
> > But the problem is we always will generate the Members entry, where now
> > it'll all filled up with "Not documented"..
> > 
> >    Members
> >        announce-initial: int (optional)
> >               Not documented
> > 
> >        announce-max: int (optional)
> >               Not documented
> > 
> >        announce-rounds: int (optional)
> >               Not documented
> > 
> >        [...]
> > 
> > I think maybe it's better we just list the members without showing "Not
> > documented" every time for the other two objects.  Not sure whether it's an
> > easy way to fix it, or is it a major issue.
> > 
> > For developers, dedup the comment should always be a win, afaict.
> 
> IMHO that's optimizing for the wrong people and isn't a measurable
> win anyway. Someone adding a new parameter can just cut+paste the
> same docs string in the three places. It is a cost, but it is a
> small one time cost, where having "not documented" is a ongoing
> cost for consumers of our API.
> 
> I don't think the burden on QEMU maintainers adding new migration
> parameters is significant enough to justify ripping out our existing
> docs.

I had that strong feeling mostly because I'm a migration developer and
migration reviewer, so I suffer on both sides. :) I was trying to stand out
for either any future author/reviewer from that pov, but I think indeed the
ultimate consumer is the reader.

Fundamentally to solve the problem, maybe we must dedup the objects rather
than the documents only?  I'll try to dig a bit more in this area next week
if I have time, any link for previous context would be welcomed (or obvious
reason that we just won't be able to do that; I only know that at least
shouldn't be trivial to resolve).

For this one - Markus, let me know what do you think, or I'll simply repost
v3 with the duplications (when needed, probably later not sooner).

Thanks,
Markus Armbruster Aug. 5, 2023, 8:05 a.m. UTC | #18
Peter Xu <peterx@redhat.com> writes:

> On Fri, Aug 04, 2023 at 02:39:15PM +0100, Daniel P. Berrangé wrote:
>> On Wed, Jul 26, 2023 at 11:12:31AM -0400, Peter Xu wrote:
>> > On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
>> > > Peter Xu <peterx@redhat.com> writes:
>> > > 
>> > > > Hi, Markus,
>> > > >
>> > > > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
>> > > 
>> > > [...]
>> > > 
>> > > >> For better or worse, we duplicate full documentation between
>> > > >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
>> > > >> would be the first instance where we reference instead.  I'm not opposed
>> > > >> to use references, but if we do, I want them used consistently.
>> > > >
>> > > > We discussed this over the other "switchover" parameter, but that patchset
>> > > > just stranded..
>> > > >
>> > > > Perhaps I just provide a pre-requisite patch to remove all the comments in
>> > > > MigrateSetParameters and MigrationParameters, letting them all point to
>> > > > MigrationParameter?
>> > > 
>> > > Simplifies maintaining the doc commments.  But how does it affect the
>> > > documentation generated from it?  Better, neutral, or worse?
>> > 
>> > Probably somewhere neutral.  There are definitely benefits, shorter
>> >
>> > man/html page in total, and avoid accidentally different docs over the same
>> > fields.  E.g., we sometimes have different wordings for different objects:
>> > 
>> >        max-cpu-throttle
>> >               maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)
>> > 
>> >        max-cpu-throttle: int (optional)
>> >               maximum cpu throttle percentage.  The default value is 99. (Since 3.1)
>> > 
>> > This one is fine, but it's just very easy to leak in something that shows
>> > differently.  It's good to provide coherent documentation for the same
>> > fields over all three objects.
>> 
>> Do we have any actual problems though where the difference in
>> docs is actively harmful ? I agree there's a possbility of the
>> duplication being problematic, but unless its actually the
>> case in reality, it is merely a theoretical downside.
>> 
>> IMHO for someone consuming the docs, this patch is worse, not neutral.
>
> I agree.
>
>> 
>> > 
>> > When looking at qemu-qmp-ref.7, it can be like this when we can invite the
>> > reader to read the other section (assuming we only keep MigrationParameter
>> > to keep the documentations):
>> > 
>> >    MigrationParameters (Object)
>> > 
>> >        The object structure to represent a list of migration parameters.
>> >        The optional members aren't actually optional.  For detailed
>> >        explanation for each of the field, please refer to the documentation
>> >        of MigrationParameter.
>> > 
>> > But the problem is we always will generate the Members entry, where now
>> > it'll all filled up with "Not documented"..
>> > 
>> >    Members
>> >        announce-initial: int (optional)
>> >               Not documented
>> > 
>> >        announce-max: int (optional)
>> >               Not documented
>> > 
>> >        announce-rounds: int (optional)
>> >               Not documented
>> > 
>> >        [...]
>> > 
>> > I think maybe it's better we just list the members without showing "Not
>> > documented" every time for the other two objects.  Not sure whether it's an
>> > easy way to fix it, or is it a major issue.
>> > 
>> > For developers, dedup the comment should always be a win, afaict.
>> 
>> IMHO that's optimizing for the wrong people and isn't a measurable
>> win anyway. Someone adding a new parameter can just cut+paste the
>> same docs string in the three places. It is a cost, but it is a
>> small one time cost, where having "not documented" is a ongoing
>> cost for consumers of our API.
>> 
>> I don't think the burden on QEMU maintainers adding new migration
>> parameters is significant enough to justify ripping out our existing
>> docs.
>
> I had that strong feeling mostly because I'm a migration developer and
> migration reviewer, so I suffer on both sides. :) I was trying to stand out
> for either any future author/reviewer from that pov, but I think indeed the
> ultimate consumer is the reader.
>
> Fundamentally to solve the problem, maybe we must dedup the objects rather
> than the documents only?  I'll try to dig a bit more in this area next week
> if I have time, any link for previous context would be welcomed (or obvious
> reason that we just won't be able to do that; I only know that at least
> shouldn't be trivial to resolve).
>
> For this one - Markus, let me know what do you think, or I'll simply repost
> v3 with the duplications (when needed, probably later not sooner).

Since you want to investigate de-duplicating the objects, do that
*first*.  If you succeed, we don't have to argue about de-duplicating
docs.
Markus Armbruster Aug. 5, 2023, 8:13 a.m. UTC | #19
Peter Xu <peterx@redhat.com> writes:

> On Fri, Aug 04, 2023 at 02:06:02PM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Wed, Jul 26, 2023 at 08:21:35AM +0200, Markus Armbruster wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >> 
>> >> > Hi, Markus,
>> >> >
>> >> > On Tue, Jul 25, 2023 at 01:10:01PM +0200, Markus Armbruster wrote:
>> >> 
>> >> [...]
>> >> 
>> >> >> For better or worse, we duplicate full documentation between
>> >> >> MigrationParameter, MigrateSetParameters, and MigrationParameters.  This
>> >> >> would be the first instance where we reference instead.  I'm not opposed
>> >> >> to use references, but if we do, I want them used consistently.
>> >> >
>> >> > We discussed this over the other "switchover" parameter, but that patchset
>> >> > just stranded..
>> >> >
>> >> > Perhaps I just provide a pre-requisite patch to remove all the comments in
>> >> > MigrateSetParameters and MigrationParameters, letting them all point to
>> >> > MigrationParameter?
>> >> 
>> >> Simplifies maintaining the doc commments.  But how does it affect the
>> >> documentation generated from it?  Better, neutral, or worse?
>> >
>> > Probably somewhere neutral.  There are definitely benefits, shorter
>> > man/html page in total, and avoid accidentally different docs over the same
>> > fields.  E.g., we sometimes have different wordings for different objects:
>> >
>> >        max-cpu-throttle
>> >               maximum cpu throttle percentage.  Defaults to 99.  (Since 3.1)
>> >
>> >        max-cpu-throttle: int (optional)
>> >               maximum cpu throttle percentage.  The default value is 99. (Since 3.1)
>> >
>> > This one is fine, but it's just very easy to leak in something that shows
>> > differently.  It's good to provide coherent documentation for the same
>> > fields over all three objects.
>> 
>> Yes, but we've been doing okay regardless.
>> 
>> The drawback of replacing duplicates by references is that readers need
>> to follow the references.
>> 
>> Less onerous when the references can be clicked.
>> 
>> If we de-duplicate, which copy do we keep, MigrationParameter,
>> MigrateSetParameters, or MigrationParameter?  Do we have an idea which
>> of these users are likely to read first?
>
> I chose MigrationParameter for no explicit reason, because I can't find
> good argument to differenciate them.  Please let me know if you have any
> suggestion.
>
>> 
>> > When looking at qemu-qmp-ref.7, it can be like this when we can invite the
>> > reader to read the other section (assuming we only keep MigrationParameter
>> > to keep the documentations):
>> >
>> >    MigrationParameters (Object)
>> >
>> >        The object structure to represent a list of migration parameters.
>> >        The optional members aren't actually optional.  For detailed
>> >        explanation for each of the field, please refer to the documentation
>> >        of MigrationParameter.
>> >
>> > But the problem is we always will generate the Members entry, where now
>> > it'll all filled up with "Not documented"..
>> >
>> >    Members
>> >        announce-initial: int (optional)
>> >               Not documented
>> >
>> >        announce-max: int (optional)
>> >               Not documented
>> >
>> >        announce-rounds: int (optional)
>> >               Not documented
>> >
>> >        [...]
>> >
>> > I think maybe it's better we just list the members without showing "Not
>> > documented" every time for the other two objects.  Not sure whether it's an
>> > easy way to fix it, or is it a major issue.
>> 
>> The automatic generation of "Not documented" documentation is a
>> stop-gap.  Leaving a member undocumented should be a hard error.  It
>> isn't only because we have 511 instances to fix.
>> 
>> Perhaps a directive to ignore undocumented members could be useful.
>> I.e. to suppress the automatic "Not documented" documented now, the
>> error later.
>> 
>> We could write things out in longhand instead, like
>> 
>>     # @announce-initial: Same as MigrationParameter member
>>     #     @announce-initial.
>
> Yes I can definitely do this.
>
> Since I don't really know whether the "put a link" will work at all (at
> least man page doesn't really have those, does it?), would this be the way
> you suggest us forward?

I don't remember offhand how the QAPI doc generation machinery adds
links.  I can look it up for you after my vacation (two weeks, starting
basically now).

> Note that I am also always happy to simply duplicate the three paragraphs
> just like before; that's not something I must do with solving the migration
> problem so far, we can decouple the two problems essentially.  But since
> we're at it, if you think worthwhile we may have a chance get rid of
> duplicated documents here (before code) I can try.
>
>> 
>> > For developers, dedup the comment should always be a win, afaict.
>> 
>> No argument.
>
> Let me explain a bit: I meant the patch author who will reduce writting
> duplicated documents, making sure everything match together.  And reviewers
> who will read the duplicated content, making sure that everything match
> together again.  The two efforts can be avoided.  That's all I meant here
> for when I was referring to as "developers" in this context..  Not everyone
> as a common sense of developer.

By "no argument", I mean we don't need to argue about this.  I actually
agree with you that de-duplication would be net positive for developers.
diff mbox series

Patch

diff --git a/qapi/migration.json b/qapi/migration.json
index 47dfef0278..fdc269e0a1 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -730,6 +730,16 @@ 
 # @max-bandwidth: to set maximum speed for migration.  maximum speed
 #     in bytes per second.  (Since 2.8)
 #
+# @available-bandwidth: to set available bandwidth for migration.  By
+#     default, this value is zero, means the user is not aware of the
+#     available bandwidth that can be used by QEMU migration, so QEMU will
+#     estimate the bandwidth automatically.  This can be set when the
+#     estimated value is not accurate, while the user is able to guarantee
+#     such bandwidth is available for migration purpose during the
+#     migration procedure.  When specified correctly, this can make the
+#     switchover decision much more accurate, which will also be based on
+#     the max downtime specified.  (Since 8.2)
+#
 # @downtime-limit: set maximum tolerated downtime for migration.
 #     maximum downtime in milliseconds (Since 2.8)
 #
@@ -803,7 +813,7 @@ 
            'cpu-throttle-initial', 'cpu-throttle-increment',
            'cpu-throttle-tailslow',
            'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth',
-           'downtime-limit',
+           'available-bandwidth', 'downtime-limit',
            { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
            'block-incremental',
            'multifd-channels',
@@ -886,6 +896,9 @@ 
 # @max-bandwidth: to set maximum speed for migration.  maximum speed
 #     in bytes per second.  (Since 2.8)
 #
+# @available-bandwidth: to set available migration bandwidth.  Please refer
+#     to comments in MigrationParameter for more information. (Since 8.2)
+#
 # @downtime-limit: set maximum tolerated downtime for migration.
 #     maximum downtime in milliseconds (Since 2.8)
 #
@@ -971,6 +984,7 @@ 
             '*tls-hostname': 'StrOrNull',
             '*tls-authz': 'StrOrNull',
             '*max-bandwidth': 'size',
+            '*available-bandwidth': 'size',
             '*downtime-limit': 'uint64',
             '*x-checkpoint-delay': { 'type': 'uint32',
                                      'features': [ 'unstable' ] },
@@ -1078,6 +1092,9 @@ 
 # @max-bandwidth: to set maximum speed for migration.  maximum speed
 #     in bytes per second.  (Since 2.8)
 #
+# @available-bandwidth: to set available migration bandwidth.  Please refer
+#     to comments in MigrationParameter for more information. (Since 8.2)
+#
 # @downtime-limit: set maximum tolerated downtime for migration.
 #     maximum downtime in milliseconds (Since 2.8)
 #
@@ -1160,6 +1177,7 @@ 
             '*tls-hostname': 'str',
             '*tls-authz': 'str',
             '*max-bandwidth': 'size',
+            '*available-bandwidth': 'size',
             '*downtime-limit': 'uint64',
             '*x-checkpoint-delay': { 'type': 'uint32',
                                      'features': [ 'unstable' ] },
diff --git a/migration/migration.h b/migration/migration.h
index b7c8b67542..fadbf64d9d 100644
--- a/migration/migration.h
+++ b/migration/migration.h
@@ -283,7 +283,7 @@  struct MigrationState {
     /*
      * The final stage happens when the remaining data is smaller than
      * this threshold; it's calculated from the requested downtime and
-     * measured bandwidth
+     * measured bandwidth, or available-bandwidth if user specified.
      */
     int64_t threshold_size;
 
diff --git a/migration/options.h b/migration/options.h
index 9aaf363322..ed2d9c92e7 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -79,6 +79,7 @@  int migrate_decompress_threads(void);
 uint64_t migrate_downtime_limit(void);
 uint8_t migrate_max_cpu_throttle(void);
 uint64_t migrate_max_bandwidth(void);
+uint64_t migrate_available_bandwidth(void);
 uint64_t migrate_max_postcopy_bandwidth(void);
 int migrate_multifd_channels(void);
 MultiFDCompression migrate_multifd_compression(void);
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index 9885d7c9f7..53fbe1b1af 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -311,6 +311,10 @@  void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_BANDWIDTH),
             params->max_bandwidth);
+        assert(params->has_available_bandwidth);
+        monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
+            MigrationParameter_str(MIGRATION_PARAMETER_AVAILABLE_BANDWIDTH),
+            params->available_bandwidth);
         assert(params->has_downtime_limit);
         monitor_printf(mon, "%s: %" PRIu64 " ms\n",
             MigrationParameter_str(MIGRATION_PARAMETER_DOWNTIME_LIMIT),
@@ -556,6 +560,16 @@  void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
         }
         p->max_bandwidth = valuebw;
         break;
+    case MIGRATION_PARAMETER_AVAILABLE_BANDWIDTH:
+        p->has_available_bandwidth = true;
+        ret = qemu_strtosz_MiB(valuestr, NULL, &valuebw);
+        if (ret < 0 || valuebw > INT64_MAX
+            || (size_t)valuebw != valuebw) {
+            error_setg(&err, "Invalid size %s", valuestr);
+            break;
+        }
+        p->available_bandwidth = valuebw;
+        break;
     case MIGRATION_PARAMETER_DOWNTIME_LIMIT:
         p->has_downtime_limit = true;
         visit_type_size(v, param, &p->downtime_limit, &err);
diff --git a/migration/migration.c b/migration/migration.c
index 91bba630a8..391ddfd8ce 100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -2671,7 +2671,7 @@  static void migration_update_counters(MigrationState *s,
 {
     uint64_t transferred, transferred_pages, time_spent;
     uint64_t current_bytes; /* bytes transferred since the beginning */
-    double bandwidth;
+    double bandwidth, avail_bw;
 
     if (current_time < s->iteration_start_time + BUFFER_DELAY) {
         return;
@@ -2681,7 +2681,17 @@  static void migration_update_counters(MigrationState *s,
     transferred = current_bytes - s->iteration_initial_bytes;
     time_spent = current_time - s->iteration_start_time;
     bandwidth = (double)transferred / time_spent;
-    s->threshold_size = bandwidth * migrate_downtime_limit();
+    if (migrate_available_bandwidth()) {
+        /*
+         * If the user specified an available bandwidth, let's trust the
+         * user so that can be more accurate than what we estimated.
+         */
+        avail_bw = migrate_available_bandwidth();
+    } else {
+        /* If the user doesn't specify bandwidth, we use the estimated */
+        avail_bw = bandwidth;
+    }
+    s->threshold_size = avail_bw * migrate_downtime_limit();
 
     s->mbps = (((double) transferred * 8.0) /
                ((double) time_spent / 1000.0)) / 1000.0 / 1000.0;
@@ -2698,7 +2708,7 @@  static void migration_update_counters(MigrationState *s,
     if (stat64_get(&mig_stats.dirty_pages_rate) &&
         transferred > 10000) {
         s->expected_downtime =
-            stat64_get(&mig_stats.dirty_bytes_last_sync) / bandwidth;
+            stat64_get(&mig_stats.dirty_bytes_last_sync) / avail_bw;
     }
 
     migration_rate_reset(s->to_dst_file);
@@ -2706,7 +2716,8 @@  static void migration_update_counters(MigrationState *s,
     update_iteration_initial_status(s);
 
     trace_migrate_transferred(transferred, time_spent,
-                              bandwidth, s->threshold_size);
+                              bandwidth, migrate_available_bandwidth(),
+                              s->threshold_size);
 }
 
 static bool migration_can_switchover(MigrationState *s)
diff --git a/migration/options.c b/migration/options.c
index 5a9505adf7..160faca71a 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -121,6 +121,8 @@  Property migration_properties[] = {
                       parameters.cpu_throttle_tailslow, false),
     DEFINE_PROP_SIZE("x-max-bandwidth", MigrationState,
                       parameters.max_bandwidth, MAX_THROTTLE),
+    DEFINE_PROP_SIZE("available-bandwidth", MigrationState,
+                      parameters.available_bandwidth, 0),
     DEFINE_PROP_UINT64("x-downtime-limit", MigrationState,
                       parameters.downtime_limit,
                       DEFAULT_MIGRATE_SET_DOWNTIME),
@@ -735,6 +737,13 @@  uint64_t migrate_max_bandwidth(void)
     return s->parameters.max_bandwidth;
 }
 
+uint64_t migrate_available_bandwidth(void)
+{
+    MigrationState *s = migrate_get_current();
+
+    return s->parameters.available_bandwidth;
+}
+
 uint64_t migrate_max_postcopy_bandwidth(void)
 {
     MigrationState *s = migrate_get_current();
@@ -872,6 +881,8 @@  MigrationParameters *qmp_query_migrate_parameters(Error **errp)
                                  s->parameters.tls_authz : "");
     params->has_max_bandwidth = true;
     params->max_bandwidth = s->parameters.max_bandwidth;
+    params->has_available_bandwidth = true;
+    params->available_bandwidth = s->parameters.available_bandwidth;
     params->has_downtime_limit = true;
     params->downtime_limit = s->parameters.downtime_limit;
     params->has_x_checkpoint_delay = true;
@@ -1004,6 +1015,15 @@  bool migrate_params_check(MigrationParameters *params, Error **errp)
         return false;
     }
 
+    if (params->has_available_bandwidth &&
+        (params->available_bandwidth > SIZE_MAX)) {
+        error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
+                   "available_bandwidth",
+                   "an integer in the range of 0 to "stringify(SIZE_MAX)
+                   " bytes/second");
+        return false;
+    }
+
     if (params->has_downtime_limit &&
         (params->downtime_limit > MAX_MIGRATE_DOWNTIME)) {
         error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -1156,6 +1176,10 @@  static void migrate_params_test_apply(MigrateSetParameters *params,
         dest->max_bandwidth = params->max_bandwidth;
     }
 
+    if (params->has_available_bandwidth) {
+        dest->available_bandwidth = params->available_bandwidth;
+    }
+
     if (params->has_downtime_limit) {
         dest->downtime_limit = params->downtime_limit;
     }
@@ -1264,6 +1288,10 @@  static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
         }
     }
 
+    if (params->has_available_bandwidth) {
+        s->parameters.available_bandwidth = params->available_bandwidth;
+    }
+
     if (params->has_downtime_limit) {
         s->parameters.downtime_limit = params->downtime_limit;
     }
diff --git a/migration/trace-events b/migration/trace-events
index 5259c1044b..fffed1f995 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -184,7 +184,7 @@  source_return_path_thread_shut(uint32_t val) "0x%x"
 source_return_path_thread_resume_ack(uint32_t v) "%"PRIu32
 source_return_path_thread_switchover_acked(void) ""
 migration_thread_low_pending(uint64_t pending) "%" PRIu64
-migrate_transferred(uint64_t tranferred, uint64_t time_spent, uint64_t bandwidth, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " max_size %" PRId64
+migrate_transferred(uint64_t tranferred, uint64_t time_spent, uint64_t bandwidth, uint64_t avail_bw, uint64_t size) "transferred %" PRIu64 " time_spent %" PRIu64 " bandwidth %" PRIu64 " avail_bw %" PRIu64 " max_size %" PRId64
 process_incoming_migration_co_end(int ret, int ps) "ret=%d postcopy-state=%d"
 process_incoming_migration_co_postcopy_end_main(void) ""
 postcopy_preempt_enabled(bool value) "%d"