diff mbox series

[v3,3/4] migration/qapi: Replace @MigrateSetParameters with @MigrationParameters

Message ID 20230905162335.235619-4-peterx@redhat.com
State New
Headers show
Series qapi/migration: Dedup migration parameter objects and fix tls-authz crash | expand

Commit Message

Peter Xu Sept. 5, 2023, 4:23 p.m. UTC
Quotting from Markus in his replies:

  migrate-set-parameters sets migration parameters, and
  query-migrate-parameters gets them.  Unsurprisingly, the former's
  argument type MigrateSetParameters is quite close to the latter's
  return type MigrationParameters.  The differences are subtle:

  1. Since migrate-set-parameters supports setting selected parameters,
     its arguments must all be optional (so you can omit the ones you
     don't want to change).  query-migrate-parameters results are also
     all optional, but almost all of them are in fact always present.

  2. For parameters @tls_creds, @tls_hostname, @tls_authz,
     migrate-set-parameters interprets special value "" as "reset to
     default".  Works, because "" is semantically invalid.  Not a
     general solution, because a semantically invalid value need not
     exist.  Markus added a general solution in commit 01fa559826
     ("migration: Use JSON null instead of "" to reset parameter to
     default").  This involved changing the type from 'str' to
     'StrOrNull'.

  3. When parameter @block-bitmap-mapping has not been set,
     query-migrate-parameters does not return it (absent optional
     member).  Clean (but undocumented).  When parameters @tls_creds,
     @tls_hostname, @tls_authz have not been set, it returns the
     semantically invalid value "".  Not so clean (and just as
     undocumented).

Here to deduplicate the two objects: keep @MigrationParameters as the name
of object to use in both places, drop @MigrateSetParameters, at the
meantime switch types of @tls* fields from "str" to "StrOrNull" types.

I found that the TLS code wasn't so much relying on tls_* fields being
non-NULL at all.  Actually on the other way round: if we set tls_authz to
an empty string (NOTE: currently, migrate_init() missed initializing
tls_authz; also touched it up in this patch), we can already fail one of
the migration-test (tls/x509/default-host), as qauthz_is_allowed_by_id()
will assume tls_authz set even if tls_auths is an empty string.

It means we're actually relying on tls_* fields being NULL even if it's the
empty string.

Let's just make it a rule to return NULL for empty string on these fields
internally.  For that, when converting a StrOrNull into a char* (where we
introduced a helper here in this patch) we'll also make the empty string to
be NULL, to make it always work.  And it doesn't show any issue either when
applying that logic to both tls_creds and tls_hostname.

With above, we can safely change both migration_tls_client_create() and
migrate_tls() to not check the empty string too finally.. not needed
anymore.

Also, we can drop the hackish conversions in qmp_migrate_set_parameters()
where we want to make sure it's a QSTRING; it's not needed now.

This greatly deduplicates the code not only in qapi/migration.json, but
also in the generic migration code.

Markus helped greatly with this patch.  Besides a better commit
message (where I just "stole" from the reply), debugged and resolved a
double free, but also provided the StrOrNull property implementation to be
used in MigrationState object when switching tls_* fields to StrOrNull.

Co-developed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/migration.json            | 191 +---------------------------
 include/hw/qdev-properties.h   |   3 +
 migration/options.h            |   3 +
 hw/core/qdev-properties.c      |  40 ++++++
 migration/migration-hmp-cmds.c |  20 +--
 migration/options.c            | 220 ++++++++++-----------------------
 migration/tls.c                |   3 +-
 7 files changed, 125 insertions(+), 355 deletions(-)

Comments

Markus Armbruster Sept. 26, 2023, 8:40 p.m. UTC | #1
Peter Xu <peterx@redhat.com> writes:

> Quotting from Markus in his replies:

Quoting

Suggest something like "Markus recently wrote:"

>   migrate-set-parameters sets migration parameters, and
>   query-migrate-parameters gets them.  Unsurprisingly, the former's
>   argument type MigrateSetParameters is quite close to the latter's
>   return type MigrationParameters.  The differences are subtle:
>
>   1. Since migrate-set-parameters supports setting selected parameters,
>      its arguments must all be optional (so you can omit the ones you
>      don't want to change).  query-migrate-parameters results are also
>      all optional, but almost all of them are in fact always present.
>
>   2. For parameters @tls_creds, @tls_hostname, @tls_authz,
>      migrate-set-parameters interprets special value "" as "reset to
>      default".  Works, because "" is semantically invalid.  Not a
>      general solution, because a semantically invalid value need not
>      exist.  Markus added a general solution in commit 01fa559826
>      ("migration: Use JSON null instead of "" to reset parameter to
>      default").  This involved changing the type from 'str' to
>      'StrOrNull'.
>
>   3. When parameter @block-bitmap-mapping has not been set,
>      query-migrate-parameters does not return it (absent optional
>      member).  Clean (but undocumented).  When parameters @tls_creds,
>      @tls_hostname, @tls_authz have not been set, it returns the
>      semantically invalid value "".  Not so clean (and just as
>      undocumented).
>
> Here to deduplicate the two objects: keep @MigrationParameters as the name
> of object to use in both places, drop @MigrateSetParameters, at the
> meantime switch types of @tls* fields from "str" to "StrOrNull" types.

Suggest

  Eliminate the duplication follows.

  Change MigrationParameters members @tls_creds, @tls_hostname,
  @tls_authz to StrOrNull.  query-migrate-parameters will of course
  never return null.

  Since these members are qdev properties, we need property machinery
  for StrOrNull: DEFINE_PROP_STR_OR_NULL().

  Switch migrate-set-parameters to MigrationParameters, and delete
  MigrateSetParameters.

Can you make this patch do just this de-duplication, and everything else
(described below) separately?

> I found that the TLS code wasn't so much relying on tls_* fields being
> non-NULL at all.  Actually on the other way round: if we set tls_authz to
> an empty string (NOTE: currently, migrate_init() missed initializing
> tls_authz; also touched it up in this patch), we can already fail one of
> the migration-test (tls/x509/default-host), as qauthz_is_allowed_by_id()
> will assume tls_authz set even if tls_auths is an empty string.
>
> It means we're actually relying on tls_* fields being NULL even if it's the
> empty string.
>
> Let's just make it a rule to return NULL for empty string on these fields
> internally.  For that, when converting a StrOrNull into a char* (where we
> introduced a helper here in this patch) we'll also make the empty string to
> be NULL, to make it always work.  And it doesn't show any issue either when
> applying that logic to both tls_creds and tls_hostname.
>
> With above, we can safely change both migration_tls_client_create() and
> migrate_tls() to not check the empty string too finally.. not needed
> anymore.
>
> Also, we can drop the hackish conversions in qmp_migrate_set_parameters()
> where we want to make sure it's a QSTRING; it's not needed now.
>
> This greatly deduplicates the code not only in qapi/migration.json, but
> also in the generic migration code.
>
> Markus helped greatly with this patch.  Besides a better commit
> message (where I just "stole" from the reply), debugged and resolved a
> double free, but also provided the StrOrNull property implementation to be
> used in MigrationState object when switching tls_* fields to StrOrNull.
>
> Co-developed-by: Markus Armbruster <armbru@redhat.com>
> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/migration.json            | 191 +---------------------------
>  include/hw/qdev-properties.h   |   3 +
>  migration/options.h            |   3 +
>  hw/core/qdev-properties.c      |  40 ++++++
>  migration/migration-hmp-cmds.c |  20 +--
>  migration/options.c            | 220 ++++++++++-----------------------
>  migration/tls.c                |   3 +-
>  7 files changed, 125 insertions(+), 355 deletions(-)
>
> diff --git a/qapi/migration.json b/qapi/migration.json
> index 8843e74b59..45d69787ae 100644
> --- a/qapi/migration.json
> +++ b/qapi/migration.json
> @@ -851,189 +851,6 @@
>             { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
>             'vcpu-dirty-limit'] }
>  
> -##
> -# @MigrateSetParameters:
> -#
> -# @announce-initial: Initial delay (in milliseconds) before sending
> -#     the first announce (Since 4.0)
> -#
> -# @announce-max: Maximum delay (in milliseconds) between packets in
> -#     the announcement (Since 4.0)
> -#
> -# @announce-rounds: Number of self-announce packets sent after
> -#     migration (Since 4.0)
> -#
> -# @announce-step: Increase in delay (in milliseconds) between
> -#     subsequent packets in the announcement (Since 4.0)
> -#
> -# @compress-level: compression level
> -#
> -# @compress-threads: compression thread count
> -#
> -# @compress-wait-thread: Controls behavior when all compression
> -#     threads are currently busy.  If true (default), wait for a free
> -#     compression thread to become available; otherwise, send the page
> -#     uncompressed.  (Since 3.1)
> -#
> -# @decompress-threads: decompression thread count
> -#
> -# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
> -#     bytes_xfer_period to trigger throttling.  It is expressed as
> -#     percentage.  The default value is 50. (Since 5.0)
> -#
> -# @cpu-throttle-initial: Initial percentage of time guest cpus are
> -#     throttled when migration auto-converge is activated.  The
> -#     default value is 20. (Since 2.7)
> -#
> -# @cpu-throttle-increment: throttle percentage increase each time
> -#     auto-converge detects that migration is not making progress.
> -#     The default value is 10. (Since 2.7)
> -#
> -# @cpu-throttle-tailslow: Make CPU throttling slower at tail stage At
> -#     the tail stage of throttling, the Guest is very sensitive to CPU
> -#     percentage while the @cpu-throttle -increment is excessive
> -#     usually at tail stage.  If this parameter is true, we will
> -#     compute the ideal CPU percentage used by the Guest, which may
> -#     exactly make the dirty rate match the dirty rate threshold.
> -#     Then we will choose a smaller throttle increment between the one
> -#     specified by @cpu-throttle-increment and the one generated by
> -#     ideal CPU percentage.  Therefore, it is compatible to
> -#     traditional throttling, meanwhile the throttle increment won't
> -#     be excessive at tail stage.  The default value is false.  (Since
> -#     5.1)
> -#
> -# @tls-creds: ID of the 'tls-creds' object that provides credentials
> -#     for establishing a TLS connection over the migration data
> -#     channel.  On the outgoing side of the migration, the credentials
> -#     must be for a 'client' endpoint, while for the incoming side the
> -#     credentials must be for a 'server' endpoint.  Setting this to a
> -#     non-empty string enables TLS for all migrations.  An empty
> -#     string means that QEMU will use plain text mode for migration,
> -#     rather than TLS (Since 2.9) Previously (since 2.7), this was
> -#     reported by omitting tls-creds instead.
> -#
> -# @tls-hostname: hostname of the target host for the migration.  This
> -#     is required when using x509 based TLS credentials and the
> -#     migration URI does not already include a hostname.  For example
> -#     if using fd: or exec: based migration, the hostname must be
> -#     provided so that the server's x509 certificate identity can be
> -#     validated.  (Since 2.7) An empty string means that QEMU will use
> -#     the hostname associated with the migration URI, if any.  (Since
> -#     2.9) Previously (since 2.7), this was reported by omitting
> -#     tls-hostname instead.
> -#
> -# @max-bandwidth: to set maximum speed for migration.  maximum speed
> -#     in bytes per second.  (Since 2.8)
> -#
> -# @downtime-limit: set maximum tolerated downtime for migration.
> -#     maximum downtime in milliseconds (Since 2.8)
> -#
> -# @x-checkpoint-delay: the delay time between two COLO checkpoints.
> -#     (Since 2.8)
> -#
> -# @block-incremental: Affects how much storage is migrated when the
> -#     block migration capability is enabled.  When false, the entire
> -#     storage backing chain is migrated into a flattened image at the
> -#     destination; when true, only the active qcow2 layer is migrated
> -#     and the destination must already have access to the same backing
> -#     chain as was used on the source.  (since 2.10)
> -#
> -# @multifd-channels: Number of channels used to migrate data in
> -#     parallel.  This is the same number that the number of sockets
> -#     used for migration.  The default value is 2 (since 4.0)
> -#
> -# @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
> -#     needs to be a multiple of the target page size and a power of 2
> -#     (Since 2.11)
> -#
> -# @max-postcopy-bandwidth: Background transfer bandwidth during
> -#     postcopy.  Defaults to 0 (unlimited).  In bytes per second.
> -#     (Since 3.0)
> -#
> -# @max-cpu-throttle: maximum cpu throttle percentage.  The default
> -#     value is 99. (Since 3.1)
> -#
> -# @multifd-compression: Which compression method to use.  Defaults to
> -#     none.  (Since 5.0)
> -#
> -# @multifd-zlib-level: Set the compression level to be used in live
> -#     migration, the compression level is an integer between 0 and 9,
> -#     where 0 means no compression, 1 means the best compression
> -#     speed, and 9 means best compression ratio which will consume
> -#     more CPU. Defaults to 1. (Since 5.0)
> -#
> -# @multifd-zstd-level: Set the compression level to be used in live
> -#     migration, the compression level is an integer between 0 and 20,
> -#     where 0 means no compression, 1 means the best compression
> -#     speed, and 20 means best compression ratio which will consume
> -#     more CPU. Defaults to 1. (Since 5.0)
> -#
> -# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
> -#     aliases for the purpose of dirty bitmap migration.  Such aliases
> -#     may for example be the corresponding names on the opposite site.
> -#     The mapping must be one-to-one, but not necessarily complete: On
> -#     the source, unmapped bitmaps and all bitmaps on unmapped nodes
> -#     will be ignored.  On the destination, encountering an unmapped
> -#     alias in the incoming migration stream will result in a report,
> -#     and all further bitmap migration data will then be discarded.
> -#     Note that the destination does not know about bitmaps it does
> -#     not receive, so there is no limitation or requirement regarding
> -#     the number of bitmaps received, or how they are named, or on
> -#     which nodes they are placed.  By default (when this parameter
> -#     has never been set), bitmap names are mapped to themselves.
> -#     Nodes are mapped to their block device name if there is one, and
> -#     to their node name otherwise.  (Since 5.2)
> -#
> -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
> -#     limit during live migration.  Should be in the range 1 to 1000ms.
> -#     Defaults to 1000ms.  (Since 8.1)
> -#
> -# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
> -#     Defaults to 1.  (Since 8.1)
> -#
> -# Features:
> -#
> -# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
> -#     are experimental.
> -#
> -# TODO: either fuse back into MigrationParameters, or make
> -#     MigrationParameters members mandatory
> -#
> -# Since: 2.4
> -##
> -{ 'struct': 'MigrateSetParameters',
> -  'data': { '*announce-initial': 'size',
> -            '*announce-max': 'size',
> -            '*announce-rounds': 'size',
> -            '*announce-step': 'size',
> -            '*compress-level': 'uint8',
> -            '*compress-threads': 'uint8',
> -            '*compress-wait-thread': 'bool',
> -            '*decompress-threads': 'uint8',
> -            '*throttle-trigger-threshold': 'uint8',
> -            '*cpu-throttle-initial': 'uint8',
> -            '*cpu-throttle-increment': 'uint8',
> -            '*cpu-throttle-tailslow': 'bool',
> -            '*tls-creds': 'StrOrNull',
> -            '*tls-hostname': 'StrOrNull',
> -            '*tls-authz': 'StrOrNull',
> -            '*max-bandwidth': 'size',
> -            '*downtime-limit': 'uint64',
> -            '*x-checkpoint-delay': { 'type': 'uint32',
> -                                     'features': [ 'unstable' ] },
> -            '*block-incremental': 'bool',
> -            '*multifd-channels': 'uint8',
> -            '*xbzrle-cache-size': 'size',
> -            '*max-postcopy-bandwidth': 'size',
> -            '*max-cpu-throttle': 'uint8',
> -            '*multifd-compression': 'MultiFDCompression',
> -            '*multifd-zlib-level': 'uint8',
> -            '*multifd-zstd-level': 'uint8',
> -            '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
> -            '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
> -                                            'features': [ 'unstable' ] },
> -            '*vcpu-dirty-limit': 'uint64'} }
> -
>  ##
>  # @migrate-set-parameters:
>  #
> @@ -1048,7 +865,7 @@
>  # <- { "return": {} }
>  ##
>  { 'command': 'migrate-set-parameters', 'boxed': true,
> -  'data': 'MigrateSetParameters' }
> +  'data': 'MigrationParameters' }
>  
>  ##
>  # @MigrationParameters:
> @@ -1214,9 +1031,9 @@
>              '*cpu-throttle-initial': 'uint8',
>              '*cpu-throttle-increment': 'uint8',
>              '*cpu-throttle-tailslow': 'bool',
> -            '*tls-creds': 'str',
> -            '*tls-hostname': 'str',
> -            '*tls-authz': 'str',
> +            '*tls-creds': 'StrOrNull',
> +            '*tls-hostname': 'StrOrNull',
> +            '*tls-authz': 'StrOrNull',
>              '*max-bandwidth': 'size',
>              '*downtime-limit': 'uint64',
>              '*x-checkpoint-delay': { 'type': 'uint32',
> diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
> index e1df08876c..3ae83d8390 100644
> --- a/include/hw/qdev-properties.h
> +++ b/include/hw/qdev-properties.h
> @@ -59,6 +59,7 @@ extern const PropertyInfo qdev_prop_uint64_checkmask;
>  extern const PropertyInfo qdev_prop_int64;
>  extern const PropertyInfo qdev_prop_size;
>  extern const PropertyInfo qdev_prop_string;
> +extern const PropertyInfo qdev_prop_str_or_null;
>  extern const PropertyInfo qdev_prop_on_off_auto;
>  extern const PropertyInfo qdev_prop_size32;
>  extern const PropertyInfo qdev_prop_arraylen;
> @@ -171,6 +172,8 @@ extern const PropertyInfo qdev_prop_link;
>      DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size, uint64_t)
>  #define DEFINE_PROP_STRING(_n, _s, _f)             \
>      DEFINE_PROP(_n, _s, _f, qdev_prop_string, char*)
> +#define DEFINE_PROP_STR_OR_NULL(_n, _s, _f)             \
> +    DEFINE_PROP(_n, _s, _f, qdev_prop_str_or_null, StrOrNull *)
>  #define DEFINE_PROP_ON_OFF_AUTO(_n, _s, _f, _d) \
>      DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_on_off_auto, OnOffAuto)
>  #define DEFINE_PROP_SIZE32(_n, _s, _f, _d)                       \
> diff --git a/migration/options.h b/migration/options.h
> index 045e2a41a2..124a5d450f 100644
> --- a/migration/options.h
> +++ b/migration/options.h
> @@ -56,6 +56,7 @@ bool migrate_zero_copy_send(void);
>  
>  bool migrate_multifd_flush_after_each_section(void);
>  bool migrate_postcopy(void);
> +/* Check whether TLS is enabled for migration */

Unrelated :)

>  bool migrate_tls(void);
>  
>  /* capabilities helpers */
> @@ -90,6 +91,8 @@ const char *migrate_tls_authz(void);
>  const char *migrate_tls_creds(void);
>  const char *migrate_tls_hostname(void);
>  uint64_t migrate_xbzrle_cache_size(void);
> +StrOrNull *StrOrNull_from_str(const char *str);
> +const char *str_from_StrOrNull(StrOrNull *obj);
>  
>  /* parameters setters */
>  
> diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
> index 357b8761b5..b4bbb52ae9 100644
> --- a/hw/core/qdev-properties.c
> +++ b/hw/core/qdev-properties.c
> @@ -2,6 +2,7 @@
>  #include "hw/qdev-properties.h"
>  #include "qapi/error.h"
>  #include "qapi/qapi-types-misc.h"
> +#include "qapi/qapi-visit-misc.h"
>  #include "qapi/qmp/qerror.h"
>  #include "qemu/ctype.h"
>  #include "qemu/error-report.h"
> @@ -490,6 +491,45 @@ const PropertyInfo qdev_prop_string = {
>      .set   = set_string,
>  };
>  
> +/* --- StrOrNull --- */
> +
> +static void release_str_or_null(Object *obj, const char *name, void *opaque)
> +{
> +    Property *prop = opaque;
> +
> +    qapi_free_StrOrNull(*(StrOrNull **)object_field_prop_ptr(obj, prop));
> +}
> +
> +static void get_str_or_null(Object *obj, Visitor *v, const char *name,
> +                            void *opaque, Error **errp)
> +{
> +    Property *prop = opaque;
> +    StrOrNull **ptr = object_field_prop_ptr(obj, prop);
> +
> +    visit_type_StrOrNull(v, name, ptr, errp);
> +}
> +
> +static void set_str_or_null(Object *obj, Visitor *v, const char *name,
> +                            void *opaque, Error **errp)
> +{
> +    Property *prop = opaque;
> +    StrOrNull **ptr = object_field_prop_ptr(obj, prop);
> +    StrOrNull *son;
> +
> +    if (!visit_type_StrOrNull(v, name, &son, errp)) {
> +        return;
> +    }
> +    qapi_free_StrOrNull(*ptr);
> +    *ptr = son;
> +}
> +
> +const PropertyInfo qdev_prop_str_or_null = {
> +    .name  = "StrOrNull",
> +    .release = release_str_or_null,
> +    .get   = get_str_or_null,
> +    .set   = set_str_or_null,
> +};
> +
>  /* --- on/off/auto --- */
>  
>  const PropertyInfo qdev_prop_on_off_auto = {
> diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> index c115ef2d23..88a8ccb475 100644
> --- a/migration/migration-hmp-cmds.c
> +++ b/migration/migration-hmp-cmds.c

Missing #include "migration/options.h".  Next patch unbreaks the build.

> @@ -257,6 +257,7 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict)
>  void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>  {
>      MigrationParameters *params;
> +    const char *str;
>  
>      params = qmp_query_migrate_parameters(NULL);
>  
> @@ -309,14 +310,18 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>          monitor_printf(mon, "%s: %u\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_MAX_CPU_THROTTLE),
>              params->max_cpu_throttle);
> -        assert(params->tls_creds);
> +        str = str_from_StrOrNull(params->tls_creds);
>          monitor_printf(mon, "%s: '%s'\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_TLS_CREDS),
> -            params->tls_creds);
> -        assert(params->tls_hostname);
> +            str ? str : "");
> +        str = str_from_StrOrNull(params->tls_hostname);
>          monitor_printf(mon, "%s: '%s'\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_TLS_HOSTNAME),
> -            params->tls_hostname);
> +            str ? str : "");
> +        str = str_from_StrOrNull(params->tls_authz);
> +        monitor_printf(mon, "%s: '%s'\n",
> +            MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
> +            str ? str : "");
>          assert(params->has_max_bandwidth);
>          monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_MAX_BANDWIDTH),
> @@ -345,9 +350,6 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
>          monitor_printf(mon, "%s: %" PRIu64 "\n",
>              MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
>              params->max_postcopy_bandwidth);
> -        monitor_printf(mon, "%s: '%s'\n",
> -            MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
> -            params->tls_authz);
>  
>          if (params->has_block_bitmap_mapping) {
>              const BitmapMigrationNodeAliasList *bmnal;
> @@ -497,7 +499,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
>      const char *param = qdict_get_str(qdict, "parameter");
>      const char *valuestr = qdict_get_str(qdict, "value");
>      Visitor *v = string_input_visitor_new(valuestr);
> -    MigrateSetParameters *p = g_new0(MigrateSetParameters, 1);
> +    MigrationParameters *p = g_new0(MigrationParameters, 1);
>      uint64_t valuebw = 0;
>      uint64_t cache_size;
>      Error *err = NULL;
> @@ -657,7 +659,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
>      qmp_migrate_set_parameters(p, &err);
>  
>   cleanup:
> -    qapi_free_MigrateSetParameters(p);
> +    qapi_free_MigrationParameters(p);
>      visit_free(v);
>      hmp_handle_error(mon, err);
>  }
> diff --git a/migration/options.c b/migration/options.c
> index 6bbfd4853d..12e392f68c 100644
> --- a/migration/options.c
> +++ b/migration/options.c
> @@ -164,9 +164,12 @@ Property migration_properties[] = {
>      DEFINE_PROP_SIZE("announce-step", MigrationState,
>                        parameters.announce_step,
>                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
> -    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
> -    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
> -    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
> +    DEFINE_PROP_STR_OR_NULL("tls-creds", MigrationState,
> +                            parameters.tls_creds),
> +    DEFINE_PROP_STR_OR_NULL("tls-hostname", MigrationState,
> +                            parameters.tls_hostname),
> +    DEFINE_PROP_STR_OR_NULL("tls-authz", MigrationState,
> +                            parameters.tls_authz),

The qdev machinery now additionally accepts JSON null as property
value.

If that's undesirable, we need to reject it.

If it's desirable, we need to document it, and we should probably make
it a separate patch.

To answer the question whether it's desirable, we need to recall the
purpose of the properties.  They go back to

commit e5cb7e7677010f529d3f0f9dcdb385dea9446f8d
Author: Peter Xu <peterx@redhat.com>
Date:   Tue Jun 27 12:10:13 2017 +0800

    migration: let MigrationState be a qdev
    
    Let the old man "MigrationState" join the object family. Direct benefit
    is that we can start to use all the property features derived from
    current QDev, like: HW_COMPAT_* bits, command line setup for migration
    parameters (so will never need to set them up each time using HMP/QMP,
    this is really, really attractive for test writters), etc.
    
    I see no reason to disallow this happen yet. So let's start from this
    one, to see whether it would be anything good.
    
    Now we init the MigrationState struct statically in main() to make sure
    it's initialized after global properties are applied, since we'll use
    them during creation of the object.
    
    No functional change at all.
    
    Reviewed-by: Juan Quintela <quintela@redhat.com>
    Signed-off-by: Peter Xu <peterx@redhat.com>
    Message-Id: <1498536619-14548-5-git-send-email-peterx@redhat.com>
    Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
    Signed-off-by: Juan Quintela <quintela@redhat.com>

>      DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
>                         parameters.x_vcpu_dirty_limit_period,
>                         DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
> @@ -201,6 +204,38 @@ Property migration_properties[] = {
>      DEFINE_PROP_END_OF_LIST(),
>  };
>  
> +/*
> + * NOTE: here we have a trick when converting the empty string (""): we
> + * need to make sure the empty string ("") will be converted to NULL, as
> + * some TLS code may rely on that to detect whether something is enabled
> + * (e.g., the tls_authz field).

Suggest

    * Map JSON null and JSON "" to NULL, other JSON strings to the string.
    * The curious special treatment of "" is necessary, because it needs
    * to be interpreted like JSON null for backward compatibility.

> + */
> +const char *str_from_StrOrNull(StrOrNull *obj)
> +{
> +    if (!obj || obj->type == QTYPE_QNULL) {
> +        return NULL;
> +    } else if (obj->type == QTYPE_QSTRING) {
> +        if (obj->u.s[0] == '\0') {
> +            return NULL;
> +        } else {
> +            return obj->u.s;
> +        }
> +    } else {
> +        abort();
> +    }
> +}
> +
> +StrOrNull *StrOrNull_from_str(const char *str)
> +{
> +    StrOrNull *obj = g_new0(StrOrNull, 1);
> +
> +    assert(str);
> +    obj->type = QTYPE_QSTRING;
> +    obj->u.s = g_strdup(str);
> +
> +    return obj;
> +}
> +
>  bool migrate_auto_converge(void)
>  {
>      MigrationState *s = migrate_get_current();
> @@ -378,9 +413,11 @@ bool migrate_postcopy(void)
>  
>  bool migrate_tls(void)
>  {
> -    MigrationState *s = migrate_get_current();
> -
> -    return s->parameters.tls_creds && *s->parameters.tls_creds;
> +    /*
> +     * The whole TLS feature relies on a non-empty tls-creds set first.
> +     * It's disabled otherwise.
> +     */

Suggest

       /* TLS is enabled when @tls-creds is non-null */
  
> +    return migrate_tls_creds();
>  }
>  
>  typedef enum WriteTrackingSupport {
> @@ -827,21 +864,21 @@ const char *migrate_tls_authz(void)
>  {
>      MigrationState *s = migrate_get_current();
>  
> -    return s->parameters.tls_authz;
> +    return str_from_StrOrNull(s->parameters.tls_authz);
>  }

This maps "" to null on use of .tls_authz.  Direct use is wrong.

The alternative is to map it when it's set.  Then setting it must go
through a setter that maps, and direct use is fine.  Observation, not a
demand.

>  
>  const char *migrate_tls_creds(void)
>  {
>      MigrationState *s = migrate_get_current();
>  
> -    return s->parameters.tls_creds;
> +    return str_from_StrOrNull(s->parameters.tls_creds);
>  }
>  
>  const char *migrate_tls_hostname(void)
>  {
>      MigrationState *s = migrate_get_current();
>  
> -    return s->parameters.tls_hostname;
> +    return str_from_StrOrNull(s->parameters.tls_hostname);
>  }

Likewise.

>  
>  uint64_t migrate_xbzrle_cache_size(void)
> @@ -911,10 +948,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>      params->cpu_throttle_increment = s->parameters.cpu_throttle_increment;
>      params->has_cpu_throttle_tailslow = true;
>      params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
> -    params->tls_creds = g_strdup(s->parameters.tls_creds);
> -    params->tls_hostname = g_strdup(s->parameters.tls_hostname);
> -    params->tls_authz = g_strdup(s->parameters.tls_authz ?
> -                                 s->parameters.tls_authz : "");
> +    params->tls_creds = QAPI_CLONE(StrOrNull, s->parameters.tls_creds);
> +    params->tls_hostname = QAPI_CLONE(StrOrNull, s->parameters.tls_hostname);
> +    params->tls_authz = QAPI_CLONE(StrOrNull, s->parameters.tls_authz);
>      params->has_max_bandwidth = true;
>      params->max_bandwidth = s->parameters.max_bandwidth;
>      params->has_downtime_limit = true;
> @@ -963,8 +999,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>  
>  void migrate_params_init(MigrationParameters *params)
>  {
> -    params->tls_hostname = g_strdup("");
> -    params->tls_creds = g_strdup("");
> +    params->tls_hostname = StrOrNull_from_str("");
> +    params->tls_creds = StrOrNull_from_str("");
> +    params->tls_authz = StrOrNull_from_str("");
>  
>      /* Set has_* up only for parameter checks */
>      params->has_compress_level = true;
> @@ -1145,7 +1182,7 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
>  #ifdef CONFIG_LINUX
>      if (migrate_zero_copy_send() &&
>          ((params->has_multifd_compression && params->multifd_compression) ||
> -         (params->tls_creds && *params->tls_creds))) {
> +         migrate_tls())) {

Correct only if params == current_migration!  Are they equal?

>          error_setg(errp,
>                     "Zero copy only available for non-compressed non-TLS multifd migration");
>          return false;
> @@ -1172,113 +1209,7 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
>      return true;
>  }
>  
> -static void migrate_params_test_apply(MigrateSetParameters *params,
> -                                      MigrationParameters *dest)
> -{
> -    *dest = migrate_get_current()->parameters;
> -
> -    /* TODO use QAPI_CLONE() instead of duplicating it inline */
> -
> -    if (params->has_compress_level) {
> -        dest->compress_level = params->compress_level;
> -    }
> -
> -    if (params->has_compress_threads) {
> -        dest->compress_threads = params->compress_threads;
> -    }
> -
> -    if (params->has_compress_wait_thread) {
> -        dest->compress_wait_thread = params->compress_wait_thread;
> -    }
> -
> -    if (params->has_decompress_threads) {
> -        dest->decompress_threads = params->decompress_threads;
> -    }
> -
> -    if (params->has_throttle_trigger_threshold) {
> -        dest->throttle_trigger_threshold = params->throttle_trigger_threshold;
> -    }
> -
> -    if (params->has_cpu_throttle_initial) {
> -        dest->cpu_throttle_initial = params->cpu_throttle_initial;
> -    }
> -
> -    if (params->has_cpu_throttle_increment) {
> -        dest->cpu_throttle_increment = params->cpu_throttle_increment;
> -    }
> -
> -    if (params->has_cpu_throttle_tailslow) {
> -        dest->cpu_throttle_tailslow = params->cpu_throttle_tailslow;
> -    }
> -
> -    if (params->tls_creds) {
> -        assert(params->tls_creds->type == QTYPE_QSTRING);
> -        dest->tls_creds = params->tls_creds->u.s;
> -    }
> -
> -    if (params->tls_hostname) {
> -        assert(params->tls_hostname->type == QTYPE_QSTRING);
> -        dest->tls_hostname = params->tls_hostname->u.s;
> -    }
> -
> -    if (params->has_max_bandwidth) {
> -        dest->max_bandwidth = params->max_bandwidth;
> -    }
> -
> -    if (params->has_downtime_limit) {
> -        dest->downtime_limit = params->downtime_limit;
> -    }
> -
> -    if (params->has_x_checkpoint_delay) {
> -        dest->x_checkpoint_delay = params->x_checkpoint_delay;
> -    }
> -
> -    if (params->has_block_incremental) {
> -        dest->block_incremental = params->block_incremental;
> -    }
> -    if (params->has_multifd_channels) {
> -        dest->multifd_channels = params->multifd_channels;
> -    }
> -    if (params->has_multifd_compression) {
> -        dest->multifd_compression = params->multifd_compression;
> -    }
> -    if (params->has_xbzrle_cache_size) {
> -        dest->xbzrle_cache_size = params->xbzrle_cache_size;
> -    }
> -    if (params->has_max_postcopy_bandwidth) {
> -        dest->max_postcopy_bandwidth = params->max_postcopy_bandwidth;
> -    }
> -    if (params->has_max_cpu_throttle) {
> -        dest->max_cpu_throttle = params->max_cpu_throttle;
> -    }
> -    if (params->has_announce_initial) {
> -        dest->announce_initial = params->announce_initial;
> -    }
> -    if (params->has_announce_max) {
> -        dest->announce_max = params->announce_max;
> -    }
> -    if (params->has_announce_rounds) {
> -        dest->announce_rounds = params->announce_rounds;
> -    }
> -    if (params->has_announce_step) {
> -        dest->announce_step = params->announce_step;
> -    }
> -
> -    if (params->has_block_bitmap_mapping) {
> -        dest->has_block_bitmap_mapping = true;
> -        dest->block_bitmap_mapping = params->block_bitmap_mapping;
> -    }
> -
> -    if (params->has_x_vcpu_dirty_limit_period) {
> -        dest->x_vcpu_dirty_limit_period =
> -            params->x_vcpu_dirty_limit_period;
> -    }
> -    if (params->has_vcpu_dirty_limit) {
> -        dest->vcpu_dirty_limit = params->vcpu_dirty_limit;
> -    }
> -}

I believe your commit message tries to explain why this function is no
longer needed.  But it's still not obvious to me.

I suspect this patch is doing too much.

> -
> -static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
> +static void migrate_params_apply(MigrationParameters *params, Error **errp)
>  {
>      MigrationState *s = migrate_get_current();
>  
> @@ -1317,21 +1248,18 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
>      }
>  
>      if (params->tls_creds) {
> -        g_free(s->parameters.tls_creds);
> -        assert(params->tls_creds->type == QTYPE_QSTRING);
> -        s->parameters.tls_creds = g_strdup(params->tls_creds->u.s);
> +        qapi_free_StrOrNull(s->parameters.tls_creds);
> +        s->parameters.tls_creds = QAPI_CLONE(StrOrNull, params->tls_creds);
>      }
>  
>      if (params->tls_hostname) {
> -        g_free(s->parameters.tls_hostname);
> -        assert(params->tls_hostname->type == QTYPE_QSTRING);
> -        s->parameters.tls_hostname = g_strdup(params->tls_hostname->u.s);
> +        qapi_free_StrOrNull(s->parameters.tls_hostname);
> +        s->parameters.tls_hostname = QAPI_CLONE(StrOrNull, params->tls_hostname);
>      }
>  
>      if (params->tls_authz) {
> -        g_free(s->parameters.tls_authz);
> -        assert(params->tls_authz->type == QTYPE_QSTRING);
> -        s->parameters.tls_authz = g_strdup(params->tls_authz->u.s);
> +        qapi_free_StrOrNull(s->parameters.tls_authz);
> +        s->parameters.tls_authz = QAPI_CLONE(StrOrNull, params->tls_authz);
>      }
>  
>      if (params->has_max_bandwidth) {
> @@ -1404,33 +1332,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
>      }
>  }
>  
> -void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
> +void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
>  {
> -    MigrationParameters tmp;
> -
> -    /* TODO Rewrite "" to null instead for all three tls_* parameters */
> -    if (params->tls_creds
> -        && params->tls_creds->type == QTYPE_QNULL) {
> -        qobject_unref(params->tls_creds->u.n);
> -        params->tls_creds->type = QTYPE_QSTRING;
> -        params->tls_creds->u.s = strdup("");
> -    }
> -    if (params->tls_hostname
> -        && params->tls_hostname->type == QTYPE_QNULL) {
> -        qobject_unref(params->tls_hostname->u.n);
> -        params->tls_hostname->type = QTYPE_QSTRING;
> -        params->tls_hostname->u.s = strdup("");
> -    }
> -    if (params->tls_authz
> -        && params->tls_authz->type == QTYPE_QNULL) {
> -        qobject_unref(params->tls_authz->u.n);
> -        params->tls_authz->type = QTYPE_QSTRING;
> -        params->tls_authz->u.s = strdup("");
> -    }
> -
> -    migrate_params_test_apply(params, &tmp);
> -
> -    if (!migrate_params_check(&tmp, errp)) {
> +    if (!migrate_params_check(params, errp)) {
>          /* Invalid parameter */
>          return;
>      }
> diff --git a/migration/tls.c b/migration/tls.c
> index fa03d9136c..c2ed4ff557 100644
> --- a/migration/tls.c
> +++ b/migration/tls.c
> @@ -126,7 +126,8 @@ QIOChannelTLS *migration_tls_client_create(QIOChannel *ioc,
>      }
>  
>      const char *tls_hostname = migrate_tls_hostname();
> -    if (tls_hostname && *tls_hostname) {
> +    /* If tls_hostname, then it must be non-empty string already */

I'm not sure the comment makes sense outside this commit's context.

> +    if (tls_hostname) {
>          hostname = tls_hostname;
>      }
Peter Xu Oct. 2, 2023, 7:52 p.m. UTC | #2
On Tue, Sep 26, 2023 at 10:40:27PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > Quotting from Markus in his replies:
> 
> Quoting
> 
> Suggest something like "Markus recently wrote:"

Will do.

> 
> >   migrate-set-parameters sets migration parameters, and
> >   query-migrate-parameters gets them.  Unsurprisingly, the former's
> >   argument type MigrateSetParameters is quite close to the latter's
> >   return type MigrationParameters.  The differences are subtle:
> >
> >   1. Since migrate-set-parameters supports setting selected parameters,
> >      its arguments must all be optional (so you can omit the ones you
> >      don't want to change).  query-migrate-parameters results are also
> >      all optional, but almost all of them are in fact always present.
> >
> >   2. For parameters @tls_creds, @tls_hostname, @tls_authz,
> >      migrate-set-parameters interprets special value "" as "reset to
> >      default".  Works, because "" is semantically invalid.  Not a
> >      general solution, because a semantically invalid value need not
> >      exist.  Markus added a general solution in commit 01fa559826
> >      ("migration: Use JSON null instead of "" to reset parameter to
> >      default").  This involved changing the type from 'str' to
> >      'StrOrNull'.
> >
> >   3. When parameter @block-bitmap-mapping has not been set,
> >      query-migrate-parameters does not return it (absent optional
> >      member).  Clean (but undocumented).  When parameters @tls_creds,
> >      @tls_hostname, @tls_authz have not been set, it returns the
> >      semantically invalid value "".  Not so clean (and just as
> >      undocumented).
> >
> > Here to deduplicate the two objects: keep @MigrationParameters as the name
> > of object to use in both places, drop @MigrateSetParameters, at the
> > meantime switch types of @tls* fields from "str" to "StrOrNull" types.
> 
> Suggest
> 
>   Eliminate the duplication follows.
> 
>   Change MigrationParameters members @tls_creds, @tls_hostname,
>   @tls_authz to StrOrNull.  query-migrate-parameters will of course
>   never return null.
> 
>   Since these members are qdev properties, we need property machinery
>   for StrOrNull: DEFINE_PROP_STR_OR_NULL().
> 
>   Switch migrate-set-parameters to MigrationParameters, and delete
>   MigrateSetParameters.

Will do.

> 
> Can you make this patch do just this de-duplication, and everything else
> (described below) separately?

One thing I can do is to move qdev_prop_str_or_null impl (from you) into a
separate patch.   I can drop some unnecessary changes too when possible,
not yet sure what else I can split, but I can try once there.

> 
> > I found that the TLS code wasn't so much relying on tls_* fields being
> > non-NULL at all.  Actually on the other way round: if we set tls_authz to
> > an empty string (NOTE: currently, migrate_init() missed initializing
> > tls_authz; also touched it up in this patch), we can already fail one of
> > the migration-test (tls/x509/default-host), as qauthz_is_allowed_by_id()
> > will assume tls_authz set even if tls_auths is an empty string.
> >
> > It means we're actually relying on tls_* fields being NULL even if it's the
> > empty string.
> >
> > Let's just make it a rule to return NULL for empty string on these fields
> > internally.  For that, when converting a StrOrNull into a char* (where we
> > introduced a helper here in this patch) we'll also make the empty string to
> > be NULL, to make it always work.  And it doesn't show any issue either when
> > applying that logic to both tls_creds and tls_hostname.
> >
> > With above, we can safely change both migration_tls_client_create() and
> > migrate_tls() to not check the empty string too finally.. not needed
> > anymore.
> >
> > Also, we can drop the hackish conversions in qmp_migrate_set_parameters()
> > where we want to make sure it's a QSTRING; it's not needed now.
> >
> > This greatly deduplicates the code not only in qapi/migration.json, but
> > also in the generic migration code.
> >
> > Markus helped greatly with this patch.  Besides a better commit
> > message (where I just "stole" from the reply), debugged and resolved a
> > double free, but also provided the StrOrNull property implementation to be
> > used in MigrationState object when switching tls_* fields to StrOrNull.
> >
> > Co-developed-by: Markus Armbruster <armbru@redhat.com>
> > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >  qapi/migration.json            | 191 +---------------------------
> >  include/hw/qdev-properties.h   |   3 +
> >  migration/options.h            |   3 +
> >  hw/core/qdev-properties.c      |  40 ++++++
> >  migration/migration-hmp-cmds.c |  20 +--
> >  migration/options.c            | 220 ++++++++++-----------------------
> >  migration/tls.c                |   3 +-
> >  7 files changed, 125 insertions(+), 355 deletions(-)
> >
> > diff --git a/qapi/migration.json b/qapi/migration.json
> > index 8843e74b59..45d69787ae 100644
> > --- a/qapi/migration.json
> > +++ b/qapi/migration.json
> > @@ -851,189 +851,6 @@
> >             { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
> >             'vcpu-dirty-limit'] }
> >  
> > -##
> > -# @MigrateSetParameters:
> > -#
> > -# @announce-initial: Initial delay (in milliseconds) before sending
> > -#     the first announce (Since 4.0)
> > -#
> > -# @announce-max: Maximum delay (in milliseconds) between packets in
> > -#     the announcement (Since 4.0)
> > -#
> > -# @announce-rounds: Number of self-announce packets sent after
> > -#     migration (Since 4.0)
> > -#
> > -# @announce-step: Increase in delay (in milliseconds) between
> > -#     subsequent packets in the announcement (Since 4.0)
> > -#
> > -# @compress-level: compression level
> > -#
> > -# @compress-threads: compression thread count
> > -#
> > -# @compress-wait-thread: Controls behavior when all compression
> > -#     threads are currently busy.  If true (default), wait for a free
> > -#     compression thread to become available; otherwise, send the page
> > -#     uncompressed.  (Since 3.1)
> > -#
> > -# @decompress-threads: decompression thread count
> > -#
> > -# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
> > -#     bytes_xfer_period to trigger throttling.  It is expressed as
> > -#     percentage.  The default value is 50. (Since 5.0)
> > -#
> > -# @cpu-throttle-initial: Initial percentage of time guest cpus are
> > -#     throttled when migration auto-converge is activated.  The
> > -#     default value is 20. (Since 2.7)
> > -#
> > -# @cpu-throttle-increment: throttle percentage increase each time
> > -#     auto-converge detects that migration is not making progress.
> > -#     The default value is 10. (Since 2.7)
> > -#
> > -# @cpu-throttle-tailslow: Make CPU throttling slower at tail stage At
> > -#     the tail stage of throttling, the Guest is very sensitive to CPU
> > -#     percentage while the @cpu-throttle -increment is excessive
> > -#     usually at tail stage.  If this parameter is true, we will
> > -#     compute the ideal CPU percentage used by the Guest, which may
> > -#     exactly make the dirty rate match the dirty rate threshold.
> > -#     Then we will choose a smaller throttle increment between the one
> > -#     specified by @cpu-throttle-increment and the one generated by
> > -#     ideal CPU percentage.  Therefore, it is compatible to
> > -#     traditional throttling, meanwhile the throttle increment won't
> > -#     be excessive at tail stage.  The default value is false.  (Since
> > -#     5.1)
> > -#
> > -# @tls-creds: ID of the 'tls-creds' object that provides credentials
> > -#     for establishing a TLS connection over the migration data
> > -#     channel.  On the outgoing side of the migration, the credentials
> > -#     must be for a 'client' endpoint, while for the incoming side the
> > -#     credentials must be for a 'server' endpoint.  Setting this to a
> > -#     non-empty string enables TLS for all migrations.  An empty
> > -#     string means that QEMU will use plain text mode for migration,
> > -#     rather than TLS (Since 2.9) Previously (since 2.7), this was
> > -#     reported by omitting tls-creds instead.
> > -#
> > -# @tls-hostname: hostname of the target host for the migration.  This
> > -#     is required when using x509 based TLS credentials and the
> > -#     migration URI does not already include a hostname.  For example
> > -#     if using fd: or exec: based migration, the hostname must be
> > -#     provided so that the server's x509 certificate identity can be
> > -#     validated.  (Since 2.7) An empty string means that QEMU will use
> > -#     the hostname associated with the migration URI, if any.  (Since
> > -#     2.9) Previously (since 2.7), this was reported by omitting
> > -#     tls-hostname instead.
> > -#
> > -# @max-bandwidth: to set maximum speed for migration.  maximum speed
> > -#     in bytes per second.  (Since 2.8)
> > -#
> > -# @downtime-limit: set maximum tolerated downtime for migration.
> > -#     maximum downtime in milliseconds (Since 2.8)
> > -#
> > -# @x-checkpoint-delay: the delay time between two COLO checkpoints.
> > -#     (Since 2.8)
> > -#
> > -# @block-incremental: Affects how much storage is migrated when the
> > -#     block migration capability is enabled.  When false, the entire
> > -#     storage backing chain is migrated into a flattened image at the
> > -#     destination; when true, only the active qcow2 layer is migrated
> > -#     and the destination must already have access to the same backing
> > -#     chain as was used on the source.  (since 2.10)
> > -#
> > -# @multifd-channels: Number of channels used to migrate data in
> > -#     parallel.  This is the same number that the number of sockets
> > -#     used for migration.  The default value is 2 (since 4.0)
> > -#
> > -# @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
> > -#     needs to be a multiple of the target page size and a power of 2
> > -#     (Since 2.11)
> > -#
> > -# @max-postcopy-bandwidth: Background transfer bandwidth during
> > -#     postcopy.  Defaults to 0 (unlimited).  In bytes per second.
> > -#     (Since 3.0)
> > -#
> > -# @max-cpu-throttle: maximum cpu throttle percentage.  The default
> > -#     value is 99. (Since 3.1)
> > -#
> > -# @multifd-compression: Which compression method to use.  Defaults to
> > -#     none.  (Since 5.0)
> > -#
> > -# @multifd-zlib-level: Set the compression level to be used in live
> > -#     migration, the compression level is an integer between 0 and 9,
> > -#     where 0 means no compression, 1 means the best compression
> > -#     speed, and 9 means best compression ratio which will consume
> > -#     more CPU. Defaults to 1. (Since 5.0)
> > -#
> > -# @multifd-zstd-level: Set the compression level to be used in live
> > -#     migration, the compression level is an integer between 0 and 20,
> > -#     where 0 means no compression, 1 means the best compression
> > -#     speed, and 20 means best compression ratio which will consume
> > -#     more CPU. Defaults to 1. (Since 5.0)
> > -#
> > -# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
> > -#     aliases for the purpose of dirty bitmap migration.  Such aliases
> > -#     may for example be the corresponding names on the opposite site.
> > -#     The mapping must be one-to-one, but not necessarily complete: On
> > -#     the source, unmapped bitmaps and all bitmaps on unmapped nodes
> > -#     will be ignored.  On the destination, encountering an unmapped
> > -#     alias in the incoming migration stream will result in a report,
> > -#     and all further bitmap migration data will then be discarded.
> > -#     Note that the destination does not know about bitmaps it does
> > -#     not receive, so there is no limitation or requirement regarding
> > -#     the number of bitmaps received, or how they are named, or on
> > -#     which nodes they are placed.  By default (when this parameter
> > -#     has never been set), bitmap names are mapped to themselves.
> > -#     Nodes are mapped to their block device name if there is one, and
> > -#     to their node name otherwise.  (Since 5.2)
> > -#
> > -# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
> > -#     limit during live migration.  Should be in the range 1 to 1000ms.
> > -#     Defaults to 1000ms.  (Since 8.1)
> > -#
> > -# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
> > -#     Defaults to 1.  (Since 8.1)
> > -#
> > -# Features:
> > -#
> > -# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
> > -#     are experimental.
> > -#
> > -# TODO: either fuse back into MigrationParameters, or make
> > -#     MigrationParameters members mandatory
> > -#
> > -# Since: 2.4
> > -##
> > -{ 'struct': 'MigrateSetParameters',
> > -  'data': { '*announce-initial': 'size',
> > -            '*announce-max': 'size',
> > -            '*announce-rounds': 'size',
> > -            '*announce-step': 'size',
> > -            '*compress-level': 'uint8',
> > -            '*compress-threads': 'uint8',
> > -            '*compress-wait-thread': 'bool',
> > -            '*decompress-threads': 'uint8',
> > -            '*throttle-trigger-threshold': 'uint8',
> > -            '*cpu-throttle-initial': 'uint8',
> > -            '*cpu-throttle-increment': 'uint8',
> > -            '*cpu-throttle-tailslow': 'bool',
> > -            '*tls-creds': 'StrOrNull',
> > -            '*tls-hostname': 'StrOrNull',
> > -            '*tls-authz': 'StrOrNull',
> > -            '*max-bandwidth': 'size',
> > -            '*downtime-limit': 'uint64',
> > -            '*x-checkpoint-delay': { 'type': 'uint32',
> > -                                     'features': [ 'unstable' ] },
> > -            '*block-incremental': 'bool',
> > -            '*multifd-channels': 'uint8',
> > -            '*xbzrle-cache-size': 'size',
> > -            '*max-postcopy-bandwidth': 'size',
> > -            '*max-cpu-throttle': 'uint8',
> > -            '*multifd-compression': 'MultiFDCompression',
> > -            '*multifd-zlib-level': 'uint8',
> > -            '*multifd-zstd-level': 'uint8',
> > -            '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
> > -            '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
> > -                                            'features': [ 'unstable' ] },
> > -            '*vcpu-dirty-limit': 'uint64'} }
> > -
> >  ##
> >  # @migrate-set-parameters:
> >  #
> > @@ -1048,7 +865,7 @@
> >  # <- { "return": {} }
> >  ##
> >  { 'command': 'migrate-set-parameters', 'boxed': true,
> > -  'data': 'MigrateSetParameters' }
> > +  'data': 'MigrationParameters' }
> >  
> >  ##
> >  # @MigrationParameters:
> > @@ -1214,9 +1031,9 @@
> >              '*cpu-throttle-initial': 'uint8',
> >              '*cpu-throttle-increment': 'uint8',
> >              '*cpu-throttle-tailslow': 'bool',
> > -            '*tls-creds': 'str',
> > -            '*tls-hostname': 'str',
> > -            '*tls-authz': 'str',
> > +            '*tls-creds': 'StrOrNull',
> > +            '*tls-hostname': 'StrOrNull',
> > +            '*tls-authz': 'StrOrNull',
> >              '*max-bandwidth': 'size',
> >              '*downtime-limit': 'uint64',
> >              '*x-checkpoint-delay': { 'type': 'uint32',
> > diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
> > index e1df08876c..3ae83d8390 100644
> > --- a/include/hw/qdev-properties.h
> > +++ b/include/hw/qdev-properties.h
> > @@ -59,6 +59,7 @@ extern const PropertyInfo qdev_prop_uint64_checkmask;
> >  extern const PropertyInfo qdev_prop_int64;
> >  extern const PropertyInfo qdev_prop_size;
> >  extern const PropertyInfo qdev_prop_string;
> > +extern const PropertyInfo qdev_prop_str_or_null;
> >  extern const PropertyInfo qdev_prop_on_off_auto;
> >  extern const PropertyInfo qdev_prop_size32;
> >  extern const PropertyInfo qdev_prop_arraylen;
> > @@ -171,6 +172,8 @@ extern const PropertyInfo qdev_prop_link;
> >      DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size, uint64_t)
> >  #define DEFINE_PROP_STRING(_n, _s, _f)             \
> >      DEFINE_PROP(_n, _s, _f, qdev_prop_string, char*)
> > +#define DEFINE_PROP_STR_OR_NULL(_n, _s, _f)             \
> > +    DEFINE_PROP(_n, _s, _f, qdev_prop_str_or_null, StrOrNull *)
> >  #define DEFINE_PROP_ON_OFF_AUTO(_n, _s, _f, _d) \
> >      DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_on_off_auto, OnOffAuto)
> >  #define DEFINE_PROP_SIZE32(_n, _s, _f, _d)                       \
> > diff --git a/migration/options.h b/migration/options.h
> > index 045e2a41a2..124a5d450f 100644
> > --- a/migration/options.h
> > +++ b/migration/options.h
> > @@ -56,6 +56,7 @@ bool migrate_zero_copy_send(void);
> >  
> >  bool migrate_multifd_flush_after_each_section(void);
> >  bool migrate_postcopy(void);
> > +/* Check whether TLS is enabled for migration */
> 
> Unrelated :)

Yeah, I'll drop it.

> 
> >  bool migrate_tls(void);
> >  
> >  /* capabilities helpers */
> > @@ -90,6 +91,8 @@ const char *migrate_tls_authz(void);
> >  const char *migrate_tls_creds(void);
> >  const char *migrate_tls_hostname(void);
> >  uint64_t migrate_xbzrle_cache_size(void);
> > +StrOrNull *StrOrNull_from_str(const char *str);
> > +const char *str_from_StrOrNull(StrOrNull *obj);
> >  
> >  /* parameters setters */
> >  
> > diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
> > index 357b8761b5..b4bbb52ae9 100644
> > --- a/hw/core/qdev-properties.c
> > +++ b/hw/core/qdev-properties.c
> > @@ -2,6 +2,7 @@
> >  #include "hw/qdev-properties.h"
> >  #include "qapi/error.h"
> >  #include "qapi/qapi-types-misc.h"
> > +#include "qapi/qapi-visit-misc.h"
> >  #include "qapi/qmp/qerror.h"
> >  #include "qemu/ctype.h"
> >  #include "qemu/error-report.h"
> > @@ -490,6 +491,45 @@ const PropertyInfo qdev_prop_string = {
> >      .set   = set_string,
> >  };
> >  
> > +/* --- StrOrNull --- */
> > +
> > +static void release_str_or_null(Object *obj, const char *name, void *opaque)
> > +{
> > +    Property *prop = opaque;
> > +
> > +    qapi_free_StrOrNull(*(StrOrNull **)object_field_prop_ptr(obj, prop));
> > +}
> > +
> > +static void get_str_or_null(Object *obj, Visitor *v, const char *name,
> > +                            void *opaque, Error **errp)
> > +{
> > +    Property *prop = opaque;
> > +    StrOrNull **ptr = object_field_prop_ptr(obj, prop);
> > +
> > +    visit_type_StrOrNull(v, name, ptr, errp);
> > +}
> > +
> > +static void set_str_or_null(Object *obj, Visitor *v, const char *name,
> > +                            void *opaque, Error **errp)
> > +{
> > +    Property *prop = opaque;
> > +    StrOrNull **ptr = object_field_prop_ptr(obj, prop);
> > +    StrOrNull *son;
> > +
> > +    if (!visit_type_StrOrNull(v, name, &son, errp)) {
> > +        return;
> > +    }
> > +    qapi_free_StrOrNull(*ptr);
> > +    *ptr = son;
> > +}
> > +
> > +const PropertyInfo qdev_prop_str_or_null = {
> > +    .name  = "StrOrNull",
> > +    .release = release_str_or_null,
> > +    .get   = get_str_or_null,
> > +    .set   = set_str_or_null,
> > +};
> > +
> >  /* --- on/off/auto --- */
> >  
> >  const PropertyInfo qdev_prop_on_off_auto = {
> > diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
> > index c115ef2d23..88a8ccb475 100644
> > --- a/migration/migration-hmp-cmds.c
> > +++ b/migration/migration-hmp-cmds.c
> 
> Missing #include "migration/options.h".  Next patch unbreaks the build.

True.. Fixed.

> 
> > @@ -257,6 +257,7 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict)
> >  void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
> >  {
> >      MigrationParameters *params;
> > +    const char *str;
> >  
> >      params = qmp_query_migrate_parameters(NULL);
> >  
> > @@ -309,14 +310,18 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
> >          monitor_printf(mon, "%s: %u\n",
> >              MigrationParameter_str(MIGRATION_PARAMETER_MAX_CPU_THROTTLE),
> >              params->max_cpu_throttle);
> > -        assert(params->tls_creds);
> > +        str = str_from_StrOrNull(params->tls_creds);
> >          monitor_printf(mon, "%s: '%s'\n",
> >              MigrationParameter_str(MIGRATION_PARAMETER_TLS_CREDS),
> > -            params->tls_creds);
> > -        assert(params->tls_hostname);
> > +            str ? str : "");
> > +        str = str_from_StrOrNull(params->tls_hostname);
> >          monitor_printf(mon, "%s: '%s'\n",
> >              MigrationParameter_str(MIGRATION_PARAMETER_TLS_HOSTNAME),
> > -            params->tls_hostname);
> > +            str ? str : "");
> > +        str = str_from_StrOrNull(params->tls_authz);
> > +        monitor_printf(mon, "%s: '%s'\n",
> > +            MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
> > +            str ? str : "");
> >          assert(params->has_max_bandwidth);
> >          monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
> >              MigrationParameter_str(MIGRATION_PARAMETER_MAX_BANDWIDTH),
> > @@ -345,9 +350,6 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
> >          monitor_printf(mon, "%s: %" PRIu64 "\n",
> >              MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
> >              params->max_postcopy_bandwidth);
> > -        monitor_printf(mon, "%s: '%s'\n",
> > -            MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
> > -            params->tls_authz);
> >  
> >          if (params->has_block_bitmap_mapping) {
> >              const BitmapMigrationNodeAliasList *bmnal;
> > @@ -497,7 +499,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
> >      const char *param = qdict_get_str(qdict, "parameter");
> >      const char *valuestr = qdict_get_str(qdict, "value");
> >      Visitor *v = string_input_visitor_new(valuestr);
> > -    MigrateSetParameters *p = g_new0(MigrateSetParameters, 1);
> > +    MigrationParameters *p = g_new0(MigrationParameters, 1);
> >      uint64_t valuebw = 0;
> >      uint64_t cache_size;
> >      Error *err = NULL;
> > @@ -657,7 +659,7 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
> >      qmp_migrate_set_parameters(p, &err);
> >  
> >   cleanup:
> > -    qapi_free_MigrateSetParameters(p);
> > +    qapi_free_MigrationParameters(p);
> >      visit_free(v);
> >      hmp_handle_error(mon, err);
> >  }
> > diff --git a/migration/options.c b/migration/options.c
> > index 6bbfd4853d..12e392f68c 100644
> > --- a/migration/options.c
> > +++ b/migration/options.c
> > @@ -164,9 +164,12 @@ Property migration_properties[] = {
> >      DEFINE_PROP_SIZE("announce-step", MigrationState,
> >                        parameters.announce_step,
> >                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
> > -    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
> > -    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
> > -    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
> > +    DEFINE_PROP_STR_OR_NULL("tls-creds", MigrationState,
> > +                            parameters.tls_creds),
> > +    DEFINE_PROP_STR_OR_NULL("tls-hostname", MigrationState,
> > +                            parameters.tls_hostname),
> > +    DEFINE_PROP_STR_OR_NULL("tls-authz", MigrationState,
> > +                            parameters.tls_authz),
> 
> The qdev machinery now additionally accepts JSON null as property
> value.
> 
> If that's undesirable, we need to reject it.

I don't think we have a need to pass in null here, not to mention this is
only for debugging purpose.

The real problem here, IMHO, is current patch will crash if someone
specifies -global migration.tls_* fields..

Unfortunately I'm not an expert on qapi.  Markus, does something like this
look like a valid fix to you?

===8<===
commit 19758cbaa99c0bad634babdb6f59f23bf0de7b75
Author: Peter Xu <peterx@redhat.com>
Date:   Mon Oct 2 14:26:23 2023 -0400

    qapi: Allow the string visitor to run on StrOrNull
    
    Unlike most of the existing types, StrOrNull needs to manage its own
    allocations.  Provide string_input_start_alternate() for the string visitor
    so that things like StrOrNull will start to work.
    
    Signed-off-by: Peter Xu <peterx@redhat.com>
---
 qapi/string-input-visitor.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
index 197139c1c0..263e26596c 100644
--- a/qapi/string-input-visitor.c
+++ b/qapi/string-input-visitor.c
@@ -387,6 +387,15 @@ static void string_input_free(Visitor *v)
     g_free(siv);
 }
 
+static bool string_input_start_alternate(Visitor *v, const char *name,
+                                         GenericAlternate **obj, size_t size,
+                                         Error **errp)
+{
+    *obj = g_malloc0(size);
+    (*obj)->type = QTYPE_QSTRING;
+    return true;
+}
+
 Visitor *string_input_visitor_new(const char *str)
 {
     StringInputVisitor *v;
@@ -407,6 +416,7 @@ Visitor *string_input_visitor_new(const char *str)
     v->visitor.check_list = check_list;
     v->visitor.end_list = end_list;
     v->visitor.free = string_input_free;
+    v->visitor.start_alternate = string_input_start_alternate;
 
     v->string = str;
     v->lm = LM_NONE;
===8<===

> 
> If it's desirable, we need to document it, and we should probably make
> it a separate patch.
> 
> To answer the question whether it's desirable, we need to recall the
> purpose of the properties.  They go back to
> 
> commit e5cb7e7677010f529d3f0f9dcdb385dea9446f8d
> Author: Peter Xu <peterx@redhat.com>
> Date:   Tue Jun 27 12:10:13 2017 +0800
> 
>     migration: let MigrationState be a qdev
>     
>     Let the old man "MigrationState" join the object family. Direct benefit
>     is that we can start to use all the property features derived from
>     current QDev, like: HW_COMPAT_* bits, command line setup for migration
>     parameters (so will never need to set them up each time using HMP/QMP,
>     this is really, really attractive for test writters), etc.
>     
>     I see no reason to disallow this happen yet. So let's start from this
>     one, to see whether it would be anything good.
>     
>     Now we init the MigrationState struct statically in main() to make sure
>     it's initialized after global properties are applied, since we'll use
>     them during creation of the object.
>     
>     No functional change at all.
>     
>     Reviewed-by: Juan Quintela <quintela@redhat.com>
>     Signed-off-by: Peter Xu <peterx@redhat.com>
>     Message-Id: <1498536619-14548-5-git-send-email-peterx@redhat.com>
>     Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>     Signed-off-by: Juan Quintela <quintela@redhat.com>
> 
> >      DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
> >                         parameters.x_vcpu_dirty_limit_period,
> >                         DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
> > @@ -201,6 +204,38 @@ Property migration_properties[] = {
> >      DEFINE_PROP_END_OF_LIST(),
> >  };
> >  
> > +/*
> > + * NOTE: here we have a trick when converting the empty string (""): we
> > + * need to make sure the empty string ("") will be converted to NULL, as
> > + * some TLS code may rely on that to detect whether something is enabled
> > + * (e.g., the tls_authz field).
> 
> Suggest
> 
>     * Map JSON null and JSON "" to NULL, other JSON strings to the string.
>     * The curious special treatment of "" is necessary, because it needs
>     * to be interpreted like JSON null for backward compatibility.

Sure.

> 
> > + */
> > +const char *str_from_StrOrNull(StrOrNull *obj)
> > +{
> > +    if (!obj || obj->type == QTYPE_QNULL) {
> > +        return NULL;
> > +    } else if (obj->type == QTYPE_QSTRING) {
> > +        if (obj->u.s[0] == '\0') {
> > +            return NULL;
> > +        } else {
> > +            return obj->u.s;
> > +        }
> > +    } else {
> > +        abort();
> > +    }
> > +}
> > +
> > +StrOrNull *StrOrNull_from_str(const char *str)
> > +{
> > +    StrOrNull *obj = g_new0(StrOrNull, 1);
> > +
> > +    assert(str);
> > +    obj->type = QTYPE_QSTRING;
> > +    obj->u.s = g_strdup(str);
> > +
> > +    return obj;
> > +}
> > +
> >  bool migrate_auto_converge(void)
> >  {
> >      MigrationState *s = migrate_get_current();
> > @@ -378,9 +413,11 @@ bool migrate_postcopy(void)
> >  
> >  bool migrate_tls(void)
> >  {
> > -    MigrationState *s = migrate_get_current();
> > -
> > -    return s->parameters.tls_creds && *s->parameters.tls_creds;
> > +    /*
> > +     * The whole TLS feature relies on a non-empty tls-creds set first.
> > +     * It's disabled otherwise.
> > +     */
> 
> Suggest
> 
>        /* TLS is enabled when @tls-creds is non-null */

But this is not the fact.. we probably need to say "non-null and non-empty
string".  I'll drop it in the new version directly, as I decided to rely
that on the comment above str_from_StrOrNull().  Let me know if you have
other preference.

>   
> > +    return migrate_tls_creds();
> >  }
> >  
> >  typedef enum WriteTrackingSupport {
> > @@ -827,21 +864,21 @@ const char *migrate_tls_authz(void)
> >  {
> >      MigrationState *s = migrate_get_current();
> >  
> > -    return s->parameters.tls_authz;
> > +    return str_from_StrOrNull(s->parameters.tls_authz);
> >  }
> 
> This maps "" to null on use of .tls_authz.  Direct use is wrong.
> 
> The alternative is to map it when it's set.  Then setting it must go
> through a setter that maps, and direct use is fine.  Observation, not a
> demand.

Right.  I plan to leave it there until later, though, and make sure all
reference to it will be correct (using str_from_StrOrNull if needed, or
migrate_tls_*).

> 
> >  
> >  const char *migrate_tls_creds(void)
> >  {
> >      MigrationState *s = migrate_get_current();
> >  
> > -    return s->parameters.tls_creds;
> > +    return str_from_StrOrNull(s->parameters.tls_creds);
> >  }
> >  
> >  const char *migrate_tls_hostname(void)
> >  {
> >      MigrationState *s = migrate_get_current();
> >  
> > -    return s->parameters.tls_hostname;
> > +    return str_from_StrOrNull(s->parameters.tls_hostname);
> >  }
> 
> Likewise.
> 
> >  
> >  uint64_t migrate_xbzrle_cache_size(void)
> > @@ -911,10 +948,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
> >      params->cpu_throttle_increment = s->parameters.cpu_throttle_increment;
> >      params->has_cpu_throttle_tailslow = true;
> >      params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
> > -    params->tls_creds = g_strdup(s->parameters.tls_creds);
> > -    params->tls_hostname = g_strdup(s->parameters.tls_hostname);
> > -    params->tls_authz = g_strdup(s->parameters.tls_authz ?
> > -                                 s->parameters.tls_authz : "");
> > +    params->tls_creds = QAPI_CLONE(StrOrNull, s->parameters.tls_creds);
> > +    params->tls_hostname = QAPI_CLONE(StrOrNull, s->parameters.tls_hostname);
> > +    params->tls_authz = QAPI_CLONE(StrOrNull, s->parameters.tls_authz);
> >      params->has_max_bandwidth = true;
> >      params->max_bandwidth = s->parameters.max_bandwidth;
> >      params->has_downtime_limit = true;
> > @@ -963,8 +999,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
> >  
> >  void migrate_params_init(MigrationParameters *params)
> >  {
> > -    params->tls_hostname = g_strdup("");
> > -    params->tls_creds = g_strdup("");
> > +    params->tls_hostname = StrOrNull_from_str("");
> > +    params->tls_creds = StrOrNull_from_str("");
> > +    params->tls_authz = StrOrNull_from_str("");
> >  
> >      /* Set has_* up only for parameter checks */
> >      params->has_compress_level = true;
> > @@ -1145,7 +1182,7 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
> >  #ifdef CONFIG_LINUX
> >      if (migrate_zero_copy_send() &&
> >          ((params->has_multifd_compression && params->multifd_compression) ||
> > -         (params->tls_creds && *params->tls_creds))) {
> > +         migrate_tls())) {
> 
> Correct only if params == current_migration!  Are they equal?

No!

I'll fix that with str_from_StrOrNull().  Definitely not pretty, but I want
to avoid growing this set so I can go back to the
avail-switchover-bandwidth patch soon.  Let me know if you have other
suggestions.

> 
> >          error_setg(errp,
> >                     "Zero copy only available for non-compressed non-TLS multifd migration");
> >          return false;
> > @@ -1172,113 +1209,7 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
> >      return true;
> >  }
> >  
> > -static void migrate_params_test_apply(MigrateSetParameters *params,
> > -                                      MigrationParameters *dest)
> > -{
> > -    *dest = migrate_get_current()->parameters;
> > -
> > -    /* TODO use QAPI_CLONE() instead of duplicating it inline */
> > -
> > -    if (params->has_compress_level) {
> > -        dest->compress_level = params->compress_level;
> > -    }
> > -
> > -    if (params->has_compress_threads) {
> > -        dest->compress_threads = params->compress_threads;
> > -    }
> > -
> > -    if (params->has_compress_wait_thread) {
> > -        dest->compress_wait_thread = params->compress_wait_thread;
> > -    }
> > -
> > -    if (params->has_decompress_threads) {
> > -        dest->decompress_threads = params->decompress_threads;
> > -    }
> > -
> > -    if (params->has_throttle_trigger_threshold) {
> > -        dest->throttle_trigger_threshold = params->throttle_trigger_threshold;
> > -    }
> > -
> > -    if (params->has_cpu_throttle_initial) {
> > -        dest->cpu_throttle_initial = params->cpu_throttle_initial;
> > -    }
> > -
> > -    if (params->has_cpu_throttle_increment) {
> > -        dest->cpu_throttle_increment = params->cpu_throttle_increment;
> > -    }
> > -
> > -    if (params->has_cpu_throttle_tailslow) {
> > -        dest->cpu_throttle_tailslow = params->cpu_throttle_tailslow;
> > -    }
> > -
> > -    if (params->tls_creds) {
> > -        assert(params->tls_creds->type == QTYPE_QSTRING);
> > -        dest->tls_creds = params->tls_creds->u.s;
> > -    }
> > -
> > -    if (params->tls_hostname) {
> > -        assert(params->tls_hostname->type == QTYPE_QSTRING);
> > -        dest->tls_hostname = params->tls_hostname->u.s;
> > -    }
> > -
> > -    if (params->has_max_bandwidth) {
> > -        dest->max_bandwidth = params->max_bandwidth;
> > -    }
> > -
> > -    if (params->has_downtime_limit) {
> > -        dest->downtime_limit = params->downtime_limit;
> > -    }
> > -
> > -    if (params->has_x_checkpoint_delay) {
> > -        dest->x_checkpoint_delay = params->x_checkpoint_delay;
> > -    }
> > -
> > -    if (params->has_block_incremental) {
> > -        dest->block_incremental = params->block_incremental;
> > -    }
> > -    if (params->has_multifd_channels) {
> > -        dest->multifd_channels = params->multifd_channels;
> > -    }
> > -    if (params->has_multifd_compression) {
> > -        dest->multifd_compression = params->multifd_compression;
> > -    }
> > -    if (params->has_xbzrle_cache_size) {
> > -        dest->xbzrle_cache_size = params->xbzrle_cache_size;
> > -    }
> > -    if (params->has_max_postcopy_bandwidth) {
> > -        dest->max_postcopy_bandwidth = params->max_postcopy_bandwidth;
> > -    }
> > -    if (params->has_max_cpu_throttle) {
> > -        dest->max_cpu_throttle = params->max_cpu_throttle;
> > -    }
> > -    if (params->has_announce_initial) {
> > -        dest->announce_initial = params->announce_initial;
> > -    }
> > -    if (params->has_announce_max) {
> > -        dest->announce_max = params->announce_max;
> > -    }
> > -    if (params->has_announce_rounds) {
> > -        dest->announce_rounds = params->announce_rounds;
> > -    }
> > -    if (params->has_announce_step) {
> > -        dest->announce_step = params->announce_step;
> > -    }
> > -
> > -    if (params->has_block_bitmap_mapping) {
> > -        dest->has_block_bitmap_mapping = true;
> > -        dest->block_bitmap_mapping = params->block_bitmap_mapping;
> > -    }
> > -
> > -    if (params->has_x_vcpu_dirty_limit_period) {
> > -        dest->x_vcpu_dirty_limit_period =
> > -            params->x_vcpu_dirty_limit_period;
> > -    }
> > -    if (params->has_vcpu_dirty_limit) {
> > -        dest->vcpu_dirty_limit = params->vcpu_dirty_limit;
> > -    }
> > -}
> 
> I believe your commit message tries to explain why this function is no
> longer needed.  But it's still not obvious to me.
> 
> I suspect this patch is doing too much.

Not only too much, but wrong.. thanks for asking this question.

I thought it was only for converting the different objects, but not really;
it also merges existing parameters with the new ones, so that when check we
do it over all the parameters and try to find conflicts.

Only checking on the MigrationParameters* may not always expose issue where
one new parameter conflicts with an old one even if there is an issue.

I'll revive this function (so hopefully also better integrity of this
patch), and I'll start to use QAPI_CLONE() here because now with identical
type of objects we can do that. We still need to keep most of the lines,
though.. to merge two sets of parameters.

> 
> > -
> > -static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
> > +static void migrate_params_apply(MigrationParameters *params, Error **errp)
> >  {
> >      MigrationState *s = migrate_get_current();
> >  
> > @@ -1317,21 +1248,18 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
> >      }
> >  
> >      if (params->tls_creds) {
> > -        g_free(s->parameters.tls_creds);
> > -        assert(params->tls_creds->type == QTYPE_QSTRING);
> > -        s->parameters.tls_creds = g_strdup(params->tls_creds->u.s);
> > +        qapi_free_StrOrNull(s->parameters.tls_creds);
> > +        s->parameters.tls_creds = QAPI_CLONE(StrOrNull, params->tls_creds);
> >      }
> >  
> >      if (params->tls_hostname) {
> > -        g_free(s->parameters.tls_hostname);
> > -        assert(params->tls_hostname->type == QTYPE_QSTRING);
> > -        s->parameters.tls_hostname = g_strdup(params->tls_hostname->u.s);
> > +        qapi_free_StrOrNull(s->parameters.tls_hostname);
> > +        s->parameters.tls_hostname = QAPI_CLONE(StrOrNull, params->tls_hostname);
> >      }
> >  
> >      if (params->tls_authz) {
> > -        g_free(s->parameters.tls_authz);
> > -        assert(params->tls_authz->type == QTYPE_QSTRING);
> > -        s->parameters.tls_authz = g_strdup(params->tls_authz->u.s);
> > +        qapi_free_StrOrNull(s->parameters.tls_authz);
> > +        s->parameters.tls_authz = QAPI_CLONE(StrOrNull, params->tls_authz);
> >      }
> >  
> >      if (params->has_max_bandwidth) {
> > @@ -1404,33 +1332,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
> >      }
> >  }
> >  
> > -void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
> > +void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
> >  {
> > -    MigrationParameters tmp;
> > -
> > -    /* TODO Rewrite "" to null instead for all three tls_* parameters */
> > -    if (params->tls_creds
> > -        && params->tls_creds->type == QTYPE_QNULL) {
> > -        qobject_unref(params->tls_creds->u.n);
> > -        params->tls_creds->type = QTYPE_QSTRING;
> > -        params->tls_creds->u.s = strdup("");
> > -    }
> > -    if (params->tls_hostname
> > -        && params->tls_hostname->type == QTYPE_QNULL) {
> > -        qobject_unref(params->tls_hostname->u.n);
> > -        params->tls_hostname->type = QTYPE_QSTRING;
> > -        params->tls_hostname->u.s = strdup("");
> > -    }
> > -    if (params->tls_authz
> > -        && params->tls_authz->type == QTYPE_QNULL) {
> > -        qobject_unref(params->tls_authz->u.n);
> > -        params->tls_authz->type = QTYPE_QSTRING;
> > -        params->tls_authz->u.s = strdup("");
> > -    }
> > -
> > -    migrate_params_test_apply(params, &tmp);
> > -
> > -    if (!migrate_params_check(&tmp, errp)) {
> > +    if (!migrate_params_check(params, errp)) {
> >          /* Invalid parameter */
> >          return;
> >      }
> > diff --git a/migration/tls.c b/migration/tls.c
> > index fa03d9136c..c2ed4ff557 100644
> > --- a/migration/tls.c
> > +++ b/migration/tls.c
> > @@ -126,7 +126,8 @@ QIOChannelTLS *migration_tls_client_create(QIOChannel *ioc,
> >      }
> >  
> >      const char *tls_hostname = migrate_tls_hostname();
> > -    if (tls_hostname && *tls_hostname) {
> > +    /* If tls_hostname, then it must be non-empty string already */
> 
> I'm not sure the comment makes sense outside this commit's context.

I can drop it.

> 
> > +    if (tls_hostname) {
> >          hostname = tls_hostname;
> >      }
> 

Thanks,
Markus Armbruster Oct. 9, 2023, 12:25 p.m. UTC | #3
I apologize for the late reply.

Peter Xu <peterx@redhat.com> writes:

> On Tue, Sep 26, 2023 at 10:40:27PM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > Quotting from Markus in his replies:
>> 
>> Quoting
>> 
>> Suggest something like "Markus recently wrote:"
>
> Will do.
>
>> 
>> >   migrate-set-parameters sets migration parameters, and
>> >   query-migrate-parameters gets them.  Unsurprisingly, the former's
>> >   argument type MigrateSetParameters is quite close to the latter's
>> >   return type MigrationParameters.  The differences are subtle:
>> >
>> >   1. Since migrate-set-parameters supports setting selected parameters,
>> >      its arguments must all be optional (so you can omit the ones you
>> >      don't want to change).  query-migrate-parameters results are also
>> >      all optional, but almost all of them are in fact always present.
>> >
>> >   2. For parameters @tls_creds, @tls_hostname, @tls_authz,
>> >      migrate-set-parameters interprets special value "" as "reset to
>> >      default".  Works, because "" is semantically invalid.  Not a
>> >      general solution, because a semantically invalid value need not
>> >      exist.  Markus added a general solution in commit 01fa559826
>> >      ("migration: Use JSON null instead of "" to reset parameter to
>> >      default").  This involved changing the type from 'str' to
>> >      'StrOrNull'.
>> >
>> >   3. When parameter @block-bitmap-mapping has not been set,
>> >      query-migrate-parameters does not return it (absent optional
>> >      member).  Clean (but undocumented).  When parameters @tls_creds,
>> >      @tls_hostname, @tls_authz have not been set, it returns the
>> >      semantically invalid value "".  Not so clean (and just as
>> >      undocumented).
>> >
>> > Here to deduplicate the two objects: keep @MigrationParameters as the name
>> > of object to use in both places, drop @MigrateSetParameters, at the
>> > meantime switch types of @tls* fields from "str" to "StrOrNull" types.
>> 
>> Suggest
>> 
>>   Eliminate the duplication follows.
>> 
>>   Change MigrationParameters members @tls_creds, @tls_hostname,
>>   @tls_authz to StrOrNull.  query-migrate-parameters will of course
>>   never return null.
>> 
>>   Since these members are qdev properties, we need property machinery
>>   for StrOrNull: DEFINE_PROP_STR_OR_NULL().
>> 
>>   Switch migrate-set-parameters to MigrationParameters, and delete
>>   MigrateSetParameters.
>
> Will do.
>
>> 
>> Can you make this patch do just this de-duplication, and everything else
>> (described below) separately?
>
> One thing I can do is to move qdev_prop_str_or_null impl (from you) into a
> separate patch.   I can drop some unnecessary changes too when possible,
> not yet sure what else I can split, but I can try once there.

Suggest to give it a quick try, then see whether you like the resulting
split better than what you have now.

>> > I found that the TLS code wasn't so much relying on tls_* fields being
>> > non-NULL at all.  Actually on the other way round: if we set tls_authz to
>> > an empty string (NOTE: currently, migrate_init() missed initializing
>> > tls_authz; also touched it up in this patch), we can already fail one of
>> > the migration-test (tls/x509/default-host), as qauthz_is_allowed_by_id()
>> > will assume tls_authz set even if tls_auths is an empty string.
>> >
>> > It means we're actually relying on tls_* fields being NULL even if it's the
>> > empty string.
>> >
>> > Let's just make it a rule to return NULL for empty string on these fields
>> > internally.  For that, when converting a StrOrNull into a char* (where we
>> > introduced a helper here in this patch) we'll also make the empty string to
>> > be NULL, to make it always work.  And it doesn't show any issue either when
>> > applying that logic to both tls_creds and tls_hostname.
>> >
>> > With above, we can safely change both migration_tls_client_create() and
>> > migrate_tls() to not check the empty string too finally.. not needed
>> > anymore.
>> >
>> > Also, we can drop the hackish conversions in qmp_migrate_set_parameters()
>> > where we want to make sure it's a QSTRING; it's not needed now.
>> >
>> > This greatly deduplicates the code not only in qapi/migration.json, but
>> > also in the generic migration code.
>> >
>> > Markus helped greatly with this patch.  Besides a better commit
>> > message (where I just "stole" from the reply), debugged and resolved a
>> > double free, but also provided the StrOrNull property implementation to be
>> > used in MigrationState object when switching tls_* fields to StrOrNull.
>> >
>> > Co-developed-by: Markus Armbruster <armbru@redhat.com>
>> > Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
>> > Signed-off-by: Peter Xu <peterx@redhat.com>

[...]

>> > diff --git a/migration/options.c b/migration/options.c
>> > index 6bbfd4853d..12e392f68c 100644
>> > --- a/migration/options.c
>> > +++ b/migration/options.c
>> > @@ -164,9 +164,12 @@ Property migration_properties[] = {
>> >      DEFINE_PROP_SIZE("announce-step", MigrationState,
>> >                        parameters.announce_step,
>> >                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
>> > -    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
>> > -    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
>> > -    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
>> > +    DEFINE_PROP_STR_OR_NULL("tls-creds", MigrationState,
>> > +                            parameters.tls_creds),
>> > +    DEFINE_PROP_STR_OR_NULL("tls-hostname", MigrationState,
>> > +                            parameters.tls_hostname),
>> > +    DEFINE_PROP_STR_OR_NULL("tls-authz", MigrationState,
>> > +                            parameters.tls_authz),
>> 
>> The qdev machinery now additionally accepts JSON null as property
>> value.
>> 
>> If that's undesirable, we need to reject it.
>
> I don't think we have a need to pass in null here, not to mention this is
> only for debugging purpose.

Not sure I understand the bit about debugging.

The point I was trying to make is this.  Before the patch, we reject
attempts to set the property value to null.  Afterwards, we accept them,
i.e. the patch loses "reject null property value".  If this loss is
undesirable, we better replace it with suitable hand-written code.

> The real problem here, IMHO, is current patch will crash if someone
> specifies -global migration.tls_* fields..

Trips this assertion:

    bool visit_start_alternate(Visitor *v, const char *name,
                               GenericAlternate **obj, size_t size,
                               Error **errp)
    {
        bool ok;

        assert(obj && size >= sizeof(GenericAlternate));
        assert(!(v->type & VISITOR_OUTPUT) || *obj);
        trace_visit_start_alternate(v, name, obj, size);
        if (!v->start_alternate) {
--->        assert(!(v->type & VISITOR_INPUT));
            return true;
        }
        ok = v->start_alternate(v, name, obj, size, errp);
        if (v->type & VISITOR_INPUT) {
            assert(ok != !*obj);
        }
        return ok;
    }

Documented with the start_alternate() method in visitor-impl.h:

        /* Must be set by input and clone visitors to visit alternates */
        bool (*start_alternate)(Visitor *v, const char *name,
                                GenericAlternate **obj, size_t size,
                                Error **errp);

Known restriction of the string input visitor:

    /*
     * The string input visitor does not implement support for visiting
     * QAPI structs, alternates, null, or arbitrary QTypes. Only flat lists
     * of integers (except type "size") are supported.
     */
    Visitor *string_input_visitor_new(const char *str);

A similar restriction is documented for the string output visitor.

> Unfortunately I'm not an expert on qapi.  Markus, does something like this
> look like a valid fix to you?

I'm not sure whether your simple patch is sufficient for lifting the
restriction.  Needs a deeper think, I'm afraid.  Can we make progress on
the remainder of the series in parallel?

I wish we could get rid of the string visitors.

> ===8<===
> commit 19758cbaa99c0bad634babdb6f59f23bf0de7b75
> Author: Peter Xu <peterx@redhat.com>
> Date:   Mon Oct 2 14:26:23 2023 -0400
>
>     qapi: Allow the string visitor to run on StrOrNull
>     
>     Unlike most of the existing types, StrOrNull needs to manage its own
>     allocations.  Provide string_input_start_alternate() for the string visitor
>     so that things like StrOrNull will start to work.
>     
>     Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  qapi/string-input-visitor.c | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> diff --git a/qapi/string-input-visitor.c b/qapi/string-input-visitor.c
> index 197139c1c0..263e26596c 100644
> --- a/qapi/string-input-visitor.c
> +++ b/qapi/string-input-visitor.c
> @@ -387,6 +387,15 @@ static void string_input_free(Visitor *v)
>      g_free(siv);
>  }
>  
> +static bool string_input_start_alternate(Visitor *v, const char *name,
> +                                         GenericAlternate **obj, size_t size,
> +                                         Error **errp)
> +{
> +    *obj = g_malloc0(size);
> +    (*obj)->type = QTYPE_QSTRING;
> +    return true;
> +}
> +
>  Visitor *string_input_visitor_new(const char *str)
>  {
>      StringInputVisitor *v;
> @@ -407,6 +416,7 @@ Visitor *string_input_visitor_new(const char *str)
>      v->visitor.check_list = check_list;
>      v->visitor.end_list = end_list;
>      v->visitor.free = string_input_free;
> +    v->visitor.start_alternate = string_input_start_alternate;
>  
>      v->string = str;
>      v->lm = LM_NONE;
> ===8<===
>
>>
>> If it's desirable, we need to document it, and we should probably make
>> it a separate patch.
>> 
>> To answer the question whether it's desirable, we need to recall the
>> purpose of the properties.  They go back to
>> 
>> commit e5cb7e7677010f529d3f0f9dcdb385dea9446f8d
>> Author: Peter Xu <peterx@redhat.com>
>> Date:   Tue Jun 27 12:10:13 2017 +0800
>> 
>>     migration: let MigrationState be a qdev
>>     
>>     Let the old man "MigrationState" join the object family. Direct benefit
>>     is that we can start to use all the property features derived from
>>     current QDev, like: HW_COMPAT_* bits, command line setup for migration
>>     parameters (so will never need to set them up each time using HMP/QMP,
>>     this is really, really attractive for test writters), etc.
>>     
>>     I see no reason to disallow this happen yet. So let's start from this
>>     one, to see whether it would be anything good.
>>     
>>     Now we init the MigrationState struct statically in main() to make sure
>>     it's initialized after global properties are applied, since we'll use
>>     them during creation of the object.
>>     
>>     No functional change at all.
>>     
>>     Reviewed-by: Juan Quintela <quintela@redhat.com>
>>     Signed-off-by: Peter Xu <peterx@redhat.com>
>>     Message-Id: <1498536619-14548-5-git-send-email-peterx@redhat.com>
>>     Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
>>     Signed-off-by: Juan Quintela <quintela@redhat.com>
>> 
>> >      DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
>> >                         parameters.x_vcpu_dirty_limit_period,
>> >                         DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
>> > @@ -201,6 +204,38 @@ Property migration_properties[] = {
>> >      DEFINE_PROP_END_OF_LIST(),
>> >  };
>> >  
>> > +/*
>> > + * NOTE: here we have a trick when converting the empty string (""): we
>> > + * need to make sure the empty string ("") will be converted to NULL, as
>> > + * some TLS code may rely on that to detect whether something is enabled
>> > + * (e.g., the tls_authz field).
>> 
>> Suggest
>> 
>>     * Map JSON null and JSON "" to NULL, other JSON strings to the string.
>>     * The curious special treatment of "" is necessary, because it needs
>>     * to be interpreted like JSON null for backward compatibility.
>
> Sure.
>
>> 
>> > + */
>> > +const char *str_from_StrOrNull(StrOrNull *obj)
>> > +{
>> > +    if (!obj || obj->type == QTYPE_QNULL) {
>> > +        return NULL;
>> > +    } else if (obj->type == QTYPE_QSTRING) {
>> > +        if (obj->u.s[0] == '\0') {
>> > +            return NULL;
>> > +        } else {
>> > +            return obj->u.s;
>> > +        }
>> > +    } else {
>> > +        abort();
>> > +    }
>> > +}
>> > +
>> > +StrOrNull *StrOrNull_from_str(const char *str)
>> > +{
>> > +    StrOrNull *obj = g_new0(StrOrNull, 1);
>> > +
>> > +    assert(str);
>> > +    obj->type = QTYPE_QSTRING;
>> > +    obj->u.s = g_strdup(str);
>> > +
>> > +    return obj;
>> > +}
>> > +
>> >  bool migrate_auto_converge(void)
>> >  {
>> >      MigrationState *s = migrate_get_current();
>> > @@ -378,9 +413,11 @@ bool migrate_postcopy(void)
>> >  
>> >  bool migrate_tls(void)
>> >  {
>> > -    MigrationState *s = migrate_get_current();
>> > -
>> > -    return s->parameters.tls_creds && *s->parameters.tls_creds;
>> > +    /*
>> > +     * The whole TLS feature relies on a non-empty tls-creds set first.
>> > +     * It's disabled otherwise.
>> > +     */
>> 
>> Suggest
>> 
>>        /* TLS is enabled when @tls-creds is non-null */
>
> But this is not the fact.. we probably need to say "non-null and non-empty
> string".  I'll drop it in the new version directly, as I decided to rely
> that on the comment above str_from_StrOrNull().  Let me know if you have
> other preference.

I'll check the new version.

>> > +    return migrate_tls_creds();
>> >  }
>> >  
>> >  typedef enum WriteTrackingSupport {
>> > @@ -827,21 +864,21 @@ const char *migrate_tls_authz(void)
>> >  {
>> >      MigrationState *s = migrate_get_current();
>> >  
>> > -    return s->parameters.tls_authz;
>> > +    return str_from_StrOrNull(s->parameters.tls_authz);
>> >  }
>> 
>> This maps "" to null on use of .tls_authz.  Direct use is wrong.
>> 
>> The alternative is to map it when it's set.  Then setting it must go
>> through a setter that maps, and direct use is fine.  Observation, not a
>> demand.
>
> Right.  I plan to leave it there until later, though, and make sure all
> reference to it will be correct (using str_from_StrOrNull if needed, or
> migrate_tls_*).
>
>> 
>> >  
>> >  const char *migrate_tls_creds(void)
>> >  {
>> >      MigrationState *s = migrate_get_current();
>> >  
>> > -    return s->parameters.tls_creds;
>> > +    return str_from_StrOrNull(s->parameters.tls_creds);
>> >  }
>> >  
>> >  const char *migrate_tls_hostname(void)
>> >  {
>> >      MigrationState *s = migrate_get_current();
>> >  
>> > -    return s->parameters.tls_hostname;
>> > +    return str_from_StrOrNull(s->parameters.tls_hostname);
>> >  }
>> 
>> Likewise.
>> 
>> >  
>> >  uint64_t migrate_xbzrle_cache_size(void)
>> > @@ -911,10 +948,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>> >      params->cpu_throttle_increment = s->parameters.cpu_throttle_increment;
>> >      params->has_cpu_throttle_tailslow = true;
>> >      params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
>> > -    params->tls_creds = g_strdup(s->parameters.tls_creds);
>> > -    params->tls_hostname = g_strdup(s->parameters.tls_hostname);
>> > -    params->tls_authz = g_strdup(s->parameters.tls_authz ?
>> > -                                 s->parameters.tls_authz : "");
>> > +    params->tls_creds = QAPI_CLONE(StrOrNull, s->parameters.tls_creds);
>> > +    params->tls_hostname = QAPI_CLONE(StrOrNull, s->parameters.tls_hostname);
>> > +    params->tls_authz = QAPI_CLONE(StrOrNull, s->parameters.tls_authz);
>> >      params->has_max_bandwidth = true;
>> >      params->max_bandwidth = s->parameters.max_bandwidth;
>> >      params->has_downtime_limit = true;
>> > @@ -963,8 +999,9 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp)
>> >  
>> >  void migrate_params_init(MigrationParameters *params)
>> >  {
>> > -    params->tls_hostname = g_strdup("");
>> > -    params->tls_creds = g_strdup("");
>> > +    params->tls_hostname = StrOrNull_from_str("");
>> > +    params->tls_creds = StrOrNull_from_str("");
>> > +    params->tls_authz = StrOrNull_from_str("");
>> >  
>> >      /* Set has_* up only for parameter checks */
>> >      params->has_compress_level = true;
>> > @@ -1145,7 +1182,7 @@ bool migrate_params_check(MigrationParameters *params, Error **errp)
>> >  #ifdef CONFIG_LINUX
>> >      if (migrate_zero_copy_send() &&
>> >          ((params->has_multifd_compression && params->multifd_compression) ||
>> > -         (params->tls_creds && *params->tls_creds))) {
>> > +         migrate_tls())) {
>> 
>> Correct only if params == current_migration!  Are they equal?
>
> No!
>
> I'll fix that with str_from_StrOrNull().  Definitely not pretty, but I want
> to avoid growing this set so I can go back to the
> avail-switchover-bandwidth patch soon.  Let me know if you have other
> suggestions.

We can accept a bit of ugliness for de-triplicating migration parameters
in the QAPI schema.

[...]
Peter Xu Oct. 10, 2023, 3:05 p.m. UTC | #4
On Mon, Oct 09, 2023 at 02:25:10PM +0200, Markus Armbruster wrote:
> I apologize for the late reply.

Not a problem - one week is not even bad at all. :)

[...]

> > One thing I can do is to move qdev_prop_str_or_null impl (from you) into a
> > separate patch.   I can drop some unnecessary changes too when possible,
> > not yet sure what else I can split, but I can try once there.
> 
> Suggest to give it a quick try, then see whether you like the resulting
> split better than what you have now.

OK.

[...]

> >> > diff --git a/migration/options.c b/migration/options.c
> >> > index 6bbfd4853d..12e392f68c 100644
> >> > --- a/migration/options.c
> >> > +++ b/migration/options.c
> >> > @@ -164,9 +164,12 @@ Property migration_properties[] = {
> >> >      DEFINE_PROP_SIZE("announce-step", MigrationState,
> >> >                        parameters.announce_step,
> >> >                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
> >> > -    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
> >> > -    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
> >> > -    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
> >> > +    DEFINE_PROP_STR_OR_NULL("tls-creds", MigrationState,
> >> > +                            parameters.tls_creds),
> >> > +    DEFINE_PROP_STR_OR_NULL("tls-hostname", MigrationState,
> >> > +                            parameters.tls_hostname),
> >> > +    DEFINE_PROP_STR_OR_NULL("tls-authz", MigrationState,
> >> > +                            parameters.tls_authz),
> >> 
> >> The qdev machinery now additionally accepts JSON null as property
> >> value.
> >> 
> >> If that's undesirable, we need to reject it.
> >
> > I don't think we have a need to pass in null here, not to mention this is
> > only for debugging purpose.
> 
> Not sure I understand the bit about debugging.

The migration_properties here is only used by "-global migration.XXX=YYY".
It's not expected for a normal user to use this interface; one should
normally use QMP or even HMP.  So migration_properties as a whole is for
debugging purpose.

> 
> The point I was trying to make is this.  Before the patch, we reject
> attempts to set the property value to null.  Afterwards, we accept them,
> i.e. the patch loses "reject null property value".  If this loss is
> undesirable, we better replace it with suitable hand-written code.

I don't even know how to set it to NULL before.. as it can only be accessed
via cmdline "-global" as mentioned above, which must be a string anyway.
So I assume this is not an issue.

> 
> > The real problem here, IMHO, is current patch will crash if someone
> > specifies -global migration.tls_* fields..
> 
> Trips this assertion:
> 
>     bool visit_start_alternate(Visitor *v, const char *name,
>                                GenericAlternate **obj, size_t size,
>                                Error **errp)
>     {
>         bool ok;
> 
>         assert(obj && size >= sizeof(GenericAlternate));
>         assert(!(v->type & VISITOR_OUTPUT) || *obj);
>         trace_visit_start_alternate(v, name, obj, size);
>         if (!v->start_alternate) {
> --->        assert(!(v->type & VISITOR_INPUT));
>             return true;
>         }
>         ok = v->start_alternate(v, name, obj, size, errp);
>         if (v->type & VISITOR_INPUT) {
>             assert(ok != !*obj);
>         }
>         return ok;
>     }
> 
> Documented with the start_alternate() method in visitor-impl.h:
> 
>         /* Must be set by input and clone visitors to visit alternates */
>         bool (*start_alternate)(Visitor *v, const char *name,
>                                 GenericAlternate **obj, size_t size,
>                                 Error **errp);
> 
> Known restriction of the string input visitor:
> 
>     /*
>      * The string input visitor does not implement support for visiting
>      * QAPI structs, alternates, null, or arbitrary QTypes. Only flat lists
>      * of integers (except type "size") are supported.
>      */
>     Visitor *string_input_visitor_new(const char *str);
> 
> A similar restriction is documented for the string output visitor.
> 
> > Unfortunately I'm not an expert on qapi.  Markus, does something like this
> > look like a valid fix to you?
> 
> I'm not sure whether your simple patch is sufficient for lifting the
> restriction.  Needs a deeper think, I'm afraid.  Can we make progress on
> the remainder of the series in parallel?

We can move back to using string rather than StrOrNull, but I saw there's
another email discussing this.  Let me also read that first, before jumping
back and forth on the solutions.

Note that my goal prior to this series is introducing another migration
parameter (avail-switchover-bandwidth).  What I think I can do is I'll
proceed with that patch rebasing to master rather than this series; doing
the triplication once more and decouple.  Then we can postpone this series.

Thanks,
Markus Armbruster Oct. 10, 2023, 7:18 p.m. UTC | #5
Peter Xu <peterx@redhat.com> writes:

> On Mon, Oct 09, 2023 at 02:25:10PM +0200, Markus Armbruster wrote:
>> I apologize for the late reply.
>
> Not a problem - one week is not even bad at all. :)

I try to keep conversations moving once they started.

> [...]
>
>> > One thing I can do is to move qdev_prop_str_or_null impl (from you) into a
>> > separate patch.   I can drop some unnecessary changes too when possible,
>> > not yet sure what else I can split, but I can try once there.
>> 
>> Suggest to give it a quick try, then see whether you like the resulting
>> split better than what you have now.
>
> OK.
>
> [...]
>
>> >> > diff --git a/migration/options.c b/migration/options.c
>> >> > index 6bbfd4853d..12e392f68c 100644
>> >> > --- a/migration/options.c
>> >> > +++ b/migration/options.c
>> >> > @@ -164,9 +164,12 @@ Property migration_properties[] = {
>> >> >      DEFINE_PROP_SIZE("announce-step", MigrationState,
>> >> >                        parameters.announce_step,
>> >> >                        DEFAULT_MIGRATE_ANNOUNCE_STEP),
>> >> > -    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
>> >> > -    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
>> >> > -    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
>> >> > +    DEFINE_PROP_STR_OR_NULL("tls-creds", MigrationState,
>> >> > +                            parameters.tls_creds),
>> >> > +    DEFINE_PROP_STR_OR_NULL("tls-hostname", MigrationState,
>> >> > +                            parameters.tls_hostname),
>> >> > +    DEFINE_PROP_STR_OR_NULL("tls-authz", MigrationState,
>> >> > +                            parameters.tls_authz),
>> >> 
>> >> The qdev machinery now additionally accepts JSON null as property
>> >> value.
>> >> 
>> >> If that's undesirable, we need to reject it.
>> >
>> > I don't think we have a need to pass in null here, not to mention this is
>> > only for debugging purpose.
>> 
>> Not sure I understand the bit about debugging.
>
> The migration_properties here is only used by "-global migration.XXX=YYY".
> It's not expected for a normal user to use this interface; one should
> normally use QMP or even HMP.  So migration_properties as a whole is for
> debugging purpose.
>
>> 
>> The point I was trying to make is this.  Before the patch, we reject
>> attempts to set the property value to null.  Afterwards, we accept them,
>> i.e. the patch loses "reject null property value".  If this loss is
>> undesirable, we better replace it with suitable hand-written code.
>
> I don't even know how to set it to NULL before.. as it can only be accessed
> via cmdline "-global" as mentioned above, which must be a string anyway.
> So I assume this is not an issue.

Something like

    {"execute": "migrate-set-parameters",
     "arguments": {"tls-authz": null}}

Hmm, crashes in migrate_params_apply(), which is a bug.  I'm getting
more and more suspicious about user-facing migration code...

If the migration object is accessible with qom-set, then that's another
way to assign null values.

>> > The real problem here, IMHO, is current patch will crash if someone
>> > specifies -global migration.tls_* fields..
>> 
>> Trips this assertion:
>> 
>>     bool visit_start_alternate(Visitor *v, const char *name,
>>                                GenericAlternate **obj, size_t size,
>>                                Error **errp)
>>     {
>>         bool ok;
>> 
>>         assert(obj && size >= sizeof(GenericAlternate));
>>         assert(!(v->type & VISITOR_OUTPUT) || *obj);
>>         trace_visit_start_alternate(v, name, obj, size);
>>         if (!v->start_alternate) {
>> --->        assert(!(v->type & VISITOR_INPUT));
>>             return true;
>>         }
>>         ok = v->start_alternate(v, name, obj, size, errp);
>>         if (v->type & VISITOR_INPUT) {
>>             assert(ok != !*obj);
>>         }
>>         return ok;
>>     }
>> 
>> Documented with the start_alternate() method in visitor-impl.h:
>> 
>>         /* Must be set by input and clone visitors to visit alternates */
>>         bool (*start_alternate)(Visitor *v, const char *name,
>>                                 GenericAlternate **obj, size_t size,
>>                                 Error **errp);
>> 
>> Known restriction of the string input visitor:
>> 
>>     /*
>>      * The string input visitor does not implement support for visiting
>>      * QAPI structs, alternates, null, or arbitrary QTypes. Only flat lists
>>      * of integers (except type "size") are supported.
>>      */
>>     Visitor *string_input_visitor_new(const char *str);
>> 
>> A similar restriction is documented for the string output visitor.
>> 
>> > Unfortunately I'm not an expert on qapi.  Markus, does something like this
>> > look like a valid fix to you?
>> 
>> I'm not sure whether your simple patch is sufficient for lifting the
>> restriction.  Needs a deeper think, I'm afraid.  Can we make progress on
>> the remainder of the series in parallel?
>
> We can move back to using string rather than StrOrNull, but I saw there's
> another email discussing this.  Let me also read that first, before jumping
> back and forth on the solutions.

In my "QAPI string visitors crashes" memo, I demonstrated that the crash
on funny property type predates your series.  You merely add another
instance.  Moreover, crashing -global is less serious than a crashing
monitor command, because only the latter can take down a running guest.
Can't see why your series needs to wait for a fix of the crash bug.
Makes sense?

> Note that my goal prior to this series is introducing another migration
> parameter (avail-switchover-bandwidth).  What I think I can do is I'll
> proceed with that patch rebasing to master rather than this series; doing
> the triplication once more and decouple.  Then we can postpone this series.
>
> Thanks,
Peter Xu Oct. 10, 2023, 8:09 p.m. UTC | #6
Hi, Markus,

On Tue, Oct 10, 2023 at 09:18:23PM +0200, Markus Armbruster wrote:

[...]

> >> The point I was trying to make is this.  Before the patch, we reject
> >> attempts to set the property value to null.  Afterwards, we accept them,
> >> i.e. the patch loses "reject null property value".  If this loss is
> >> undesirable, we better replace it with suitable hand-written code.
> >
> > I don't even know how to set it to NULL before.. as it can only be accessed
> > via cmdline "-global" as mentioned above, which must be a string anyway.
> > So I assume this is not an issue.
> 
> Something like
> 
>     {"execute": "migrate-set-parameters",
>      "arguments": {"tls-authz": null}}
> 
> Hmm, crashes in migrate_params_apply(), which is a bug.  I'm getting
> more and more suspicious about user-facing migration code...

Did you apply patch 1 of this series?

https://lore.kernel.org/qemu-devel/20230905162335.235619-2-peterx@redhat.com/

QMP "migrate-set-parameters" does not go via migration_properties, so even
if we change handling of migration_properties, it shouldn't yet affect the
QMP interface of that.

> 
> If the migration object is accessible with qom-set, then that's another
> way to assign null values.

I see what you meant.  IMHO we just don't need to worry on breaking that as
I am not aware of anyone using that to set migration parameters, and I
think the whole idea of migration_properties is for debugging.  The only
legal way an user should set migration parameters should be via QMP, afaik.

> In my "QAPI string visitors crashes" memo, I demonstrated that the crash
> on funny property type predates your series.  You merely add another
> instance.  Moreover, crashing -global is less serious than a crashing
> monitor command, because only the latter can take down a running guest.
> Can't see why your series needs to wait for a fix of the crash bug.
> Makes sense?

What's your suggestion to move on with this series without a fix for that
crash bug?

I started this series with making tls_* all strings (rather than StrOrNull)
and that actually worked out, mostly.  We switched to StrOrNull just
because we think it's cleaner and 100% not breaking anyone (even though I
still don't think the other way will).  I don't see how I can proceed this
series without fixing this visitor issue but keep using StrOrNull.

Please don't worry on blocking my work: it won't anymore.  The thing I need is:

https://lore.kernel.org/qemu-devel/20230905193802.250440-1-peterx@redhat.com/

While this whole series is just paving way for it.  If I can't get
immediate results out of this series, I'll just go with the triplications,
leaving all the rest for later.

Thanks,
Markus Armbruster Oct. 11, 2023, 4:28 a.m. UTC | #7
Peter Xu <peterx@redhat.com> writes:

> Hi, Markus,
>
> On Tue, Oct 10, 2023 at 09:18:23PM +0200, Markus Armbruster wrote:
>
> [...]
>
>> >> The point I was trying to make is this.  Before the patch, we reject
>> >> attempts to set the property value to null.  Afterwards, we accept them,
>> >> i.e. the patch loses "reject null property value".  If this loss is
>> >> undesirable, we better replace it with suitable hand-written code.
>> >
>> > I don't even know how to set it to NULL before.. as it can only be accessed
>> > via cmdline "-global" as mentioned above, which must be a string anyway.
>> > So I assume this is not an issue.
>> 
>> Something like
>> 
>>     {"execute": "migrate-set-parameters",
>>      "arguments": {"tls-authz": null}}
>> 
>> Hmm, crashes in migrate_params_apply(), which is a bug.  I'm getting
>> more and more suspicious about user-facing migration code...
>
> Did you apply patch 1 of this series?

Since we're talking about "how to set it to NULL before", I was using
master.

> https://lore.kernel.org/qemu-devel/20230905162335.235619-2-peterx@redhat.com/
>
> QMP "migrate-set-parameters" does not go via migration_properties, so even
> if we change handling of migration_properties, it shouldn't yet affect the
> QMP interface of that.

I see.

I want to understand the impact of the change from 'str' to 'StrOrNull'
on external interfaces.  The first step is to know where exactly the
type is exposed externally.  *Know*, not gut-feel based on intended use.

I'll have another look at the schema change, and how the types are used.

>> If the migration object is accessible with qom-set, then that's another
>> way to assign null values.
>
> I see what you meant.  IMHO we just don't need to worry on breaking that as
> I am not aware of anyone using that to set migration parameters, and I
> think the whole idea of migration_properties is for debugging.  The only
> legal way an user should set migration parameters should be via QMP, afaik.

No matter the intended purpose of an interface, its meaning should be
clear.  If we accept null, what does it mean?

>> In my "QAPI string visitors crashes" memo, I demonstrated that the crash
>> on funny property type predates your series.  You merely add another
>> instance.  Moreover, crashing -global is less serious than a crashing
>> monitor command, because only the latter can take down a running guest.
>> Can't see why your series needs to wait for a fix of the crash bug.
>> Makes sense?
>
> What's your suggestion to move on with this series without a fix for that
> crash bug?
>
> I started this series with making tls_* all strings (rather than StrOrNull)
> and that actually worked out, mostly.  We switched to StrOrNull just
> because we think it's cleaner and 100% not breaking anyone (even though I
> still don't think the other way will).  I don't see how I can proceed this
> series without fixing this visitor issue but keep using StrOrNull.

I forgot it the switch to alternate crashes on normal usage, not just
when you attempt to pass null.

> Please don't worry on blocking my work: it won't anymore.  The thing I need is:
>
> https://lore.kernel.org/qemu-devel/20230905193802.250440-1-peterx@redhat.com/
>
> While this whole series is just paving way for it.  If I can't get
> immediate results out of this series, I'll just go with the triplications,
> leaving all the rest for later.

Okay.
Markus Armbruster Oct. 11, 2023, 2:21 p.m. UTC | #8
Markus Armbruster <armbru@redhat.com> writes:

> Peter Xu <peterx@redhat.com> writes:
>
>> Hi, Markus,
>>
>> On Tue, Oct 10, 2023 at 09:18:23PM +0200, Markus Armbruster wrote:
>>
>> [...]
>>
>>> >> The point I was trying to make is this.  Before the patch, we reject
>>> >> attempts to set the property value to null.  Afterwards, we accept them,
>>> >> i.e. the patch loses "reject null property value".  If this loss is
>>> >> undesirable, we better replace it with suitable hand-written code.
>>> >
>>> > I don't even know how to set it to NULL before.. as it can only be accessed
>>> > via cmdline "-global" as mentioned above, which must be a string anyway.
>>> > So I assume this is not an issue.
>>> 
>>> Something like
>>> 
>>>     {"execute": "migrate-set-parameters",
>>>      "arguments": {"tls-authz": null}}
>>> 
>>> Hmm, crashes in migrate_params_apply(), which is a bug.  I'm getting
>>> more and more suspicious about user-facing migration code...
>>
>> Did you apply patch 1 of this series?
>
> Since we're talking about "how to set it to NULL before", I was using
> master.
>
>> https://lore.kernel.org/qemu-devel/20230905162335.235619-2-peterx@redhat.com/
>>
>> QMP "migrate-set-parameters" does not go via migration_properties, so even
>> if we change handling of migration_properties, it shouldn't yet affect the
>> QMP interface of that.
>
> I see.
>
> I want to understand the impact of the change from 'str' to 'StrOrNull'
> on external interfaces.  The first step is to know where exactly the
> type is exposed externally.  *Know*, not gut-feel based on intended use.
>
> I'll have another look at the schema change, and how the types are used.

Schema changes:

1. Change MigrationParameters members @tls-creds, @tls-hostname,
   @tls-authz from 'str' to 'StrOrNull'

2. Replace MigrateSetParameters by MigrationParameters.

   No change, since they are identical after change 1.

To determine the patch's impact, we need to examine uses of
MigrationParameters members @tls-FOO before the patch.  These are:

* Return type of query-migrate-parameters

  Introspection shows the type change: the type's set of values now
  includes JSON null.

  Is JSON null possible?  See [*] below.

* migrate_params_init()

  Before the patch, we initialize to "".

  Afterwards, we initialize to "" wrapped in a StrOrNull.

  The initial value means "off" before and after.

* migrate_params_check()

  An error check gets updated.  Ignoring for now.

* migrate_params_test_apply()

  Function deleted in the patch, but you wrote that's wrong.  Ignoring
  for now.

* migrate_params_apply()

  Duplicates the three parameters from argument @parameters into the
  migration object's member parameters.

  Argument @parameters comes from QMP via command
  migrate-set-parameters.  Before the patch,
  qmp_migrate_set_parameters() maps JSON null values to "".  Afterwards,
  it passes the values verbatim.

  Parameters stored in the migration object before and after the patch:

  - When initialized and never changed: char * "", and StrOrNull
    QTYPE_QSTRING "".

  - When set to non-empty string with migrate-set-parameters or
    equivalent: that non-empty string, and QTYPE_QSTRING with that
    non-empty string.

  - When reset with migrate-set-parameters with value "": "", and
    QTYPE_QSTRING "".

  - When reset with migrate-set-parameters with value null: "", and
    QTYPE_QNULL.

  Note that there's now a difference between passing "" and null to
  migrate-set-parameters: the former results in value QTYPE_QSTRING "",
  the latter QTYPE_QNULL.  Both values mean "off".  I hate this.  I very
  much want a single C representation of "off".

* MigrationState member @parameters.

  Uses:

  - Properties "tls-creds", "tls-hostname", "tls-authz"

    These are externally accessible with -global.  The additional null
    value is not accessible there: string input visitor limitation.  It
    could become accessible depending on how we fix the crash bugs
    related to that limitation, but we can worry about that when we do
    it.

    Digression: why do these properties even exist?  I believe we
    created the "migration" (pseudo-)device just so we can use "compat
    props" to apply machine- and accelerator-specific configuration
    tweaks.  We then added configuration for *all* configuration
    parameters, not just the ones that need tweaking.  The external
    exposure of properties via -global is not something we wanted, it
    just came with the part we wanted (compat props).  Accidental
    external interface.  Ugh.

    None of the tls-FOO are tweaked via compat props, so no worries
    there.

    I believe property access with qom-get and qom-set is not possible,
    because the migration object is not part to the QOM tree, and
    therefore is not reachable via any QOM path.  Aside: feels like
    abuse of QOM.

    It's also not part of the device tree rooted at the main system bus,
    which means it isn't visible in "info qtree".  It is visible in
    "info qdm", "device_add migration,help", and "-device
    migration,help".  Output of the latter two changes.  All harmless.

    I *think* that's all.

  - migrate_tls(), migrate_tls_authz(), migrate_tls_creds(),
    migrate_tls_hostname()

    Before the patch, these return the respective migration parameter
    directly.  I believe the value is never NULL.  Value "" is special
    and means "off".

    After the patch, these return the respective migration parameter
    when it's a non-empty QTYPE_QSTRING, else NULL.  Value NULL means
    off.

    Note this maps both C representations of "off" to NULL.

    This changes the return value for "off" from "" to NULL.
    Improvement, because it results in a more pleasant "is off" check.

  - qmp_query_migrate_parameters()

    The three tls_FOO get duplicated into the return value.

    Looks like the two different C representations of "off" bleed into
    QMP (ugh!), and [*] JSON null is possible (incompatible change).

* hmp_info_migrate_parameters()

  The two different C representations of "off" are first mapped to NULL
  with str_from_StrOrNull(), and then mapped to "" with a ?: operator.
  Works.

Bottom line:

* Affected external interfaces:

  - query-migrate-parameters: can now return either "" or null when TLS
    is off.  null is an incompatible change.  Needs fixing.

  - query-qmp-schema: shows null is now possible.  Correctly reflects
    the backward incompatible change.  If we fix compatibility break, we
    get a tolerable loss of typing precision instead.

2. Two different C representations of "off".  Strong dislike.  I
   recommend to fix the compatibility break by switching to a single C
   representation.

Thoughts?

[...]
Peter Xu Oct. 12, 2023, 3:58 p.m. UTC | #9
On Wed, Oct 11, 2023 at 06:28:26AM +0200, Markus Armbruster wrote:
> >> Something like
> >> 
> >>     {"execute": "migrate-set-parameters",
> >>      "arguments": {"tls-authz": null}}
> >> 
> >> Hmm, crashes in migrate_params_apply(), which is a bug.  I'm getting
> >> more and more suspicious about user-facing migration code...
> >
> > Did you apply patch 1 of this series?
> 
> Since we're talking about "how to set it to NULL before", I was using
> master.

I see.  Then it seems we're good.  The fix just landed master branch
(86dec715a7).

> 
> > https://lore.kernel.org/qemu-devel/20230905162335.235619-2-peterx@redhat.com/
> >
> > QMP "migrate-set-parameters" does not go via migration_properties, so even
> > if we change handling of migration_properties, it shouldn't yet affect the
> > QMP interface of that.
> 
> I see.
> 
> I want to understand the impact of the change from 'str' to 'StrOrNull'
> on external interfaces.  The first step is to know where exactly the
> type is exposed externally.  *Know*, not gut-feel based on intended use.
> 
> I'll have another look at the schema change, and how the types are used.

Thanks.

> 
> >> If the migration object is accessible with qom-set, then that's another
> >> way to assign null values.
> >
> > I see what you meant.  IMHO we just don't need to worry on breaking that as
> > I am not aware of anyone using that to set migration parameters, and I
> > think the whole idea of migration_properties is for debugging.  The only
> > legal way an user should set migration parameters should be via QMP, afaik.
> 
> No matter the intended purpose of an interface, its meaning should be
> clear.  If we accept null, what does it mean?

IMHO here it means anything parsed from migration_properties will be a
qstring and impossible to be a qnull, at least if with that of my (probably
unmature, or even wrong..) fix:

https://lore.kernel.org/all/ZRsff7Lmy7TnggK9@x1n/

+static bool string_input_start_alternate(Visitor *v, const char *name,
+                                         GenericAlternate **obj, size_t size,
+                                         Error **errp)
+{
+    *obj = g_malloc0(size);
+    (*obj)->type = QTYPE_QSTRING;          <---------------- constantly set to string type
+    return true;
+}

I think it means when we parse the string, we'll always go with:

visit_type_StrOrNull():
    switch ((*obj)->type) {
    case QTYPE_QSTRING:
        ok = visit_type_str(v, name, &(*obj)->u.s, errp);
        break;

Finally, parse_type_str().

So it'll be impossible to specify null for -global migration.tls_*=XXX.

I suppose it's fine then?  Because it actually matches with previous when
it was still a string, and we use empty string to show tls disabled.

Thanks,
Peter Xu Oct. 12, 2023, 7:28 p.m. UTC | #10
On Wed, Oct 11, 2023 at 04:21:02PM +0200, Markus Armbruster wrote:
> > I'll have another look at the schema change, and how the types are used.
> 
> Schema changes:
> 
> 1. Change MigrationParameters members @tls-creds, @tls-hostname,
>    @tls-authz from 'str' to 'StrOrNull'
> 
> 2. Replace MigrateSetParameters by MigrationParameters.
> 
>    No change, since they are identical after change 1.
> 
> To determine the patch's impact, we need to examine uses of
> MigrationParameters members @tls-FOO before the patch.  These are:
> 
> * Return type of query-migrate-parameters
> 
>   Introspection shows the type change: the type's set of values now
>   includes JSON null.
> 
>   Is JSON null possible?  See [*] below.
> 
> * migrate_params_init()
> 
>   Before the patch, we initialize to "".
> 
>   Afterwards, we initialize to "" wrapped in a StrOrNull.
> 
>   The initial value means "off" before and after.
> 
> * migrate_params_check()
> 
>   An error check gets updated.  Ignoring for now.
> 
> * migrate_params_test_apply()
> 
>   Function deleted in the patch, but you wrote that's wrong.  Ignoring
>   for now.
> 
> * migrate_params_apply()
> 
>   Duplicates the three parameters from argument @parameters into the
>   migration object's member parameters.
> 
>   Argument @parameters comes from QMP via command
>   migrate-set-parameters.  Before the patch,
>   qmp_migrate_set_parameters() maps JSON null values to "".  Afterwards,
>   it passes the values verbatim.
> 
>   Parameters stored in the migration object before and after the patch:
> 
>   - When initialized and never changed: char * "", and StrOrNull
>     QTYPE_QSTRING "".
> 
>   - When set to non-empty string with migrate-set-parameters or
>     equivalent: that non-empty string, and QTYPE_QSTRING with that
>     non-empty string.
> 
>   - When reset with migrate-set-parameters with value "": "", and
>     QTYPE_QSTRING "".
> 
>   - When reset with migrate-set-parameters with value null: "", and
>     QTYPE_QNULL.
> 
>   Note that there's now a difference between passing "" and null to
>   migrate-set-parameters: the former results in value QTYPE_QSTRING "",
>   the latter QTYPE_QNULL.  Both values mean "off".  I hate this.  I very
>   much want a single C representation of "off".

Yes.

One option to avoid such ugliness is we keep using strings for tls fields,
then the only OFF is empty string.  It is not perfect either, but we need
to support empty string anyway as OFF.  That's the simplest to me.

We also have the benefit of decoupling this series from the qapi string
visitor issue, which means I'll just revive v1 which still doesn't use
StrOrNull.

> 
> * MigrationState member @parameters.
> 
>   Uses:
> 
>   - Properties "tls-creds", "tls-hostname", "tls-authz"
> 
>     These are externally accessible with -global.  The additional null
>     value is not accessible there: string input visitor limitation.  It
>     could become accessible depending on how we fix the crash bugs
>     related to that limitation, but we can worry about that when we do
>     it.
> 
>     Digression: why do these properties even exist?  I believe we
>     created the "migration" (pseudo-)device just so we can use "compat
>     props" to apply machine- and accelerator-specific configuration
>     tweaks.  We then added configuration for *all* configuration
>     parameters, not just the ones that need tweaking.  The external
>     exposure of properties via -global is not something we wanted, it
>     just came with the part we wanted (compat props).  Accidental
>     external interface.  Ugh.

IIRC both of them used to be the goals: either allow compat properties for
old machine types, or specify migration parameters in cmdline for easier
debugging and tests.  I use "-global" in mostly every migration test script
after it's introduced.

> 
>     None of the tls-FOO are tweaked via compat props, so no worries
>     there.
> 
>     I believe property access with qom-get and qom-set is not possible,
>     because the migration object is not part to the QOM tree, and
>     therefore is not reachable via any QOM path.  Aside: feels like
>     abuse of QOM.

Yes slightly abused, as the comment shows..

static const TypeInfo migration_type = {
    .name = TYPE_MIGRATION,
    /*
     * NOTE: TYPE_MIGRATION is not really a device, as the object is
     * not created using qdev_new(), it is not attached to the qdev
     * device tree, and it is never realized.
     *
     * TODO: Make this TYPE_OBJECT once QOM provides something like
     * TYPE_DEVICE's "-global" properties.
     */
    .parent = TYPE_DEVICE,

It's just that current code works mostly as expected without much churns
elsewhere.. so the TODO is not really getting much attention..

> 
>     It's also not part of the device tree rooted at the main system bus,
>     which means it isn't visible in "info qtree".  It is visible in
>     "info qdm", "device_add migration,help", and "-device
>     migration,help".  Output of the latter two changes.  All harmless.
> 
>     I *think* that's all.
> 
>   - migrate_tls(), migrate_tls_authz(), migrate_tls_creds(),
>     migrate_tls_hostname()
> 
>     Before the patch, these return the respective migration parameter
>     directly.  I believe the value is never NULL.  Value "" is special
>     and means "off".
> 
>     After the patch, these return the respective migration parameter
>     when it's a non-empty QTYPE_QSTRING, else NULL.  Value NULL means
>     off.
> 
>     Note this maps both C representations of "off" to NULL.
> 
>     This changes the return value for "off" from "" to NULL.
>     Improvement, because it results in a more pleasant "is off" check.

Maybe more than pleasant. :) Internally NULL is a must (even if empty
string), to make things like "if (tls_authz)" work.

> 
>   - qmp_query_migrate_parameters()
> 
>     The three tls_FOO get duplicated into the return value.
> 
>     Looks like the two different C representations of "off" bleed into
>     QMP (ugh!), and [*] JSON null is possible (incompatible change).
> 
> * hmp_info_migrate_parameters()
> 
>   The two different C representations of "off" are first mapped to NULL
>   with str_from_StrOrNull(), and then mapped to "" with a ?: operator.
>   Works.
> 
> Bottom line:
> 
> * Affected external interfaces:
> 
>   - query-migrate-parameters: can now return either "" or null when TLS
>     is off.  null is an incompatible change.  Needs fixing.

If we want to seriously keep using StrOrNull as the interface, I think we
need to allow both NULL or "" to exist?  Because the "logical goal" is to
use NULL but "" for compatibility?

I worry we spend a lot of time working on the NULL/"" but then found nobody
is using NULL anyway.

> 
>   - query-qmp-schema: shows null is now possible.  Correctly reflects
>     the backward incompatible change.  If we fix compatibility break, we
>     get a tolerable loss of typing precision instead.
> 
> 2. Two different C representations of "off".  Strong dislike.  I
>    recommend to fix the compatibility break by switching to a single C
>    representation.

IIUC the goal is to switch to StrOrNull, but that face challenge.

> 
> Thoughts?

If I'd vote (based on our current code), I'd double check tls* fields are
working properly with libvirt on empty strings, then switch all StrOrNull
in migration parameters to strings, for both QAPI and QEMU internally.
Then making sure all migrate_tls_*() helpers do proper job on converting ""
to NULL.

If we design a new interface, just remember to use NULL (hopefully) to be
clear and single way to say "OFF" starting from the 1st day.

But that's only my 2 cents.

Thanks,
Markus Armbruster Oct. 13, 2023, 5:36 a.m. UTC | #11
Peter Xu <peterx@redhat.com> writes:

> On Wed, Oct 11, 2023 at 04:21:02PM +0200, Markus Armbruster wrote:
>> > I'll have another look at the schema change, and how the types are used.
>> 
>> Schema changes:
>> 
>> 1. Change MigrationParameters members @tls-creds, @tls-hostname,
>>    @tls-authz from 'str' to 'StrOrNull'
>> 
>> 2. Replace MigrateSetParameters by MigrationParameters.
>> 
>>    No change, since they are identical after change 1.
>> 
>> To determine the patch's impact, we need to examine uses of
>> MigrationParameters members @tls-FOO before the patch.  These are:
>> 
>> * Return type of query-migrate-parameters
>> 
>>   Introspection shows the type change: the type's set of values now
>>   includes JSON null.
>> 
>>   Is JSON null possible?  See [*] below.
>> 
>> * migrate_params_init()
>> 
>>   Before the patch, we initialize to "".
>> 
>>   Afterwards, we initialize to "" wrapped in a StrOrNull.
>> 
>>   The initial value means "off" before and after.
>> 
>> * migrate_params_check()
>> 
>>   An error check gets updated.  Ignoring for now.
>> 
>> * migrate_params_test_apply()
>> 
>>   Function deleted in the patch, but you wrote that's wrong.  Ignoring
>>   for now.
>> 
>> * migrate_params_apply()
>> 
>>   Duplicates the three parameters from argument @parameters into the
>>   migration object's member parameters.
>> 
>>   Argument @parameters comes from QMP via command
>>   migrate-set-parameters.  Before the patch,
>>   qmp_migrate_set_parameters() maps JSON null values to "".  Afterwards,
>>   it passes the values verbatim.
>> 
>>   Parameters stored in the migration object before and after the patch:
>> 
>>   - When initialized and never changed: char * "", and StrOrNull
>>     QTYPE_QSTRING "".
>> 
>>   - When set to non-empty string with migrate-set-parameters or
>>     equivalent: that non-empty string, and QTYPE_QSTRING with that
>>     non-empty string.
>> 
>>   - When reset with migrate-set-parameters with value "": "", and
>>     QTYPE_QSTRING "".
>> 
>>   - When reset with migrate-set-parameters with value null: "", and
>>     QTYPE_QNULL.
>> 
>>   Note that there's now a difference between passing "" and null to
>>   migrate-set-parameters: the former results in value QTYPE_QSTRING "",
>>   the latter QTYPE_QNULL.  Both values mean "off".  I hate this.  I very
>>   much want a single C representation of "off".
>
> Yes.
>
> One option to avoid such ugliness is we keep using strings for tls fields,
> then the only OFF is empty string.  It is not perfect either, but we need
> to support empty string anyway as OFF.  That's the simplest to me.
>
> We also have the benefit of decoupling this series from the qapi string
> visitor issue, which means I'll just revive v1 which still doesn't use
> StrOrNull.

Another option is to keep the schema as is, but normalize the two "off"
values to just one value on assignment.

>> 
>> * MigrationState member @parameters.
>> 
>>   Uses:
>> 
>>   - Properties "tls-creds", "tls-hostname", "tls-authz"
>> 
>>     These are externally accessible with -global.  The additional null
>>     value is not accessible there: string input visitor limitation.  It
>>     could become accessible depending on how we fix the crash bugs
>>     related to that limitation, but we can worry about that when we do
>>     it.
>> 
>>     Digression: why do these properties even exist?  I believe we
>>     created the "migration" (pseudo-)device just so we can use "compat
>>     props" to apply machine- and accelerator-specific configuration
>>     tweaks.  We then added configuration for *all* configuration
>>     parameters, not just the ones that need tweaking.  The external
>>     exposure of properties via -global is not something we wanted, it
>>     just came with the part we wanted (compat props).  Accidental
>>     external interface.  Ugh.
>
> IIRC both of them used to be the goals: either allow compat properties for
> old machine types, or specify migration parameters in cmdline for easier
> debugging and tests.  I use "-global" in mostly every migration test script
> after it's introduced.

You use -global just because it's easier than using monitor commands,
correct?

>>     None of the tls-FOO are tweaked via compat props, so no worries
>>     there.
>> 
>>     I believe property access with qom-get and qom-set is not possible,
>>     because the migration object is not part to the QOM tree, and
>>     therefore is not reachable via any QOM path.  Aside: feels like
>>     abuse of QOM.
>
> Yes slightly abused, as the comment shows..
>
> static const TypeInfo migration_type = {
>     .name = TYPE_MIGRATION,
>     /*
>      * NOTE: TYPE_MIGRATION is not really a device, as the object is
>      * not created using qdev_new(), it is not attached to the qdev
>      * device tree, and it is never realized.
>      *
>      * TODO: Make this TYPE_OBJECT once QOM provides something like
>      * TYPE_DEVICE's "-global" properties.
>      */
>     .parent = TYPE_DEVICE,
>
> It's just that current code works mostly as expected without much churns
> elsewhere.. so the TODO is not really getting much attention..

I see a number of abuses:

1. We're using TYPE_DEVICE for something that clearly isn't a device.

2. The device is not connected to the qdev device tree rooted at the
main system bus.  I'm not sure this qualifies as abuse.  The connection
the device tree supports is plugging into a bus, which bus-less devices
cannot do.  Instead, they are commonly part of other devices.  There
might be exceptions.  The qdev device tree feels like a relic of the
pre-QOM past.

3. We're keeping the device around unrealized.  No written contract
forbids this, but it's highly irregular.

4. The object is not connected to the QOM composition tree.  No written
contract forbids this, but it's highly irregular, especially for an
object whose purpose is to provide an external configuration interface.

The comment is about the abuse of qdev (items 1 to 3).  Even with the
TODO comment fully addressed, we'd still abuse QOM (item 4).

>> 
>>     It's also not part of the device tree rooted at the main system bus,
>>     which means it isn't visible in "info qtree".  It is visible in
>>     "info qdm", "device_add migration,help", and "-device
>>     migration,help".  Output of the latter two changes.  All harmless.
>> 
>>     I *think* that's all.
>> 
>>   - migrate_tls(), migrate_tls_authz(), migrate_tls_creds(),
>>     migrate_tls_hostname()
>> 
>>     Before the patch, these return the respective migration parameter
>>     directly.  I believe the value is never NULL.  Value "" is special
>>     and means "off".
>> 
>>     After the patch, these return the respective migration parameter
>>     when it's a non-empty QTYPE_QSTRING, else NULL.  Value NULL means
>>     off.
>> 
>>     Note this maps both C representations of "off" to NULL.
>> 
>>     This changes the return value for "off" from "" to NULL.
>>     Improvement, because it results in a more pleasant "is off" check.
>
> Maybe more than pleasant. :) Internally NULL is a must (even if empty
> string), to make things like "if (tls_authz)" work.

Three possible interfaces, and the resulting tests:

* Non-empty string when on, "" when off:
  if (*tls_authz)

* Non-empty string when on, NULL when off:
  if (tls_authz)

* Non-empty string when on, NULL or "" when off:
  if (tls_authz && *tls_authz)
  
All three work equally well, but the second one is more pleasant than
the others.

>>   - qmp_query_migrate_parameters()
>> 
>>     The three tls_FOO get duplicated into the return value.
>> 
>>     Looks like the two different C representations of "off" bleed into
>>     QMP (ugh!), and [*] JSON null is possible (incompatible change).
>> 
>> * hmp_info_migrate_parameters()
>> 
>>   The two different C representations of "off" are first mapped to NULL
>>   with str_from_StrOrNull(), and then mapped to "" with a ?: operator.
>>   Works.
>> 
>> Bottom line:
>> 
>> * Affected external interfaces:
>> 
>>   - query-migrate-parameters: can now return either "" or null when TLS
>>     is off.  null is an incompatible change.  Needs fixing.
>
> If we want to seriously keep using StrOrNull as the interface, I think we
> need to allow both NULL or "" to exist?  Because the "logical goal" is to
> use NULL but "" for compatibility?

The interface was ill-conceived from the start.  Probably so it could be
shoehorned more easily into existing migraion configuration
infrastructure.

We have three syntactically independent parameters: @tls-creds,
@tls-hostname, @tls-authz.  The latter two are meaningless (and silently
ignored) unless the first one is given.  I hate that.

TLS is off by default.  To enable it, set @tls-creds.  Except setting it
to the otherwise invalid value "" disables it.  That's because all we
have for configuration is the "set migration parameter to value"
interface.  Only works because we have an invalid value we can abuse.  I
hate that.

I found this interface days before the freeze, too late to replace it by
a saner one.

> I worry we spend a lot of time working on the NULL/"" but then found nobody
> is using NULL anyway.

I naively assumed deprecating "" would get null used.

>>   - query-qmp-schema: shows null is now possible.  Correctly reflects
>>     the backward incompatible change.  If we fix compatibility break, we
>>     get a tolerable loss of typing precision instead.
>> 
>> 2. Two different C representations of "off".  Strong dislike.  I
>>    recommend to fix the compatibility break by switching to a single C
>>    representation.
>
> IIUC the goal is to switch to StrOrNull, but that face challenge.
>
>> 
>> Thoughts?
>
> If I'd vote (based on our current code), I'd double check tls* fields are
> working properly with libvirt on empty strings, then switch all StrOrNull
> in migration parameters to strings, for both QAPI and QEMU internally.
> Then making sure all migrate_tls_*() helpers do proper job on converting ""
> to NULL.

Accept cleanup of this bad interface is not practical.  Cut our losses
and run.

> If we design a new interface, just remember to use NULL (hopefully) to be
> clear and single way to say "OFF" starting from the 1st day.

Migration configuration and control is in an ugly state.

Configuration is almost entirely special (own QMP commands for
everything), with a little abuse of general infrastructure stirred in
(-global, compat props).  Having capabilities in addition to parameters
is a useless complication.  Too many features of questionable utility
with way too many configuration knobs.

Control is entirely special.  Solves the largely same problems as jobs
do, just different.

Is there hope for improvement, or should I hold my nose and look the
other way?

> But that's only my 2 cents.
>
> Thanks,
Markus Armbruster Oct. 13, 2023, 5:41 a.m. UTC | #12
Peter Xu <peterx@redhat.com> writes:

> On Wed, Oct 11, 2023 at 06:28:26AM +0200, Markus Armbruster wrote:

[...]

>> >> If the migration object is accessible with qom-set, then that's another
>> >> way to assign null values.
>> >
>> > I see what you meant.  IMHO we just don't need to worry on breaking that as
>> > I am not aware of anyone using that to set migration parameters, and I
>> > think the whole idea of migration_properties is for debugging.  The only
>> > legal way an user should set migration parameters should be via QMP, afaik.
>> 
>> No matter the intended purpose of an interface, its meaning should be
>> clear.  If we accept null, what does it mean?
>
> IMHO here it means anything parsed from migration_properties will be a
> qstring and impossible to be a qnull, at least if with that of my (probably
> unmature, or even wrong..) fix:
>
> https://lore.kernel.org/all/ZRsff7Lmy7TnggK9@x1n/
>
> +static bool string_input_start_alternate(Visitor *v, const char *name,
> +                                         GenericAlternate **obj, size_t size,
> +                                         Error **errp)
> +{
> +    *obj = g_malloc0(size);
> +    (*obj)->type = QTYPE_QSTRING;          <---------------- constantly set to string type
> +    return true;
> +}
>
> I think it means when we parse the string, we'll always go with:
>
> visit_type_StrOrNull():
>     switch ((*obj)->type) {
>     case QTYPE_QSTRING:
>         ok = visit_type_str(v, name, &(*obj)->u.s, errp);
>         break;
>
> Finally, parse_type_str().
>
> So it'll be impossible to specify null for -global migration.tls_*=XXX.

Only because -global can't do null.  Changes the moment we improve it to
support JSON values.  We already improved several other CLI options that
way.

> I suppose it's fine then?  Because it actually matches with previous when
> it was still a string, and we use empty string to show tls disabled.
>
> Thanks,
Juan Quintela Oct. 31, 2023, 11:08 a.m. UTC | #13
Markus Armbruster <armbru@redhat.com> wrote:
> Peter Xu <peterx@redhat.com> writes:
>
>> On Wed, Oct 11, 2023 at 04:21:02PM +0200, Markus Armbruster wrote:

>> IIRC both of them used to be the goals: either allow compat properties for
>> old machine types, or specify migration parameters in cmdline for easier
>> debugging and tests.  I use "-global" in mostly every migration test script
>> after it's introduced.
>
> You use -global just because it's easier than using monitor commands,
> correct?

It is long history.  But to make things easier I will try to resume.
In the beggining there was no "defer" method, so it was imposible to
setup migration-channels and that kind of information.
So we created that -global migration properties.

Time pass, and we need to fix that for real, because more and more
migration parameters need to be set bofer we start incoming migration.
So we create migration "defer" method.  And now we can set things from
the command line/QMP.

But when one is testing (i.e. migration developers), using the global
property is much easier.

I am always tempted to modify the monitor command line to allow "read
the commands from this file at startup".

> Configuration is almost entirely special (own QMP commands for
> everything), with a little abuse of general infrastructure stirred in
> (-global, compat props).  Having capabilities in addition to parameters
> is a useless complication.  Too many features of questionable utility
> with way too many configuration knobs.

I also remember that one.
In the beggining all migration options were bools.  So we have named
capabilities.  At some point we needed parameters that were not bools,
so we had to get the parameters thing because all the migration code
supposed that the capabilities were bool.

No, I am not defending the choices we made at the time, but that is how
it happened.  To be fair, when I have a new "bool" to add to migration,
I am never sure if I have to add it as a capability or as a parameter
that returns bool.


Later, Juan.
Markus Armbruster Nov. 2, 2023, 2:25 p.m. UTC | #14
Juan Quintela <quintela@redhat.com> writes:

> Markus Armbruster <armbru@redhat.com> wrote:
>> Peter Xu <peterx@redhat.com> writes:
>>
>>> On Wed, Oct 11, 2023 at 04:21:02PM +0200, Markus Armbruster wrote:
>
>>> IIRC both of them used to be the goals: either allow compat properties for
>>> old machine types, or specify migration parameters in cmdline for easier
>>> debugging and tests.  I use "-global" in mostly every migration test script
>>> after it's introduced.
>>
>> You use -global just because it's easier than using monitor commands,
>> correct?
>
> It is long history.  But to make things easier I will try to resume.
> In the beggining there was no "defer" method, so it was imposible to
> setup migration-channels and that kind of information.
> So we created that -global migration properties.
>
> Time pass, and we need to fix that for real, because more and more
> migration parameters need to be set bofer we start incoming migration.
> So we create migration "defer" method.  And now we can set things from
> the command line/QMP.
>
> But when one is testing (i.e. migration developers), using the global
> property is much easier.
>
> I am always tempted to modify the monitor command line to allow "read
> the commands from this file at startup".
>
>> Configuration is almost entirely special (own QMP commands for
>> everything), with a little abuse of general infrastructure stirred in
>> (-global, compat props).  Having capabilities in addition to parameters
>> is a useless complication.  Too many features of questionable utility
>> with way too many configuration knobs.
>
> I also remember that one.
> In the beggining all migration options were bools.  So we have named
> capabilities.  At some point we needed parameters that were not bools,
> so we had to get the parameters thing because all the migration code
> supposed that the capabilities were bool.
>
> No, I am not defending the choices we made at the time, but that is how
> it happened.

Decisions that make sense at the time can stop making sense later.

>               To be fair, when I have a new "bool" to add to migration,
> I am never sure if I have to add it as a capability or as a parameter
> that returns bool.

I'd be unsure, too.


Migration has its own idiosyncratic configuration interface, even though
its configuration needs are not special at all.  This is due to a long
history of decisions that made sense at the time.

What kind of interface would we choose if we could start over now?

Let's have a look at what I consider the two most complex piece of
configuration to date, namely block backends and QOM objects.

In both cases, configuration is a QAPI object type: BlockdevOptions and
ObjectOptions.

The common members are the configuration common to all block backends /
objects.  One of them is the type of block backend ("driver" in block
parlance) or QOM object ("qom-type").

A type's variant members are the configuration specific to that type.

This is suitably expressive.

We create a state object for a given configuration object with
blockdev-add / object-add.

For block devices, we even have a way to modify a state object's
configuration: blockdev-reopen.  For QOM objects, there's qom-set, but I
don't expect that to work in the general case.  Where "not work" can
range from "does nothing" to "explodes".

Now let's try to apply this to migration.

As long as we can have just one migration, we need just one QAPI object
to configure it.

We could create the object with -object / object_add.  For convenience,
we'd probably want to create one with default configuration
automatically on demand.

We could use qom-set to change configuration.  If we're not comfortable
with using qom-set for production, we could do something like
blockdev-reopen instead.


Could we move towards such a design?  Turn the existing ad hoc interface
into compatibility sugar for it?
Peter Xu Nov. 2, 2023, 6:05 p.m. UTC | #15
On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
> Juan Quintela <quintela@redhat.com> writes:
> 
> > Markus Armbruster <armbru@redhat.com> wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >>
> >>> On Wed, Oct 11, 2023 at 04:21:02PM +0200, Markus Armbruster wrote:
> >
> >>> IIRC both of them used to be the goals: either allow compat properties for
> >>> old machine types, or specify migration parameters in cmdline for easier
> >>> debugging and tests.  I use "-global" in mostly every migration test script
> >>> after it's introduced.
> >>
> >> You use -global just because it's easier than using monitor commands,
> >> correct?
> >
> > It is long history.  But to make things easier I will try to resume.
> > In the beggining there was no "defer" method, so it was imposible to
> > setup migration-channels and that kind of information.
> > So we created that -global migration properties.
> >
> > Time pass, and we need to fix that for real, because more and more
> > migration parameters need to be set bofer we start incoming migration.
> > So we create migration "defer" method.  And now we can set things from
> > the command line/QMP.
> >
> > But when one is testing (i.e. migration developers), using the global
> > property is much easier.
> >
> > I am always tempted to modify the monitor command line to allow "read
> > the commands from this file at startup".
> >
> >> Configuration is almost entirely special (own QMP commands for
> >> everything), with a little abuse of general infrastructure stirred in
> >> (-global, compat props).  Having capabilities in addition to parameters
> >> is a useless complication.  Too many features of questionable utility
> >> with way too many configuration knobs.
> >
> > I also remember that one.
> > In the beggining all migration options were bools.  So we have named
> > capabilities.  At some point we needed parameters that were not bools,
> > so we had to get the parameters thing because all the migration code
> > supposed that the capabilities were bool.
> >
> > No, I am not defending the choices we made at the time, but that is how
> > it happened.
> 
> Decisions that make sense at the time can stop making sense later.
> 
> >               To be fair, when I have a new "bool" to add to migration,
> > I am never sure if I have to add it as a capability or as a parameter
> > that returns bool.
> 
> I'd be unsure, too.
> 
> 
> Migration has its own idiosyncratic configuration interface, even though
> its configuration needs are not special at all.  This is due to a long
> history of decisions that made sense at the time.
> 
> What kind of interface would we choose if we could start over now?
> 
> Let's have a look at what I consider the two most complex piece of
> configuration to date, namely block backends and QOM objects.
> 
> In both cases, configuration is a QAPI object type: BlockdevOptions and
> ObjectOptions.
> 
> The common members are the configuration common to all block backends /
> objects.  One of them is the type of block backend ("driver" in block
> parlance) or QOM object ("qom-type").
> 
> A type's variant members are the configuration specific to that type.
> 
> This is suitably expressive.
> 
> We create a state object for a given configuration object with
> blockdev-add / object-add.
> 
> For block devices, we even have a way to modify a state object's
> configuration: blockdev-reopen.  For QOM objects, there's qom-set, but I
> don't expect that to work in the general case.  Where "not work" can
> range from "does nothing" to "explodes".
> 
> Now let's try to apply this to migration.
> 
> As long as we can have just one migration, we need just one QAPI object
> to configure it.
> 
> We could create the object with -object / object_add.  For convenience,
> we'd probably want to create one with default configuration
> automatically on demand.
> 
> We could use qom-set to change configuration.  If we're not comfortable
> with using qom-set for production, we could do something like
> blockdev-reopen instead.
> 
> Could we move towards such a design?  Turn the existing ad hoc interface
> into compatibility sugar for it?

Sounds doable to me.

I'm not familiar with BlockdevOptions, it looks like something setup once
and for all for all relevant parameters need to be set in the same request?
Migration will require each cap/parameter to be set separately anytime,
e.g., the user can adjust downtime / bandwidth even during migration in
progress.

Making all caps/parameters QOM objects, or one object containing both
attributes, sounds like a good fit.  object_property_* APIs allows setters,
I think that's good enough for migration to trigger whatever needed (e.g.
migration_rate_set() updates after bandwidth modifications).

We can convert e.g. qmp set parameters into a loop of setting each
property, it'll be slightly slower because we'll need to do sanity check
for each property after each change, but that shouldn't be a hot path
anyway so seems fine.

It'l still be a pity that we still cannot reduce the triplications of qapi
docs immediately even with that.  But with that, it seems doable if we will
obsolete QMP migrate-set-parameters after we can do QOM-set.

Thanks,
Markus Armbruster Nov. 14, 2023, 7:27 a.m. UTC | #16
Cc: Paolo for QOM expertise.

Peter Xu <peterx@redhat.com> writes:

> On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:

[...]

>> Migration has its own idiosyncratic configuration interface, even though
>> its configuration needs are not special at all.  This is due to a long
>> history of decisions that made sense at the time.
>> 
>> What kind of interface would we choose if we could start over now?
>> 
>> Let's have a look at what I consider the two most complex piece of
>> configuration to date, namely block backends and QOM objects.
>> 
>> In both cases, configuration is a QAPI object type: BlockdevOptions and
>> ObjectOptions.
>> 
>> The common members are the configuration common to all block backends /
>> objects.  One of them is the type of block backend ("driver" in block
>> parlance) or QOM object ("qom-type").
>> 
>> A type's variant members are the configuration specific to that type.
>> 
>> This is suitably expressive.
>> 
>> We create a state object for a given configuration object with
>> blockdev-add / object-add.
>> 
>> For block devices, we even have a way to modify a state object's
>> configuration: blockdev-reopen.  For QOM objects, there's qom-set, but I
>> don't expect that to work in the general case.  Where "not work" can
>> range from "does nothing" to "explodes".
>> 
>> Now let's try to apply this to migration.
>> 
>> As long as we can have just one migration, we need just one QAPI object
>> to configure it.
>> 
>> We could create the object with -object / object_add.  For convenience,
>> we'd probably want to create one with default configuration
>> automatically on demand.
>> 
>> We could use qom-set to change configuration.  If we're not comfortable
>> with using qom-set for production, we could do something like
>> blockdev-reopen instead.
>> 
>> Could we move towards such a design?  Turn the existing ad hoc interface
>> into compatibility sugar for it?
>
> Sounds doable to me.
>
> I'm not familiar with BlockdevOptions, it looks like something setup once
> and for all for all relevant parameters need to be set in the same request?

Yes, but you can "reopen", which replaces the entire configuration.

blockdev-add creates a new block backend device, and blockdev-reopen
reopens a set of existing ones.  Both take the same arguments for each
device.

> Migration will require each cap/parameter to be set separately anytime,
> e.g., the user can adjust downtime / bandwidth even during migration in
> progress.

"Replace entire configuration" isn't a good fit then, because users
would have to repeat the entire configuration just to tweak one thing.

> Making all caps/parameters QOM objects, or one object containing both
> attributes, sounds like a good fit.  object_property_* APIs allows setters,
> I think that's good enough for migration to trigger whatever needed (e.g.
> migration_rate_set() updates after bandwidth modifications).
>
> We can convert e.g. qmp set parameters into a loop of setting each
> property, it'll be slightly slower because we'll need to do sanity check
> for each property after each change, but that shouldn't be a hot path
> anyway so seems fine.

I figure doing initial configuration in one command is convenient.  The
obvious existing command for that is object-add.

The obvious interface for modifying configuration is a command to change
just one parameter.  The obvious existing command for that is qom-set.

Problem: qom-set is a death trap in general.  It can modify any QOM
property with a setter, and we test basically none of them.  Using it
for modifying migration configuration would signal it's okay to use
elsewhere, too.  I'm not sure we want to send that message.  Maybe we
want to do the opposite, and make it an unstable interface.

Aside: I believe the root problem is our failure to tie "can write" to
the object's state.  Just because a property can be set with object-add
doesn't mean it can be validly changed at any time during the object's
life.

Problem: when several parameters together have to satisfy constraints,
going from one valid configuration to another valid configuration may
require changing several parameters at once, or else go through invalid
intermediate configurations.

This problem is not at all specific to the migration object.

One solution is careful design to ensure that there's always a sequence
of transitions through valid configuration.  Can become complicated as
configuration evolves.  Possible even impractical or impossible.

Another solution is a command to modify multiple parameters together,
leaving alone the others (unlike blockdev-reopen, which overwrites all
of them).

> It'l still be a pity that we still cannot reduce the triplications of qapi
> docs immediately even with that.  But with that, it seems doable if we will
> obsolete QMP migrate-set-parameters after we can do QOM-set.

Yes.
Daniel P. Berrangé Nov. 14, 2023, 9:13 a.m. UTC | #17
On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
> Now let's try to apply this to migration.
>
> As long as we can have just one migration, we need just one QAPI object
> to configure it.
> 
> We could create the object with -object / object_add.  For convenience,
> we'd probably want to create one with default configuration
> automatically on demand.
> 
> We could use qom-set to change configuration.  If we're not comfortable
> with using qom-set for production, we could do something like
> blockdev-reopen instead.

Do we even need to do this via a QAPI object ?

Why are we not just making the obvious design change of passing everything
with the 'migrate' / 'migrate-incoming' commands that kick it off:

ie:

{ 'command': 'migrate',
  'data': {'uri': 'str',
           '*channels': [ 'MigrationChannel' ],
	   '*capabilities': [ 'MigrateCapability' ],
	   '*parameters': [ 'MigrateParameters' ],
           '*detach': 'bool', '*resume': 'bool' } }

     (deprecated bits trimmed for clarity)

and the counterpart:

{ 'command': 'migrate-incoming',
             'data': {'*uri': 'str',
                      '*channels': [ 'MigrationChannel' ],
                      '*capabilities': [ 'MigrateCapability' ],
                      '*parameters': [ 'MigrateParameters' ] } }

such that the design is just like 99% of other commands which take
all their parameters directly. We already have 'migrate-set-parameters'
remaining for the runtime tunables, and can deprecate the usage of this
when migration is not already running, and similarly deprecate
migrate-set-capabilities.

With regards,
Daniel
Markus Armbruster Nov. 14, 2023, 9:53 a.m. UTC | #18
Daniel P. Berrangé <berrange@redhat.com> writes:

> On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
>> Now let's try to apply this to migration.
>>
>> As long as we can have just one migration, we need just one QAPI object
>> to configure it.
>> 
>> We could create the object with -object / object_add.  For convenience,
>> we'd probably want to create one with default configuration
>> automatically on demand.
>> 
>> We could use qom-set to change configuration.  If we're not comfortable
>> with using qom-set for production, we could do something like
>> blockdev-reopen instead.
>
> Do we even need to do this via a QAPI object ?

No.  It's merely one way to skin the cat.

> Why are we not just making the obvious design change of passing everything
> with the 'migrate' / 'migrate-incoming' commands that kick it off:
>
> ie:
>
> { 'command': 'migrate',
>   'data': {'uri': 'str',
>            '*channels': [ 'MigrationChannel' ],
> 	   '*capabilities': [ 'MigrateCapability' ],
> 	   '*parameters': [ 'MigrateParameters' ],
>            '*detach': 'bool', '*resume': 'bool' } }
>
>      (deprecated bits trimmed for clarity)
>
> and the counterpart:
>
> { 'command': 'migrate-incoming',
>              'data': {'*uri': 'str',
>                       '*channels': [ 'MigrationChannel' ],
>                       '*capabilities': [ 'MigrateCapability' ],
>                       '*parameters': [ 'MigrateParameters' ] } }
>
> such that the design is just like 99% of other commands which take
> all their parameters directly. We already have 'migrate-set-parameters'
> remaining for the runtime tunables, and can deprecate the usage of this
> when migration is not already running, and similarly deprecate
> migrate-set-capabilities.

We still need commands to adjust configuration while migration runs.
Daniel P. Berrangé Nov. 14, 2023, 9:55 a.m. UTC | #19
On Tue, Nov 14, 2023 at 10:53:20AM +0100, Markus Armbruster wrote:
> Daniel P. Berrangé <berrange@redhat.com> writes:
> 
> > On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
> >> Now let's try to apply this to migration.
> >>
> >> As long as we can have just one migration, we need just one QAPI object
> >> to configure it.
> >> 
> >> We could create the object with -object / object_add.  For convenience,
> >> we'd probably want to create one with default configuration
> >> automatically on demand.
> >> 
> >> We could use qom-set to change configuration.  If we're not comfortable
> >> with using qom-set for production, we could do something like
> >> blockdev-reopen instead.
> >
> > Do we even need to do this via a QAPI object ?
> 
> No.  It's merely one way to skin the cat.
> 
> > Why are we not just making the obvious design change of passing everything
> > with the 'migrate' / 'migrate-incoming' commands that kick it off:
> >
> > ie:
> >
> > { 'command': 'migrate',
> >   'data': {'uri': 'str',
> >            '*channels': [ 'MigrationChannel' ],
> > 	   '*capabilities': [ 'MigrateCapability' ],
> > 	   '*parameters': [ 'MigrateParameters' ],
> >            '*detach': 'bool', '*resume': 'bool' } }
> >
> >      (deprecated bits trimmed for clarity)
> >
> > and the counterpart:
> >
> > { 'command': 'migrate-incoming',
> >              'data': {'*uri': 'str',
> >                       '*channels': [ 'MigrationChannel' ],
> >                       '*capabilities': [ 'MigrateCapability' ],
> >                       '*parameters': [ 'MigrateParameters' ] } }
> >
> > such that the design is just like 99% of other commands which take
> > all their parameters directly. We already have 'migrate-set-parameters'
> > remaining for the runtime tunables, and can deprecate the usage of this
> > when migration is not already running, and similarly deprecate
> > migrate-set-capabilities.
> 
> We still need commands to adjust configuration while migration runs.

migrate-set-parameters will still satisfy that use case, we can allow
its usage when migration is active and deprecate its usage when migration
is not active

With regards,
Daniel
Juan Quintela Nov. 14, 2023, 10:20 a.m. UTC | #20
Markus Armbruster <armbru@redhat.com> wrote:
D> Cc: Paolo for QOM expertise.
>
> Peter Xu <peterx@redhat.com> writes:
>
>> On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
>
> [...]
>
>>> Migration has its own idiosyncratic configuration interface, even though
>>> its configuration needs are not special at all.  This is due to a long
>>> history of decisions that made sense at the time.
>>> 
>>> What kind of interface would we choose if we could start over now?
>>> 
>>> Let's have a look at what I consider the two most complex piece of
>>> configuration to date, namely block backends and QOM objects.
>>> 
>>> In both cases, configuration is a QAPI object type: BlockdevOptions and
>>> ObjectOptions.
>>> 
>>> The common members are the configuration common to all block backends /
>>> objects.  One of them is the type of block backend ("driver" in block
>>> parlance) or QOM object ("qom-type").
>>> 
>>> A type's variant members are the configuration specific to that type.
>>> 
>>> This is suitably expressive.
>>> 
>>> We create a state object for a given configuration object with
>>> blockdev-add / object-add.
>>> 
>>> For block devices, we even have a way to modify a state object's
>>> configuration: blockdev-reopen.  For QOM objects, there's qom-set, but I
>>> don't expect that to work in the general case.  Where "not work" can
>>> range from "does nothing" to "explodes".
>>> 
>>> Now let's try to apply this to migration.
>>> 
>>> As long as we can have just one migration, we need just one QAPI object
>>> to configure it.
>>> 
>>> We could create the object with -object / object_add.  For convenience,
>>> we'd probably want to create one with default configuration
>>> automatically on demand.
>>> 
>>> We could use qom-set to change configuration.  If we're not comfortable
>>> with using qom-set for production, we could do something like
>>> blockdev-reopen instead.
>>> 
>>> Could we move towards such a design?  Turn the existing ad hoc interface
>>> into compatibility sugar for it?
>>
>> Sounds doable to me.
>>
>> I'm not familiar with BlockdevOptions, it looks like something setup once
>> and for all for all relevant parameters need to be set in the same request?
>
> Yes, but you can "reopen", which replaces the entire configuration.
>
> blockdev-add creates a new block backend device, and blockdev-reopen
> reopens a set of existing ones.  Both take the same arguments for each
> device.
>
>> Migration will require each cap/parameter to be set separately anytime,
>> e.g., the user can adjust downtime / bandwidth even during migration in
>> progress.
>
> "Replace entire configuration" isn't a good fit then, because users
> would have to repeat the entire configuration just to tweak one thing.

We have two types of parameters:
- multifd_channels type: we need to set them before the migration start
- downtime: We can change it at any time, even during migration

So I think we need two types of parameters.
For the parameters that one can't change during migration, it could be a
good idea to set all of them at once, i.e.

(qemu) migration_set multifd on multifd-channels 6 multifd_compression none

Whatever syntax we see fit.  That would make it easier to:
- document what parameters make sense together (they need to be set at
  the same time)
- convert migrate_params_check()/migrate_params_test_apply()/migrate_params_apply()
  into a sane interface.
- this parameters don't need to be atomic, they are set by definition by
  the main thread before the migration starts.  The other parameters
  needs to be atomic.

>> Making all caps/parameters QOM objects, or one object containing both
>> attributes, sounds like a good fit.  object_property_* APIs allows setters,
>> I think that's good enough for migration to trigger whatever needed (e.g.
>> migration_rate_set() updates after bandwidth modifications).
>>
>> We can convert e.g. qmp set parameters into a loop of setting each
>> property, it'll be slightly slower because we'll need to do sanity check
>> for each property after each change, but that shouldn't be a hot path
>> anyway so seems fine.
>
> I figure doing initial configuration in one command is convenient.  The
> obvious existing command for that is object-add.
>
> The obvious interface for modifying configuration is a command to change
> just one parameter.  The obvious existing command for that is qom-set.

As said before, I think we need two commands:

- migrate_set_method method [list of method arguments with values]
  Values that need to be set before migration

- migrate_set_parameter parameter value
  Values that can be changed at any time

> Problem: qom-set is a death trap in general.  It can modify any QOM
> property with a setter, and we test basically none of them.  Using it
> for modifying migration configuration would signal it's okay to use
> elsewhere, too.  I'm not sure we want to send that message.  Maybe we
> want to do the opposite, and make it an unstable interface.
>
> Aside: I believe the root problem is our failure to tie "can write" to
> the object's state.  Just because a property can be set with object-add
> doesn't mean it can be validly changed at any time during the object's
> life.

Yeap.

> Problem: when several parameters together have to satisfy constraints,
> going from one valid configuration to another valid configuration may
> require changing several parameters at once, or else go through invalid
> intermediate configurations.

There are other things that "currently" we are not considering and that
make things really strange.

(qemu) migrate_set_capability xbzrle on
(qemu) migrate_set_parameter xbzrle_cache $foo
(qemu) migrate $bar

migration fails for whatever reason

we try again with another method, let's say multifd

(qemu) migrate_set_capability multifd on
(qemu) migrate_set_parameter multifd-channels $foo2
(qemu) migrate $bar2

At this point, xbrzrle_cache is still set, but it makes no sense.  So I
think that if we change the interface, the migrate_set_method that I
suggested would just clean-up all values to its defaults.


> This problem is not at all specific to the migration object.
>
> One solution is careful design to ensure that there's always a sequence
> of transitions through valid configuration.  Can become complicated as
> configuration evolves.  Possible even impractical or impossible.

This is a nightmare.  See migrate_params_check().  Basically everytime
that we add a capability/parameter we need to go through all the list to
see if anything is compatible/incompatible.  It is much better that we
set all that kind of parameters at once, so this is much easier to understand.

> Another solution is a command to modify multiple parameters together,
> leaving alone the others (unlike blockdev-reopen, which overwrites all
> of them).

I still think we need both methods because we have the two kinds of
parameters.

Actually, it is even worse than that, because there are parameters that
need to be set before the migration start but that are not related to
the migration method at all.



>> It'l still be a pity that we still cannot reduce the triplications of qapi
>> docs immediately even with that.  But with that, it seems doable if we will
>> obsolete QMP migrate-set-parameters after we can do QOM-set.

Trying to be practical, this are the capabilities:

{ 'enum': 'MigrationCapability',
  'data': [

'xbzrle'  <- how we compress, but it is used on top of anything else
             except multifd/rdma
'rdma-pin-all' <- this is a rdma parameter, but at the time there were
             not parameters, so here we are.

'auto-converge' <- this is obsolete.  Dirty limit features are better
                   for this.  But this again is independent of anything
                   else.  I will have to double check to see if it can
                   be set at any moment.
'zero-blocks' <- only for storage migration, clearly a parameter.

{ 'name': 'compress', 'features': [ 'deprecated' ] } <- migration
           compression method
'events' <- this should be default and make it a nop for backwards
           compatibility.  I think libvirt sets it always.

'postcopy-ram' <- we want to do postcopy.  I don't even remember why
                  this is a capability when we have to issue a command
                  to start it anyways

{ 'name': 'x-colo', 'features': [ 'unstable' ] }, <- colo is
                  experimental, not even enter here.

'release-ram' <- This only makes sense if:
                 * we are in postcopy
                 * we are migration with the guest stopped (and not sure
                   if in this case it even works)

{ 'name': 'block', 'features': [ 'deprecated' ] } <- deprecated, don't
           spent time here.

'return-path' <-  basically everything except exec should have this.
                  this is way from destination to give us back errors.
                  exec basically is not able to give any error.

'pause-before-switchover' <-  This is independent of anything else, but
                              it makes sense to require to set it before
                              migration start.  Used when you need to
                              "lock" things, like block devices to a
                              single user.  Think iscsi block devices
                              that can only be enabled at the same time
                              on source or destination, but not both.

'multifd' <- migration method

'dirty-bitmaps' <- block migration is weird.  I don't remember this one.


'postcopy-blocktime' <- don't even remember

'late-block-activate' <- I think this is somehow related to
                         pause-before-switchover, but I don't ever
                         remember the details.  I guess we did it wrong
                         the fist time.

{ 'name': 'x-ignore-shared', 'features': [ 'unstable' ] } <- we need to
           mark this stable.  Basically means that:
           * we are migratin on the same machine
           * RAM is mmaped in source/destination shared
           * so we don't need to migrate it.

'validate-uuid' <- making sure that we are migration to the right
                   destination.  Independent of anything else.  Should
                   be set before migration starts.

'background-snapshot' <- block devices are big and slow.  Migrating them
           is complilaceed.  Forgot about them.

'zero-copy-send' <- only valid for multifd, should be a paramter

'postcopy-preempt' <-  We create a new channel for postcopy, depends on
                       postcopy and needs to be set before migration starts.

'switchover-ack' <- Independent of anything else.  Needs to be set
           before migration starts.

'dirty-limit' <- Autoconverge was bad.  We create another way of doing
                 the same functionality.


And now we go with migration_parametres:

'announce-initial', 'announce-max',
'announce-rounds', 'announce-step',
	I think we should set all of them at the same time.
        Before migration ends.  This is SDN for you, sometimes we need
        to repeat the ARP packets to get they passed through routers.
        Hello openstack.


{ 'name': 'compress-level', 'features': [ 'deprecated' ] },
{ 'name': 'compress-threads', 'features': [ 'deprecated' ] },
{ 'name': 'decompress-threads', 'features': [ 'deprecated' ] },
{ 'name': 'compress-wait-thread', 'features': [ 'deprecated'] },

	Parameters for old compression method.  Should be set before we start.

'throttle-trigger-threshold',
'cpu-throttle-initial',
'cpu-throttle-increment',
'cpu-throttle-tailslow',
'max-cpu-throttle'
	Autoconverge parameters.  Make sense to change them after we start.
        Remember that autoconverge was a good idea and a bad
        implementation.

'tls-creds', 'tls-hostname', 'tls-authz',

        They are only needed if we are using TLS.  But we only need TLS
        depending on the URI we are using (tcp+tls).

'max-bandwidth',
        Can be changed at any time.

'avail-switchover-bandwidth',
        Can be changed at any time.  Better explained, it shouldn't be
        needed.  But we know that it is needed after migration starts,
        so ...

'downtime-limit',
        Can be changed at any point.

{ 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] },
	Colo parameter.  Needs to be set before migration starts.

{ 'name': 'block-incremental', 'features': [ 'deprecated' ] },
	deprecated, but needs to be set before migration starts.

'multifd-channels',
	Needs to be set before migration starts.

'xbzrle-cache-size'
	Needs to be set before migration starts.
        It *could* be changed after migration starts, but ...

'max-postcopy-bandwidth'
	When we are in postcopy stage, reasonable thing to do is to use
	all available bandwidth.  When that is not true, use this parameter.


'multifd-compression'
'multifd-zlib-level'
'multifd-zstd-level'
	multifd compression parameters.  Depending of
        multifd-compression value, the others make sense or not.

'block-bitmap-mapping',
	I don't understand/remember this new block migration, so I can't comment.

{ 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
	Needs to be set before migration starts.
'vcpu-dirty-limit',
	One could defend to change it after migration starts, but ....

'mode'
        set before migration starts


So you can see that we have lots of parameters to try to make sense of.

Later, Juan.
Juan Quintela Nov. 14, 2023, 10:28 a.m. UTC | #21
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
>> Now let's try to apply this to migration.
>>
>> As long as we can have just one migration, we need just one QAPI object
>> to configure it.
>> 
>> We could create the object with -object / object_add.  For convenience,
>> we'd probably want to create one with default configuration
>> automatically on demand.
>> 
>> We could use qom-set to change configuration.  If we're not comfortable
>> with using qom-set for production, we could do something like
>> blockdev-reopen instead.
>
> Do we even need to do this via a QAPI object ?
>
> Why are we not just making the obvious design change of passing everything
> with the 'migrate' / 'migrate-incoming' commands that kick it off:
>
> ie:
>
> { 'command': 'migrate',
>   'data': {'uri': 'str',
>            '*channels': [ 'MigrationChannel' ],
> 	   '*capabilities': [ 'MigrateCapability' ],
> 	   '*parameters': [ 'MigrateParameters' ],
>            '*detach': 'bool', '*resume': 'bool' } }

Once that we are doing incompatible changes:
- resume can be another parameter
- detach is not needed.  QMP don't use it, and HMP don't need to pass it
  to qmp_migrate() to make the non-detached implemntation.


>      (deprecated bits trimmed for clarity)
>
> and the counterpart:
>
> { 'command': 'migrate-incoming',
>              'data': {'*uri': 'str',
>                       '*channels': [ 'MigrationChannel' ],
>                       '*capabilities': [ 'MigrateCapability' ],
>                       '*parameters': [ 'MigrateParameters' ] } }
>
> such that the design is just like 99% of other commands which take
> all their parameters directly. We already have 'migrate-set-parameters'
> remaining for the runtime tunables, and can deprecate the usage of this
> when migration is not already running, and similarly deprecate
> migrate-set-capabilities.

This makes sense to me, but once that we change, we could try to merge
capabilities and parameters.  See my other email on this topic.
Basically the distition is arbitrary, so just have one of them.

Or better, as I said in the other email, we have two types of
parameters:
- the ones that need to be set before migration starts
- the ones that can be changed at any time

So to be simpler, I think that 1st set should be passed to the commands
themselves and the others should only be set with
migrate_set_parameters.

Later, Juan.
Daniel P. Berrangé Nov. 14, 2023, 10:34 a.m. UTC | #22
On Tue, Nov 14, 2023 at 11:28:28AM +0100, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
> >> Now let's try to apply this to migration.
> >>
> >> As long as we can have just one migration, we need just one QAPI object
> >> to configure it.
> >> 
> >> We could create the object with -object / object_add.  For convenience,
> >> we'd probably want to create one with default configuration
> >> automatically on demand.
> >> 
> >> We could use qom-set to change configuration.  If we're not comfortable
> >> with using qom-set for production, we could do something like
> >> blockdev-reopen instead.
> >
> > Do we even need to do this via a QAPI object ?
> >
> > Why are we not just making the obvious design change of passing everything
> > with the 'migrate' / 'migrate-incoming' commands that kick it off:
> >
> > ie:
> >
> > { 'command': 'migrate',
> >   'data': {'uri': 'str',
> >            '*channels': [ 'MigrationChannel' ],
> > 	   '*capabilities': [ 'MigrateCapability' ],
> > 	   '*parameters': [ 'MigrateParameters' ],
> >            '*detach': 'bool', '*resume': 'bool' } }
> 
> Once that we are doing incompatible changes:

This is not incompatible - it is fully backcompatible with existing
usage initially, which should make it pretty trivial to introduce
to the code. Mgmt apps can carry on using migrate-set-capabilities
and migrate-set-parameters, and ignore these new 'capabilities'
and 'parameters' fields if desired.

Only once we decide to deprecate migrate-set-capabilities, would
it become incompatible.

> - resume can be another parameter

Potentially yes, but 'resume' is conceptually different to all
the other capabilities and parameters, so I could see it remaining
as a distinct field as it is now

> - detach is not needed.  QMP don't use it, and HMP don't need to pass it
>   to qmp_migrate() to make the non-detached implemntation.

We could deprecate that today then.

> 
> 
> >      (deprecated bits trimmed for clarity)
> >
> > and the counterpart:
> >
> > { 'command': 'migrate-incoming',
> >              'data': {'*uri': 'str',
> >                       '*channels': [ 'MigrationChannel' ],
> >                       '*capabilities': [ 'MigrateCapability' ],
> >                       '*parameters': [ 'MigrateParameters' ] } }
> >
> > such that the design is just like 99% of other commands which take
> > all their parameters directly. We already have 'migrate-set-parameters'
> > remaining for the runtime tunables, and can deprecate the usage of this
> > when migration is not already running, and similarly deprecate
> > migrate-set-capabilities.
> 
> This makes sense to me, but once that we change, we could try to merge
> capabilities and parameters.  See my other email on this topic.
> Basically the distition is arbitrary, so just have one of them.
> 
> Or better, as I said in the other email, we have two types of
> parameters:
> - the ones that need to be set before migration starts
> - the ones that can be changed at any time
> 
> So to be simpler, I think that 1st set should be passed to the commands
> themselves and the others should only be set with
> migrate_set_parameters.

As a mgmt app dev I don't want there to be an arbitrary distinction
between what I can pass with 'migrate' and what I have to use a
separate command for. If I'm starting a migration, I just want to
pass all the settings with the 'migrate' command. I should not have
to care about separate 'migrate-set-parameters' command at all, unless
I actually need to change something on the fly (many migrates will
never need this).

With regards,
Daniel
Juan Quintela Nov. 14, 2023, 10:42 a.m. UTC | #23
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Tue, Nov 14, 2023 at 11:28:28AM +0100, Juan Quintela wrote:
>> Daniel P. Berrangé <berrange@redhat.com> wrote:
>> > On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
>> >> Now let's try to apply this to migration.
>> >>
>> >> As long as we can have just one migration, we need just one QAPI object
>> >> to configure it.
>> >> 
>> >> We could create the object with -object / object_add.  For convenience,
>> >> we'd probably want to create one with default configuration
>> >> automatically on demand.
>> >> 
>> >> We could use qom-set to change configuration.  If we're not comfortable
>> >> with using qom-set for production, we could do something like
>> >> blockdev-reopen instead.
>> >
>> > Do we even need to do this via a QAPI object ?
>> >
>> > Why are we not just making the obvious design change of passing everything
>> > with the 'migrate' / 'migrate-incoming' commands that kick it off:
>> >
>> > ie:
>> >
>> > { 'command': 'migrate',
>> >   'data': {'uri': 'str',
>> >            '*channels': [ 'MigrationChannel' ],
>> > 	   '*capabilities': [ 'MigrateCapability' ],
>> > 	   '*parameters': [ 'MigrateParameters' ],
>> >            '*detach': 'bool', '*resume': 'bool' } }
>> 
>> Once that we are doing incompatible changes:
>
> This is not incompatible - it is fully backcompatible with existing
> usage initially, which should make it pretty trivial to introduce
> to the code. Mgmt apps can carry on using migrate-set-capabilities
> and migrate-set-parameters, and ignore these new 'capabilities'
> and 'parameters' fields if desired.
>
> Only once we decide to deprecate migrate-set-capabilities, would
> it become incompatible.

Oh, I mean that the interface is incompatible.  Not that we can't do the
current one on top of this one.

>> - resume can be another parameter
>
> Potentially yes, but 'resume' is conceptually different to all
> the other capabilities and parameters, so I could see it remaining
> as a distinct field as it is now

It is conceptually different.  But it is the _only_ one needed that
capability.  And putting that on the parameters and just checking it
first will achieve the same result.  I think that being special here
don't help, for instance, to check for incompatible things, we need to
also pass resume (it is only valid for postcopy).

>> - detach is not needed.  QMP don't use it, and HMP don't need to pass it
>>   to qmp_migrate() to make the non-detached implemntation.
>
> We could deprecate that today then.

Yeap.  Will do it.

>> >      (deprecated bits trimmed for clarity)
>> >
>> > and the counterpart:
>> >
>> > { 'command': 'migrate-incoming',
>> >              'data': {'*uri': 'str',
>> >                       '*channels': [ 'MigrationChannel' ],
>> >                       '*capabilities': [ 'MigrateCapability' ],
>> >                       '*parameters': [ 'MigrateParameters' ] } }
>> >
>> > such that the design is just like 99% of other commands which take
>> > all their parameters directly. We already have 'migrate-set-parameters'
>> > remaining for the runtime tunables, and can deprecate the usage of this
>> > when migration is not already running, and similarly deprecate
>> > migrate-set-capabilities.
>> 
>> This makes sense to me, but once that we change, we could try to merge
>> capabilities and parameters.  See my other email on this topic.
>> Basically the distition is arbitrary, so just have one of them.
>> 
>> Or better, as I said in the other email, we have two types of
>> parameters:
>> - the ones that need to be set before migration starts
>> - the ones that can be changed at any time
>> 
>> So to be simpler, I think that 1st set should be passed to the commands
>> themselves and the others should only be set with
>> migrate_set_parameters.
>
> As a mgmt app dev I don't want there to be an arbitrary distinction
> between what I can pass with 'migrate' and what I have to use a
> separate command for.

If it ever wants to set the parameter that it "can" change after
migration starts, it needs to know that they are different.

Once told that, I don't write management apps and you do so I am not
discussing further O:-)

> If I'm starting a migration, I just want to
> pass all the settings with the 'migrate' command. I should not have
> to care about separate 'migrate-set-parameters' command at all, unless
> I actually need to change something on the fly (many migrates will
> never need this).


What OpenStack/CNV do?

If my memory is right, at least one of them used progressive downtimes
every couple of iterations or something like that.

Later, Juan.

> With regards,
> Daniel
Daniel P. Berrangé Nov. 14, 2023, 10:45 a.m. UTC | #24
On Tue, Nov 14, 2023 at 11:42:27AM +0100, Juan Quintela wrote:
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> > On Tue, Nov 14, 2023 at 11:28:28AM +0100, Juan Quintela wrote:
> >> Daniel P. Berrangé <berrange@redhat.com> wrote:
> >> > On Thu, Nov 02, 2023 at 03:25:25PM +0100, Markus Armbruster wrote:
> >> >> Now let's try to apply this to migration.
> >> >>
> >> >> As long as we can have just one migration, we need just one QAPI object
> >> >> to configure it.
> >> >> 
> >> >> We could create the object with -object / object_add.  For convenience,
> >> >> we'd probably want to create one with default configuration
> >> >> automatically on demand.
> >> >> 
> >> >> We could use qom-set to change configuration.  If we're not comfortable
> >> >> with using qom-set for production, we could do something like
> >> >> blockdev-reopen instead.
> >> >
> >> > Do we even need to do this via a QAPI object ?
> >> >
> >> > Why are we not just making the obvious design change of passing everything
> >> > with the 'migrate' / 'migrate-incoming' commands that kick it off:
> >> >
> >> > ie:
> >> >
> >> > { 'command': 'migrate',
> >> >   'data': {'uri': 'str',
> >> >            '*channels': [ 'MigrationChannel' ],
> >> > 	   '*capabilities': [ 'MigrateCapability' ],
> >> > 	   '*parameters': [ 'MigrateParameters' ],
> >> >            '*detach': 'bool', '*resume': 'bool' } }
> >> 
> >> Once that we are doing incompatible changes:
> >
> > This is not incompatible - it is fully backcompatible with existing
> > usage initially, which should make it pretty trivial to introduce
> > to the code. Mgmt apps can carry on using migrate-set-capabilities
> > and migrate-set-parameters, and ignore these new 'capabilities'
> > and 'parameters' fields if desired.
> >
> > Only once we decide to deprecate migrate-set-capabilities, would
> > it become incompatible.
> 
> Oh, I mean that the interface is incompatible.  Not that we can't do the
> current one on top of this one.
> 
> >> - resume can be another parameter
> >
> > Potentially yes, but 'resume' is conceptually different to all
> > the other capabilities and parameters, so I could see it remaining
> > as a distinct field as it is now
> 
> It is conceptually different.  But it is the _only_ one needed that
> capability.  And putting that on the parameters and just checking it
> first will achieve the same result.  I think that being special here
> don't help, for instance, to check for incompatible things, we need to
> also pass resume (it is only valid for postcopy).
> 
> >> - detach is not needed.  QMP don't use it, and HMP don't need to pass it
> >>   to qmp_migrate() to make the non-detached implemntation.
> >
> > We could deprecate that today then.
> 
> Yeap.  Will do it.
> 
> >> >      (deprecated bits trimmed for clarity)
> >> >
> >> > and the counterpart:
> >> >
> >> > { 'command': 'migrate-incoming',
> >> >              'data': {'*uri': 'str',
> >> >                       '*channels': [ 'MigrationChannel' ],
> >> >                       '*capabilities': [ 'MigrateCapability' ],
> >> >                       '*parameters': [ 'MigrateParameters' ] } }
> >> >
> >> > such that the design is just like 99% of other commands which take
> >> > all their parameters directly. We already have 'migrate-set-parameters'
> >> > remaining for the runtime tunables, and can deprecate the usage of this
> >> > when migration is not already running, and similarly deprecate
> >> > migrate-set-capabilities.
> >> 
> >> This makes sense to me, but once that we change, we could try to merge
> >> capabilities and parameters.  See my other email on this topic.
> >> Basically the distition is arbitrary, so just have one of them.
> >> 
> >> Or better, as I said in the other email, we have two types of
> >> parameters:
> >> - the ones that need to be set before migration starts
> >> - the ones that can be changed at any time
> >> 
> >> So to be simpler, I think that 1st set should be passed to the commands
> >> themselves and the others should only be set with
> >> migrate_set_parameters.
> >
> > As a mgmt app dev I don't want there to be an arbitrary distinction
> > between what I can pass with 'migrate' and what I have to use a
> > separate command for.
> 
> If it ever wants to set the parameter that it "can" change after
> migration starts, it needs to know that they are different.
> 
> Once told that, I don't write management apps and you do so I am not
> discussing further O:-)
> 
> > If I'm starting a migration, I just want to
> > pass all the settings with the 'migrate' command. I should not have
> > to care about separate 'migrate-set-parameters' command at all, unless
> > I actually need to change something on the fly (many migrates will
> > never need this).
> 
> 
> What OpenStack/CNV do?
> 
> If my memory is right, at least one of them used progressive downtimes
> every couple of iterations or something like that.

If they're using pre-copy, yes, they both can do progressive tuning.

If using post-copy though, you potentially never need to change any
tunable on the fly.

With regards,
Daniel
diff mbox series

Patch

diff --git a/qapi/migration.json b/qapi/migration.json
index 8843e74b59..45d69787ae 100644
--- a/qapi/migration.json
+++ b/qapi/migration.json
@@ -851,189 +851,6 @@ 
            { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] },
            'vcpu-dirty-limit'] }
 
-##
-# @MigrateSetParameters:
-#
-# @announce-initial: Initial delay (in milliseconds) before sending
-#     the first announce (Since 4.0)
-#
-# @announce-max: Maximum delay (in milliseconds) between packets in
-#     the announcement (Since 4.0)
-#
-# @announce-rounds: Number of self-announce packets sent after
-#     migration (Since 4.0)
-#
-# @announce-step: Increase in delay (in milliseconds) between
-#     subsequent packets in the announcement (Since 4.0)
-#
-# @compress-level: compression level
-#
-# @compress-threads: compression thread count
-#
-# @compress-wait-thread: Controls behavior when all compression
-#     threads are currently busy.  If true (default), wait for a free
-#     compression thread to become available; otherwise, send the page
-#     uncompressed.  (Since 3.1)
-#
-# @decompress-threads: decompression thread count
-#
-# @throttle-trigger-threshold: The ratio of bytes_dirty_period and
-#     bytes_xfer_period to trigger throttling.  It is expressed as
-#     percentage.  The default value is 50. (Since 5.0)
-#
-# @cpu-throttle-initial: Initial percentage of time guest cpus are
-#     throttled when migration auto-converge is activated.  The
-#     default value is 20. (Since 2.7)
-#
-# @cpu-throttle-increment: throttle percentage increase each time
-#     auto-converge detects that migration is not making progress.
-#     The default value is 10. (Since 2.7)
-#
-# @cpu-throttle-tailslow: Make CPU throttling slower at tail stage At
-#     the tail stage of throttling, the Guest is very sensitive to CPU
-#     percentage while the @cpu-throttle -increment is excessive
-#     usually at tail stage.  If this parameter is true, we will
-#     compute the ideal CPU percentage used by the Guest, which may
-#     exactly make the dirty rate match the dirty rate threshold.
-#     Then we will choose a smaller throttle increment between the one
-#     specified by @cpu-throttle-increment and the one generated by
-#     ideal CPU percentage.  Therefore, it is compatible to
-#     traditional throttling, meanwhile the throttle increment won't
-#     be excessive at tail stage.  The default value is false.  (Since
-#     5.1)
-#
-# @tls-creds: ID of the 'tls-creds' object that provides credentials
-#     for establishing a TLS connection over the migration data
-#     channel.  On the outgoing side of the migration, the credentials
-#     must be for a 'client' endpoint, while for the incoming side the
-#     credentials must be for a 'server' endpoint.  Setting this to a
-#     non-empty string enables TLS for all migrations.  An empty
-#     string means that QEMU will use plain text mode for migration,
-#     rather than TLS (Since 2.9) Previously (since 2.7), this was
-#     reported by omitting tls-creds instead.
-#
-# @tls-hostname: hostname of the target host for the migration.  This
-#     is required when using x509 based TLS credentials and the
-#     migration URI does not already include a hostname.  For example
-#     if using fd: or exec: based migration, the hostname must be
-#     provided so that the server's x509 certificate identity can be
-#     validated.  (Since 2.7) An empty string means that QEMU will use
-#     the hostname associated with the migration URI, if any.  (Since
-#     2.9) Previously (since 2.7), this was reported by omitting
-#     tls-hostname instead.
-#
-# @max-bandwidth: to set maximum speed for migration.  maximum speed
-#     in bytes per second.  (Since 2.8)
-#
-# @downtime-limit: set maximum tolerated downtime for migration.
-#     maximum downtime in milliseconds (Since 2.8)
-#
-# @x-checkpoint-delay: the delay time between two COLO checkpoints.
-#     (Since 2.8)
-#
-# @block-incremental: Affects how much storage is migrated when the
-#     block migration capability is enabled.  When false, the entire
-#     storage backing chain is migrated into a flattened image at the
-#     destination; when true, only the active qcow2 layer is migrated
-#     and the destination must already have access to the same backing
-#     chain as was used on the source.  (since 2.10)
-#
-# @multifd-channels: Number of channels used to migrate data in
-#     parallel.  This is the same number that the number of sockets
-#     used for migration.  The default value is 2 (since 4.0)
-#
-# @xbzrle-cache-size: cache size to be used by XBZRLE migration.  It
-#     needs to be a multiple of the target page size and a power of 2
-#     (Since 2.11)
-#
-# @max-postcopy-bandwidth: Background transfer bandwidth during
-#     postcopy.  Defaults to 0 (unlimited).  In bytes per second.
-#     (Since 3.0)
-#
-# @max-cpu-throttle: maximum cpu throttle percentage.  The default
-#     value is 99. (Since 3.1)
-#
-# @multifd-compression: Which compression method to use.  Defaults to
-#     none.  (Since 5.0)
-#
-# @multifd-zlib-level: Set the compression level to be used in live
-#     migration, the compression level is an integer between 0 and 9,
-#     where 0 means no compression, 1 means the best compression
-#     speed, and 9 means best compression ratio which will consume
-#     more CPU. Defaults to 1. (Since 5.0)
-#
-# @multifd-zstd-level: Set the compression level to be used in live
-#     migration, the compression level is an integer between 0 and 20,
-#     where 0 means no compression, 1 means the best compression
-#     speed, and 20 means best compression ratio which will consume
-#     more CPU. Defaults to 1. (Since 5.0)
-#
-# @block-bitmap-mapping: Maps block nodes and bitmaps on them to
-#     aliases for the purpose of dirty bitmap migration.  Such aliases
-#     may for example be the corresponding names on the opposite site.
-#     The mapping must be one-to-one, but not necessarily complete: On
-#     the source, unmapped bitmaps and all bitmaps on unmapped nodes
-#     will be ignored.  On the destination, encountering an unmapped
-#     alias in the incoming migration stream will result in a report,
-#     and all further bitmap migration data will then be discarded.
-#     Note that the destination does not know about bitmaps it does
-#     not receive, so there is no limitation or requirement regarding
-#     the number of bitmaps received, or how they are named, or on
-#     which nodes they are placed.  By default (when this parameter
-#     has never been set), bitmap names are mapped to themselves.
-#     Nodes are mapped to their block device name if there is one, and
-#     to their node name otherwise.  (Since 5.2)
-#
-# @x-vcpu-dirty-limit-period: Periodic time (in milliseconds) of dirty
-#     limit during live migration.  Should be in the range 1 to 1000ms.
-#     Defaults to 1000ms.  (Since 8.1)
-#
-# @vcpu-dirty-limit: Dirtyrate limit (MB/s) during live migration.
-#     Defaults to 1.  (Since 8.1)
-#
-# Features:
-#
-# @unstable: Members @x-checkpoint-delay and @x-vcpu-dirty-limit-period
-#     are experimental.
-#
-# TODO: either fuse back into MigrationParameters, or make
-#     MigrationParameters members mandatory
-#
-# Since: 2.4
-##
-{ 'struct': 'MigrateSetParameters',
-  'data': { '*announce-initial': 'size',
-            '*announce-max': 'size',
-            '*announce-rounds': 'size',
-            '*announce-step': 'size',
-            '*compress-level': 'uint8',
-            '*compress-threads': 'uint8',
-            '*compress-wait-thread': 'bool',
-            '*decompress-threads': 'uint8',
-            '*throttle-trigger-threshold': 'uint8',
-            '*cpu-throttle-initial': 'uint8',
-            '*cpu-throttle-increment': 'uint8',
-            '*cpu-throttle-tailslow': 'bool',
-            '*tls-creds': 'StrOrNull',
-            '*tls-hostname': 'StrOrNull',
-            '*tls-authz': 'StrOrNull',
-            '*max-bandwidth': 'size',
-            '*downtime-limit': 'uint64',
-            '*x-checkpoint-delay': { 'type': 'uint32',
-                                     'features': [ 'unstable' ] },
-            '*block-incremental': 'bool',
-            '*multifd-channels': 'uint8',
-            '*xbzrle-cache-size': 'size',
-            '*max-postcopy-bandwidth': 'size',
-            '*max-cpu-throttle': 'uint8',
-            '*multifd-compression': 'MultiFDCompression',
-            '*multifd-zlib-level': 'uint8',
-            '*multifd-zstd-level': 'uint8',
-            '*block-bitmap-mapping': [ 'BitmapMigrationNodeAlias' ],
-            '*x-vcpu-dirty-limit-period': { 'type': 'uint64',
-                                            'features': [ 'unstable' ] },
-            '*vcpu-dirty-limit': 'uint64'} }
-
 ##
 # @migrate-set-parameters:
 #
@@ -1048,7 +865,7 @@ 
 # <- { "return": {} }
 ##
 { 'command': 'migrate-set-parameters', 'boxed': true,
-  'data': 'MigrateSetParameters' }
+  'data': 'MigrationParameters' }
 
 ##
 # @MigrationParameters:
@@ -1214,9 +1031,9 @@ 
             '*cpu-throttle-initial': 'uint8',
             '*cpu-throttle-increment': 'uint8',
             '*cpu-throttle-tailslow': 'bool',
-            '*tls-creds': 'str',
-            '*tls-hostname': 'str',
-            '*tls-authz': 'str',
+            '*tls-creds': 'StrOrNull',
+            '*tls-hostname': 'StrOrNull',
+            '*tls-authz': 'StrOrNull',
             '*max-bandwidth': 'size',
             '*downtime-limit': 'uint64',
             '*x-checkpoint-delay': { 'type': 'uint32',
diff --git a/include/hw/qdev-properties.h b/include/hw/qdev-properties.h
index e1df08876c..3ae83d8390 100644
--- a/include/hw/qdev-properties.h
+++ b/include/hw/qdev-properties.h
@@ -59,6 +59,7 @@  extern const PropertyInfo qdev_prop_uint64_checkmask;
 extern const PropertyInfo qdev_prop_int64;
 extern const PropertyInfo qdev_prop_size;
 extern const PropertyInfo qdev_prop_string;
+extern const PropertyInfo qdev_prop_str_or_null;
 extern const PropertyInfo qdev_prop_on_off_auto;
 extern const PropertyInfo qdev_prop_size32;
 extern const PropertyInfo qdev_prop_arraylen;
@@ -171,6 +172,8 @@  extern const PropertyInfo qdev_prop_link;
     DEFINE_PROP_UNSIGNED(_n, _s, _f, _d, qdev_prop_size, uint64_t)
 #define DEFINE_PROP_STRING(_n, _s, _f)             \
     DEFINE_PROP(_n, _s, _f, qdev_prop_string, char*)
+#define DEFINE_PROP_STR_OR_NULL(_n, _s, _f)             \
+    DEFINE_PROP(_n, _s, _f, qdev_prop_str_or_null, StrOrNull *)
 #define DEFINE_PROP_ON_OFF_AUTO(_n, _s, _f, _d) \
     DEFINE_PROP_SIGNED(_n, _s, _f, _d, qdev_prop_on_off_auto, OnOffAuto)
 #define DEFINE_PROP_SIZE32(_n, _s, _f, _d)                       \
diff --git a/migration/options.h b/migration/options.h
index 045e2a41a2..124a5d450f 100644
--- a/migration/options.h
+++ b/migration/options.h
@@ -56,6 +56,7 @@  bool migrate_zero_copy_send(void);
 
 bool migrate_multifd_flush_after_each_section(void);
 bool migrate_postcopy(void);
+/* Check whether TLS is enabled for migration */
 bool migrate_tls(void);
 
 /* capabilities helpers */
@@ -90,6 +91,8 @@  const char *migrate_tls_authz(void);
 const char *migrate_tls_creds(void);
 const char *migrate_tls_hostname(void);
 uint64_t migrate_xbzrle_cache_size(void);
+StrOrNull *StrOrNull_from_str(const char *str);
+const char *str_from_StrOrNull(StrOrNull *obj);
 
 /* parameters setters */
 
diff --git a/hw/core/qdev-properties.c b/hw/core/qdev-properties.c
index 357b8761b5..b4bbb52ae9 100644
--- a/hw/core/qdev-properties.c
+++ b/hw/core/qdev-properties.c
@@ -2,6 +2,7 @@ 
 #include "hw/qdev-properties.h"
 #include "qapi/error.h"
 #include "qapi/qapi-types-misc.h"
+#include "qapi/qapi-visit-misc.h"
 #include "qapi/qmp/qerror.h"
 #include "qemu/ctype.h"
 #include "qemu/error-report.h"
@@ -490,6 +491,45 @@  const PropertyInfo qdev_prop_string = {
     .set   = set_string,
 };
 
+/* --- StrOrNull --- */
+
+static void release_str_or_null(Object *obj, const char *name, void *opaque)
+{
+    Property *prop = opaque;
+
+    qapi_free_StrOrNull(*(StrOrNull **)object_field_prop_ptr(obj, prop));
+}
+
+static void get_str_or_null(Object *obj, Visitor *v, const char *name,
+                            void *opaque, Error **errp)
+{
+    Property *prop = opaque;
+    StrOrNull **ptr = object_field_prop_ptr(obj, prop);
+
+    visit_type_StrOrNull(v, name, ptr, errp);
+}
+
+static void set_str_or_null(Object *obj, Visitor *v, const char *name,
+                            void *opaque, Error **errp)
+{
+    Property *prop = opaque;
+    StrOrNull **ptr = object_field_prop_ptr(obj, prop);
+    StrOrNull *son;
+
+    if (!visit_type_StrOrNull(v, name, &son, errp)) {
+        return;
+    }
+    qapi_free_StrOrNull(*ptr);
+    *ptr = son;
+}
+
+const PropertyInfo qdev_prop_str_or_null = {
+    .name  = "StrOrNull",
+    .release = release_str_or_null,
+    .get   = get_str_or_null,
+    .set   = set_str_or_null,
+};
+
 /* --- on/off/auto --- */
 
 const PropertyInfo qdev_prop_on_off_auto = {
diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c
index c115ef2d23..88a8ccb475 100644
--- a/migration/migration-hmp-cmds.c
+++ b/migration/migration-hmp-cmds.c
@@ -257,6 +257,7 @@  void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict)
 void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
 {
     MigrationParameters *params;
+    const char *str;
 
     params = qmp_query_migrate_parameters(NULL);
 
@@ -309,14 +310,18 @@  void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %u\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_CPU_THROTTLE),
             params->max_cpu_throttle);
-        assert(params->tls_creds);
+        str = str_from_StrOrNull(params->tls_creds);
         monitor_printf(mon, "%s: '%s'\n",
             MigrationParameter_str(MIGRATION_PARAMETER_TLS_CREDS),
-            params->tls_creds);
-        assert(params->tls_hostname);
+            str ? str : "");
+        str = str_from_StrOrNull(params->tls_hostname);
         monitor_printf(mon, "%s: '%s'\n",
             MigrationParameter_str(MIGRATION_PARAMETER_TLS_HOSTNAME),
-            params->tls_hostname);
+            str ? str : "");
+        str = str_from_StrOrNull(params->tls_authz);
+        monitor_printf(mon, "%s: '%s'\n",
+            MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
+            str ? str : "");
         assert(params->has_max_bandwidth);
         monitor_printf(mon, "%s: %" PRIu64 " bytes/second\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_BANDWIDTH),
@@ -345,9 +350,6 @@  void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
         monitor_printf(mon, "%s: %" PRIu64 "\n",
             MigrationParameter_str(MIGRATION_PARAMETER_MAX_POSTCOPY_BANDWIDTH),
             params->max_postcopy_bandwidth);
-        monitor_printf(mon, "%s: '%s'\n",
-            MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ),
-            params->tls_authz);
 
         if (params->has_block_bitmap_mapping) {
             const BitmapMigrationNodeAliasList *bmnal;
@@ -497,7 +499,7 @@  void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
     const char *param = qdict_get_str(qdict, "parameter");
     const char *valuestr = qdict_get_str(qdict, "value");
     Visitor *v = string_input_visitor_new(valuestr);
-    MigrateSetParameters *p = g_new0(MigrateSetParameters, 1);
+    MigrationParameters *p = g_new0(MigrationParameters, 1);
     uint64_t valuebw = 0;
     uint64_t cache_size;
     Error *err = NULL;
@@ -657,7 +659,7 @@  void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict)
     qmp_migrate_set_parameters(p, &err);
 
  cleanup:
-    qapi_free_MigrateSetParameters(p);
+    qapi_free_MigrationParameters(p);
     visit_free(v);
     hmp_handle_error(mon, err);
 }
diff --git a/migration/options.c b/migration/options.c
index 6bbfd4853d..12e392f68c 100644
--- a/migration/options.c
+++ b/migration/options.c
@@ -164,9 +164,12 @@  Property migration_properties[] = {
     DEFINE_PROP_SIZE("announce-step", MigrationState,
                       parameters.announce_step,
                       DEFAULT_MIGRATE_ANNOUNCE_STEP),
-    DEFINE_PROP_STRING("tls-creds", MigrationState, parameters.tls_creds),
-    DEFINE_PROP_STRING("tls-hostname", MigrationState, parameters.tls_hostname),
-    DEFINE_PROP_STRING("tls-authz", MigrationState, parameters.tls_authz),
+    DEFINE_PROP_STR_OR_NULL("tls-creds", MigrationState,
+                            parameters.tls_creds),
+    DEFINE_PROP_STR_OR_NULL("tls-hostname", MigrationState,
+                            parameters.tls_hostname),
+    DEFINE_PROP_STR_OR_NULL("tls-authz", MigrationState,
+                            parameters.tls_authz),
     DEFINE_PROP_UINT64("x-vcpu-dirty-limit-period", MigrationState,
                        parameters.x_vcpu_dirty_limit_period,
                        DEFAULT_MIGRATE_VCPU_DIRTY_LIMIT_PERIOD),
@@ -201,6 +204,38 @@  Property migration_properties[] = {
     DEFINE_PROP_END_OF_LIST(),
 };
 
+/*
+ * NOTE: here we have a trick when converting the empty string (""): we
+ * need to make sure the empty string ("") will be converted to NULL, as
+ * some TLS code may rely on that to detect whether something is enabled
+ * (e.g., the tls_authz field).
+ */
+const char *str_from_StrOrNull(StrOrNull *obj)
+{
+    if (!obj || obj->type == QTYPE_QNULL) {
+        return NULL;
+    } else if (obj->type == QTYPE_QSTRING) {
+        if (obj->u.s[0] == '\0') {
+            return NULL;
+        } else {
+            return obj->u.s;
+        }
+    } else {
+        abort();
+    }
+}
+
+StrOrNull *StrOrNull_from_str(const char *str)
+{
+    StrOrNull *obj = g_new0(StrOrNull, 1);
+
+    assert(str);
+    obj->type = QTYPE_QSTRING;
+    obj->u.s = g_strdup(str);
+
+    return obj;
+}
+
 bool migrate_auto_converge(void)
 {
     MigrationState *s = migrate_get_current();
@@ -378,9 +413,11 @@  bool migrate_postcopy(void)
 
 bool migrate_tls(void)
 {
-    MigrationState *s = migrate_get_current();
-
-    return s->parameters.tls_creds && *s->parameters.tls_creds;
+    /*
+     * The whole TLS feature relies on a non-empty tls-creds set first.
+     * It's disabled otherwise.
+     */
+    return migrate_tls_creds();
 }
 
 typedef enum WriteTrackingSupport {
@@ -827,21 +864,21 @@  const char *migrate_tls_authz(void)
 {
     MigrationState *s = migrate_get_current();
 
-    return s->parameters.tls_authz;
+    return str_from_StrOrNull(s->parameters.tls_authz);
 }
 
 const char *migrate_tls_creds(void)
 {
     MigrationState *s = migrate_get_current();
 
-    return s->parameters.tls_creds;
+    return str_from_StrOrNull(s->parameters.tls_creds);
 }
 
 const char *migrate_tls_hostname(void)
 {
     MigrationState *s = migrate_get_current();
 
-    return s->parameters.tls_hostname;
+    return str_from_StrOrNull(s->parameters.tls_hostname);
 }
 
 uint64_t migrate_xbzrle_cache_size(void)
@@ -911,10 +948,9 @@  MigrationParameters *qmp_query_migrate_parameters(Error **errp)
     params->cpu_throttle_increment = s->parameters.cpu_throttle_increment;
     params->has_cpu_throttle_tailslow = true;
     params->cpu_throttle_tailslow = s->parameters.cpu_throttle_tailslow;
-    params->tls_creds = g_strdup(s->parameters.tls_creds);
-    params->tls_hostname = g_strdup(s->parameters.tls_hostname);
-    params->tls_authz = g_strdup(s->parameters.tls_authz ?
-                                 s->parameters.tls_authz : "");
+    params->tls_creds = QAPI_CLONE(StrOrNull, s->parameters.tls_creds);
+    params->tls_hostname = QAPI_CLONE(StrOrNull, s->parameters.tls_hostname);
+    params->tls_authz = QAPI_CLONE(StrOrNull, s->parameters.tls_authz);
     params->has_max_bandwidth = true;
     params->max_bandwidth = s->parameters.max_bandwidth;
     params->has_downtime_limit = true;
@@ -963,8 +999,9 @@  MigrationParameters *qmp_query_migrate_parameters(Error **errp)
 
 void migrate_params_init(MigrationParameters *params)
 {
-    params->tls_hostname = g_strdup("");
-    params->tls_creds = g_strdup("");
+    params->tls_hostname = StrOrNull_from_str("");
+    params->tls_creds = StrOrNull_from_str("");
+    params->tls_authz = StrOrNull_from_str("");
 
     /* Set has_* up only for parameter checks */
     params->has_compress_level = true;
@@ -1145,7 +1182,7 @@  bool migrate_params_check(MigrationParameters *params, Error **errp)
 #ifdef CONFIG_LINUX
     if (migrate_zero_copy_send() &&
         ((params->has_multifd_compression && params->multifd_compression) ||
-         (params->tls_creds && *params->tls_creds))) {
+         migrate_tls())) {
         error_setg(errp,
                    "Zero copy only available for non-compressed non-TLS multifd migration");
         return false;
@@ -1172,113 +1209,7 @@  bool migrate_params_check(MigrationParameters *params, Error **errp)
     return true;
 }
 
-static void migrate_params_test_apply(MigrateSetParameters *params,
-                                      MigrationParameters *dest)
-{
-    *dest = migrate_get_current()->parameters;
-
-    /* TODO use QAPI_CLONE() instead of duplicating it inline */
-
-    if (params->has_compress_level) {
-        dest->compress_level = params->compress_level;
-    }
-
-    if (params->has_compress_threads) {
-        dest->compress_threads = params->compress_threads;
-    }
-
-    if (params->has_compress_wait_thread) {
-        dest->compress_wait_thread = params->compress_wait_thread;
-    }
-
-    if (params->has_decompress_threads) {
-        dest->decompress_threads = params->decompress_threads;
-    }
-
-    if (params->has_throttle_trigger_threshold) {
-        dest->throttle_trigger_threshold = params->throttle_trigger_threshold;
-    }
-
-    if (params->has_cpu_throttle_initial) {
-        dest->cpu_throttle_initial = params->cpu_throttle_initial;
-    }
-
-    if (params->has_cpu_throttle_increment) {
-        dest->cpu_throttle_increment = params->cpu_throttle_increment;
-    }
-
-    if (params->has_cpu_throttle_tailslow) {
-        dest->cpu_throttle_tailslow = params->cpu_throttle_tailslow;
-    }
-
-    if (params->tls_creds) {
-        assert(params->tls_creds->type == QTYPE_QSTRING);
-        dest->tls_creds = params->tls_creds->u.s;
-    }
-
-    if (params->tls_hostname) {
-        assert(params->tls_hostname->type == QTYPE_QSTRING);
-        dest->tls_hostname = params->tls_hostname->u.s;
-    }
-
-    if (params->has_max_bandwidth) {
-        dest->max_bandwidth = params->max_bandwidth;
-    }
-
-    if (params->has_downtime_limit) {
-        dest->downtime_limit = params->downtime_limit;
-    }
-
-    if (params->has_x_checkpoint_delay) {
-        dest->x_checkpoint_delay = params->x_checkpoint_delay;
-    }
-
-    if (params->has_block_incremental) {
-        dest->block_incremental = params->block_incremental;
-    }
-    if (params->has_multifd_channels) {
-        dest->multifd_channels = params->multifd_channels;
-    }
-    if (params->has_multifd_compression) {
-        dest->multifd_compression = params->multifd_compression;
-    }
-    if (params->has_xbzrle_cache_size) {
-        dest->xbzrle_cache_size = params->xbzrle_cache_size;
-    }
-    if (params->has_max_postcopy_bandwidth) {
-        dest->max_postcopy_bandwidth = params->max_postcopy_bandwidth;
-    }
-    if (params->has_max_cpu_throttle) {
-        dest->max_cpu_throttle = params->max_cpu_throttle;
-    }
-    if (params->has_announce_initial) {
-        dest->announce_initial = params->announce_initial;
-    }
-    if (params->has_announce_max) {
-        dest->announce_max = params->announce_max;
-    }
-    if (params->has_announce_rounds) {
-        dest->announce_rounds = params->announce_rounds;
-    }
-    if (params->has_announce_step) {
-        dest->announce_step = params->announce_step;
-    }
-
-    if (params->has_block_bitmap_mapping) {
-        dest->has_block_bitmap_mapping = true;
-        dest->block_bitmap_mapping = params->block_bitmap_mapping;
-    }
-
-    if (params->has_x_vcpu_dirty_limit_period) {
-        dest->x_vcpu_dirty_limit_period =
-            params->x_vcpu_dirty_limit_period;
-    }
-    if (params->has_vcpu_dirty_limit) {
-        dest->vcpu_dirty_limit = params->vcpu_dirty_limit;
-    }
-}
-
-static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
+static void migrate_params_apply(MigrationParameters *params, Error **errp)
 {
     MigrationState *s = migrate_get_current();
 
@@ -1317,21 +1248,18 @@  static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
     }
 
     if (params->tls_creds) {
-        g_free(s->parameters.tls_creds);
-        assert(params->tls_creds->type == QTYPE_QSTRING);
-        s->parameters.tls_creds = g_strdup(params->tls_creds->u.s);
+        qapi_free_StrOrNull(s->parameters.tls_creds);
+        s->parameters.tls_creds = QAPI_CLONE(StrOrNull, params->tls_creds);
     }
 
     if (params->tls_hostname) {
-        g_free(s->parameters.tls_hostname);
-        assert(params->tls_hostname->type == QTYPE_QSTRING);
-        s->parameters.tls_hostname = g_strdup(params->tls_hostname->u.s);
+        qapi_free_StrOrNull(s->parameters.tls_hostname);
+        s->parameters.tls_hostname = QAPI_CLONE(StrOrNull, params->tls_hostname);
     }
 
     if (params->tls_authz) {
-        g_free(s->parameters.tls_authz);
-        assert(params->tls_authz->type == QTYPE_QSTRING);
-        s->parameters.tls_authz = g_strdup(params->tls_authz->u.s);
+        qapi_free_StrOrNull(s->parameters.tls_authz);
+        s->parameters.tls_authz = QAPI_CLONE(StrOrNull, params->tls_authz);
     }
 
     if (params->has_max_bandwidth) {
@@ -1404,33 +1332,9 @@  static void migrate_params_apply(MigrateSetParameters *params, Error **errp)
     }
 }
 
-void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp)
+void qmp_migrate_set_parameters(MigrationParameters *params, Error **errp)
 {
-    MigrationParameters tmp;
-
-    /* TODO Rewrite "" to null instead for all three tls_* parameters */
-    if (params->tls_creds
-        && params->tls_creds->type == QTYPE_QNULL) {
-        qobject_unref(params->tls_creds->u.n);
-        params->tls_creds->type = QTYPE_QSTRING;
-        params->tls_creds->u.s = strdup("");
-    }
-    if (params->tls_hostname
-        && params->tls_hostname->type == QTYPE_QNULL) {
-        qobject_unref(params->tls_hostname->u.n);
-        params->tls_hostname->type = QTYPE_QSTRING;
-        params->tls_hostname->u.s = strdup("");
-    }
-    if (params->tls_authz
-        && params->tls_authz->type == QTYPE_QNULL) {
-        qobject_unref(params->tls_authz->u.n);
-        params->tls_authz->type = QTYPE_QSTRING;
-        params->tls_authz->u.s = strdup("");
-    }
-
-    migrate_params_test_apply(params, &tmp);
-
-    if (!migrate_params_check(&tmp, errp)) {
+    if (!migrate_params_check(params, errp)) {
         /* Invalid parameter */
         return;
     }
diff --git a/migration/tls.c b/migration/tls.c
index fa03d9136c..c2ed4ff557 100644
--- a/migration/tls.c
+++ b/migration/tls.c
@@ -126,7 +126,8 @@  QIOChannelTLS *migration_tls_client_create(QIOChannel *ioc,
     }
 
     const char *tls_hostname = migrate_tls_hostname();
-    if (tls_hostname && *tls_hostname) {
+    /* If tls_hostname, then it must be non-empty string already */
+    if (tls_hostname) {
         hostname = tls_hostname;
     }