diff mbox series

[v2,1/2] qdev-monitor: avoid QemuOpts in QMP device_add

Message ID 20240801140552.1021693-2-stefanha@redhat.com
State New
Headers show
Series qdev-monitor: avoid QemuOpts in QMP device_add | expand

Commit Message

Stefan Hajnoczi Aug. 1, 2024, 2:05 p.m. UTC
The QMP device_add monitor command converts the QDict arguments to
QemuOpts and then back again to QDict. This process only supports scalar
types. Device properties like virtio-blk-pci's iothread-vq-mapping (an
array of objects) are silently dropped by qemu_opts_from_qdict() during
the QemuOpts conversion even though QAPI is capable of validating them.
As a result, hotplugging virtio-blk-pci devices with the
iothread-vq-mapping property does not work as expected (the property is
ignored).

Get rid of the QemuOpts conversion in qmp_device_add() and call
qdev_device_add_from_qdict() with from_json=true. Using the QMP
command's QDict arguments directly allows non-scalar properties.

The HMP is also adjusted since qmp_device_add()'s now expects properly
typed JSON arguments and cannot be used from HMP anymore. Move the code
that was previously in qmp_device_add() (with QemuOpts conversion and
from_json=false) into hmp_device_add() so that its behavior is
unchanged.

This patch changes the behavior of QMP device_add but not HMP
device_add. QMP clients that sent incorrectly typed device_add QMP
commands no longer work. This is a breaking change but clients should be
using the correct types already. See the netdev_add QAPIfication in
commit db2a380c8457 for similar reasoning and object-add in commit
9151e59a8b6e. Unlike those commits, we continue to rely on 'gen': false
for the time being.

Move the drain_call_rcu() invocation into qdev_device_add_from_qdict()
so all callers benefit from it automatically. This avoids code
duplication.

Markus helped me figure this out and even provided a draft patch. The
code ended up very close to what he suggested.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Cc: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
 system/qdev-monitor.c | 56 ++++++++++++++++++++++---------------------
 1 file changed, 29 insertions(+), 27 deletions(-)

Comments

Markus Armbruster Aug. 2, 2024, 8:01 a.m. UTC | #1
Stefan Hajnoczi <stefanha@redhat.com> writes:

> The QMP device_add monitor command converts the QDict arguments to
> QemuOpts and then back again to QDict. This process only supports scalar
> types. Device properties like virtio-blk-pci's iothread-vq-mapping (an
> array of objects) are silently dropped by qemu_opts_from_qdict() during
> the QemuOpts conversion even though QAPI is capable of validating them.
> As a result, hotplugging virtio-blk-pci devices with the
> iothread-vq-mapping property does not work as expected (the property is
> ignored).
>
> Get rid of the QemuOpts conversion in qmp_device_add() and call
> qdev_device_add_from_qdict() with from_json=true. Using the QMP
> command's QDict arguments directly allows non-scalar properties.
>
> The HMP is also adjusted since qmp_device_add()'s now expects properly
> typed JSON arguments and cannot be used from HMP anymore. Move the code
> that was previously in qmp_device_add() (with QemuOpts conversion and
> from_json=false) into hmp_device_add() so that its behavior is
> unchanged.
>
> This patch changes the behavior of QMP device_add but not HMP
> device_add. QMP clients that sent incorrectly typed device_add QMP
> commands no longer work. This is a breaking change but clients should be
> using the correct types already. See the netdev_add QAPIfication in
> commit db2a380c8457 for similar reasoning and object-add in commit
> 9151e59a8b6e. Unlike those commits, we continue to rely on 'gen': false
> for the time being.
>
> Move the drain_call_rcu() invocation into qdev_device_add_from_qdict()
> so all callers benefit from it automatically. This avoids code
> duplication.
>
> Markus helped me figure this out and even provided a draft patch. The
> code ended up very close to what he suggested.
>
> Suggested-by: Markus Armbruster <armbru@redhat.com>
> Cc: Daniel P. Berrangé <berrange@redhat.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  system/qdev-monitor.c | 56 ++++++++++++++++++++++---------------------
>  1 file changed, 29 insertions(+), 27 deletions(-)
>
> diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
> index 6af6ef7d66..8a756b1a91 100644
> --- a/system/qdev-monitor.c
> +++ b/system/qdev-monitor.c
> @@ -725,6 +725,17 @@ err_del_dev:
>      if (dev) {
>          object_unparent(OBJECT(dev));
>          object_unref(OBJECT(dev));
> +
> +        /*
> +         * Drain all pending RCU callbacks. This is done because
> +         * some bus related operations can delay a device removal
> +         * (in this case this can happen if device is added and then
> +         * removed due to a configuration error)
> +         * to a RCU callback, but user might expect that this interface
> +         * will finish its job completely once qmp command returns result
> +         * to the user
> +         */
> +        drain_call_rcu();
>      }
>      return NULL;
>  }

Moving this from qmp_device_add() adds RCU draining to call chains not
going through qmp_device_add().

Can adding it hurt?  I guess it can't.

Can it fix bugs?  I don't know.

Let's review the callers of qdev_device_add_from_qdict():

* qdev_device_add()

  - called from qmp_device_add()

    No change.

  - called from device_init_func() called from qemu_create_cli_devices()

    See below.

  - called from usbback_portid_add() called from usbback_process_port()
    called from usbback_backend_changed()

    · called from usbback_init()

    · called as XenDevOps method backend_changed()

    This is Xen.  We now drain pending RCU callbacks.  Impact?  Beats
    me.

* qemu_create_cli_devices() called from qmp_x_exit_preconfig()

  - as QMP command with -preconfig, phase must be
    PHASE_MACHINE_INITIALIZED

  - called from qemu_init() without -preconfig

  We now drain pending RCU callbacks.  Can any be pending at this
  early point?  If not, the change is a no-op.

* failover_add_primary() called from virtio_net_set_features() called as
  VirtioDeviceClass method set_features()

  This is virtio-net failover.  We now drain pending RCU callbacks.
  Impact?  Beats me.

My gut feeling is "improvement, possibly even a bug fix".  It deserves
its own commit, doesn't it?

> @@ -849,34 +860,10 @@ void hmp_info_qdm(Monitor *mon, const QDict *qdict)
>  
>  void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
>  {
> -    QemuOpts *opts;
>      DeviceState *dev;
>  
> -    opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict, errp);
> -    if (!opts) {
> -        return;
> -    }
> -    if (!monitor_cur_is_qmp() && qdev_device_help(opts)) {
> -        qemu_opts_del(opts);
> -        return;
> -    }
> -    dev = qdev_device_add(opts, errp);
> -    if (!dev) {
> -        /*
> -         * Drain all pending RCU callbacks. This is done because
> -         * some bus related operations can delay a device removal
> -         * (in this case this can happen if device is added and then
> -         * removed due to a configuration error)
> -         * to a RCU callback, but user might expect that this interface
> -         * will finish its job completely once qmp command returns result
> -         * to the user
> -         */
> -        drain_call_rcu();
> -
> -        qemu_opts_del(opts);
> -        return;
> -    }
> -    object_unref(OBJECT(dev));
> +    dev = qdev_device_add_from_qdict(qdict, true, errp);
> +    object_unref(dev);
>  }
>  
>  static DeviceState *find_device_state(const char *id, Error **errp)
> @@ -967,8 +954,23 @@ void qmp_device_del(const char *id, Error **errp)
>  void hmp_device_add(Monitor *mon, const QDict *qdict)
>  {
>      Error *err = NULL;
> +    QemuOpts *opts;
> +    DeviceState *dev;
>  
> -    qmp_device_add((QDict *)qdict, NULL, &err);
> +    opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict, &err);
> +    if (!opts) {
> +        goto out;
> +    }
> +    if (qdev_device_help(opts)) {
> +        qemu_opts_del(opts);
> +        return;
> +    }
> +    dev = qdev_device_add(opts, &err);
> +    if (!dev) {
> +        qemu_opts_del(opts);
> +    }
> +    object_unref(dev);
> +out:
>      hmp_handle_error(mon, err);
>  }

Remainder looks good to me.
Stefan Hajnoczi Aug. 12, 2024, 6:07 p.m. UTC | #2
On Fri, Aug 02, 2024 at 10:01:20AM +0200, Markus Armbruster wrote:
> Stefan Hajnoczi <stefanha@redhat.com> writes:
> 
> > The QMP device_add monitor command converts the QDict arguments to
> > QemuOpts and then back again to QDict. This process only supports scalar
> > types. Device properties like virtio-blk-pci's iothread-vq-mapping (an
> > array of objects) are silently dropped by qemu_opts_from_qdict() during
> > the QemuOpts conversion even though QAPI is capable of validating them.
> > As a result, hotplugging virtio-blk-pci devices with the
> > iothread-vq-mapping property does not work as expected (the property is
> > ignored).
> >
> > Get rid of the QemuOpts conversion in qmp_device_add() and call
> > qdev_device_add_from_qdict() with from_json=true. Using the QMP
> > command's QDict arguments directly allows non-scalar properties.
> >
> > The HMP is also adjusted since qmp_device_add()'s now expects properly
> > typed JSON arguments and cannot be used from HMP anymore. Move the code
> > that was previously in qmp_device_add() (with QemuOpts conversion and
> > from_json=false) into hmp_device_add() so that its behavior is
> > unchanged.
> >
> > This patch changes the behavior of QMP device_add but not HMP
> > device_add. QMP clients that sent incorrectly typed device_add QMP
> > commands no longer work. This is a breaking change but clients should be
> > using the correct types already. See the netdev_add QAPIfication in
> > commit db2a380c8457 for similar reasoning and object-add in commit
> > 9151e59a8b6e. Unlike those commits, we continue to rely on 'gen': false
> > for the time being.
> >
> > Move the drain_call_rcu() invocation into qdev_device_add_from_qdict()
> > so all callers benefit from it automatically. This avoids code
> > duplication.
> >
> > Markus helped me figure this out and even provided a draft patch. The
> > code ended up very close to what he suggested.
> >
> > Suggested-by: Markus Armbruster <armbru@redhat.com>
> > Cc: Daniel P. Berrangé <berrange@redhat.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  system/qdev-monitor.c | 56 ++++++++++++++++++++++---------------------
> >  1 file changed, 29 insertions(+), 27 deletions(-)
> >
> > diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
> > index 6af6ef7d66..8a756b1a91 100644
> > --- a/system/qdev-monitor.c
> > +++ b/system/qdev-monitor.c
> > @@ -725,6 +725,17 @@ err_del_dev:
> >      if (dev) {
> >          object_unparent(OBJECT(dev));
> >          object_unref(OBJECT(dev));
> > +
> > +        /*
> > +         * Drain all pending RCU callbacks. This is done because
> > +         * some bus related operations can delay a device removal
> > +         * (in this case this can happen if device is added and then
> > +         * removed due to a configuration error)
> > +         * to a RCU callback, but user might expect that this interface
> > +         * will finish its job completely once qmp command returns result
> > +         * to the user
> > +         */
> > +        drain_call_rcu();
> >      }
> >      return NULL;
> >  }
> 
> Moving this from qmp_device_add() adds RCU draining to call chains not
> going through qmp_device_add().
> 
> Can adding it hurt?  I guess it can't.
> 
> Can it fix bugs?  I don't know.
> 
> Let's review the callers of qdev_device_add_from_qdict():
> 
> * qdev_device_add()
> 
>   - called from qmp_device_add()
> 
>     No change.
> 
>   - called from device_init_func() called from qemu_create_cli_devices()
> 
>     See below.
> 
>   - called from usbback_portid_add() called from usbback_process_port()
>     called from usbback_backend_changed()
> 
>     · called from usbback_init()
> 
>     · called as XenDevOps method backend_changed()
> 
>     This is Xen.  We now drain pending RCU callbacks.  Impact?  Beats
>     me.
> 
> * qemu_create_cli_devices() called from qmp_x_exit_preconfig()
> 
>   - as QMP command with -preconfig, phase must be
>     PHASE_MACHINE_INITIALIZED
> 
>   - called from qemu_init() without -preconfig
> 
>   We now drain pending RCU callbacks.  Can any be pending at this
>   early point?  If not, the change is a no-op.
> 
> * failover_add_primary() called from virtio_net_set_features() called as
>   VirtioDeviceClass method set_features()
> 
>   This is virtio-net failover.  We now drain pending RCU callbacks.
>   Impact?  Beats me.
> 
> My gut feeling is "improvement, possibly even a bug fix".  It deserves
> its own commit, doesn't it?

I thought I'd checked it, but now that I have reviewed the functions you
looked at, I no longer think it's safe to move drain_call_rcu() into
qdev_device_add_from_qdict(). drain_call_rcu() drops the BQL and must
not be called from non-reentrant multi-threaded code paths.

For example, if drain_call_rcu() is called from device emulation code
then another vCPU thread could execute that same device's emulation code
while the BQL is dropped. That's usually unsafe because most device
emulation code only expects to run in one thread at any given time
thanks to the BQL.

Here are the cases:
- qemu_create_cli_devices: safe because this happens before device
  emulation runs in vCPU threads
- Xen: safe because there are no vCPU threads when QEMU just provides
  device emulation in a normal Xen system. But I'm not sure if usbback
  can also be used in a QEMU's Xen HVM guest mode where KVM handles
  vCPUs, it might be a problem there.
- virtio_net_set_features() -> failover_add_primary() is unsafe. Another
  vCPU thread could enter virtio-net emulation code while the thread
  running virtio_net_set_features() drops the lock.

I think it's better to duplicate drain_call_rcu() into hmp_device_add()
than to risk breaking the other code paths.

Stefan
Markus Armbruster Aug. 29, 2024, 2:09 p.m. UTC | #3
Stefan Hajnoczi <stefanha@redhat.com> writes:

> On Fri, Aug 02, 2024 at 10:01:20AM +0200, Markus Armbruster wrote:
>> Stefan Hajnoczi <stefanha@redhat.com> writes:
>> 
>> > The QMP device_add monitor command converts the QDict arguments to
>> > QemuOpts and then back again to QDict. This process only supports scalar
>> > types. Device properties like virtio-blk-pci's iothread-vq-mapping (an
>> > array of objects) are silently dropped by qemu_opts_from_qdict() during
>> > the QemuOpts conversion even though QAPI is capable of validating them.
>> > As a result, hotplugging virtio-blk-pci devices with the
>> > iothread-vq-mapping property does not work as expected (the property is
>> > ignored).
>> >
>> > Get rid of the QemuOpts conversion in qmp_device_add() and call
>> > qdev_device_add_from_qdict() with from_json=true. Using the QMP
>> > command's QDict arguments directly allows non-scalar properties.
>> >
>> > The HMP is also adjusted since qmp_device_add()'s now expects properly
>> > typed JSON arguments and cannot be used from HMP anymore. Move the code
>> > that was previously in qmp_device_add() (with QemuOpts conversion and
>> > from_json=false) into hmp_device_add() so that its behavior is
>> > unchanged.
>> >
>> > This patch changes the behavior of QMP device_add but not HMP
>> > device_add. QMP clients that sent incorrectly typed device_add QMP
>> > commands no longer work. This is a breaking change but clients should be
>> > using the correct types already. See the netdev_add QAPIfication in
>> > commit db2a380c8457 for similar reasoning and object-add in commit
>> > 9151e59a8b6e. Unlike those commits, we continue to rely on 'gen': false
>> > for the time being.
>> >
>> > Move the drain_call_rcu() invocation into qdev_device_add_from_qdict()
>> > so all callers benefit from it automatically. This avoids code
>> > duplication.
>> >
>> > Markus helped me figure this out and even provided a draft patch. The
>> > code ended up very close to what he suggested.
>> >
>> > Suggested-by: Markus Armbruster <armbru@redhat.com>
>> > Cc: Daniel P. Berrangé <berrange@redhat.com>
>> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
>> > ---
>> >  system/qdev-monitor.c | 56 ++++++++++++++++++++++---------------------
>> >  1 file changed, 29 insertions(+), 27 deletions(-)
>> >
>> > diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
>> > index 6af6ef7d66..8a756b1a91 100644
>> > --- a/system/qdev-monitor.c
>> > +++ b/system/qdev-monitor.c
>> > @@ -725,6 +725,17 @@ err_del_dev:
>> >      if (dev) {
>> >          object_unparent(OBJECT(dev));
>> >          object_unref(OBJECT(dev));
>> > +
>> > +        /*
>> > +         * Drain all pending RCU callbacks. This is done because
>> > +         * some bus related operations can delay a device removal
>> > +         * (in this case this can happen if device is added and then
>> > +         * removed due to a configuration error)
>> > +         * to a RCU callback, but user might expect that this interface
>> > +         * will finish its job completely once qmp command returns result
>> > +         * to the user
>> > +         */
>> > +        drain_call_rcu();
>> >      }
>> >      return NULL;
>> >  }
>> 
>> Moving this from qmp_device_add() adds RCU draining to call chains not
>> going through qmp_device_add().
>> 
>> Can adding it hurt?  I guess it can't.
>> 
>> Can it fix bugs?  I don't know.
>> 
>> Let's review the callers of qdev_device_add_from_qdict():
>> 
>> * qdev_device_add()
>> 
>>   - called from qmp_device_add()
>> 
>>     No change.
>> 
>>   - called from device_init_func() called from qemu_create_cli_devices()
>> 
>>     See below.
>> 
>>   - called from usbback_portid_add() called from usbback_process_port()
>>     called from usbback_backend_changed()
>> 
>>     · called from usbback_init()
>> 
>>     · called as XenDevOps method backend_changed()
>> 
>>     This is Xen.  We now drain pending RCU callbacks.  Impact?  Beats
>>     me.
>> 
>> * qemu_create_cli_devices() called from qmp_x_exit_preconfig()
>> 
>>   - as QMP command with -preconfig, phase must be
>>     PHASE_MACHINE_INITIALIZED
>> 
>>   - called from qemu_init() without -preconfig
>> 
>>   We now drain pending RCU callbacks.  Can any be pending at this
>>   early point?  If not, the change is a no-op.
>> 
>> * failover_add_primary() called from virtio_net_set_features() called as
>>   VirtioDeviceClass method set_features()
>> 
>>   This is virtio-net failover.  We now drain pending RCU callbacks.
>>   Impact?  Beats me.
>> 
>> My gut feeling is "improvement, possibly even a bug fix".  It deserves
>> its own commit, doesn't it?
>
> I thought I'd checked it, but now that I have reviewed the functions you
> looked at, I no longer think it's safe to move drain_call_rcu() into
> qdev_device_add_from_qdict(). drain_call_rcu() drops the BQL and must
> not be called from non-reentrant multi-threaded code paths.
>
> For example, if drain_call_rcu() is called from device emulation code
> then another vCPU thread could execute that same device's emulation code
> while the BQL is dropped. That's usually unsafe because most device
> emulation code only expects to run in one thread at any given time
> thanks to the BQL.
>
> Here are the cases:
> - qemu_create_cli_devices: safe because this happens before device
>   emulation runs in vCPU threads
> - Xen: safe because there are no vCPU threads when QEMU just provides
>   device emulation in a normal Xen system. But I'm not sure if usbback
>   can also be used in a QEMU's Xen HVM guest mode where KVM handles
>   vCPUs, it might be a problem there.
> - virtio_net_set_features() -> failover_add_primary() is unsafe. Another
>   vCPU thread could enter virtio-net emulation code while the thread
>   running virtio_net_set_features() drops the lock.

I see.

Can you make an argument why the reason for RCU draining does not apply
in failover_add_primary()?

If it doesn't apply, capturing the argument in a comment would be nice.

If it does apply, a FIXME comment seems in order.

Also, RCU draining / BQL dropping vs. non-draining/dropping behavior of
non-static functions should be documented in function comments.

All this can be done on top.

> I think it's better to duplicate drain_call_rcu() into hmp_device_add()
> than to risk breaking the other code paths.

Makes sense.
diff mbox series

Patch

diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c
index 6af6ef7d66..8a756b1a91 100644
--- a/system/qdev-monitor.c
+++ b/system/qdev-monitor.c
@@ -725,6 +725,17 @@  err_del_dev:
     if (dev) {
         object_unparent(OBJECT(dev));
         object_unref(OBJECT(dev));
+
+        /*
+         * Drain all pending RCU callbacks. This is done because
+         * some bus related operations can delay a device removal
+         * (in this case this can happen if device is added and then
+         * removed due to a configuration error)
+         * to a RCU callback, but user might expect that this interface
+         * will finish its job completely once qmp command returns result
+         * to the user
+         */
+        drain_call_rcu();
     }
     return NULL;
 }
@@ -849,34 +860,10 @@  void hmp_info_qdm(Monitor *mon, const QDict *qdict)
 
 void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp)
 {
-    QemuOpts *opts;
     DeviceState *dev;
 
-    opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict, errp);
-    if (!opts) {
-        return;
-    }
-    if (!monitor_cur_is_qmp() && qdev_device_help(opts)) {
-        qemu_opts_del(opts);
-        return;
-    }
-    dev = qdev_device_add(opts, errp);
-    if (!dev) {
-        /*
-         * Drain all pending RCU callbacks. This is done because
-         * some bus related operations can delay a device removal
-         * (in this case this can happen if device is added and then
-         * removed due to a configuration error)
-         * to a RCU callback, but user might expect that this interface
-         * will finish its job completely once qmp command returns result
-         * to the user
-         */
-        drain_call_rcu();
-
-        qemu_opts_del(opts);
-        return;
-    }
-    object_unref(OBJECT(dev));
+    dev = qdev_device_add_from_qdict(qdict, true, errp);
+    object_unref(dev);
 }
 
 static DeviceState *find_device_state(const char *id, Error **errp)
@@ -967,8 +954,23 @@  void qmp_device_del(const char *id, Error **errp)
 void hmp_device_add(Monitor *mon, const QDict *qdict)
 {
     Error *err = NULL;
+    QemuOpts *opts;
+    DeviceState *dev;
 
-    qmp_device_add((QDict *)qdict, NULL, &err);
+    opts = qemu_opts_from_qdict(qemu_find_opts("device"), qdict, &err);
+    if (!opts) {
+        goto out;
+    }
+    if (qdev_device_help(opts)) {
+        qemu_opts_del(opts);
+        return;
+    }
+    dev = qdev_device_add(opts, &err);
+    if (!dev) {
+        qemu_opts_del(opts);
+    }
+    object_unref(dev);
+out:
     hmp_handle_error(mon, err);
 }