diff mbox series

[2/2] failover: don't allow to migrate a paused VM that needs PCI unplug

Message ID 20210929144311.1168561-3-lvivier@redhat.com
State New
Headers show
Series failover: don't allow to migrate a paused VM that needs PCI unplug | expand

Commit Message

Laurent Vivier Sept. 29, 2021, 2:43 p.m. UTC
As the guest OS is paused, we will never receive the unplug event
from the kernel and the migration cannot continue.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
---
 hw/net/virtio-net.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Juan Quintela Nov. 2, 2021, 9:04 a.m. UTC | #1
Laurent Vivier <lvivier@redhat.com> wrote:
> As the guest OS is paused, we will never receive the unplug event
> from the kernel and the migration cannot continue.
>
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

queued.

I can't think of a better solution.
Michael S. Tsirkin Nov. 2, 2021, 3:04 p.m. UTC | #2
On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
> As the guest OS is paused, we will never receive the unplug event
> from the kernel and the migration cannot continue.
> 
> Signed-off-by: Laurent Vivier <lvivier@redhat.com>

Well ... what if user previously did

pause
start migration
unpause

we are breaking it now for no good reason.

Further, how about

start migration
pause

are we going to break this too? by failing pause?


> ---
>  hw/net/virtio-net.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
> index f205331dcf8c..e54b6c8cd86c 100644
> --- a/hw/net/virtio-net.c
> +++ b/hw/net/virtio-net.c
> @@ -37,8 +37,10 @@
>  #include "qapi/qapi-events-migration.h"
>  #include "hw/virtio/virtio-access.h"
>  #include "migration/misc.h"
> +#include "migration/migration.h"
>  #include "standard-headers/linux/ethtool.h"
>  #include "sysemu/sysemu.h"
> +#include "sysemu/runstate.h"
>  #include "trace.h"
>  #include "monitor/qdev.h"
>  #include "hw/pci/pci.h"
> @@ -3279,7 +3281,13 @@ static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
>      should_be_hidden = qatomic_read(&n->failover_primary_hidden);
>  
>      if (migration_in_setup(s) && !should_be_hidden) {
> -        if (failover_unplug_primary(n, dev)) {
> +        if (!runstate_is_running()) {
> +            Error *err = NULL;
> +            error_setg(&err,
> +                       "cannot unplug primary device while VM is paused");
> +            migration_cancel(err);
> +            error_free(err);
> +        } else if (failover_unplug_primary(n, dev)) {
>              vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
>              qapi_event_send_unplug_primary(dev->id);
>              qatomic_set(&n->failover_primary_hidden, true);
> -- 
> 2.31.1
Juan Quintela Nov. 2, 2021, 3:28 p.m. UTC | #3
"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
>> As the guest OS is paused, we will never receive the unplug event
>> from the kernel and the migration cannot continue.
>> 
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>
> Well ... what if user previously did
>
> pause
> start migration
> unpause
>
> we are breaking it now for no good reason.

No.  we are canceling the migration.  Migration can not finish on that
state.  We are inside the test:

      if (migration_in_setup(s) && !should_be_hidden) {

If you don't have any really weird setup[1], migration setup just takes
milliseconds (low units for small guest, and 200-300ms for really huge
ones).

So I still think this is right.


1: Weird here means things like RDMA, locking all the memory of one
   guest can take forever.  To get an idea about this, until we
   introduce RDMA, we didn't meassured the setup stage time, because it
   was so small that it didn't matter at all.

Unplug from guest is other operation that can take quite a long time,
because it depends on guest cooperation.

> Further, how about
>
> start migration
> pause
>
> are we going to break this too? by failing pause?

I haven't thougth about this one, but it shouldn't matter (famous last
words), beacuse there are to cases:

- migration has started and unplug has already finished, no problem.

- migration has started but we haven't yet arrived to
  virtio_net_handle_migration_primary().  We are paused, and we give the
  guest a good error message about why are we failing.  notice that
  migration can't finish anyways, it would stuck there forever waiting
  for the (stopped guest to unplug the device).

So the only case that I can see that *could* matter is:

- start migration
- pause the guest
   this implies pausing the migration
- unpause
   at this point we can continue the migration

do we really care about this scenary?

I think not, because the migration has advanced so few, that starting
from zero would be the best option anyways.

Later, Juan.

PD1: No, I am not sure what happens if you run "pause" after the event
     to guest is sent, but before that the guest finish the unplug (I
     guess it would stall).  But in this case, we are doing something at
     least fishy.  On the other hand, we know that "pause; migration"
     will never really work.

PD2: Perhaps we could "invet" another state that means:
IN_SETUP_AND_WE_CANT_BE_PAUSED, and change it between we ask for the
device to unplug, and that it unplugs.  But it looks really complicated.
Laurent Vivier Nov. 2, 2021, 5:06 p.m. UTC | #4
On 02/11/2021 16:04, Michael S. Tsirkin wrote:
> On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
>> As the guest OS is paused, we will never receive the unplug event
>> from the kernel and the migration cannot continue.
>>
>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> 
> Well ... what if user previously did
> 
> pause
> start migration
> unpause
> 
> we are breaking it now for no good reason.
> 
> Further, how about
> 
> start migration
> pause
> 
> are we going to break this too? by failing pause?
> 
> 

TL;DR: This patch only prevents to migrate a VFIO device as failover allows to start a 
migration with a VFIO device plugged in.

Long Story:

* before this patch:

- pause and start migration and unpause-> fails if we unpause too late because we migrate 
a VFIO device, works otherwise
- start migration and pause before we unplug the card -> hangs forever
- start migration and pause after we unplug the card -> it works fine

* After this patch:

- pause and start migration and unpause-> fails if we unpause too late because of the new 
error checking, works otherwise
- start migration and pause before we unplug the card -> fails because of the new error 
checking
- start migration and pause after we unplug the card -> it works fine

Thanks,
Laurent
Michael S. Tsirkin Nov. 2, 2021, 5:08 p.m. UTC | #5
On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
> On 02/11/2021 16:04, Michael S. Tsirkin wrote:
> > On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
> > > As the guest OS is paused, we will never receive the unplug event
> > > from the kernel and the migration cannot continue.
> > > 
> > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> > 
> > Well ... what if user previously did
> > 
> > pause
> > start migration
> > unpause
> > 
> > we are breaking it now for no good reason.
> > 
> > Further, how about
> > 
> > start migration
> > pause
> > 
> > are we going to break this too? by failing pause?
> > 
> > 
> 
> TL;DR: This patch only prevents to migrate a VFIO device as failover allows
> to start a migration with a VFIO device plugged in.
> 
> Long Story:
> 
> * before this patch:
> 
> - pause and start migration and unpause-> fails if we unpause too late
> because we migrate a VFIO device, works otherwise


confused about this one. can you explain pls?

> - start migration and pause before we unplug the card -> hangs forever
> - start migration and pause after we unplug the card -> it works fine
> 
> * After this patch:
> 
> - pause and start migration and unpause-> fails if we unpause too late
> because of the new error checking, works otherwise
> - start migration and pause before we unplug the card -> fails because of
> the new error checking
> - start migration and pause after we unplug the card -> it works fine
> 
> Thanks,
> Laurent
>
Juan Quintela Nov. 2, 2021, 5:26 p.m. UTC | #6
"Michael S. Tsirkin" <mst@redhat.com> wrote:
> On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
>> On 02/11/2021 16:04, Michael S. Tsirkin wrote:
>> > On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
>> > > As the guest OS is paused, we will never receive the unplug event
>> > > from the kernel and the migration cannot continue.
>> > > 
>> > > Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>> > 
>> > Well ... what if user previously did
>> > 
>> > pause
>> > start migration
>> > unpause
>> > 
>> > we are breaking it now for no good reason.
>> > 
>> > Further, how about
>> > 
>> > start migration
>> > pause
>> > 
>> > are we going to break this too? by failing pause?
>> > 
>> > 
>> 
>> TL;DR: This patch only prevents to migrate a VFIO device as failover allows
>> to start a migration with a VFIO device plugged in.
>> 
>> Long Story:
>> 
>> * before this patch:
>> 
>> - pause and start migration and unpause-> fails if we unpause too late
>> because we migrate a VFIO device, works otherwise
>
>
> confused about this one. can you explain pls?

Pause the guest.
Start migration.

     if (migration_in_setup(s) && !should_be_hidden) {
        if (failover_unplug_primary(n, dev)) {
             vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
             qapi_event_send_unplug_primary(dev->id);

We send the unplug request, but the guest is paused.

             qatomic_set(&n->failover_primary_hidden, true);

callbacks, callbacks, callbacks.

        while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
               qemu_savevm_state_guest_unplug_pending()) {
            qemu_sem_timedwait(&s->wait_unplug_sem, 250);
        }

And we are not able to get out of that loop, because we never get to the
point where the guest send the unplug command.

So, the only other thing that I can think of is putting one timeout
there, but how much?  That is a good question.

Later, Juan.
Laurent Vivier Nov. 2, 2021, 5:43 p.m. UTC | #7
On 02/11/2021 18:08, Michael S. Tsirkin wrote:
> On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
>> On 02/11/2021 16:04, Michael S. Tsirkin wrote:
>>> On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
>>>> As the guest OS is paused, we will never receive the unplug event
>>>> from the kernel and the migration cannot continue.
>>>>
>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>
>>> Well ... what if user previously did
>>>
>>> pause
>>> start migration
>>> unpause
>>>
>>> we are breaking it now for no good reason.
>>>
>>> Further, how about
>>>
>>> start migration
>>> pause
>>>
>>> are we going to break this too? by failing pause?
>>>
>>>
>>
>> TL;DR: This patch only prevents to migrate a VFIO device as failover allows
>> to start a migration with a VFIO device plugged in.
>>
>> Long Story:
>>
>> * before this patch:
>>
>> - pause and start migration and unpause-> fails if we unpause too late
>> because we migrate a VFIO device, works otherwise
> 
> 
> confused about this one. can you explain pls?
> 

Sorry, I've been confused by another bug: with ACPI unplug, we don't wait the unplug, and 
so if the machine is paused the VFIO is "migrated" and we have an error message on the 
destination side as the card cannot be plugged back.

but with PCIe native hotplug ("-global ICH9-LPC.acpi-pci-hotplug-with-bridge-support=off") 
we have:

before this patch:

if we pause and then start the migration, migration hangs until we unpause the VM. But the 
migration can hangs forever if the VM is never unpaused. Normally migration of a paused VM 
should not hang.

after this patch:

if we pause and then start the migration, the migration fails because of the new error.

Remember that the migration of a VM with a VFIO device normally fails, so a user should 
not try to migrate a VM with a VFIO device except if he knows he is using failover, and in 
this case he should know he must not pause the VM.

Thanks,
Laurent
Laurent Vivier Nov. 2, 2021, 5:47 p.m. UTC | #8
On 02/11/2021 18:26, Juan Quintela wrote:
> "Michael S. Tsirkin" <mst@redhat.com> wrote:
>> On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
>>> On 02/11/2021 16:04, Michael S. Tsirkin wrote:
>>>> On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
>>>>> As the guest OS is paused, we will never receive the unplug event
>>>>> from the kernel and the migration cannot continue.
>>>>>
>>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
>>>>
>>>> Well ... what if user previously did
>>>>
>>>> pause
>>>> start migration
>>>> unpause
>>>>
>>>> we are breaking it now for no good reason.
>>>>
>>>> Further, how about
>>>>
>>>> start migration
>>>> pause
>>>>
>>>> are we going to break this too? by failing pause?
>>>>
>>>>
>>>
>>> TL;DR: This patch only prevents to migrate a VFIO device as failover allows
>>> to start a migration with a VFIO device plugged in.
>>>
>>> Long Story:
>>>
>>> * before this patch:
>>>
>>> - pause and start migration and unpause-> fails if we unpause too late
>>> because we migrate a VFIO device, works otherwise
>>
>>
>> confused about this one. can you explain pls?
> 
> Pause the guest.
> Start migration.
> 
>       if (migration_in_setup(s) && !should_be_hidden) {
>          if (failover_unplug_primary(n, dev)) {
>               vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
>               qapi_event_send_unplug_primary(dev->id);
> 
> We send the unplug request, but the guest is paused.
> 
>               qatomic_set(&n->failover_primary_hidden, true);
> 
> callbacks, callbacks, callbacks.
> 
>          while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
>                 qemu_savevm_state_guest_unplug_pending()) {
>              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
>          }
> 
> And we are not able to get out of that loop, because we never get to the
> point where the guest send the unplug command.
> 
> So, the only other thing that I can think of is putting one timeout
> there, but how much?  That is a good question.
> 

Please, no timeout, IMHO timeout is worse than a clean exit on failure.

Thanks,
Laurent
Juan Quintela Nov. 2, 2021, 6:09 p.m. UTC | #9
On Tue, Nov 2, 2021, 18:47 Laurent Vivier <lvivier@redhat.com> wrote:

> On 02/11/2021 18:26, Juan Quintela wrote:
> > "Michael S. Tsirkin" <mst@redhat.com> wrote:
> >> On Tue, Nov 02, 2021 at 06:06:51PM +0100, Laurent Vivier wrote:
> >>> On 02/11/2021 16:04, Michael S. Tsirkin wrote:
> >>>> On Wed, Sep 29, 2021 at 04:43:11PM +0200, Laurent Vivier wrote:
> >>>>> As the guest OS is paused, we will never receive the unplug event
> >>>>> from the kernel and the migration cannot continue.
> >>>>>
> >>>>> Signed-off-by: Laurent Vivier <lvivier@redhat.com>
> >>>>
> >>>> Well ... what if user previously did
> >>>>
> >>>> pause
> >>>> start migration
> >>>> unpause
> >>>>
> >>>> we are breaking it now for no good reason.
> >>>>
> >>>> Further, how about
> >>>>
> >>>> start migration
> >>>> pause
> >>>>
> >>>> are we going to break this too? by failing pause?
> >>>>
> >>>>
> >>>
> >>> TL;DR: This patch only prevents to migrate a VFIO device as failover
> allows
> >>> to start a migration with a VFIO device plugged in.
> >>>
> >>> Long Story:
> >>>
> >>> * before this patch:
> >>>
> >>> - pause and start migration and unpause-> fails if we unpause too late
> >>> because we migrate a VFIO device, works otherwise
> >>
> >>
> >> confused about this one. can you explain pls?
> >
> > Pause the guest.
> > Start migration.
> >
> >       if (migration_in_setup(s) && !should_be_hidden) {
> >          if (failover_unplug_primary(n, dev)) {
> >               vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev),
> dev);
> >               qapi_event_send_unplug_primary(dev->id);
> >
> > We send the unplug request, but the guest is paused.
> >
> >               qatomic_set(&n->failover_primary_hidden, true);
> >
> > callbacks, callbacks, callbacks.
> >
> >          while (s->state == MIGRATION_STATUS_WAIT_UNPLUG &&
> >                 qemu_savevm_state_guest_unplug_pending()) {
> >              qemu_sem_timedwait(&s->wait_unplug_sem, 250);
> >          }
> >
> > And we are not able to get out of that loop, because we never get to the
> > point where the guest send the unplug command.
> >
> > So, the only other thing that I can think of is putting one timeout
> > there, but how much?  That is a good question.
> >
>
> Please, no timeout, IMHO timeout is worse than a clean exit on failure.
>


How long should we wait for the guest? If not a timeout....


> Thanks,
> Laurent
>
>
diff mbox series

Patch

diff --git a/hw/net/virtio-net.c b/hw/net/virtio-net.c
index f205331dcf8c..e54b6c8cd86c 100644
--- a/hw/net/virtio-net.c
+++ b/hw/net/virtio-net.c
@@ -37,8 +37,10 @@ 
 #include "qapi/qapi-events-migration.h"
 #include "hw/virtio/virtio-access.h"
 #include "migration/misc.h"
+#include "migration/migration.h"
 #include "standard-headers/linux/ethtool.h"
 #include "sysemu/sysemu.h"
+#include "sysemu/runstate.h"
 #include "trace.h"
 #include "monitor/qdev.h"
 #include "hw/pci/pci.h"
@@ -3279,7 +3281,13 @@  static void virtio_net_handle_migration_primary(VirtIONet *n, MigrationState *s)
     should_be_hidden = qatomic_read(&n->failover_primary_hidden);
 
     if (migration_in_setup(s) && !should_be_hidden) {
-        if (failover_unplug_primary(n, dev)) {
+        if (!runstate_is_running()) {
+            Error *err = NULL;
+            error_setg(&err,
+                       "cannot unplug primary device while VM is paused");
+            migration_cancel(err);
+            error_free(err);
+        } else if (failover_unplug_primary(n, dev)) {
             vmstate_unregister(VMSTATE_IF(dev), qdev_get_vmsd(dev), dev);
             qapi_event_send_unplug_primary(dev->id);
             qatomic_set(&n->failover_primary_hidden, true);