Message ID | 58b44a8409ab790a76212958928fa6f3ccbf9096.1329758006.git.jcody@redhat.com |
---|---|
State | New |
Headers | show |
On 02/20/2012 10:31 AM, Jeff Cody wrote: > In the case of a failure in a group snapshot, it is possible for > multiple file image failures to occur - for instance, failure of > an original snapshot, and then failure of one or more of the > attempted reopens of the original. > > Knowing all of the file images which failed could be useful or > critical information, so this command returns a list of strings > containing the filenames of all failures from the last > invocation of blockdev-group-snapshot-sync. Meta-question: Suppose that the guest is running when we issue blockdev-group-snapshot-sync - in that case, qemu is responsible for pausing and then resuming the guest. On success, this makes sense. But what happens on failure? If we only fail at creating one snapshot, but successfully roll back the rest of the set, should the guest be resumed (as if the command had never been attempted), or should the guest be left paused? On the other hand, if we fail at creating one snapshot, as well as fail at rolling back, then that argues that we _cannot_ resume the guest, because we no longer have a block device open. This policy needs to be documented in one (or both) of the two new monitor commands, and we probably ought to make sure that if the guest is left paused where it had originally started as running, then an appropriate event is also emitted. For blockdev-snapshot-sync, libvirt was always pausing qemu before issuing the snapshot, then resuming afterwards; but now that we have the ability to make the set atomic, I'm debating about whether libvirt still needs to pause qemu, or whether it can now rely on qemu doing the right things about pausing and resuming as part of the snapshot command.
On 02/20/2012 12:48 PM, Eric Blake wrote: > On 02/20/2012 10:31 AM, Jeff Cody wrote: >> In the case of a failure in a group snapshot, it is possible for >> multiple file image failures to occur - for instance, failure of >> an original snapshot, and then failure of one or more of the >> attempted reopens of the original. >> >> Knowing all of the file images which failed could be useful or >> critical information, so this command returns a list of strings >> containing the filenames of all failures from the last >> invocation of blockdev-group-snapshot-sync. > > Meta-question: > > Suppose that the guest is running when we issue > blockdev-group-snapshot-sync - in that case, qemu is responsible for > pausing and then resuming the guest. On success, this makes sense. But > what happens on failure? The guest is not paused in blockdev-group-snapshot-sync; I don't think that qemu should enforce pause/resume in the live snapshot commands. > > If we only fail at creating one snapshot, but successfully roll back the > rest of the set, should the guest be resumed (as if the command had > never been attempted), or should the guest be left paused? > > On the other hand, if we fail at creating one snapshot, as well as fail > at rolling back, then that argues that we _cannot_ resume the guest, > because we no longer have a block device open. Is that really true, though? Depending on what drive failed, the guest may still be runnable. It would be roughly equivalent to the guest as a drive failure; a bad event, but not always fatal. But, I think v2 of the patch may make this moot - I was talking with Kevin, and he had some good ideas on how to do this without requiring a close & reopen in the case of the snapshot failure; which means that we shouldn't have to worry about the second scenario. I am going to incorporate those changes into v2. > > This policy needs to be documented in one (or both) of the two new > monitor commands, and we probably ought to make sure that if the guest > is left paused where it had originally started as running, then an > appropriate event is also emitted. I agree, the documentation should make it clear what is going on - I will add that to v2. > > For blockdev-snapshot-sync, libvirt was always pausing qemu before > issuing the snapshot, then resuming afterwards; but now that we have the > ability to make the set atomic, I'm debating about whether libvirt still > needs to pause qemu, or whether it can now rely on qemu doing the right > things about pausing and resuming as part of the snapshot command. > Again, it doesn't pause automatically, so that is up to libvirt. The guest agent is also available to freeze the filesystem, if libvirt wants to trust it (and it is running); if not, then libvirt can still issue a pause/resume around the snapshot command (and libvirt may be in a better position to decide what to do in case of failure, if it has some knowledge of the drives that failed and how they are used).
diff --git a/blockdev.c b/blockdev.c index 0149720..cb44af5 100644 --- a/blockdev.c +++ b/blockdev.c @@ -727,6 +727,7 @@ typedef struct BlkGroupSnapshotData { QSIMPLEQ_ENTRY(BlkGroupSnapshotData) entry; } BlkGroupSnapshotData; +SnapshotFailList *group_snap_fail_list; /* * 'Atomic' group snapshots. The snapshots are taken as a set, and if any fail * then we attempt to undo all of the pivots performed. @@ -739,10 +740,24 @@ void qmp_blockdev_group_snapshot_sync(SnapshotDevList *dev_list, SnapshotDev *dev_info = NULL; BlkGroupSnapshotData *snap_entry; BlockDriver *proto_drv; + SnapshotFailList *fail_entry = group_snap_fail_list; QSIMPLEQ_HEAD(gsnp_list, BlkGroupSnapshotData) gsnp_list; QSIMPLEQ_INIT(&gsnp_list); + /* + * clear out our failure list first, and reclaim memory + * we maintain the list, so if a group snapshot fails + * we can be queried about which devices failed + */ + SnapshotFailList *fail_entry_next = NULL; + while (NULL != fail_entry) { + g_free(fail_entry->value); + fail_entry_next = fail_entry->next; + g_free(fail_entry); + fail_entry = fail_entry_next; + } + /* We don't do anything in this loop that commits us to the snapshot */ while (NULL != dev_entry) { dev_info = dev_entry->value; @@ -815,6 +830,16 @@ void qmp_blockdev_group_snapshot_sync(SnapshotDevList *dev_list, */ if (ret != 0) { error_set(errp, QERR_OPEN_FILE_FAILED, snap_entry->snapshot_file); + /* + * We bail on the first failure, but add the failed filename to the + * return list in case any of the rollback pivots fail as well + */ + SnapshotFailList *failure; + failure = g_malloc0(sizeof(SnapshotFailList)); + failure->value = g_malloc0(sizeof(*failure->value)); + failure->value->failed_file = g_strdup(snap_entry->snapshot_file); + failure->next = group_snap_fail_list; + group_snap_fail_list = failure; goto error_rollback; } } @@ -829,7 +854,17 @@ error_rollback: ret = bdrv_open(snap_entry->bs, snap_entry->old_filename, snap_entry->flags, snap_entry->old_drv); if (ret != 0) { - /* This is very very bad */ + /* + * This is very very bad. Make sure the caller is aware + * of which files failed, since there could be more than + * one + */ + SnapshotFailList *failure; + failure = g_malloc0(sizeof(SnapshotFailList)); + failure->value = g_malloc0(sizeof(*failure->value)); + failure->value->failed_file = g_strdup(snap_entry->old_filename); + failure->next = group_snap_fail_list; + group_snap_fail_list = failure; error_set(errp, QERR_OPEN_FILE_FAILED, snap_entry->old_filename); } @@ -843,6 +878,11 @@ exit: return; } +SnapshotFailList *qmp_blockdev_query_group_snapshot_failure(Error **errp) +{ + return group_snap_fail_list; +} + static void eject_device(BlockDriverState *bs, int force, Error **errp) { diff --git a/qapi-schema.json b/qapi-schema.json index b8d66d0..c4b27a3 100644 --- a/qapi-schema.json +++ b/qapi-schema.json @@ -1121,6 +1121,14 @@ 'data': {'device': 'str', 'snapshot-file': 'str', '*format': 'str' } } ## +# @SnapshotFail +# +# @failed_file: string of the filename that failed +## +{ 'type': 'SnapshotFail', + 'data': {'failed_file': 'str' } } + +## # @blockdev-group-snapshot-sync # # Generates a synchronous snapshot of a group of one or more block devices, @@ -1152,6 +1160,23 @@ 'data': { 'dev': [ 'SnapshotDev' ] } } ## +# @blockdev-query-group-snapshot-failure +# +# +# Returns: A list of @SnapshotFail, that contains the filenames for all failures +# of the last blockdev-group-snapshot-sync command. +# +# Notes: +# Since there could potentially be more than one file open or drive +# failures, the additional command 'blockdev-query-group-snapshot-failure' +# will return a list of all device files that have failed. This could +# include the original filename if the reopen of an original image file +# failed. +# +## +{ 'command': 'blockdev-query-group-snapshot-failure', 'returns': [ 'SnapshotFail' ] } + +## # @blockdev-snapshot-sync # # Generates a synchronous snapshot of a block device. diff --git a/qmp-commands.hx b/qmp-commands.hx index 2fe1e6e..9a80e08 100644 --- a/qmp-commands.hx +++ b/qmp-commands.hx @@ -706,6 +706,11 @@ Example: EQMP { + .name = "blockdev-query-group-snapshot-failure", + .args_type = "", + .mhandler.cmd_new = qmp_marshal_input_blockdev_query_group_snapshot_failure, + }, + { .name = "blockdev-snapshot-sync", .args_type = "device:B,snapshot-file:s,format:s?", .mhandler.cmd_new = qmp_marshal_input_blockdev_snapshot_sync,
In the case of a failure in a group snapshot, it is possible for multiple file image failures to occur - for instance, failure of an original snapshot, and then failure of one or more of the attempted reopens of the original. Knowing all of the file images which failed could be useful or critical information, so this command returns a list of strings containing the filenames of all failures from the last invocation of blockdev-group-snapshot-sync. Signed-off-by: Jeff Cody <jcody@redhat.com> --- blockdev.c | 42 +++++++++++++++++++++++++++++++++++++++++- qapi-schema.json | 25 +++++++++++++++++++++++++ qmp-commands.hx | 5 +++++ 3 files changed, 71 insertions(+), 1 deletions(-)