Message ID | 1386244972-528-5-git-send-email-xiawenc@linux.vnet.ibm.com |
---|---|
State | New |
Headers | show |
On Thu, Dec 05, 2013 at 08:02:50PM +0800, Wenchao Xia wrote: > +restore_refcount: > + if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, -1) > + < 0 && errp) { > + /* Nothing can be done now, need image check later */ > + error_setg(&err, "%s\nqcow2: Error in restoring refcount in snapshot", > + error_get_pretty(*errp)); > + error_free(*errp); > + *errp = NULL; > + error_propagate(errp, err); > + } We get here if writing the new snapshot list failed. If qcow2_update_snapshot_refcount(..., -1) also fails I think we should skip qcow2_free_clusters() below. We don't know the exact state of the disk image anymore - better to leak clusters than to have a dangling reference. > +dealloc_cluster: > + qcow2_free_clusters(bs, sn->l1_table_offset, > + sn->l1_size * sizeof(uint64_t), > + QCOW2_DISCARD_ALWAYS); > + > fail: > g_free(sn->id_str); > g_free(sn->name); > -- > 1.7.1 > >
于 2013/12/20 22:20, Stefan Hajnoczi 写道: > On Thu, Dec 05, 2013 at 08:02:50PM +0800, Wenchao Xia wrote: >> +restore_refcount: >> + if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, -1) >> + < 0 && errp) { >> + /* Nothing can be done now, need image check later */ >> + error_setg(&err, "%s\nqcow2: Error in restoring refcount in snapshot", >> + error_get_pretty(*errp)); >> + error_free(*errp); >> + *errp = NULL; >> + error_propagate(errp, err); >> + } > > We get here if writing the new snapshot list failed. If > qcow2_update_snapshot_refcount(..., -1) also fails I think we should > skip qcow2_free_clusters() below. We don't know the exact state of the > disk image anymore - better to leak clusters than to have a dangling > reference. > Make sense, dangling point should be avoid in any case, will fix, Thanks for reviewing! >> +dealloc_cluster: >> + qcow2_free_clusters(bs, sn->l1_table_offset, >> + sn->l1_size * sizeof(uint64_t), >> + QCOW2_DISCARD_ALWAYS); >> + >> fail: >> g_free(sn->id_str); >> g_free(sn->name); >> -- >> 1.7.1 >> >> >
于 2013/12/23 10:57, Wenchao Xia 写道: > 于 2013/12/20 22:20, Stefan Hajnoczi 写道: >> On Thu, Dec 05, 2013 at 08:02:50PM +0800, Wenchao Xia wrote: >>> +restore_refcount: >>> + if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, >>> s->l1_size, -1) >>> + < 0 && errp) { >>> + /* Nothing can be done now, need image check later */ >>> + error_setg(&err, "%s\nqcow2: Error in restoring refcount in >>> snapshot", >>> + error_get_pretty(*errp)); >>> + error_free(*errp); >>> + *errp = NULL; >>> + error_propagate(errp, err); >>> + } >> >> We get here if writing the new snapshot list failed. If >> qcow2_update_snapshot_refcount(..., -1) also fails I think we should >> skip qcow2_free_clusters() below. We don't know the exact state of the >> disk image anymore - better to leak clusters than to have a dangling >> reference. >> > > Make sense, dangling point should be avoid in any case, will fix, > Thanks for reviewing! > >>> +dealloc_cluster: >>> + qcow2_free_clusters(bs, sn->l1_table_offset, >>> + sn->l1_size * sizeof(uint64_t), >>> + QCOW2_DISCARD_ALWAYS); >>> + >>> fail: >>> g_free(sn->id_str); >>> g_free(sn->name); >>> -- >>> 1.7.1 >>> >>> >> > > Hi, Stefan I have reconsidered the roll back process, there is many case we should take care, so it is better to summarize a general rule to do such cancel operations. I suggest: do a series of roll back operations, when one fail, skip following roll back operation. For snapshot create, the create action is: allocate new L1 -> refcount+1 -> allocate sn_list -> update header The mirrored rollback action can be: deallocate L1 <- refcount-1 <- deallocate sn_list <- restore header When fail happens in rollback action, simply stop following ones. If you agree, I'd like to reorganize the patch as above.
On Mon, Dec 23, 2013 at 02:12:56PM +0800, Wenchao Xia wrote: > I have reconsidered the roll back process, there is many case we > should take care, so it is better to summarize a general rule to do such > cancel operations. I suggest: do a series of roll back operations, > when one fail, skip following roll back operation. For snapshot create, > the create action is: > allocate new L1 -> refcount+1 -> allocate sn_list -> update header > The mirrored rollback action can be: > deallocate L1 <- refcount-1 <- deallocate sn_list <- restore header > > When fail happens in rollback action, simply stop following ones. > If you agree, I'd like to reorganize the patch as above. I agree. When the steps depend on each other we should skip further operations when an error is returned in the failure path. If the steps are independent we can still safely clean up the independent parts. Stefan
diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c index 55746c4..5f787bc 100644 --- a/block/qcow2-snapshot.c +++ b/block/qcow2-snapshot.c @@ -400,6 +400,7 @@ void qcow2_snapshot_create(BlockDriverState *bs, int i, ret; uint64_t *l1_table = NULL; int64_t l1_table_offset; + Error *err = NULL; memset(sn, 0, sizeof(*sn)); @@ -448,7 +449,7 @@ void qcow2_snapshot_create(BlockDriverState *bs, PRIu64 " with size %" PRIu64, sn->l1_table_offset, (uint64_t)(s->l1_size * sizeof(uint64_t))); - goto fail; + goto dealloc_cluster; } ret = bdrv_pwrite(bs->file, sn->l1_table_offset, l1_table, @@ -459,7 +460,7 @@ void qcow2_snapshot_create(BlockDriverState *bs, PRIu64 " with size %" PRIu64, sn->l1_table_offset, (uint64_t)(s->l1_size * sizeof(uint64_t))); - goto fail; + goto dealloc_cluster; } g_free(l1_table); @@ -476,7 +477,7 @@ void qcow2_snapshot_create(BlockDriverState *bs, "Failed in update of refcount for snapshot at %" PRIu64 " with size %d", s->l1_table_offset, s->l1_size); - goto fail; + goto dealloc_cluster; } /* Append the new snapshot to the snapshot list */ @@ -494,7 +495,7 @@ void qcow2_snapshot_create(BlockDriverState *bs, g_free(s->snapshots); s->snapshots = old_snapshot_list; s->nb_snapshots--; - goto fail; + goto restore_refcount; } g_free(old_snapshot_list); @@ -514,6 +515,22 @@ void qcow2_snapshot_create(BlockDriverState *bs, #endif return; +restore_refcount: + if (qcow2_update_snapshot_refcount(bs, s->l1_table_offset, s->l1_size, -1) + < 0 && errp) { + /* Nothing can be done now, need image check later */ + error_setg(&err, "%s\nqcow2: Error in restoring refcount in snapshot", + error_get_pretty(*errp)); + error_free(*errp); + *errp = NULL; + error_propagate(errp, err); + } + +dealloc_cluster: + qcow2_free_clusters(bs, sn->l1_table_offset, + sn->l1_size * sizeof(uint64_t), + QCOW2_DISCARD_ALWAYS); + fail: g_free(sn->id_str); g_free(sn->name);