diff mbox

[v3,03/12] block: Add overlay BDS pointer into the BlockDriverState struct

Message ID 8c52711b0fc55e2bb44b3cb2d8a7043079e456c6.1401463410.git.jcody@redhat.com
State New
Headers show

Commit Message

Jeff Cody May 30, 2014, 3:35 p.m. UTC
Now that node-names can reference an individual BlockDriverState without
needing to navigate downwards from the 'device' level, in order to find
the top-most image (active layer) we need a pointer to the overlay of a
BDS.

This will allow QMP commands to reference an image solely by its
node-name, without also needing to pass in the corresponding 'device'
string.

This also adds a helper function to set the overlay pointer that is, for
now, trivial.  But since we recently moved away from open coding
bs->backing_hd assignment, we should probably also refrain from setting
bs->overlay directly.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block.c                   | 14 ++++++++++++++
 include/block/block.h     |  1 +
 include/block/block_int.h |  1 +
 3 files changed, 16 insertions(+)

Comments

Eric Blake May 30, 2014, 4 p.m. UTC | #1
On 05/30/2014 09:35 AM, Jeff Cody wrote:
> Now that node-names can reference an individual BlockDriverState without
> needing to navigate downwards from the 'device' level, in order to find
> the top-most image (active layer) we need a pointer to the overlay of a
> BDS.
> 
> This will allow QMP commands to reference an image solely by its
> node-name, without also needing to pass in the corresponding 'device'
> string.
> 
> This also adds a helper function to set the overlay pointer that is, for
> now, trivial.  But since we recently moved away from open coding
> bs->backing_hd assignment, we should probably also refrain from setting
> bs->overlay directly.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  block.c                   | 14 ++++++++++++++
>  include/block/block.h     |  1 +
>  include/block/block_int.h |  1 +
>  3 files changed, 16 insertions(+)

Question - can one bds ever belong to more than one chain?  That is,
could I create a guest that uses:
       / disk1
base <
       \ disk2

for two disks in parallel, but where I use blockdev referencing to pass
in a single readonly copy of base to both chains rather than the default
of opening base twice?  If so, then what does the overlay field get set
to for base: disk1 or disk2?  On the other hand, if base truly is
shared, it's non-deterministic which device is meant when using just the
node-name for the shared base bds.  So maybe the user gets what they
deserved.  Or maybe it argues that doing a bds lookup for a bds that has
more than one device chain returns a failure unless the device chain was
passed as well (but that may mean reference counting or even tracking an
array of overlays).

But if a bds can be in at most one chain at a time, then your patch
makes sense.

Reviewed-by: Eric Blake <eblake@redhat.com>
Jeff Cody May 30, 2014, 4:36 p.m. UTC | #2
On Fri, May 30, 2014 at 10:00:09AM -0600, Eric Blake wrote:
> On 05/30/2014 09:35 AM, Jeff Cody wrote:
> > Now that node-names can reference an individual BlockDriverState without
> > needing to navigate downwards from the 'device' level, in order to find
> > the top-most image (active layer) we need a pointer to the overlay of a
> > BDS.
> > 
> > This will allow QMP commands to reference an image solely by its
> > node-name, without also needing to pass in the corresponding 'device'
> > string.
> > 
> > This also adds a helper function to set the overlay pointer that is, for
> > now, trivial.  But since we recently moved away from open coding
> > bs->backing_hd assignment, we should probably also refrain from setting
> > bs->overlay directly.
> > 
> > Signed-off-by: Jeff Cody <jcody@redhat.com>
> > ---
> >  block.c                   | 14 ++++++++++++++
> >  include/block/block.h     |  1 +
> >  include/block/block_int.h |  1 +
> >  3 files changed, 16 insertions(+)
> 
> Question - can one bds ever belong to more than one chain?  That is,
> could I create a guest that uses:
>        / disk1
> base <
>        \ disk2
> 
> for two disks in parallel, but where I use blockdev referencing to pass
> in a single readonly copy of base to both chains rather than the default
> of opening base twice?  If so, then what does the overlay field get set
> to for base: disk1 or disk2?  On the other hand, if base truly is
> shared, it's non-deterministic which device is meant when using just the
> node-name for the shared base bds.  So maybe the user gets what they
> deserved.  Or maybe it argues that doing a bds lookup for a bds that has
> more than one device chain returns a failure unless the device chain was
> passed as well (but that may mean reference counting or even tracking an
> array of overlays).
> 
> But if a bds can be in at most one chain at a time, then your patch
> makes sense.
> 
> Reviewed-by: Eric Blake <eblake@redhat.com>
>

While an image itself may be a backing file to more than one overlay
image, QEMU would open that as 2 separate chains, as we open the drive
from the active layer downwards, following the backing_file.  So in
the QEMU implementation of your above structure, a BDS will be in at
most one chain at a time:

[0 base] <-- [1 disk1]

[2 base] <-- [3 disk2]

In the above case, each '[]' is a unique BDS structure.

(and of course, if you perform a block-commit to an image that is a
base to multiple other images, then expect Bad Things).
Eric Blake May 30, 2014, 7:12 p.m. UTC | #3
On 05/30/2014 10:36 AM, Jeff Cody wrote:

>> Question - can one bds ever belong to more than one chain?  That is,
>> could I create a guest that uses:
>>        / disk1
>> base <
>>        \ disk2
>>

> While an image itself may be a backing file to more than one overlay
> image, QEMU would open that as 2 separate chains, as we open the drive
> from the active layer downwards, following the backing_file.  So in
> the QEMU implementation of your above structure, a BDS will be in at
> most one chain at a time:
> 
> [0 base] <-- [1 disk1]
> 
> [2 base] <-- [3 disk2]

That's if I only specify disk1 and disk2 as the files to open (and let
qemu follow its own backing chain to create BDS for [0 base] and [2
base]).  But I was thinking more about blockdev-add, which has
provisions in the QMP for creating a block device chain via reference to
an existing bds node-name rather than inline (although I know that qemu
1.7 didn't support that, I still haven't figured out if it can be done
in qemu 2.0 with the addition of node-names).

I've added you in cc to a thread from January on that topic.

> 
> In the above case, each '[]' is a unique BDS structure.
> 
> (and of course, if you perform a block-commit to an image that is a
> base to multiple other images, then expect Bad Things).

On the other hand, doing block-stream to break a shared base into
multiple independent images actually makes sense.  And since this series
changes block-stream to work on just a node name, there is the question
of which of the two devices actually get the streaming action.
diff mbox

Patch

diff --git a/block.c b/block.c
index cf4b296..588046e 100644
--- a/block.c
+++ b/block.c
@@ -1108,12 +1108,22 @@  fail:
     return ret;
 }
 
+void bdrv_set_overlay(BlockDriverState *bs, BlockDriverState *overlay)
+{
+    if (bs) {
+        bs->overlay = overlay;
+    }
+}
+
 void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
 {
 
     if (bs->backing_hd) {
         assert(bs->backing_blocker);
         bdrv_op_unblock_all(bs->backing_hd, bs->backing_blocker);
+        if (!backing_hd) {
+            bdrv_set_overlay(bs->backing_hd, NULL);
+        }
     } else if (backing_hd) {
         error_setg(&bs->backing_blocker,
                    "device is used as backing hd of '%s'",
@@ -1126,6 +1136,8 @@  void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd)
         bs->backing_blocker = NULL;
         goto out;
     }
+    bdrv_set_overlay(backing_hd, bs);
+
     bs->open_flags &= ~BDRV_O_NO_BACKING;
     pstrcpy(bs->backing_file, sizeof(bs->backing_file), backing_hd->filename);
     pstrcpy(bs->backing_format, sizeof(bs->backing_format),
@@ -2085,6 +2097,8 @@  void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top)
     /* The contents of 'tmp' will become bs_top, as we are
      * swapping bs_new and bs_top contents. */
     bdrv_set_backing_hd(bs_top, bs_new);
+    /* make sure that bs_new->backing_hd->overlay points to bs_new */
+    bdrv_set_overlay(bs_new->backing_hd, bs_new);
 }
 
 static void bdrv_delete(BlockDriverState *bs)
diff --git a/include/block/block.h b/include/block/block.h
index 4dc68be..dff5403 100644
--- a/include/block/block.h
+++ b/include/block/block.h
@@ -217,6 +217,7 @@  int bdrv_open_image(BlockDriverState **pbs, const char *filename,
                     QDict *options, const char *bdref_key, int flags,
                     bool allow_none, Error **errp);
 void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd);
+void bdrv_set_overlay(BlockDriverState *bs, BlockDriverState *overlay);
 int bdrv_open_backing_file(BlockDriverState *bs, QDict *options, Error **errp);
 void bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp);
 int bdrv_open(BlockDriverState **pbs, const char *filename,
diff --git a/include/block/block_int.h b/include/block/block_int.h
index f2e753f..c0fe90b 100644
--- a/include/block/block_int.h
+++ b/include/block/block_int.h
@@ -303,6 +303,7 @@  struct BlockDriverState {
     char backing_format[16]; /* if non-zero and backing_file exists */
 
     BlockDriverState *backing_hd;
+    BlockDriverState *overlay;
     BlockDriverState *file;
 
     NotifierList close_notifiers;