diff mbox

[1/2] block: change parent backing link when *tqe_prev == NULL

Message ID fd9948b07d7c90e35cecf0be6c4ef87b8324902e.1454129966.git.jcody@redhat.com
State New
Headers show

Commit Message

Jeff Cody Jan. 30, 2016, 5:17 a.m. UTC
In change_parent_backing_link(), we only inserted the new
BlockDriverState entry into the device_list if the tqe_prev pointer was
NULL.   However, we must also allow insertion when the BDS pointed
to by the tqe_prev pointer is NULL as well.

This fixes a bug with external snapshots, and live active layer commits.

After a live snapshot occurs, the active layer and the base layer both
have a non-NULL tqe_prev field in the device_list, although the base
node's tqe_prev field points to a NULL entry.

Once the active commit is finished, bdrv_replace_in_backing_chain() is
called to set the base node as the new active node, and remove the
node that was the prior active layer from the device_list.

If we only check against the tqe_prev pointer field and not the entity
it is pointing to, then we fail to insert base image into the device
list.  The previous active layer is still removed from the device_list,
leaving an empty device_list queue.

With an empty device_list queue, odd behavior occurs - such as not
allowing any more live snapshots.

This commit fixes this issue, by checking for a NULL tqe_prev entity
in the devices_list.

Signed-off-by: Jeff Cody <jcody@redhat.com>
---
 block.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Max Reitz Feb. 1, 2016, 9:43 p.m. UTC | #1
On 30.01.2016 06:17, Jeff Cody wrote:
> In change_parent_backing_link(), we only inserted the new
> BlockDriverState entry into the device_list if the tqe_prev pointer was
> NULL.   However, we must also allow insertion when the BDS pointed
> to by the tqe_prev pointer is NULL as well.
> 
> This fixes a bug with external snapshots, and live active layer commits.
> 
> After a live snapshot occurs, the active layer and the base layer both
> have a non-NULL tqe_prev field in the device_list, although the base
> node's tqe_prev field points to a NULL entry.
> 
> Once the active commit is finished, bdrv_replace_in_backing_chain() is
> called to set the base node as the new active node, and remove the
> node that was the prior active layer from the device_list.
> 
> If we only check against the tqe_prev pointer field and not the entity
> it is pointing to, then we fail to insert base image into the device
> list.  The previous active layer is still removed from the device_list,
> leaving an empty device_list queue.
> 
> With an empty device_list queue, odd behavior occurs - such as not
> allowing any more live snapshots.
> 
> This commit fixes this issue, by checking for a NULL tqe_prev entity
> in the devices_list.
> 
> Signed-off-by: Jeff Cody <jcody@redhat.com>
> ---
>  block.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/block.c b/block.c
> index 5709d3d..0b8526b 100644
> --- a/block.c
> +++ b/block.c
> @@ -2272,7 +2272,7 @@ static void change_parent_backing_link(BlockDriverState *from,
>      }
>      if (from->blk) {
>          blk_set_bs(from->blk, to);
> -        if (!to->device_list.tqe_prev) {
> +        if (!to->device_list.tqe_prev || !*to->device_list.tqe_prev) {

I'm not sure this is the right fix; bdrv_make_anon() clearly states that
we do want device_list.tqe_prev to be NULL if and only if the BDS is not
part of the device list. So this should not be happening.

>              QTAILQ_INSERT_BEFORE(from, to, device_list);
>          }
>          QTAILQ_REMOVE(&bdrv_states, from, device_list);

Inserting a from->device_list.tqe_prev = NULL; here makes the iotest
happy and looks like the better fix to me.

Max

>
Jeff Cody Feb. 2, 2016, 1:29 a.m. UTC | #2
On Mon, Feb 01, 2016 at 10:43:02PM +0100, Max Reitz wrote:
> On 30.01.2016 06:17, Jeff Cody wrote:
> > In change_parent_backing_link(), we only inserted the new
> > BlockDriverState entry into the device_list if the tqe_prev pointer was
> > NULL.   However, we must also allow insertion when the BDS pointed
> > to by the tqe_prev pointer is NULL as well.
> > 
> > This fixes a bug with external snapshots, and live active layer commits.
> > 
> > After a live snapshot occurs, the active layer and the base layer both
> > have a non-NULL tqe_prev field in the device_list, although the base
> > node's tqe_prev field points to a NULL entry.
> > 
> > Once the active commit is finished, bdrv_replace_in_backing_chain() is
> > called to set the base node as the new active node, and remove the
> > node that was the prior active layer from the device_list.
> > 
> > If we only check against the tqe_prev pointer field and not the entity
> > it is pointing to, then we fail to insert base image into the device
> > list.  The previous active layer is still removed from the device_list,
> > leaving an empty device_list queue.
> > 
> > With an empty device_list queue, odd behavior occurs - such as not
> > allowing any more live snapshots.
> > 
> > This commit fixes this issue, by checking for a NULL tqe_prev entity
> > in the devices_list.
> > 
> > Signed-off-by: Jeff Cody <jcody@redhat.com>
> > ---
> >  block.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/block.c b/block.c
> > index 5709d3d..0b8526b 100644
> > --- a/block.c
> > +++ b/block.c
> > @@ -2272,7 +2272,7 @@ static void change_parent_backing_link(BlockDriverState *from,
> >      }
> >      if (from->blk) {
> >          blk_set_bs(from->blk, to);
> > -        if (!to->device_list.tqe_prev) {
> > +        if (!to->device_list.tqe_prev || !*to->device_list.tqe_prev) {
> 
> I'm not sure this is the right fix; bdrv_make_anon() clearly states that
> we do want device_list.tqe_prev to be NULL if and only if the BDS is not
> part of the device list. So this should not be happening.
>

Good point.  This also screams for a helper function to remove a BDS
from the device_list, to enforce this behavior (we remove a BDS from
the device_list is 3 spots, and this time it was missed in one of
them).  Hopefully that will help prevent future errors.

> >              QTAILQ_INSERT_BEFORE(from, to, device_list);
> >          }
> >          QTAILQ_REMOVE(&bdrv_states, from, device_list);
> 
> Inserting a from->device_list.tqe_prev = NULL; here makes the iotest
> happy and looks like the better fix to me.
> 
> Max
>

Thanks!

-Jeff
diff mbox

Patch

diff --git a/block.c b/block.c
index 5709d3d..0b8526b 100644
--- a/block.c
+++ b/block.c
@@ -2272,7 +2272,7 @@  static void change_parent_backing_link(BlockDriverState *from,
     }
     if (from->blk) {
         blk_set_bs(from->blk, to);
-        if (!to->device_list.tqe_prev) {
+        if (!to->device_list.tqe_prev || !*to->device_list.tqe_prev) {
             QTAILQ_INSERT_BEFORE(from, to, device_list);
         }
         QTAILQ_REMOVE(&bdrv_states, from, device_list);