mbox series

[v2,00/10] block: Attempt on fixing 030-reported errors

Message ID 20211111120829.81329-1-hreitz@redhat.com
Headers show
Series block: Attempt on fixing 030-reported errors | expand

Message

Hanna Czenczek Nov. 11, 2021, 12:08 p.m. UTC
Hi,

v1 cover letter:
https://lists.nongnu.org/archive/html/qemu-devel/2021-11/msg01287.html

In v2 I’ve addressed the comments I’ve received from Kevin and Vladimir.
To this end, I’ve retained only the non-controversial part in patch 5,
and split everything else (i.e. the stuff relating to
bdrv_replace_child_tran()) into the dedicated patches 6 and 8.

Kevin’s most important comments (to my understanding) were that:

(A) When bdrv_remove_file_or_backing_child() uses
    bdrv_replace_child_tran(), we have to ensure that the BDS lives as
    long as the transaction does.  This is solved by keeping a strong
    reference to it that’s released only when the transaction is cleaned
    up (and the new patch 7 ensures that the .clean() handler will be
    invoked after all .commit()/.abort() handlers, so the reference will
    really live as long as the transaction does).

(B) bdrv_replace_node_noperm() passes a pointer to loop-local variable,
    which is a really bad idea considering the transaction lives much
    longer than one loop iteration.
    Luckily, the new_bs it passes is always non-NULL, and so
    bdrv_replace_child_tran() actually doesn’t need to store the
    BdrvChild ** pointer, because for a non-NULL new_bs, there is
    nothing to revert in the abort handler.  We just need to clarify
    this, not store the pointer in case of a non-NULL new_bs, and assert
    that bdrv_replace_node_noperm() and its relatives are only used to
    replace an existing node by some other existing (i.e. non-NULL)
    node.


git-backport-diff against v1:

Key:
[----] : patches are identical
[####] : number of functional differences between upstream/downstream patch
[down] : patch is downstream-only
The flags [FC] indicate (F)unctional and (C)ontextual differences, respectively

001/10:[----] [--] 'stream: Traverse graph after modification'
002/10:[0005] [FC] 'block: Manipulate children list in .attach/.detach'
003/10:[----] [--] 'block: Unite remove_empty_child and child_free'
004/10:[----] [--] 'block: Drop detached child from ignore list'
005/10:[0040] [FC] 'block: Pass BdrvChild ** to replace_child_noperm'
006/10:[down] 'block: Restructure remove_file_or_backing_child()'
007/10:[down] 'transactions: Invoke clean() after everything else'
008/10:[down] 'block: Let replace_child_tran keep indirect pointer'
009/10:[0020] [FC] 'block: Let replace_child_noperm free children'
010/10:[----] [--] 'iotests/030: Unthrottle parallel jobs in reverse'


Detailed per-patch changes:

Patch 2:
 - Dropped stale comment about undoing bdrv_attach_child_noperm()’s list
   insertion in the respective abort handler

Patch 5:
 - Split off everything related to bdrv_replace_child_tran()

Patch 6:
 - Added (split off from patch 5)
 - Keep the `if (!child) { return; }` path before getting childp

Patch 7:
 - Added: I wanted bdrv_remove_file_or_backing_child() to keep a strong
   BDS reference throughout the transaction, and the simplest way (i.e.
   the one where I wouldn’t have to think too hard) was to add this
   patch and have transactions invoke .clean() after all
   .commit()/.abort() handlers are done.  This way we can just drop the
   reference in .clean() and need not care about the order the
   transaction actions were added in.

Patch 8:
 - Added (split off from patch 5)
 - The BdrvChild ** pointer is now only stored in the
   BdrvReplaceChildState if new_bs is NULL.  Otherwise, we don’t need
   it, because we won’t need to reinstate the child in the abort
   handler.  This way we don’t have to worry about
   bdrv_replace_node_noperm() passing a pointer to a loop-local
   variable (because the new_bs it passes is never NULL).
 - In the same vein, assert that bdrv_replace_node() and relatives
   cannot be called to replace the node by NULL.
 - Have bdrv_remove_file_or_backing_child() keep a strong reference to
   the parent BDS throughout the transaction, so the &bs->backing or
   &bs->file pointer remains valid
 - Moved the comment explaining why we want to pass &s->child instead of
   s->childp to bdrv_replace_child_noperm() in
   bdrv_replace_child_abort() from patch 9 here (and also keep passing
   &s->child instead of s->childp).  It is already relevant now that
   s->childp is valid only if new_bs is NULL.

Patch 9:
 - The comment this used to add to bdrv_replace_child_abort() is now
   already added by patch 8, we just need to drop the TODO note there
 - Also drop the TODO note above bdrv_replace_child_tran()


Hanna Reitz (10):
  stream: Traverse graph after modification
  block: Manipulate children list in .attach/.detach
  block: Unite remove_empty_child and child_free
  block: Drop detached child from ignore list
  block: Pass BdrvChild ** to replace_child_noperm
  block: Restructure remove_file_or_backing_child()
  transactions: Invoke clean() after everything else
  block: Let replace_child_tran keep indirect pointer
  block: Let replace_child_noperm free children
  iotests/030: Unthrottle parallel jobs in reverse

 include/qemu/transactions.h |   3 +
 block.c                     | 233 +++++++++++++++++++++++++++---------
 block/stream.c              |   7 +-
 util/transactions.c         |   8 +-
 tests/qemu-iotests/030      |  11 +-
 5 files changed, 201 insertions(+), 61 deletions(-)

Comments

Kevin Wolf Nov. 11, 2021, 5:25 p.m. UTC | #1
Am 11.11.2021 um 13:08 hat Hanna Reitz geschrieben:
> Hi,
> 
> v1 cover letter:
> https://lists.nongnu.org/archive/html/qemu-devel/2021-11/msg01287.html
> 
> In v2 I’ve addressed the comments I’ve received from Kevin and Vladimir.
> To this end, I’ve retained only the non-controversial part in patch 5,
> and split everything else (i.e. the stuff relating to
> bdrv_replace_child_tran()) into the dedicated patches 6 and 8.
> 
> Kevin’s most important comments (to my understanding) were that:
> 
> (A) When bdrv_remove_file_or_backing_child() uses
>     bdrv_replace_child_tran(), we have to ensure that the BDS lives as
>     long as the transaction does.  This is solved by keeping a strong
>     reference to it that’s released only when the transaction is cleaned
>     up (and the new patch 7 ensures that the .clean() handler will be
>     invoked after all .commit()/.abort() handlers, so the reference will
>     really live as long as the transaction does).
> 
> (B) bdrv_replace_node_noperm() passes a pointer to loop-local variable,
>     which is a really bad idea considering the transaction lives much
>     longer than one loop iteration.
>     Luckily, the new_bs it passes is always non-NULL, and so
>     bdrv_replace_child_tran() actually doesn’t need to store the
>     BdrvChild ** pointer, because for a non-NULL new_bs, there is
>     nothing to revert in the abort handler.  We just need to clarify
>     this, not store the pointer in case of a non-NULL new_bs, and assert
>     that bdrv_replace_node_noperm() and its relatives are only used to
>     replace an existing node by some other existing (i.e. non-NULL)
>     node.

Thanks, applied to the block branch.

Kevin