Message ID | 20210614082931.24925-1-eesposit@redhat.com |
---|---|
Headers | show |
Series | blkdebug: fix racing condition when iterating on | expand |
On 14.06.21 10:29, Emanuele Giuseppe Esposito wrote: > When qemu_coroutine_enter is executed in a loop > (even QEMU_FOREACH_SAFE), the new routine can modify the list, > for example removing an element, causing problem when control > is given back to the caller that continues iterating on the same list. > > Patch 1 solves the issue in blkdebug_debug_resume by restarting > the list walk after every coroutine_enter if list has to be fully iterated. > Patches 2,3,4 aim to fix blkdebug_debug_event by gathering > all actions that the rules make in a counter and invoking > the respective coroutine_yeld only after processing all requests. > > Patch 5-6 are somewhat independent of the others, patch 5 removes the need > of new_state field, and patch 6 adds a lock to > protect rules and suspended_reqs; right now everything works because > it's protected by the AioContext lock. > This is a preparation for the current proposal of removing the AioContext > lock and instead using smaller granularity locks to allow multiple > iothread execution in the same block device. > > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> > --- > v5: > * Add comment in patch 1 to explain why we don't need _SAFE in for loop > * Move the state update (s->state = new_state) in patch 5, to maintain > the same existing effect in all patches I’m not sure whether this actually fixes a user-visible bug…? The first paragraph makes it sound like it, but there is no test, so I’m not sure. I’m mostly asking because of freeze; but you make it sound like there’s a bug, and as this only concerns blkdebug (i.e., a block driver used only for testing), I feel like applying this series after soft freeze should be fine, so: Thanks, I’ve applied this series to my block branch: https://github.com/XanClic/qemu/commits/block Max
When qemu_coroutine_enter is executed in a loop (even QEMU_FOREACH_SAFE), the new routine can modify the list, for example removing an element, causing problem when control is given back to the caller that continues iterating on the same list. Patch 1 solves the issue in blkdebug_debug_resume by restarting the list walk after every coroutine_enter if list has to be fully iterated. Patches 2,3,4 aim to fix blkdebug_debug_event by gathering all actions that the rules make in a counter and invoking the respective coroutine_yeld only after processing all requests. Patch 5-6 are somewhat independent of the others, patch 5 removes the need of new_state field, and patch 6 adds a lock to protect rules and suspended_reqs; right now everything works because it's protected by the AioContext lock. This is a preparation for the current proposal of removing the AioContext lock and instead using smaller granularity locks to allow multiple iothread execution in the same block device. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> --- v5: * Add comment in patch 1 to explain why we don't need _SAFE in for loop * Move the state update (s->state = new_state) in patch 5, to maintain the same existing effect in all patches Emanuele Giuseppe Esposito (6): blkdebug: refactor removal of a suspended request blkdebug: move post-resume handling to resume_req_by_tag blkdebug: track all actions blkdebug: do not suspend in the middle of QLIST_FOREACH_SAFE block/blkdebug: remove new_state field and instead use a local variable blkdebug: protect rules and suspended_reqs with a lock block/blkdebug.c | 136 ++++++++++++++++++++++++++++++++--------------- 1 file changed, 92 insertions(+), 44 deletions(-)