[v2,2/9] block-copy: add missing coroutine_fn annotations

Message ID	20221104095700.4117433-3-eesposit@redhat.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Emanuele Giuseppe Esposito <eesposit@redhat.com> To: qemu-block@nongnu.org Cc: Kevin Wolf <kwolf@redhat.com>, Hanna Reitz <hreitz@redhat.com>, John Snow <jsnow@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>, Eric Blake <eblake@redhat.com>, Fam Zheng <fam@euphon.net>, qemu-devel@nongnu.org, Emanuele Giuseppe Esposito <eesposit@redhat.com> Subject: [PATCH v2 2/9] block-copy: add missing coroutine_fn annotations Date: Fri, 4 Nov 2022 05:56:53 -0400 Message-Id: <20221104095700.4117433-3-eesposit@redhat.com> In-Reply-To: <20221104095700.4117433-1-eesposit@redhat.com> References: <20221104095700.4117433-1-eesposit@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=170.10.133.124; envelope-from=eesposit@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -30 X-Spam_score: -3.1 X-Spam_bar: --- X-Spam_report: (-3.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-1.047, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action Precedence: list Sender: "Qemu-devel" <qemu-devel-bounces@nongnu.org> Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Series	Still more coroutine and various fixes in block layer \| expand [v2,0/9] Still more coroutine and various fixes in block layer [v2,1/9] block: call bdrv_co_drain_begin in a coroutine [v2,2/9] block-copy: add missing coroutine_fn annotations [v2,3/9] nbd/server.c: add missing coroutine_fn annotations [v2,4/9] block-backend: replace bdrv__above with blk__above [v2,5/9] block: distinguish between bdrv_create running in coroutine and not [v2,6/9] block/vmdk: add missing coroutine_fn annotations [v2,7/9] block: bdrv_create_file is a coroutine_fn [v2,8/9] block: bdrv_create is never called in non-coroutine context [v2,9/9] block/dirty-bitmap: remove unnecessary qemu_in_coroutine() case

Emanuele Giuseppe Esposito Nov. 4, 2022, 9:56 a.m. UTC

These functions end up calling bdrv_common_block_status_above(), a
generated_co_wrapper function.
In addition, they also happen to be always called in coroutine context,
meaning all callers are coroutine_fn.
This means that the g_c_w function will enter the qemu_in_coroutine()
case and eventually suspend (or in other words call qemu_coroutine_yield()).
Therefore we need to mark such functions coroutine_fn too.

Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
---
 block/block-copy.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Vladimir Sementsov-Ogievskiy Nov. 8, 2022, 2:48 p.m. UTC | #1

On 11/4/22 12:56, Emanuele Giuseppe Esposito wrote:
> These functions end up calling bdrv_common_block_status_above(), a
> generated_co_wrapper function.

generated_co_wrapper is not a coroutine_fn. Сonversely it's a function that do a class coroutine wrapping - start a coroutine and do POLL to wait for the coroutine to finish.

> In addition, they also happen to be always called in coroutine context,
> meaning all callers are coroutine_fn.

That's also not a reason for marking them coroutine_fn. "coroutine_fn" means that the function can be called only from coroutine context.

> This means that the g_c_w function will enter the qemu_in_coroutine()
> case and eventually suspend (or in other words call qemu_coroutine_yield()).
> Therefore we need to mark such functions coroutine_fn too.

I don't think so. Moreover, this breaks the concept, as your new coroutine_fn functions will call generated_co_wrapper functions which are not marked coroutine_fn and never was.

> 
> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
> ---
>   block/block-copy.c | 15 +++++++++------
>   1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/block/block-copy.c b/block/block-copy.c
> index bb947afdda..f33ab1d0b6 100644
> --- a/block/block-copy.c
> +++ b/block/block-copy.c
> @@ -577,8 +577,9 @@ static coroutine_fn int block_copy_task_entry(AioTask *task)
>       return ret;
>   }
>   
> -static int block_copy_block_status(BlockCopyState *s, int64_t offset,
> -                                   int64_t bytes, int64_t *pnum)
> +static coroutine_fn int block_copy_block_status(BlockCopyState *s,
> +                                                int64_t offset,
> +                                                int64_t bytes, int64_t *pnum)
>   {
>       int64_t num;
>       BlockDriverState *base;
> @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState *s, int64_t offset,
>    * Check if the cluster starting at offset is allocated or not.
>    * return via pnum the number of contiguous clusters sharing this allocation.
>    */
> -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t offset,
> -                                           int64_t *pnum)
> +static int coroutine_fn block_copy_is_cluster_allocated(BlockCopyState *s,
> +                                                        int64_t offset,
> +                                                        int64_t *pnum)
>   {
>       BlockDriverState *bs = s->source->bs;
>       int64_t count, total_count = 0;
> @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t offset, int64_t bytes)
>    * @return 0 when the cluster at @offset was unallocated,
>    *         1 otherwise, and -ret on error.
>    */
> -int64_t block_copy_reset_unallocated(BlockCopyState *s,
> -                                     int64_t offset, int64_t *count)
> +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
> +                                                  int64_t offset,
> +                                                  int64_t *count)
>   {
>       int ret;
>       int64_t clusters, bytes;

Emanuele Giuseppe Esposito Nov. 8, 2022, 3:09 p.m. UTC | #2

Am 08/11/2022 um 15:48 schrieb Vladimir Sementsov-Ogievskiy:
> On 11/4/22 12:56, Emanuele Giuseppe Esposito wrote:
>> These functions end up calling bdrv_common_block_status_above(), a
>> generated_co_wrapper function.
> 
> generated_co_wrapper is not a coroutine_fn. Сonversely it's a function
> that do a class coroutine wrapping - start a coroutine and do POLL to
> wait for the coroutine to finish.
> 
>> In addition, they also happen to be always called in coroutine context,
>> meaning all callers are coroutine_fn.
> 
> That's also not a reason for marking them coroutine_fn. "coroutine_fn"
> means that the function can be called only from coroutine context.
> 
>> This means that the g_c_w function will enter the qemu_in_coroutine()
>> case and eventually suspend (or in other words call
>> qemu_coroutine_yield()).
>> Therefore we need to mark such functions coroutine_fn too.
> 
> I don't think so. Moreover, this breaks the concept, as your new
> coroutine_fn functions will call generated_co_wrapper functions which
> are not marked coroutine_fn and never was.

Theoretically not, because marking it coroutine_fn we know that we are
going in the if(qemu_in_coroutine()) branch of the g_c_w, so we could
directly replace it with the actual function. Because it's a pain to do
it with hand, and at some point I/someone should use Alberto static
analyzer to get rid of that, I decided to leave g_c_w there.

As I understand it, it seems that you and Paolo have a different
understanding on what coroutine_fn means and where it should be used.
Honestly I don't understand your point, as you said

> "coroutine_fn"
> means that the function can be called only from coroutine context.

which is the case for these functions. So could you please clarify?

What I do know is that it's extremely confusing to understand if a
function that is *not* marked as coroutine_fn is actually being called
also from coroutines or not.

Which complicates A LOT whatever change or operation I want to perform
on the BlockDriver callbacks or any other function in the block layer,
because in the current approach for the AioContext lock removal (which
you are not aware of, I understand) we need to distinguish between
functions running in coroutine context and not, and throwing g_c_w
functions everywhere makes my work much harder that it already is.

Thank you,
Emanuele

> 
>>
>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>> ---
>>   block/block-copy.c | 15 +++++++++------
>>   1 file changed, 9 insertions(+), 6 deletions(-)
>>
>> diff --git a/block/block-copy.c b/block/block-copy.c
>> index bb947afdda..f33ab1d0b6 100644
>> --- a/block/block-copy.c
>> +++ b/block/block-copy.c
>> @@ -577,8 +577,9 @@ static coroutine_fn int
>> block_copy_task_entry(AioTask *task)
>>       return ret;
>>   }
>>   -static int block_copy_block_status(BlockCopyState *s, int64_t offset,
>> -                                   int64_t bytes, int64_t *pnum)
>> +static coroutine_fn int block_copy_block_status(BlockCopyState *s,
>> +                                                int64_t offset,
>> +                                                int64_t bytes,
>> int64_t *pnum)
>>   {
>>       int64_t num;
>>       BlockDriverState *base;
>> @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState
>> *s, int64_t offset,
>>    * Check if the cluster starting at offset is allocated or not.
>>    * return via pnum the number of contiguous clusters sharing this
>> allocation.
>>    */
>> -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t
>> offset,
>> -                                           int64_t *pnum)
>> +static int coroutine_fn
>> block_copy_is_cluster_allocated(BlockCopyState *s,
>> +                                                        int64_t offset,
>> +                                                        int64_t *pnum)
>>   {
>>       BlockDriverState *bs = s->source->bs;
>>       int64_t count, total_count = 0;
>> @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t
>> offset, int64_t bytes)
>>    * @return 0 when the cluster at @offset was unallocated,
>>    *         1 otherwise, and -ret on error.
>>    */
>> -int64_t block_copy_reset_unallocated(BlockCopyState *s,
>> -                                     int64_t offset, int64_t *count)
>> +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
>> +                                                  int64_t offset,
>> +                                                  int64_t *count)
>>   {
>>       int ret;
>>       int64_t clusters, bytes;
>

Vladimir Sementsov-Ogievskiy Nov. 8, 2022, 4:19 p.m. UTC | #3

[add Stefan]


On 11/8/22 18:09, Emanuele Giuseppe Esposito wrote:
> 
> 
> Am 08/11/2022 um 15:48 schrieb Vladimir Sementsov-Ogievskiy:
>> On 11/4/22 12:56, Emanuele Giuseppe Esposito wrote:
>>> These functions end up calling bdrv_common_block_status_above(), a
>>> generated_co_wrapper function.
>>
>> generated_co_wrapper is not a coroutine_fn. Сonversely it's a function
>> that do a class coroutine wrapping - start a coroutine and do POLL to
>> wait for the coroutine to finish.
>>
>>> In addition, they also happen to be always called in coroutine context,
>>> meaning all callers are coroutine_fn.
>>
>> That's also not a reason for marking them coroutine_fn. "coroutine_fn"
>> means that the function can be called only from coroutine context.
>>
>>> This means that the g_c_w function will enter the qemu_in_coroutine()
>>> case and eventually suspend (or in other words call
>>> qemu_coroutine_yield()).
>>> Therefore we need to mark such functions coroutine_fn too.
>>
>> I don't think so. Moreover, this breaks the concept, as your new
>> coroutine_fn functions will call generated_co_wrapper functions which
>> are not marked coroutine_fn and never was.
> 
> Theoretically not, 

Agree, I was wrong in this point

> because marking it coroutine_fn we know that we are
> going in the if(qemu_in_coroutine()) branch of the g_c_w, so we could
> directly replace it with the actual function. Because it's a pain to do
> it with hand, and at some point I/someone should use Alberto static
> analyzer to get rid of that, I decided to leave g_c_w there.
> 
> As I understand it, it seems that you and Paolo have a different
> understanding on what coroutine_fn means and where it should be used.

Looks so...

But we have a documentation in coroutine.h, let's check:

  * Mark a function that executes in coroutine context
  *
  * Functions that execute in coroutine context cannot be called directly from
  * normal functions.  In the future it would be nice to enable compiler or
  * static checker support for catching such errors.  This annotation might make
  * it possible and in the meantime it serves as documentation.

Not very clear, but still it say:

  coroutine_fn = "function that executes in coroutine context"
  "functions that execute in coroutine context"  =  "cannot be called directly from normal functions"

So, IMHO that corresponds to my point of view: we shouldn't mark by coroutine_fn functions that can be called from both coroutine context and not.

If we want to change the concept, we should start with rewriting this documentation.

> Honestly I don't understand your point, as you said
> 
>> "coroutine_fn"
>> means that the function can be called only from coroutine context.
> 
> which is the case for these functions. So could you please clarify?
> 
> What I do know is that it's extremely confusing to understand if a
> function that is *not* marked as coroutine_fn is actually being called
> also from coroutines or not.
> 
> Which complicates A LOT whatever change or operation I want to perform
> on the BlockDriver callbacks or any other function in the block layer,
> because in the current approach for the AioContext lock removal (which
> you are not aware of, I understand) we need to distinguish between
> functions running in coroutine context and not, and throwing g_c_w
> functions everywhere makes my work much harder that it already is.

OK, I understand the problem.

Hmm. Formally marking by "coroutine_fn" a function that theoretically can be called from any context doesn't break things. We just say that since that moment we don't allow to call this function from non-coroutine context.

OK, I tend to agree that you are on the right way, sorry for my skepticism)

PS: you recently introduced a lot of IO_CODE() / GLOBAL_STATE_CODE() marks, which (will be/already) turned into assertions.

This is a lot better than our "coroutine_fn" sign, which actually do no check (and can't do). Don't you plan to swap a "coroutine_fn" noop marker with more meaningful IN_COROUTINE(); (or something like this, which just do assert(qemu_in_coroutine())) at start of the function? It would be a lot safer.


> 
> Thank you,
> Emanuele
> 
>>
>>>
>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>> ---
>>>    block/block-copy.c | 15 +++++++++------
>>>    1 file changed, 9 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>> index bb947afdda..f33ab1d0b6 100644
>>> --- a/block/block-copy.c
>>> +++ b/block/block-copy.c
>>> @@ -577,8 +577,9 @@ static coroutine_fn int
>>> block_copy_task_entry(AioTask *task)
>>>        return ret;
>>>    }
>>>    -static int block_copy_block_status(BlockCopyState *s, int64_t offset,
>>> -                                   int64_t bytes, int64_t *pnum)
>>> +static coroutine_fn int block_copy_block_status(BlockCopyState *s,
>>> +                                                int64_t offset,
>>> +                                                int64_t bytes,
>>> int64_t *pnum)
>>>    {
>>>        int64_t num;
>>>        BlockDriverState *base;
>>> @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState
>>> *s, int64_t offset,
>>>     * Check if the cluster starting at offset is allocated or not.
>>>     * return via pnum the number of contiguous clusters sharing this
>>> allocation.
>>>     */
>>> -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t
>>> offset,
>>> -                                           int64_t *pnum)
>>> +static int coroutine_fn
>>> block_copy_is_cluster_allocated(BlockCopyState *s,
>>> +                                                        int64_t offset,
>>> +                                                        int64_t *pnum)
>>>    {
>>>        BlockDriverState *bs = s->source->bs;
>>>        int64_t count, total_count = 0;
>>> @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t
>>> offset, int64_t bytes)
>>>     * @return 0 when the cluster at @offset was unallocated,
>>>     *         1 otherwise, and -ret on error.
>>>     */
>>> -int64_t block_copy_reset_unallocated(BlockCopyState *s,
>>> -                                     int64_t offset, int64_t *count)
>>> +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
>>> +                                                  int64_t offset,
>>> +                                                  int64_t *count)
>>>    {
>>>        int ret;
>>>        int64_t clusters, bytes;
>>
>

Vladimir Sementsov-Ogievskiy Nov. 8, 2022, 4:30 p.m. UTC | #4

On 11/8/22 19:19, Vladimir Sementsov-Ogievskiy wrote:
> This is a lot better than our "coroutine_fn" sign, which actually do no check (and can't do). Don't you plan to swap a "coroutine_fn" noop marker with more meaningful IN_COROUTINE(); (or something like this, which just do assert(qemu_in_coroutine())) at start of the function? It would be a lot safer.


Moreover, we can introduce two macros:

IN_COROUTINE() and NOT_COROUTINE(), mark functions correspondingly and drop coroutine_fn mark. Than the picture would be very deterministic:

IN_COROUTINE - function is called only from coroutine context
NOT_COROUTINE - function is never called from coroutine context
<no mark> - function may be called from both coroutine and non-coroutine context

Emanuele Giuseppe Esposito Nov. 9, 2022, 12:23 p.m. UTC | #5

Am 08/11/2022 um 17:19 schrieb Vladimir Sementsov-Ogievskiy:
> [add Stefan]
> 
> 
> On 11/8/22 18:09, Emanuele Giuseppe Esposito wrote:
>>
>>
>> Am 08/11/2022 um 15:48 schrieb Vladimir Sementsov-Ogievskiy:
>>> On 11/4/22 12:56, Emanuele Giuseppe Esposito wrote:
>>>> These functions end up calling bdrv_common_block_status_above(), a
>>>> generated_co_wrapper function.
>>>
>>> generated_co_wrapper is not a coroutine_fn. Сonversely it's a function
>>> that do a class coroutine wrapping - start a coroutine and do POLL to
>>> wait for the coroutine to finish.
>>>
>>>> In addition, they also happen to be always called in coroutine context,
>>>> meaning all callers are coroutine_fn.
>>>
>>> That's also not a reason for marking them coroutine_fn. "coroutine_fn"
>>> means that the function can be called only from coroutine context.
>>>
>>>> This means that the g_c_w function will enter the qemu_in_coroutine()
>>>> case and eventually suspend (or in other words call
>>>> qemu_coroutine_yield()).
>>>> Therefore we need to mark such functions coroutine_fn too.
>>>
>>> I don't think so. Moreover, this breaks the concept, as your new
>>> coroutine_fn functions will call generated_co_wrapper functions which
>>> are not marked coroutine_fn and never was.
>>
>> Theoretically not, 
> 
> Agree, I was wrong in this point
> 
>> because marking it coroutine_fn we know that we are
>> going in the if(qemu_in_coroutine()) branch of the g_c_w, so we could
>> directly replace it with the actual function. Because it's a pain to do
>> it with hand, and at some point I/someone should use Alberto static
>> analyzer to get rid of that, I decided to leave g_c_w there.
>>
>> As I understand it, it seems that you and Paolo have a different
>> understanding on what coroutine_fn means and where it should be used.
> 
> Looks so...
> 
> But we have a documentation in coroutine.h, let's check:
> 
>  * Mark a function that executes in coroutine context
>  *
>  * Functions that execute in coroutine context cannot be called directly
> from
>  * normal functions.  In the future it would be nice to enable compiler or
>  * static checker support for catching such errors.  This annotation
> might make
>  * it possible and in the meantime it serves as documentation.
> 
> Not very clear, but still it say:
> 
>  coroutine_fn = "function that executes in coroutine context"
>  "functions that execute in coroutine context"  =  "cannot be called
> directly from normal functions"
> 
> So, IMHO that corresponds to my point of view: we shouldn't mark by
> coroutine_fn functions that can be called from both coroutine context
> and not.

Yes and the point is that these functions are not called by
non-coroutine context.

> 
> If we want to change the concept, we should start with rewriting this
> documentation.
> 
>> Honestly I don't understand your point, as you said
>>
>>> "coroutine_fn"
>>> means that the function can be called only from coroutine context.
>>
>> which is the case for these functions. So could you please clarify?
>>
>> What I do know is that it's extremely confusing to understand if a
>> function that is *not* marked as coroutine_fn is actually being called
>> also from coroutines or not.
>>
>> Which complicates A LOT whatever change or operation I want to perform
>> on the BlockDriver callbacks or any other function in the block layer,
>> because in the current approach for the AioContext lock removal (which
>> you are not aware of, I understand) we need to distinguish between
>> functions running in coroutine context and not, and throwing g_c_w
>> functions everywhere makes my work much harder that it already is.
> 
> OK, I understand the problem.
> 
> Hmm. Formally marking by "coroutine_fn" a function that theoretically
> can be called from any context doesn't break things. We just say that
> since that moment we don't allow to call this function from
> non-coroutine context.
> 
> OK, I tend to agree that you are on the right way, sorry for my skepticism)
> 
> PS: you recently introduced a lot of IO_CODE() / GLOBAL_STATE_CODE()
> marks, which (will be/already) turned into assertions.
> 
> This is a lot better than our "coroutine_fn" sign, which actually do no
> check (and can't do). Don't you plan to swap a "coroutine_fn" noop
> marker with more meaningful IN_COROUTINE(); (or something like this,
> which just do assert(qemu_in_coroutine())) at start of the function? It
> would be a lot safer.

CCing also Alberto and Paolo

So basically I think what we need is something that scans the whole
block layer code and puts the right coroutine_fn annotations (or
assertions, if you want) in the right places.

The rule should be (anyone please correct me if I am wrong):
- if a function is only called by coroutine_fn functions, then it's a
coroutine_fn. Theoretically all these functions should eventually end up
calling qemu_coroutine_yield or similar.

- if it's called only from non-coroutine, then leave it as it is.
Probably asserting is a good idea.

- if it's used by both, then it's a case-by-case decision: I think for
simple functions, we can use a special annotation and document that they
should always consider that they could run in both cases.
If it's a big function like the g_c_w, call only the _co_ version in
coroutine_fn, and create a coroutine or call the non-coroutine
counterpart if we are not in coroutine context.
Which is also what I am trying to do with g_c_w_simple.

However, it would be nice to assign this to someone and do this
automatically, not doing it by hand. I am not sure if Alberto static
analyzer is currently capable of doing that.

What do you all think?

Thank you,
Emanuele

> 
> 
>>
>> Thank you,
>> Emanuele
>>
>>>
>>>>
>>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com>
>>>> ---
>>>>    block/block-copy.c | 15 +++++++++------
>>>>    1 file changed, 9 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/block/block-copy.c b/block/block-copy.c
>>>> index bb947afdda..f33ab1d0b6 100644
>>>> --- a/block/block-copy.c
>>>> +++ b/block/block-copy.c
>>>> @@ -577,8 +577,9 @@ static coroutine_fn int
>>>> block_copy_task_entry(AioTask *task)
>>>>        return ret;
>>>>    }
>>>>    -static int block_copy_block_status(BlockCopyState *s, int64_t
>>>> offset,
>>>> -                                   int64_t bytes, int64_t *pnum)
>>>> +static coroutine_fn int block_copy_block_status(BlockCopyState *s,
>>>> +                                                int64_t offset,
>>>> +                                                int64_t bytes,
>>>> int64_t *pnum)
>>>>    {
>>>>        int64_t num;
>>>>        BlockDriverState *base;
>>>> @@ -613,8 +614,9 @@ static int block_copy_block_status(BlockCopyState
>>>> *s, int64_t offset,
>>>>     * Check if the cluster starting at offset is allocated or not.
>>>>     * return via pnum the number of contiguous clusters sharing this
>>>> allocation.
>>>>     */
>>>> -static int block_copy_is_cluster_allocated(BlockCopyState *s, int64_t
>>>> offset,
>>>> -                                           int64_t *pnum)
>>>> +static int coroutine_fn
>>>> block_copy_is_cluster_allocated(BlockCopyState *s,
>>>> +                                                        int64_t
>>>> offset,
>>>> +                                                        int64_t *pnum)
>>>>    {
>>>>        BlockDriverState *bs = s->source->bs;
>>>>        int64_t count, total_count = 0;
>>>> @@ -669,8 +671,9 @@ void block_copy_reset(BlockCopyState *s, int64_t
>>>> offset, int64_t bytes)
>>>>     * @return 0 when the cluster at @offset was unallocated,
>>>>     *         1 otherwise, and -ret on error.
>>>>     */
>>>> -int64_t block_copy_reset_unallocated(BlockCopyState *s,
>>>> -                                     int64_t offset, int64_t *count)
>>>> +int64_t coroutine_fn block_copy_reset_unallocated(BlockCopyState *s,
>>>> +                                                  int64_t offset,
>>>> +                                                  int64_t *count)
>>>>    {
>>>>        int ret;
>>>>        int64_t clusters, bytes;
>>>
>>
>

Alberto Faria Nov. 9, 2022, 11:52 p.m. UTC | #6

On Wed, Nov 9, 2022 at 12:24 PM Emanuele Giuseppe Esposito
<eesposit@redhat.com> wrote:
> CCing also Alberto and Paolo
>
> So basically I think what we need is something that scans the whole
> block layer code and puts the right coroutine_fn annotations (or
> assertions, if you want) in the right places.
>
> The rule should be (anyone please correct me if I am wrong):
> - if a function is only called by coroutine_fn functions, then it's a
> coroutine_fn. Theoretically all these functions should eventually end up
> calling qemu_coroutine_yield or similar.
>
> - if it's called only from non-coroutine, then leave it as it is.
> Probably asserting is a good idea.
>
> - if it's used by both, then it's a case-by-case decision: I think for
> simple functions, we can use a special annotation and document that they
> should always consider that they could run in both cases.
> If it's a big function like the g_c_w, call only the _co_ version in
> coroutine_fn, and create a coroutine or call the non-coroutine
> counterpart if we are not in coroutine context.
> Which is also what I am trying to do with g_c_w_simple.
>
> However, it would be nice to assign this to someone and do this
> automatically, not doing it by hand. I am not sure if Alberto static
> analyzer is currently capable of doing that.
>
> What do you all think?

From what I've seen so far of coroutine_fn, its intended semantics
seem to align completely with the `async` of many languages. The only
restriction is that a function that calls coroutine_fn functions
(a.k.a. coroutines) must itself be coroutine_fn. Marking other
functions as coroutine_fn, as you mention in your first bullet above,
just artificially restricts them to coroutine context. Similarly,
restricting functions to non-coroutine context may not generally be
useful, except when there is an alternative version of the function
that is optimized for coroutine context in some way (e.g., calling a
coroutine_fn instead of the corresponding generated_co_wrapper).

But maybe you're writing a function that you predict will eventually
need to call a coroutine, even though it doesn't today. In those cases
it could make sense to mark it coroutine_fn, to prevent non-coroutine
callers from appearing and later breaking, but this should probably be
the exception, not the rule.

My WIP static analyzer [1] should be able to find most cases where a
non-coroutine_fn function calls a coroutine (some complicated cases
involving function pointers and typedefs are not yet implemented). It
also complains about cases where a coroutine calls a
generated_co_wrapper (see the no_coroutine_fn annotation, which you
can also apply to functions other than generated_co_wrappers). You can
use it today: just merge [1] with your code and run (after building
QEMU):

    ./static-analyzer.py build block

Alberto

[1] https://gitlab.com/albertofaria/qemu/-/tree/static-analysis

Paolo Bonzini Nov. 10, 2022, 10:52 a.m. UTC | #7

On Wed, Nov 9, 2022 at 1:24 PM Emanuele Giuseppe Esposito
<eesposit@redhat.com> wrote:
> > > What I do know is that it's extremely confusing to understand if a
> > > function that is *not* marked as coroutine_fn is actually being called
> > > also from coroutines or not.

Agreed. This is a huge point in favor of pushing coroutine wrappers as
far up in the call stack as possible, because it means more
coroutine_fns and fewer mixed functions.

> > This is a lot better than our "coroutine_fn" sign, which actually do no
> > check (and can't do). Don't you plan to swap a "coroutine_fn" noop
> > marker with more meaningful IN_COROUTINE(); (or something like this,
> > which just do assert(qemu_in_coroutine())) at start of the function? It
> > would be a lot safer.
>
> CCing also Alberto and Paolo
>
> So basically I think what we need is something that scans the whole
> block layer code and puts the right coroutine_fn annotations (or
> assertions, if you want) in the right places.

coroutine_fn markers are done by Alberto's static analyzer, which I
used to add coroutine_fn pretty much everywhere in the code base where
they are *needed*. My rules are simple:

* there MUST be no calls from non-coroutine_fn to coroutine_fn, this is obvious

* there MUST be no blocking in coroutine_fn

* there SHOULD be no calls from coroutine_fn to generated_co_wrapper;
use the wrapped *_co_* function directly instead.

To catch the last one, or possibly the last two, Alberto added
no_coroutine_fn. In a perfect world non-marked functions would be
"valid either in coroutine or non-coroutine function": they would call
neither coroutine_fns nor no_coroutine_fns.

This is unfortunately easier said than done, but in order to move
towards that case, I think we can look again at vrc and extend it with
new commands. Alberto's work covers *local* tests, looking at one
caller and one callee at a time. With vrc's knowledge of the global
call graph, we can find *all* paths from a coroutine_fn to a
generated_co_wrapper, including those that go through unmarked
functions. Then there are two cases:

* if the unmarked function is never called from outside a coroutine,
call the wrapped function and change it to coroutine_fn

* if the unmarked function can be called from outside a coroutine,
change it to a coroutine_fn (renaming it) and add a
generated_co_wrapper. Rinse and repeat.

> However, it would be nice to assign this to someone and do this
> automatically, not doing it by hand. I am not sure if Alberto static
> analyzer is currently capable of doing that.

I think the first step has to be done by hand, because it entails
creating new generated_co_wrappers. Checking for regressions can then
be done automatically though.

Paolo

Emanuele Giuseppe Esposito Nov. 15, 2022, 3:41 p.m. UTC | #8

To sum up on what was discussed in this serie, I don't really see any
strong objection against these patches, so I will soon send v3 which is
pretty much the same except for patch 1, which will be removed.

I think these patches are useful and will be even more meaningful to the
reviewer when in the next few days I send all the rwlock patches.

What has been discussed so far (using QEMU_IN_COROUTINE, using some sort
of tool to automate everything, etc.) has been noted and as I understand
will be researched by Alberto.

Thank you,
Emanuele

Am 10/11/2022 um 11:52 schrieb Paolo Bonzini:
> On Wed, Nov 9, 2022 at 1:24 PM Emanuele Giuseppe Esposito
> <eesposit@redhat.com> wrote:
>>>> What I do know is that it's extremely confusing to understand if a
>>>> function that is *not* marked as coroutine_fn is actually being called
>>>> also from coroutines or not.
> 
> Agreed. This is a huge point in favor of pushing coroutine wrappers as
> far up in the call stack as possible, because it means more
> coroutine_fns and fewer mixed functions.
> 
>>> This is a lot better than our "coroutine_fn" sign, which actually do no
>>> check (and can't do). Don't you plan to swap a "coroutine_fn" noop
>>> marker with more meaningful IN_COROUTINE(); (or something like this,
>>> which just do assert(qemu_in_coroutine())) at start of the function? It
>>> would be a lot safer.
>>
>> CCing also Alberto and Paolo
>>
>> So basically I think what we need is something that scans the whole
>> block layer code and puts the right coroutine_fn annotations (or
>> assertions, if you want) in the right places.
> 
> coroutine_fn markers are done by Alberto's static analyzer, which I
> used to add coroutine_fn pretty much everywhere in the code base where
> they are *needed*. My rules are simple:
> 
> * there MUST be no calls from non-coroutine_fn to coroutine_fn, this is obvious
> 
> * there MUST be no blocking in coroutine_fn
> 
> * there SHOULD be no calls from coroutine_fn to generated_co_wrapper;
> use the wrapped *_co_* function directly instead.
> 
> To catch the last one, or possibly the last two, Alberto added
> no_coroutine_fn. In a perfect world non-marked functions would be
> "valid either in coroutine or non-coroutine function": they would call
> neither coroutine_fns nor no_coroutine_fns.
> 
> This is unfortunately easier said than done, but in order to move
> towards that case, I think we can look again at vrc and extend it with
> new commands. Alberto's work covers *local* tests, looking at one
> caller and one callee at a time. With vrc's knowledge of the global
> call graph, we can find *all* paths from a coroutine_fn to a
> generated_co_wrapper, including those that go through unmarked
> functions. Then there are two cases:
> 
> * if the unmarked function is never called from outside a coroutine,
> call the wrapped function and change it to coroutine_fn
> 
> * if the unmarked function can be called from outside a coroutine,
> change it to a coroutine_fn (renaming it) and add a
> generated_co_wrapper. Rinse and repeat.
> 
>> However, it would be nice to assign this to someone and do this
>> automatically, not doing it by hand. I am not sure if Alberto static
>> analyzer is currently capable of doing that.
> 
> I think the first step has to be done by hand, because it entails
> creating new generated_co_wrappers. Checking for regressions can then
> be done automatically though.
> 
> Paolo
>

Paolo Bonzini Nov. 16, 2022, 4:40 p.m. UTC | #9

On 11/15/22 16:41, Emanuele Giuseppe Esposito wrote:
> To sum up on what was discussed in this serie, I don't really see any
> strong objection against these patches, so I will soon send v3 which is
> pretty much the same except for patch 1, which will be removed.
> 
> I think these patches are useful and will be even more meaningful to the
> reviewer when in the next few days I send all the rwlock patches.

Yes, I agree.

FWIW I implemented path search in vrc and it found 133 candidates 
(functions that are only called by coroutine_fn are not coroutine_fns 
themselves).  I only list them after the signature because as expected, 
most of them are pointless; however there are some are obviously correct:

1) some have _co_ in their name :)

2) these five directly call a generated_co_wrapper so they're an easy catch:

vhdx_log_write_and_flush    -> bdrv_flush
vhdx_log_write_and_flush    -> bdrv_pread
vhdx_log_write_and_flush    -> bdrv_pwrite
mirror_flush                -> blk_flush
qcow2_check_refcounts       -> bdrv_pwrite
qcow2_check_refcounts       -> bdrv_pwrite_sync
qcow2_check_refcounts       -> bdrv_pread
qcow2_read_extensions       -> bdrv_pread
check_directory_consistency -> bdrv_pwrite

(vrc lets me query this with "paths [coroutine_fn_candidate] 
[no_coroutine_fn]")

3) I can also query (with "paths [coroutine_fn_candidate] ... 
[no_coroutine_fn]") those that end up calling a generated_co_wrapper. 
Among these, vrc catches block_copy_reset_unallocated from this patch:

block_copy_reset_unallocated
block_crypto_co_create_generic
calculate_l2_meta
check_directory_consistency
commit_direntries
commit_one_file
is_zero
mirror_flush
qcow2_alloc_bytes
qcow2_alloc_cluster_abort
qcow2_alloc_clusters_at
qcow2_check_refcounts
qcow2_get_last_cluster
qcow2_read_extensions
qcow2_read_snapshots
qcow2_truncate_bitmaps_check
qcow2_update_options
vhdx_log_write_and_flush
vmdk_is_cid_valid
zero_l2_subclusters

Another possibility is to identify common "entry points" in the paths to 
the no_coroutine_fn and make them generated_co_wrappers.  For example in 
qcow2 these include bitmap_list_load, update_refcount and 
get_cluster_table and the qcow2_snapshot_* functions.

Of course the analysis would have to be rerun after doing every change.

The most time consuming part is labeling coroutine_fn/no_coroutine_fn, 
which would be useful to do with clang (and at this point you might as 
well extract the CFG with it).  Doing the queries totally by hand 
doesn't quite scale (for example vrc's blind spot is inlining and I 
forgot to disable it, but I only noticed too late...), but it should be 
scriptable since after all VRC is just a Python package + a nice CLI.

Thanks,

Paolo



label coroutine_fn_candidate aio_get_thread_pool
label coroutine_fn_candidate aio_task_pool_free
label coroutine_fn_candidate aio_task_pool_status
label coroutine_fn_candidate bdrv_bsc_fill
label coroutine_fn_candidate bdrv_bsc_invalidate_range
label coroutine_fn_candidate bdrv_bsc_is_data
label coroutine_fn_candidate bdrv_can_write_zeroes_with_unmap
label coroutine_fn_candidate bdrv_check_request
label coroutine_fn_candidate bdrv_dirty_bitmap_get
label coroutine_fn_candidate bdrv_dirty_bitmap_get_locked
label coroutine_fn_candidate bdrv_dirty_bitmap_lock
label coroutine_fn_candidate bdrv_dirty_bitmap_next_dirty_area
label coroutine_fn_candidate bdrv_dirty_bitmap_next_zero
label coroutine_fn_candidate bdrv_dirty_bitmap_set_inconsistent
label coroutine_fn_candidate bdrv_dirty_bitmap_status
label coroutine_fn_candidate bdrv_dirty_bitmap_truncate
label coroutine_fn_candidate bdrv_dirty_bitmap_unlock
label coroutine_fn_candidate bdrv_dirty_iter_free
label coroutine_fn_candidate bdrv_dirty_iter_new
label coroutine_fn_candidate bdrv_dirty_iter_next
label coroutine_fn_candidate bdrv_has_readonly_bitmaps
label coroutine_fn_candidate bdrv_inc_in_flight
label coroutine_fn_candidate bdrv_min_mem_align
label coroutine_fn_candidate bdrv_pad_request
label coroutine_fn_candidate bdrv_probe_all
label coroutine_fn_candidate bdrv_reset_dirty_bitmap_locked
label coroutine_fn_candidate bdrv_round_to_clusters
label coroutine_fn_candidate bdrv_set_dirty
label coroutine_fn_candidate bdrv_set_dirty_iter
label coroutine_fn_candidate bdrv_write_threshold_check_write
label coroutine_fn_candidate blk_check_byte_request
label coroutine_fn_candidate blkverify_err
label coroutine_fn_candidate block_copy_async
label coroutine_fn_candidate block_copy_call_cancel
label coroutine_fn_candidate block_copy_call_cancelled
label coroutine_fn_candidate block_copy_call_failed
label coroutine_fn_candidate block_copy_call_finished
label coroutine_fn_candidate block_copy_call_free
label coroutine_fn_candidate block_copy_call_status
label coroutine_fn_candidate block_copy_call_succeeded
label coroutine_fn_candidate block_copy_reset_unallocated
label coroutine_fn_candidate block_copy_set_skip_unallocated
label coroutine_fn_candidate block_crypto_co_create_generic
label coroutine_fn_candidate BlockDriver.bdrv_aio_flush
label coroutine_fn_candidate BlockDriver.bdrv_aio_ioctl
label coroutine_fn_candidate BlockDriver.bdrv_aio_pdiscard
label coroutine_fn_candidate BlockDriver.bdrv_aio_preadv
label coroutine_fn_candidate BlockDriver.bdrv_aio_pwritev
label coroutine_fn_candidate BlockDriver.bdrv_co_drain_end
label coroutine_fn_candidate block_job_ratelimit_get_delay
label coroutine_fn_candidate block_job_ratelimit_get_delay
label coroutine_fn_candidate block_status
label coroutine_fn_candidate calculate_l2_meta
label coroutine_fn_candidate check_directory_consistency
label coroutine_fn_candidate commit_direntries
label coroutine_fn_candidate commit_one_file
label coroutine_fn_candidate count_single_write_clusters
label coroutine_fn_candidate iov_send_recv
label coroutine_fn_candidate is_allocated_sectors
label coroutine_fn_candidate iscsi_allocmap_is_allocated
label coroutine_fn_candidate iscsi_allocmap_is_valid
label coroutine_fn_candidate iscsi_allocmap_update
label coroutine_fn_candidate iscsi_populate_target_desc
label coroutine_fn_candidate is_sector_request_lun_aligned
label coroutine_fn_candidate is_zero
label coroutine_fn_candidate job_cancel_requested
label coroutine_fn_candidate job_enter
label coroutine_fn_candidate job_is_cancelled
label coroutine_fn_candidate job_progress_increase_remaining
label coroutine_fn_candidate job_progress_set_remaining
label coroutine_fn_candidate job_progress_update
label coroutine_fn_candidate job_transition_to_ready
label coroutine_fn_candidate mirror_flush
label coroutine_fn_candidate mirror_perform
label coroutine_fn_candidate nbd_client_will_reconnect
label coroutine_fn_candidate nbd_client_will_reconnect
label coroutine_fn_candidate nbd_err_lookup
label coroutine_fn_candidate nbd_errno_to_system_errno
label coroutine_fn_candidate nbd_iter_channel_error
label coroutine_fn_candidate nfs_file_co_create
label coroutine_fn_candidate nvme_get_free_req
label coroutine_fn_candidate nvme_put_free_req_and_wake
label coroutine_fn_candidate pstrcat
label coroutine_fn_candidate qapi_event_send_quorum_failure
label coroutine_fn_candidate qcow2_alloc_bytes
label coroutine_fn_candidate qcow2_alloc_cluster_abort
label coroutine_fn_candidate qcow2_alloc_clusters_at
label coroutine_fn_candidate qcow2_check_refcounts
label coroutine_fn_candidate qcow2_check_vmstate_request
label coroutine_fn_candidate qcow2_get_last_cluster
label coroutine_fn_candidate qcow2_read_extensions
label coroutine_fn_candidate qcow2_read_snapshots
label coroutine_fn_candidate qcow2_truncate_bitmaps_check
label coroutine_fn_candidate qcow2_update_options
label coroutine_fn_candidate qcrypto_block_decrypt
label coroutine_fn_candidate qcrypto_block_encrypt
label coroutine_fn_candidate qdict_rename_keys
label coroutine_fn_candidate qed_alloc_l2_cache_entry
label coroutine_fn_candidate qed_alloc_table
label coroutine_fn_candidate qed_commit_l2_cache_entry
label coroutine_fn_candidate qed_set_used_clusters
label coroutine_fn_candidate qemu_blockalign0
label coroutine_fn_candidate qemu_coroutine_enter_if_inactive
label coroutine_fn_candidate qemu_iovec_clone
label coroutine_fn_candidate qemu_iovec_compare
label coroutine_fn_candidate qemu_iovec_init_slice
label coroutine_fn_candidate qemu_iovec_is_zero
label coroutine_fn_candidate qemu_iovec_subvec_niov
label coroutine_fn_candidate qemu_iovec_to_buf
label coroutine_fn_candidate qemu_rbd_co_create
label coroutine_fn_candidate qio_channel_readv
label coroutine_fn_candidate qio_channel_readv_all
label coroutine_fn_candidate qio_channel_readv_all_eof
label coroutine_fn_candidate quorum_copy_qiov
label coroutine_fn_candidate quorum_report_bad_acb
label coroutine_fn_candidate quorum_vote_error
label coroutine_fn_candidate reqlist_find_conflict
label coroutine_fn_candidate reqlist_init_req
label coroutine_fn_candidate rule_check
label coroutine_fn_candidate throttle_account
label coroutine_fn_candidate tracked_request_set_serialising
label coroutine_fn_candidate vhdx_block_translate
label coroutine_fn_candidate vhdx_log_write_and_flush
label coroutine_fn_candidate vhdx_metadata_entry_le_export
label coroutine_fn_candidate vhdx_metadata_header_le_export
label coroutine_fn_candidate vhdx_region_entry_le_export
label coroutine_fn_candidate vhdx_region_header_le_export
label coroutine_fn_candidate vhdx_update_bat_table_entry
label coroutine_fn_candidate vmdk_is_cid_valid
label coroutine_fn_candidate warn_reportf_err
label coroutine_fn_candidate yank_register_function
label coroutine_fn_candidate zero_l2_subclusters

[v2,2/9] block-copy: add missing coroutine_fn annotations

Commit Message

Comments

Patch