Message ID | 20220616131835.2004262-6-eesposit@redhat.com |
---|---|
State | New |
Headers | show |
Series | job: replace AioContext lock with job_mutex | expand |
On 6/16/22 16:18, Emanuele Giuseppe Esposito wrote: > In preparation to the job_lock/unlock usage, create _locked > duplicates of some functions, since they will be sometimes called with > job_mutex held (mostly within job.c), > and sometimes without (mostly from JobDrivers using the job API). > > Therefore create a _locked version of such function, so that it > can be used in both cases. > > List of functions duplicated as _locked: > job_is_ready (both versions are public) > job_is_completed (both versions are public) > job_is_cancelled (_locked version is public, needed by mirror.c) > job_pause_point (_locked version is static, purely done to simplify the code) > job_cancel_requested (_locked version is static) > > Note: at this stage, job_{lock/unlock} and job lock guard macros > are *nop*. Great description, thanks! > > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> Hmm, after this patch, part of public API has "called with/without lock" comments. But there are still public job_* functions that doesn't have this mark. That look inconsistent. I think, all public API without _locked suffix, should be called without a lock? If so, we don't need to write it for each function. And only mark _locked() functions with "called with lock held" marks. > --- > include/qemu/job.h | 25 +++++++++++++++++++++--- > job.c | 48 ++++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 64 insertions(+), 9 deletions(-) > [..] > > -/** Returns whether the job is ready to be completed. */ > +/** Just like job_is_completed, but called between job_lock and job_unlock */ I'd prefer phrasing "called with job_lock held". You wording make me think about job_lock() ... job_unlock() foo() job_lock() ... job_unlock() - foo() actually called between job_lock and job_unlock :) (it's a nitpicking, you may ignore it :) > +bool job_is_completed_locked(Job *job); > + > +/** > + * Returns whether the job is ready to be completed. > + * Called with job_mutex *not* held. > + */ > bool job_is_ready(Job *job); > > +/** Just like job_is_ready, but called between job_lock and job_unlock */ > +bool job_is_ready_locked(Job *job); > + > /** > * Request @job to pause at the next pause point. Must be paired with > * job_resume(). If the job is supposed to be resumed by user action, call > diff --git a/job.c b/job.c > index cafd597ba4..c4776985c4 100644 > --- a/job.c > +++ b/job.c > @@ -236,19 +236,32 @@ const char *job_type_str(const Job *job) > return JobType_str(job_type(job)); > } > > -bool job_is_cancelled(Job *job) > +bool job_is_cancelled_locked(Job *job) > { > /* force_cancel may be true only if cancelled is true, too */ > assert(job->cancelled || !job->force_cancel); > return job->force_cancel; > } > > -bool job_cancel_requested(Job *job) > +bool job_is_cancelled(Job *job) > +{ > + JOB_LOCK_GUARD(); > + return job_is_cancelled_locked(job); > +} > + > +/* Called with job_mutex held. */ > +static bool job_cancel_requested_locked(Job *job) > { > return job->cancelled; > } > > -bool job_is_ready(Job *job) > +bool job_cancel_requested(Job *job) > +{ > + JOB_LOCK_GUARD(); > + return job_cancel_requested_locked(job); > +} > + > +bool job_is_ready_locked(Job *job) > { > switch (job->status) { > case JOB_STATUS_UNDEFINED: > @@ -270,7 +283,13 @@ bool job_is_ready(Job *job) > return false; > } > > -bool job_is_completed(Job *job) > +bool job_is_ready(Job *job) > +{ > + JOB_LOCK_GUARD(); > + return job_is_ready_locked(job); > +} > + > +bool job_is_completed_locked(Job *job) > { > switch (job->status) { > case JOB_STATUS_UNDEFINED: > @@ -292,6 +311,12 @@ bool job_is_completed(Job *job) > return false; > } > > +bool job_is_completed(Job *job) > +{ > + JOB_LOCK_GUARD(); > + return job_is_completed_locked(job); > +} > + > static bool job_started(Job *job) > { > return job->co; > @@ -521,7 +546,8 @@ static void coroutine_fn job_do_yield(Job *job, uint64_t ns) > assert(job->busy); > } > > -void coroutine_fn job_pause_point(Job *job) > +/* Called with job_mutex held, but releases it temporarily. */ > +static void coroutine_fn job_pause_point_locked(Job *job) > { > assert(job && job_started(job)); In this function, we should now use job_pause_point_locked(), otherwise it looks incorrect. (I remember that lock is noop for now, but still, let's keep think as correct as possible) And job_do_yield() takes lock by itself. How to resolve it? > > @@ -552,6 +578,12 @@ void coroutine_fn job_pause_point(Job *job) > } > } > > +void coroutine_fn job_pause_point(Job *job) > +{ > + JOB_LOCK_GUARD(); > + job_pause_point_locked(job); > +} > + > void job_yield(Job *job) > { > assert(job->busy); > @@ -949,11 +981,15 @@ static void job_completed(Job *job) > } > } > > -/** Useful only as a type shim for aio_bh_schedule_oneshot. */ > +/** > + * Useful only as a type shim for aio_bh_schedule_oneshot. > + * Called with job_mutex *not* held. > + */ > static void job_exit(void *opaque) > { > Job *job = (Job *)opaque; > AioContext *ctx; > + JOB_LOCK_GUARD(); That's not part of this patch.. Doesn't relate to "add _locked duplicates" > > job_ref(job); > aio_context_acquire(job->aio_context);
Am 21/06/2022 um 17:03 schrieb Vladimir Sementsov-Ogievskiy: > On 6/16/22 16:18, Emanuele Giuseppe Esposito wrote: >> In preparation to the job_lock/unlock usage, create _locked >> duplicates of some functions, since they will be sometimes called with >> job_mutex held (mostly within job.c), >> and sometimes without (mostly from JobDrivers using the job API). >> >> Therefore create a _locked version of such function, so that it >> can be used in both cases. >> >> List of functions duplicated as _locked: >> job_is_ready (both versions are public) >> job_is_completed (both versions are public) >> job_is_cancelled (_locked version is public, needed by mirror.c) >> job_pause_point (_locked version is static, purely done to simplify >> the code) >> job_cancel_requested (_locked version is static) >> >> Note: at this stage, job_{lock/unlock} and job lock guard macros >> are *nop*. > > Great description, thanks! > >> >> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> >> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> > > Hmm, after this patch, part of public API has "called with/without lock" > comments. But there are still public job_* functions that doesn't have > this mark. That look inconsistent. I think, all public API without > _locked suffix, should be called without a lock? If so, we don't need to > write it for each function. And only mark _locked() functions with > "called with lock held" marks. > >> --- >> include/qemu/job.h | 25 +++++++++++++++++++++--- >> job.c | 48 ++++++++++++++++++++++++++++++++++++++++------ >> 2 files changed, 64 insertions(+), 9 deletions(-) >> > > [..] > >> -/** Returns whether the job is ready to be completed. */ >> +/** Just like job_is_completed, but called between job_lock and >> job_unlock */ > > I'd prefer phrasing "called with job_lock held". You wording make me > think about > > job_lock() > ... > job_unlock() > > foo() > > job_lock() > ... > job_unlock() > > - foo() actually called between job_lock and job_unlock :) > > (it's a nitpicking, you may ignore it :) > >> +bool job_is_completed_locked(Job *job); >> + >> +/** >> + * Returns whether the job is ready to be completed. >> + * Called with job_mutex *not* held. >> + */ >> bool job_is_ready(Job *job); >> +/** Just like job_is_ready, but called between job_lock and >> job_unlock */ >> +bool job_is_ready_locked(Job *job); >> + >> /** >> * Request @job to pause at the next pause point. Must be paired with >> * job_resume(). If the job is supposed to be resumed by user >> action, call >> diff --git a/job.c b/job.c >> index cafd597ba4..c4776985c4 100644 >> --- a/job.c >> +++ b/job.c >> @@ -236,19 +236,32 @@ const char *job_type_str(const Job *job) >> return JobType_str(job_type(job)); >> } >> -bool job_is_cancelled(Job *job) >> +bool job_is_cancelled_locked(Job *job) >> { >> /* force_cancel may be true only if cancelled is true, too */ >> assert(job->cancelled || !job->force_cancel); >> return job->force_cancel; >> } >> -bool job_cancel_requested(Job *job) >> +bool job_is_cancelled(Job *job) >> +{ >> + JOB_LOCK_GUARD(); >> + return job_is_cancelled_locked(job); >> +} >> + >> +/* Called with job_mutex held. */ >> +static bool job_cancel_requested_locked(Job *job) >> { >> return job->cancelled; >> } >> -bool job_is_ready(Job *job) >> +bool job_cancel_requested(Job *job) >> +{ >> + JOB_LOCK_GUARD(); >> + return job_cancel_requested_locked(job); >> +} >> + >> +bool job_is_ready_locked(Job *job) >> { >> switch (job->status) { >> case JOB_STATUS_UNDEFINED: >> @@ -270,7 +283,13 @@ bool job_is_ready(Job *job) >> return false; >> } >> -bool job_is_completed(Job *job) >> +bool job_is_ready(Job *job) >> +{ >> + JOB_LOCK_GUARD(); >> + return job_is_ready_locked(job); >> +} >> + >> +bool job_is_completed_locked(Job *job) >> { >> switch (job->status) { >> case JOB_STATUS_UNDEFINED: >> @@ -292,6 +311,12 @@ bool job_is_completed(Job *job) >> return false; >> } >> +bool job_is_completed(Job *job) >> +{ >> + JOB_LOCK_GUARD(); >> + return job_is_completed_locked(job); >> +} >> + >> static bool job_started(Job *job) >> { >> return job->co; >> @@ -521,7 +546,8 @@ static void coroutine_fn job_do_yield(Job *job, >> uint64_t ns) >> assert(job->busy); >> } >> -void coroutine_fn job_pause_point(Job *job) >> +/* Called with job_mutex held, but releases it temporarily. */ >> +static void coroutine_fn job_pause_point_locked(Job *job) >> { >> assert(job && job_started(job)); > > In this function, we should now use job_pause_point_locked(), otherwise > it looks incorrect. (I remember that lock is noop for now, but still, > let's keep think as correct as possible) > I miss your point here. What is incorrect? > > And job_do_yield() takes lock by itself. How to resolve it? You mean the real_job_lock/unlock taken in job_do_yield? > >> @@ -552,6 +578,12 @@ void coroutine_fn job_pause_point(Job *job) >> } >> } >> +void coroutine_fn job_pause_point(Job *job) >> +{ >> + JOB_LOCK_GUARD(); >> + job_pause_point_locked(job); >> +} >> + >> void job_yield(Job *job) >> { >> assert(job->busy); >> @@ -949,11 +981,15 @@ static void job_completed(Job *job) >> } >> } >> -/** Useful only as a type shim for aio_bh_schedule_oneshot. */ >> +/** >> + * Useful only as a type shim for aio_bh_schedule_oneshot. >> + * Called with job_mutex *not* held. >> + */ >> static void job_exit(void *opaque) >> { >> Job *job = (Job *)opaque; >> AioContext *ctx; >> + JOB_LOCK_GUARD(); > > That's not part of this patch.. Doesn't relate to "add _locked duplicates" > >> job_ref(job); >> aio_context_acquire(job->aio_context); > >
On 6/22/22 17:26, Emanuele Giuseppe Esposito wrote: > > > Am 21/06/2022 um 17:03 schrieb Vladimir Sementsov-Ogievskiy: >> On 6/16/22 16:18, Emanuele Giuseppe Esposito wrote: >>> In preparation to the job_lock/unlock usage, create _locked >>> duplicates of some functions, since they will be sometimes called with >>> job_mutex held (mostly within job.c), >>> and sometimes without (mostly from JobDrivers using the job API). >>> >>> Therefore create a _locked version of such function, so that it >>> can be used in both cases. >>> >>> List of functions duplicated as _locked: >>> job_is_ready (both versions are public) >>> job_is_completed (both versions are public) >>> job_is_cancelled (_locked version is public, needed by mirror.c) >>> job_pause_point (_locked version is static, purely done to simplify >>> the code) >>> job_cancel_requested (_locked version is static) >>> >>> Note: at this stage, job_{lock/unlock} and job lock guard macros >>> are *nop*. >> >> Great description, thanks! >> >>> >>> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> >>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> >> >> Hmm, after this patch, part of public API has "called with/without lock" >> comments. But there are still public job_* functions that doesn't have >> this mark. That look inconsistent. I think, all public API without >> _locked suffix, should be called without a lock? If so, we don't need to >> write it for each function. And only mark _locked() functions with >> "called with lock held" marks. >> >>> --- >>> include/qemu/job.h | 25 +++++++++++++++++++++--- >>> job.c | 48 ++++++++++++++++++++++++++++++++++++++++------ >>> 2 files changed, 64 insertions(+), 9 deletions(-) >>> >> >> [..] >> >>> -/** Returns whether the job is ready to be completed. */ >>> +/** Just like job_is_completed, but called between job_lock and >>> job_unlock */ >> >> I'd prefer phrasing "called with job_lock held". You wording make me >> think about >> >> job_lock() >> ... >> job_unlock() >> >> foo() >> >> job_lock() >> ... >> job_unlock() >> >> - foo() actually called between job_lock and job_unlock :) >> >> (it's a nitpicking, you may ignore it :) >> >>> +bool job_is_completed_locked(Job *job); >>> + >>> +/** >>> + * Returns whether the job is ready to be completed. >>> + * Called with job_mutex *not* held. >>> + */ >>> bool job_is_ready(Job *job); >>> +/** Just like job_is_ready, but called between job_lock and >>> job_unlock */ >>> +bool job_is_ready_locked(Job *job); >>> + >>> /** >>> * Request @job to pause at the next pause point. Must be paired with >>> * job_resume(). If the job is supposed to be resumed by user >>> action, call >>> diff --git a/job.c b/job.c >>> index cafd597ba4..c4776985c4 100644 >>> --- a/job.c >>> +++ b/job.c >>> @@ -236,19 +236,32 @@ const char *job_type_str(const Job *job) >>> return JobType_str(job_type(job)); >>> } >>> -bool job_is_cancelled(Job *job) >>> +bool job_is_cancelled_locked(Job *job) >>> { >>> /* force_cancel may be true only if cancelled is true, too */ >>> assert(job->cancelled || !job->force_cancel); >>> return job->force_cancel; >>> } >>> -bool job_cancel_requested(Job *job) >>> +bool job_is_cancelled(Job *job) >>> +{ >>> + JOB_LOCK_GUARD(); >>> + return job_is_cancelled_locked(job); >>> +} >>> + >>> +/* Called with job_mutex held. */ >>> +static bool job_cancel_requested_locked(Job *job) >>> { >>> return job->cancelled; >>> } >>> -bool job_is_ready(Job *job) >>> +bool job_cancel_requested(Job *job) >>> +{ >>> + JOB_LOCK_GUARD(); >>> + return job_cancel_requested_locked(job); >>> +} >>> + >>> +bool job_is_ready_locked(Job *job) >>> { >>> switch (job->status) { >>> case JOB_STATUS_UNDEFINED: >>> @@ -270,7 +283,13 @@ bool job_is_ready(Job *job) >>> return false; >>> } >>> -bool job_is_completed(Job *job) >>> +bool job_is_ready(Job *job) >>> +{ >>> + JOB_LOCK_GUARD(); >>> + return job_is_ready_locked(job); >>> +} >>> + >>> +bool job_is_completed_locked(Job *job) >>> { >>> switch (job->status) { >>> case JOB_STATUS_UNDEFINED: >>> @@ -292,6 +311,12 @@ bool job_is_completed(Job *job) >>> return false; >>> } >>> +bool job_is_completed(Job *job) >>> +{ >>> + JOB_LOCK_GUARD(); >>> + return job_is_completed_locked(job); >>> +} >>> + >>> static bool job_started(Job *job) >>> { >>> return job->co; >>> @@ -521,7 +546,8 @@ static void coroutine_fn job_do_yield(Job *job, >>> uint64_t ns) >>> assert(job->busy); >>> } >>> -void coroutine_fn job_pause_point(Job *job) >>> +/* Called with job_mutex held, but releases it temporarily. */ >>> +static void coroutine_fn job_pause_point_locked(Job *job) >>> { >>> assert(job && job_started(job)); >> >> In this function, we should now use job_pause_point_locked(), otherwise >> it looks incorrect. (I remember that lock is noop for now, but still, >> let's keep think as correct as possible) >> > > I miss your point here. What is incorrect? Function called with lock held. But it calls job_pause_point(), which do lock the mutex. This will deadlock. That doesn't deadlock only because our mutex is noop. That's why I say "it looks incorrect". >> >> And job_do_yield() takes lock by itself. How to resolve it? > > You mean the real_job_lock/unlock taken in job_do_yield? Yes.. Hmm, but we can consider real_job_lock as something other, that can be taken under job_lock. > >> >>> @@ -552,6 +578,12 @@ void coroutine_fn job_pause_point(Job *job) >>> } >>> } >>> +void coroutine_fn job_pause_point(Job *job) >>> +{ >>> + JOB_LOCK_GUARD(); >>> + job_pause_point_locked(job); >>> +} >>> + >>> void job_yield(Job *job) >>> { >>> assert(job->busy); >>> @@ -949,11 +981,15 @@ static void job_completed(Job *job) >>> } >>> } >>> -/** Useful only as a type shim for aio_bh_schedule_oneshot. */ >>> +/** >>> + * Useful only as a type shim for aio_bh_schedule_oneshot. >>> + * Called with job_mutex *not* held. >>> + */ >>> static void job_exit(void *opaque) >>> { >>> Job *job = (Job *)opaque; >>> AioContext *ctx; >>> + JOB_LOCK_GUARD(); >> >> That's not part of this patch.. Doesn't relate to "add _locked duplicates" >> >>> job_ref(job); >>> aio_context_acquire(job->aio_context); >> >> >
diff --git a/include/qemu/job.h b/include/qemu/job.h index 4b64eb15f7..275d593715 100644 --- a/include/qemu/job.h +++ b/include/qemu/job.h @@ -475,21 +475,40 @@ const char *job_type_str(const Job *job); /** Returns true if the job should not be visible to the management layer. */ bool job_is_internal(Job *job); -/** Returns whether the job is being cancelled. */ +/** + * Returns whether the job is being cancelled. + * Called with job_mutex *not* held. + */ bool job_is_cancelled(Job *job); +/** Just like job_is_cancelled, but called between job_lock and job_unlock */ +bool job_is_cancelled_locked(Job *job); + /** * Returns whether the job is scheduled for cancellation (at an * indefinite point). + * Called with job_mutex *not* held. */ bool job_cancel_requested(Job *job); -/** Returns whether the job is in a completed state. */ +/** + * Returns whether the job is in a completed state. + * Called with job_mutex *not* held. + */ bool job_is_completed(Job *job); -/** Returns whether the job is ready to be completed. */ +/** Just like job_is_completed, but called between job_lock and job_unlock */ +bool job_is_completed_locked(Job *job); + +/** + * Returns whether the job is ready to be completed. + * Called with job_mutex *not* held. + */ bool job_is_ready(Job *job); +/** Just like job_is_ready, but called between job_lock and job_unlock */ +bool job_is_ready_locked(Job *job); + /** * Request @job to pause at the next pause point. Must be paired with * job_resume(). If the job is supposed to be resumed by user action, call diff --git a/job.c b/job.c index cafd597ba4..c4776985c4 100644 --- a/job.c +++ b/job.c @@ -236,19 +236,32 @@ const char *job_type_str(const Job *job) return JobType_str(job_type(job)); } -bool job_is_cancelled(Job *job) +bool job_is_cancelled_locked(Job *job) { /* force_cancel may be true only if cancelled is true, too */ assert(job->cancelled || !job->force_cancel); return job->force_cancel; } -bool job_cancel_requested(Job *job) +bool job_is_cancelled(Job *job) +{ + JOB_LOCK_GUARD(); + return job_is_cancelled_locked(job); +} + +/* Called with job_mutex held. */ +static bool job_cancel_requested_locked(Job *job) { return job->cancelled; } -bool job_is_ready(Job *job) +bool job_cancel_requested(Job *job) +{ + JOB_LOCK_GUARD(); + return job_cancel_requested_locked(job); +} + +bool job_is_ready_locked(Job *job) { switch (job->status) { case JOB_STATUS_UNDEFINED: @@ -270,7 +283,13 @@ bool job_is_ready(Job *job) return false; } -bool job_is_completed(Job *job) +bool job_is_ready(Job *job) +{ + JOB_LOCK_GUARD(); + return job_is_ready_locked(job); +} + +bool job_is_completed_locked(Job *job) { switch (job->status) { case JOB_STATUS_UNDEFINED: @@ -292,6 +311,12 @@ bool job_is_completed(Job *job) return false; } +bool job_is_completed(Job *job) +{ + JOB_LOCK_GUARD(); + return job_is_completed_locked(job); +} + static bool job_started(Job *job) { return job->co; @@ -521,7 +546,8 @@ static void coroutine_fn job_do_yield(Job *job, uint64_t ns) assert(job->busy); } -void coroutine_fn job_pause_point(Job *job) +/* Called with job_mutex held, but releases it temporarily. */ +static void coroutine_fn job_pause_point_locked(Job *job) { assert(job && job_started(job)); @@ -552,6 +578,12 @@ void coroutine_fn job_pause_point(Job *job) } } +void coroutine_fn job_pause_point(Job *job) +{ + JOB_LOCK_GUARD(); + job_pause_point_locked(job); +} + void job_yield(Job *job) { assert(job->busy); @@ -949,11 +981,15 @@ static void job_completed(Job *job) } } -/** Useful only as a type shim for aio_bh_schedule_oneshot. */ +/** + * Useful only as a type shim for aio_bh_schedule_oneshot. + * Called with job_mutex *not* held. + */ static void job_exit(void *opaque) { Job *job = (Job *)opaque; AioContext *ctx; + JOB_LOCK_GUARD(); job_ref(job); aio_context_acquire(job->aio_context);