Message ID | 4E81F43B.1040708@codemonkey.ws |
---|---|
State | New |
Headers | show |
On 09/27/2011 06:05 PM, Anthony Liguori wrote: > Actually, for posix-aio, we can just switch to using g_idle_add(). > g_idle_add() uses g_source_attach which is thread safe. g_idle_add() > gives you a thread safe mechanism to defer a piece of work to the > main loop which is really what we want here. For that, a bottom half would also do (apart that I am not sure it is async-safe with TCG). In fact, that would make sense since all of posix_aio_process_queue could become a bottom half. But the problem to be solved here is waking up qemu_aio_wait, and scheduling bottom halves currently wakes up only the vl.c main loop. > This can actually be made to work with sync I/O emulation too by > having another GMainLoop in the sync I/O loop although I thought I > recalled a patch series to remove that stuff. ... which stuff? :) Another GMainLoop in the sync I/O loop is problematic for two reasons: 1) the sync I/O loop does not relinquish the I/O thread mutex, which makes it very different from the outer loop; 2) a nested GMainLoop keeps polling all the file descriptors in the outer loop, which requires you to cope with reentrancy in those monitor commands that flush AIO. Paolo
On 09/27/2011 11:39 AM, Paolo Bonzini wrote: > On 09/27/2011 06:05 PM, Anthony Liguori wrote: >> Actually, for posix-aio, we can just switch to using g_idle_add(). >> g_idle_add() uses g_source_attach which is thread safe. g_idle_add() >> gives you a thread safe mechanism to defer a piece of work to the >> main loop which is really what we want here. > > For that, a bottom half would also do (apart that I am not sure it is > async-safe with TCG). In fact, that would make sense since all of > posix_aio_process_queue could become a bottom half. Bottom halves are signal safe, not thread safe. To make bottom halves thread safe, you would (in the very least) have to add some barriers when reading/writing the scheduling flag. I think it's much better to just use GIdle sources though. >> This can actually be made to work with sync I/O emulation too by >> having another GMainLoop in the sync I/O loop although I thought I >> recalled a patch series to remove that stuff. > > ... which stuff? :) The sync I/O emulation. Since sync I/O is done in block drivers, they can just use coroutine I/O instead of sync I/O. > Another GMainLoop in the sync I/O loop is problematic for > two reasons: 1) the sync I/O loop does not relinquish the I/O thread mutex, > which makes it very different from the outer loop; 2) a nested GMainLoop keeps > polling all the file descriptors in the outer loop, which requires you to cope > with reentrancy in those monitor commands that flush AIO. Re: (2), you can use a separate aio GMainContext. The trick is that you need to keep a global current main context that switches when you enter the AIO main loop. It's doable, but since I think we're on the verge of eliminating sync I/O emulation, that's probably a better path to pursue. Regards, Anthony Liguori > > Paolo >
On 09/27/2011 11:23 PM, Anthony Liguori wrote: > On 09/27/2011 11:39 AM, Paolo Bonzini wrote: >> On 09/27/2011 06:05 PM, Anthony Liguori wrote: >>> Actually, for posix-aio, we can just switch to using g_idle_add(). >>> g_idle_add() uses g_source_attach which is thread safe. g_idle_add() >>> gives you a thread safe mechanism to defer a piece of work to the >>> main loop which is really what we want here. >> >> For that, a bottom half would also do (apart that I am not sure it is >> async-safe with TCG). In fact, that would make sense since all of >> posix_aio_process_queue could become a bottom half. > > Bottom halves are signal safe, not thread safe. > > To make bottom halves thread safe, you would (in the very least) have to > add some barriers when reading/writing the scheduling flag. You can probably assume that qemu_notify_event (and dually the read in the main loop) are resp. write/read memory barriers. Or even full. If we switch entirely to GSources, it would be nice to use them. But since we aren't, and our main loop functionality is quite different from glib's (it doesn't rely on abstractions for file descriptors, for example), it is just a painful incomplete transition to use glib's idle sources to do the exact same thing that is done by bottom halves (which are already in our toolbox). Paolo
On 2011-09-28 08:27, Paolo Bonzini wrote: > On 09/27/2011 11:23 PM, Anthony Liguori wrote: >> On 09/27/2011 11:39 AM, Paolo Bonzini wrote: >>> On 09/27/2011 06:05 PM, Anthony Liguori wrote: >>>> Actually, for posix-aio, we can just switch to using g_idle_add(). >>>> g_idle_add() uses g_source_attach which is thread safe. g_idle_add() >>>> gives you a thread safe mechanism to defer a piece of work to the >>>> main loop which is really what we want here. >>> >>> For that, a bottom half would also do (apart that I am not sure it is >>> async-safe with TCG). In fact, that would make sense since all of >>> posix_aio_process_queue could become a bottom half. >> >> Bottom halves are signal safe, not thread safe. >> >> To make bottom halves thread safe, you would (in the very least) have to >> add some barriers when reading/writing the scheduling flag. > > You can probably assume that qemu_notify_event (and dually the read in > the main loop) are resp. write/read memory barriers. Or even full. > > If we switch entirely to GSources, it would be nice to use them. But > since we aren't, and our main loop functionality is quite different from > glib's (it doesn't rely on abstractions for file descriptors, for > example), it is just a painful incomplete transition to use glib's idle > sources to do the exact same thing that is done by bottom halves (which > are already in our toolbox). BTW, I just wondered if there is anything conceptually preventing to skip this ping pong between AIO thread and main loop and just run the completion over the former context (under global lock protection of course). Jan
On 09/28/2011 09:52 AM, Jan Kiszka wrote: > > You can probably assume that qemu_notify_event (and dually the read in > > the main loop) are resp. write/read memory barriers. Or even full. > > > > If we switch entirely to GSources, it would be nice to use them. But > > since we aren't, and our main loop functionality is quite different from > > glib's (it doesn't rely on abstractions for file descriptors, for > > example), it is just a painful incomplete transition to use glib's idle > > sources to do the exact same thing that is done by bottom halves (which > > are already in our toolbox). > > BTW, I just wondered if there is anything conceptually preventing to > skip this ping pong between AIO thread and main loop and just run the > completion over the former context (under global lock protection of course). Would be a good idea, but it would require some refactoring because tools do not have a global lock. Paolo
On Tue, Sep 27, 2011 at 10:23 PM, Anthony Liguori <anthony@codemonkey.ws> wrote: > On 09/27/2011 11:39 AM, Paolo Bonzini wrote: >> >> On 09/27/2011 06:05 PM, Anthony Liguori wrote: >>> >>> Actually, for posix-aio, we can just switch to using g_idle_add(). >>> g_idle_add() uses g_source_attach which is thread safe. g_idle_add() >>> gives you a thread safe mechanism to defer a piece of work to the >>> main loop which is really what we want here. >> >> For that, a bottom half would also do (apart that I am not sure it is >> async-safe with TCG). In fact, that would make sense since all of >> posix_aio_process_queue could become a bottom half. > > Bottom halves are signal safe, not thread safe. > > To make bottom halves thread safe, you would (in the very least) have to add > some barriers when reading/writing the scheduling flag. I think it's much > better to just use GIdle sources though. > >>> This can actually be made to work with sync I/O emulation too by >>> having another GMainLoop in the sync I/O loop although I thought I >>> recalled a patch series to remove that stuff. >> >> ... which stuff? :) > > The sync I/O emulation. Since sync I/O is done in block drivers, they can > just use coroutine I/O instead of sync I/O. Yes, I think we should covert sync I/O code to use coroutines, which is quite natural. The users of sync I/O today are: 1. Hardware emulation - lesser-used or not performance-critical code paths still use bdrv_read/write() in places, e.g. sd, nand, fdc, ide pio. 2. Block drivers. Some image formats are synchronous, but converting to coroutines is pretty easy. 3. qemu-tools including qemu-img and qemu-io. With Paolo's work to make the event loop available in qemu-tools it should be possible to convert and eliminate synchronous I/O interfaces. Stefan
From cf036a192f09ea76d89648b83ec84a4226d14172 Mon Sep 17 00:00:00 2001 From: Anthony Liguori <aliguori@us.ibm.com> Date: Tue, 27 Sep 2011 11:01:51 -0500 Subject: [PATCH] posix-aio-compat: use g_idle_add() Signed-off-by: Anthony Liguori <aliguori@us.ibm.com> --- posix-aio-compat.c | 51 +++++++-------------------------------------------- 1 files changed, 7 insertions(+), 44 deletions(-) diff --git a/posix-aio-compat.c b/posix-aio-compat.c index d3c1174..1f746be 100644 --- a/posix-aio-compat.c +++ b/posix-aio-compat.c @@ -52,7 +52,6 @@ struct qemu_paiocb { }; typedef struct PosixAioState { - int rfd, wfd; struct qemu_paiocb *first_aio; } PosixAioState; @@ -466,7 +465,7 @@ static int qemu_paio_error(struct qemu_paiocb *aiocb) return ret; } -static int posix_aio_process_queue(void *opaque) +static gboolean posix_aio_process_queue(void *opaque) { PosixAioState *s = opaque; struct qemu_paiocb *acb, **pacb; @@ -477,8 +476,9 @@ static int posix_aio_process_queue(void *opaque) pacb = &s->first_aio; for(;;) { acb = *pacb; - if (!acb) - return result; + if (!acb) { + goto out; + } ret = qemu_paio_error(acb); if (ret == ECANCELED) { @@ -513,27 +513,8 @@ static int posix_aio_process_queue(void *opaque) } } - return result; -} - -static void posix_aio_read(void *opaque) -{ - PosixAioState *s = opaque; - ssize_t len; - - /* read all bytes from signal pipe */ - for (;;) { - char bytes[16]; - - len = read(s->rfd, bytes, sizeof(bytes)); - if (len == -1 && errno == EINTR) - continue; /* try again */ - if (len == sizeof(bytes)) - continue; /* more to read */ - break; - } - - posix_aio_process_queue(s); +out: + return FALSE; } static int posix_aio_flush(void *opaque) @@ -546,12 +527,7 @@ static PosixAioState *posix_aio_state; static void posix_aio_notify_event(void) { - char byte = 0; - ssize_t ret; - - ret = write(posix_aio_state->wfd, &byte, sizeof(byte)); - if (ret < 0 && errno != EAGAIN) - die("write()"); + g_idle_add(posix_aio_process_queue, posix_aio_state); } static void paio_remove(struct qemu_paiocb *acb) @@ -665,19 +641,6 @@ int paio_init(void) s = g_malloc(sizeof(PosixAioState)); s->first_aio = NULL; - if (qemu_pipe(fds) == -1) { - fprintf(stderr, "failed to create pipe\n"); - return -1; - } - - s->rfd = fds[0]; - s->wfd = fds[1]; - - fcntl(s->rfd, F_SETFL, O_NONBLOCK); - fcntl(s->wfd, F_SETFL, O_NONBLOCK); - - qemu_aio_set_fd_handler(s->rfd, posix_aio_read, NULL, posix_aio_flush, - posix_aio_process_queue, s); ret = pthread_attr_init(&attr); if (ret) -- 1.7.4.1