Message ID | 1359979973-31338-3-git-send-email-stefanha@redhat.com |
---|---|
State | New |
Headers | show |
Comments in-line. On 02/04/13 13:12, Stefan Hajnoczi wrote: > Use g_poll(3) instead of select(2). Well, this is kind of a cheat. > It's true that we're now using g_poll(3) on POSIX hosts but the *_fill() > and *_poll() functions are still using rfds/wfds/xfds. > > We've set the scene to start converting *_fill() and *_poll() functions > step-by-step until no more rfds/wfds/xfds users remain. Then we'll drop > the temporary gpollfds_from_select() and gpollfds_to_select() functions > and be left with native g_poll(2). > > On Windows things are a little crazy: convert from rfds/wfds/xfds to > GPollFDs, back to rfds/wfds/xfds, call select(2), rfds/wfds/xfds back to > GPollFDs, and finally back to rfds/wfds/xfds again. This is only > temporary and keeps the Windows build working through the following > patches. We'll drop this excessive conversion later and be left with a > single GPollFDs -> select(2) -> GPollFDs sequence that allows Windows to > use select(2) while the rest of QEMU only knows about GPollFD. > > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> > --- > main-loop.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- > 1 file changed, 127 insertions(+), 8 deletions(-) I assume this is complex because the pre-patch situation is overly complex as well: slirp -> fd_set iohandler -> fd_set unix windows --------------------------------- -------------------------------------------- before after before after -------------- ----------------- --------------------- --------------------- glib -> fd_set glib -> fd_set glib -> GPollFD glib -> GPollFD fd_set -> GPollFD WaitObj -> GPollFD WaitObj -> GPollFD SELECT G_POLL G_POLL (glib/WaitObj) G_POLL (glib/WaitObj) fd_set -> GPollFD GPollFD -> fd_set SELECT (slirp/ioh.) SELECT (slirp/ioh.) (I'm ignoring the after-select / after-poll operations; they (should) mirror the pre-select / pre-poll ones.) No idea why the windows version has been mixing g_poll() and select(). I'd hope that this series kills select() for uniformity's sake, but the 00/10 subject and the commit msg for this patch indicate otherwise. "main-loop.c" is full of global variables which makes it a *pain* to read. > > diff --git a/main-loop.c b/main-loop.c > index d0d8fe4..f1dcd14 100644 > --- a/main-loop.c > +++ b/main-loop.c > @@ -117,6 +117,8 @@ void qemu_notify_event(void) > aio_notify(qemu_aio_context); > } > > +static GArray *gpollfds; > + > int qemu_init_main_loop(void) > { > int ret; > @@ -133,6 +135,7 @@ int qemu_init_main_loop(void) > return ret; > } > > + gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD)); > qemu_aio_context = aio_context_new(); > src = aio_get_g_source(qemu_aio_context); > g_source_attach(src, NULL); > @@ -146,6 +149,62 @@ static GPollFD poll_fds[1024 * 2]; /* this is probably overkill */ > static int n_poll_fds; > static int max_priority; > > +/* Load rfds/wfds/xfds into gpollfds. Will be removed a few commits later. */ > +static void gpollfds_from_select(void) > +{ > + int fd; > + for (fd = 0; fd <= nfds; fd++) { "nfds" is an example of a global variable being global ("file scope") for no good reason. Anyway it does seem to have inclusive meaning ("highest file descriptor in all sets"). > + int events = 0; > + if (FD_ISSET(fd, &rfds)) { > + events |= G_IO_IN | G_IO_HUP | G_IO_ERR; > + } ("=" would have sufficed) > + if (FD_ISSET(fd, &wfds)) { > + events |= G_IO_OUT | G_IO_ERR; > + } > + if (FD_ISSET(fd, &xfds)) { > + events |= G_IO_PRI; > + } > + if (events) { > + GPollFD pfd = { > + .fd = fd, > + .events = events, > + }; > + g_array_append_val(gpollfds, pfd); > + } > + } > +} (I don't like "gpollfds" being global, but) this seems OK. It matches the glib docs and also How the Mapping Should Be Done (TM). This function corresponds to the "unix / after / fd_set -> GPollFD" entry of the diagram, and as such it is composed with "glib -> fd_set" (glib_select_fill()). I'll note that glib_select_fill() sets "xfds" for G_IO_ERR but not G_IO_PRI. (According to SUSv4, in our case xfds is the logical-or of "error pending" and "OOB/urgent pending".) We can assume that each entry stored by g_main_context_query() will have at least one of IN and OUT set in "events". Further assuming that glib follows its own documentation, that implies that G_IO_ERR will be set unconditionally (because it always accompanies each of IN and OUT). glib_select_fill() will map that to xfds, which we then map here to G_IO_PRI. The total effect is that G_IO_PRI is set on all file descriptors, always, even if we only try to write. I think ultimately support for OOB data should be killed throughout, as a policy, including xfds & G_IO_PRI. Pending errors are returned by read()/write()/etc just fine; see eg. libvirt commit d19149dd. > + > +/* Store gpollfds revents into rfds/wfds/xfds. Will be removed a few commits > + * later. > + */ > +static void gpollfds_to_select(int ret) > +{ > + int i; > + > + FD_ZERO(&rfds); > + FD_ZERO(&wfds); > + FD_ZERO(&xfds); > + > + if (ret <= 0) { > + return; > + } ie. for g_poll() timeouts, errors & signal interrupts, we clear the sets and that's it. OK. > + > + for (i = 0; i < gpollfds->len; i++) { > + int fd = g_array_index(gpollfds, GPollFD, i).fd; > + int revents = g_array_index(gpollfds, GPollFD, i).revents; > + > + if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) { > + FD_SET(fd, &rfds); > + } > + if (revents & (G_IO_OUT | G_IO_ERR)) { > + FD_SET(fd, &wfds); > + } > + if (revents & G_IO_PRI) { > + FD_SET(fd, &xfds); > + } > + } > +} Looks good. > + > #ifndef _WIN32 > static void glib_select_fill(int *max_fd, fd_set *rfds, fd_set *wfds, > fd_set *xfds, uint32_t *cur_timeout) > @@ -212,22 +271,22 @@ static void glib_select_poll(fd_set *rfds, fd_set *wfds, fd_set *xfds, > > static int os_host_main_loop_wait(uint32_t timeout) > { > - struct timeval tv, *tvarg = NULL; > int ret; > > glib_select_fill(&nfds, &rfds, &wfds, &xfds, &timeout); > > - if (timeout < UINT32_MAX) { > - tvarg = &tv; > - tv.tv_sec = timeout / 1000; > - tv.tv_usec = (timeout % 1000) * 1000; > - } > - > if (timeout > 0) { > qemu_mutex_unlock_iothread(); > } > > - ret = select(nfds + 1, &rfds, &wfds, &xfds, tvarg); > + /* We'll eventually drop fd_set completely. But for now we still have > + * *_fill() and *_poll() functions that use rfds/wfds/xfds. > + */ > + gpollfds_from_select(); > + > + ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout); This exploits for the timeout parameter that (int)UINT32_MAX equals -1 on all supported platforms. Timeout values in [ INT_MAX+1, UINT_MAX32-1 ] lose their meaning, but I guess that's no problem in practice. > + > + gpollfds_to_select(ret); > > if (timeout > 0) { > qemu_mutex_lock_iothread(); > @@ -327,6 +386,55 @@ void qemu_fd_register(int fd) > FD_CONNECT | FD_WRITE | FD_OOB); > } > > +static int pollfds_fill(GArray *pollfds, fd_set *rfds, fd_set *wfds, > + fd_set *xfds) > +{ > + int nfds = -1; "nfds" shadows the global "nfds". The return value of pollfds_fill() is then assigned to the global "nfds". Not very elegant (but it's all rooted in the existense of the global nfds). > + int i; > + > + for (i = 0; i < pollfds->len; i++) { > + GPollFD *pfd = &g_array_index(pollfds, GPollFD, i); > + int fd = pfd->fd; > + int events = pfd->events; > + if (events & (G_IO_IN | G_IO_HUP | G_IO_ERR)) { > + FD_SET(fd, rfds); > + nfds = MAX(nfds, fd); > + } > + if (events & (G_IO_OUT | G_IO_ERR)) { > + FD_SET(fd, wfds); > + nfds = MAX(nfds, fd); > + } > + if (events & G_IO_PRI) { > + FD_SET(fd, xfds); > + nfds = MAX(nfds, fd); > + } > + } > + return nfds; > +} > + > +static void pollfds_poll(GArray *pollfds, int nfds, fd_set *rfds, > + fd_set *wfds, fd_set *xfds) > +{ The "nfds" parameter both shadows the global variable and is unused in the function. > + int i; > + > + for (i = 0; i < pollfds->len; i++) { > + GPollFD *pfd = &g_array_index(pollfds, GPollFD, i); > + int fd = pfd->fd; > + int revents = 0; > + > + if (FD_ISSET(fd, rfds)) { > + revents |= G_IO_IN | G_IO_HUP | G_IO_ERR; > + } ("=" would have sufficed) > + if (FD_ISSET(fd, wfds)) { > + revents |= G_IO_OUT | G_IO_ERR; > + } > + if (FD_ISSET(fd, xfds)) { > + revents |= G_IO_PRI; > + } > + pfd->revents |= revents & pfd->events; I find this |= vs. = more confusing than the other two. "pfd->revents" is zero here. > + } > +} > + > static int os_host_main_loop_wait(uint32_t timeout) > { > GMainContext *context = g_main_context_default(); > @@ -382,12 +490,22 @@ static int os_host_main_loop_wait(uint32_t timeout) > * improve socket latency. > */ > > + /* This back-and-forth between GPollFDs and select(2) is temporary. We'll > + * drop it in a couple of patches, I promise :). > + */ > + gpollfds_from_select(); > + FD_ZERO(&rfds); > + FD_ZERO(&wfds); > + FD_ZERO(&xfds); > + nfds = pollfds_fill(gpollfds, &rfds, &wfds, &xfds); > if (nfds >= 0) { > select_ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv0); > if (select_ret != 0) { > timeout = 0; > + pollfds_poll(gpollfds, nfds, &rfds, &wfds, &xfds); This function shouldn't be invoked for "select_ret == -1" (in case that's possible here); "revents" will end up a copy of "events": "On failure, the objects pointed to by the readfds, writefds, and errorfds arguments shall not be modified." > } > } > + gpollfds_to_select(select_ret || g_poll_ret); I can see this logical-or mimics the return value below; however I think it's not robust. If one of the operands is -1, and the other is non-positive, then gpollfds_to_select() should not traverse the GPollFDs, but it will. ... Actually, "g_poll_ret" should have absolutely no effect on gpollfds_to_select(): "g_poll_ret" corresponds to the "poll_fds" array, which covers the glib-internal file descriptors and the WaitObjects. Here (= in the windows case), "gpollfds" covers only slirp & iohandler, thus only "select_ret" should be passed in. > > return select_ret || g_poll_ret; I find this return value unfathomable (how is (-1||0) equivalent to (1||1)?), but it's out of scope for now. > } > @@ -403,6 +521,7 @@ int main_loop_wait(int nonblocking) > } > > /* poll any events */ > + g_array_set_size(gpollfds, 0); /* reset for new iteration */ > /* XXX: separate device handlers from system ones */ > nfds = -1; > FD_ZERO(&rfds); I can't decide (yet) if any of the above is important; probably not. From a quick look at the tree at the series' end, most of it seems to be gone by then. I'll let you decide. Reviewed-by: Laszlo Ersek <lersek@redhat.com>
On Mon, Feb 04, 2013 at 03:37:39PM +0100, Laszlo Ersek wrote: > I assume this is complex because the pre-patch situation is overly > complex as well: > > slirp -> fd_set > iohandler -> fd_set > unix windows > --------------------------------- -------------------------------------------- > before after before after > -------------- ----------------- --------------------- --------------------- > glib -> fd_set glib -> fd_set glib -> GPollFD glib -> GPollFD > fd_set -> GPollFD WaitObj -> GPollFD WaitObj -> GPollFD > > SELECT G_POLL G_POLL (glib/WaitObj) G_POLL (glib/WaitObj) > > fd_set -> GPollFD > GPollFD -> fd_set > > SELECT (slirp/ioh.) SELECT (slirp/ioh.) > > (I'm ignoring the after-select / after-poll operations; they (should) > mirror the pre-select / pre-poll ones.) > > No idea why the windows version has been mixing g_poll() and select(). > I'd hope that this series kills select() for uniformity's sake, but the > 00/10 subject and the commit msg for this patch indicate otherwise. This is why I CCed Fabien. I left the g_poll-followed-by-select behavior. It may be to do with Windows treating sockets different from other objects. Paolo may know the answer, too. > "main-loop.c" is full of global variables which makes it a *pain* to read. Yes. > > + if (FD_ISSET(fd, &wfds)) { > > + events |= G_IO_OUT | G_IO_ERR; > > + } > > + if (FD_ISSET(fd, &xfds)) { > > + events |= G_IO_PRI; > > + } > > + if (events) { > > + GPollFD pfd = { > > + .fd = fd, > > + .events = events, > > + }; > > + g_array_append_val(gpollfds, pfd); > > + } > > + } > > +} > > (I don't like "gpollfds" being global, but) this seems OK. It matches > the glib docs and also How the Mapping Should Be Done (TM). > > This function corresponds to the "unix / after / fd_set -> GPollFD" > entry of the diagram, and as such it is composed with "glib -> fd_set" > (glib_select_fill()). I'll note that glib_select_fill() sets "xfds" for > G_IO_ERR but not G_IO_PRI. (According to SUSv4, in our case xfds is the > logical-or of "error pending" and "OOB/urgent pending".) > > We can assume that each entry stored by g_main_context_query() will have > at least one of IN and OUT set in "events". Further assuming that glib > follows its own documentation, that implies that G_IO_ERR will be set > unconditionally (because it always accompanies each of IN and OUT). > glib_select_fill() will map that to xfds, which we then map here to > G_IO_PRI. The total effect is that G_IO_PRI is set on all file > descriptors, always, even if we only try to write. > > I think ultimately support for OOB data should be killed throughout, as > a policy, including xfds & G_IO_PRI. Pending errors are returned by > read()/write()/etc just fine; see eg. libvirt commit d19149dd. slirp uses OOB because it implements TCP. I don't think we can drop it. However, later in the series we drop the glib_select_fill() rfds/wfds/xfds conversion and use GPollFD directly. That solves these conversion issues. > > + > > + gpollfds_to_select(ret); > > > > if (timeout > 0) { > > qemu_mutex_lock_iothread(); > > @@ -327,6 +386,55 @@ void qemu_fd_register(int fd) > > FD_CONNECT | FD_WRITE | FD_OOB); > > } > > > > +static int pollfds_fill(GArray *pollfds, fd_set *rfds, fd_set *wfds, > > + fd_set *xfds) > > +{ > > + int nfds = -1; > > "nfds" shadows the global "nfds". The return value of pollfds_fill() is > then assigned to the global "nfds". Not very elegant (but it's all > rooted in the existense of the global nfds). Global nfds is dropped later in the series. I had real trouble picking good names due to the globals and ended up with temporary shadowing, which is fixed later. > > + if (FD_ISSET(fd, wfds)) { > > + revents |= G_IO_OUT | G_IO_ERR; > > + } > > + if (FD_ISSET(fd, xfds)) { > > + revents |= G_IO_PRI; > > + } > > + pfd->revents |= revents & pfd->events; > > I find this |= vs. = more confusing than the other two. "pfd->revents" > is zero here. Yes, it's an interesting feature of pollfds_poll(). It does not clobber revents. This means you can apply additional *_poll()-style functions before calling pollfds_poll(). I thought this was neat but I guess we can drop it. > > > + } > > +} > > + > > static int os_host_main_loop_wait(uint32_t timeout) > > { > > GMainContext *context = g_main_context_default(); > > @@ -382,12 +490,22 @@ static int os_host_main_loop_wait(uint32_t timeout) > > * improve socket latency. > > */ > > > > + /* This back-and-forth between GPollFDs and select(2) is temporary. We'll > > + * drop it in a couple of patches, I promise :). > > + */ > > + gpollfds_from_select(); > > + FD_ZERO(&rfds); > > + FD_ZERO(&wfds); > > + FD_ZERO(&xfds); > > + nfds = pollfds_fill(gpollfds, &rfds, &wfds, &xfds); > > if (nfds >= 0) { > > select_ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv0); > > if (select_ret != 0) { > > timeout = 0; > > + pollfds_poll(gpollfds, nfds, &rfds, &wfds, &xfds); > > This function shouldn't be invoked for "select_ret == -1" (in case > that's possible here); "revents" will end up a copy of "events": > > "On failure, the objects pointed to by the readfds, writefds, and > errorfds arguments shall not be modified." Thanks, will fix. > > > } > > } > > + gpollfds_to_select(select_ret || g_poll_ret); > > I can see this logical-or mimics the return value below; however I think > it's not robust. If one of the operands is -1, and the other is > non-positive, then gpollfds_to_select() should not traverse the > GPollFDs, but it will. > > ... Actually, "g_poll_ret" should have absolutely no effect on > gpollfds_to_select(): "g_poll_ret" corresponds to the "poll_fds" array, > which covers the glib-internal file descriptors and the WaitObjects. > Here (= in the windows case), "gpollfds" covers only slirp & iohandler, > thus only "select_ret" should be passed in. Will take a look at this for the next version of this patch. > I can't decide (yet) if any of the above is important; probably not. > From a quick look at the tree at the series' end, most of it seems to be > gone by then. I'll let you decide. Thanks for the review. If I need to respin I'll address comments. Stefan
On 02/04/13 16:07, Stefan Hajnoczi wrote: > slirp uses OOB because it implements TCP. I don't think we can drop it. I considered slirp for that reason, and I did grep the tree for OOB and saw some hits. I didn't track it down but I thought slirp would use lower level sockets (SOCK_RAW?) for the TCP implementation. Ie. it would provide OOB but not depend on it (because OOB doesn't make sense at the IP level; the urgent pointer is in the TCP header). Anyway it's not important. > Yes, it's an interesting feature of pollfds_poll(). It does not clobber > revents. This means you can apply additional *_poll()-style functions > before calling pollfds_poll(). I thought this was neat but I guess we > can drop it. Ah I didn't realize that. I think a comment should be enough, should you respin it. > Thanks for the review. If I need to respin I'll address comments. It's more clear now, thanks! Laszlo
Il 04/02/2013 16:07, Stefan Hajnoczi ha scritto: >> > No idea why the windows version has been mixing g_poll() and select(). >> > I'd hope that this series kills select() for uniformity's sake, but the >> > 00/10 subject and the commit msg for this patch indicate otherwise. > This is why I CCed Fabien. I left the g_poll-followed-by-select > behavior. It may be to do with Windows treating sockets different from > other objects. Paolo may know the answer, too. > Yes, indeed. Windows uses g_poll for HANDLEs and select for sockets. All sockets are configured to trigger an event handle when there is a change in the socket availability, so g_poll exits and select is reexecuted. Paolo
On 02/04/2013 06:35 PM, Paolo Bonzini wrote: > Il 04/02/2013 16:07, Stefan Hajnoczi ha scritto: >>>> No idea why the windows version has been mixing g_poll() and select(). >>>> I'd hope that this series kills select() for uniformity's sake, but the >>>> 00/10 subject and the commit msg for this patch indicate otherwise. >> This is why I CCed Fabien. I left the g_poll-followed-by-select >> behavior. It may be to do with Windows treating sockets different from >> other objects. Paolo may know the answer, too. >> > > Yes, indeed. > > Windows uses g_poll for HANDLEs and select for sockets. All sockets are > configured to trigger an event handle when there is a change in the > socket availability, so g_poll exits and select is reexecuted. > Yes, the reason for this select() call is that we know when an event occurred on one socket, but we don't know what kind of event and which socket. Would it be different if we had a HANDLE for each socket, instead of one for all?
Il 04/02/2013 18:51, Fabien Chouteau ha scritto: > On 02/04/2013 06:35 PM, Paolo Bonzini wrote: >> Il 04/02/2013 16:07, Stefan Hajnoczi ha scritto: >>>>> No idea why the windows version has been mixing g_poll() and select(). >>>>> I'd hope that this series kills select() for uniformity's sake, but the >>>>> 00/10 subject and the commit msg for this patch indicate otherwise. >>> This is why I CCed Fabien. I left the g_poll-followed-by-select >>> behavior. It may be to do with Windows treating sockets different from >>> other objects. Paolo may know the answer, too. >> >> Windows uses g_poll for HANDLEs and select for sockets. All sockets are >> configured to trigger an event handle when there is a change in the >> socket availability, so g_poll exits and select is reexecuted. > > Yes, the reason for this select() call is that we know when an event > occurred on one socket, but we don't know what kind of event and which > socket. > > Would it be different if we had a HANDLE for each socket, instead of > one for all? Nope, I think you cannot assign different events for read and write. Paolo
diff --git a/main-loop.c b/main-loop.c index d0d8fe4..f1dcd14 100644 --- a/main-loop.c +++ b/main-loop.c @@ -117,6 +117,8 @@ void qemu_notify_event(void) aio_notify(qemu_aio_context); } +static GArray *gpollfds; + int qemu_init_main_loop(void) { int ret; @@ -133,6 +135,7 @@ int qemu_init_main_loop(void) return ret; } + gpollfds = g_array_new(FALSE, FALSE, sizeof(GPollFD)); qemu_aio_context = aio_context_new(); src = aio_get_g_source(qemu_aio_context); g_source_attach(src, NULL); @@ -146,6 +149,62 @@ static GPollFD poll_fds[1024 * 2]; /* this is probably overkill */ static int n_poll_fds; static int max_priority; +/* Load rfds/wfds/xfds into gpollfds. Will be removed a few commits later. */ +static void gpollfds_from_select(void) +{ + int fd; + for (fd = 0; fd <= nfds; fd++) { + int events = 0; + if (FD_ISSET(fd, &rfds)) { + events |= G_IO_IN | G_IO_HUP | G_IO_ERR; + } + if (FD_ISSET(fd, &wfds)) { + events |= G_IO_OUT | G_IO_ERR; + } + if (FD_ISSET(fd, &xfds)) { + events |= G_IO_PRI; + } + if (events) { + GPollFD pfd = { + .fd = fd, + .events = events, + }; + g_array_append_val(gpollfds, pfd); + } + } +} + +/* Store gpollfds revents into rfds/wfds/xfds. Will be removed a few commits + * later. + */ +static void gpollfds_to_select(int ret) +{ + int i; + + FD_ZERO(&rfds); + FD_ZERO(&wfds); + FD_ZERO(&xfds); + + if (ret <= 0) { + return; + } + + for (i = 0; i < gpollfds->len; i++) { + int fd = g_array_index(gpollfds, GPollFD, i).fd; + int revents = g_array_index(gpollfds, GPollFD, i).revents; + + if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR)) { + FD_SET(fd, &rfds); + } + if (revents & (G_IO_OUT | G_IO_ERR)) { + FD_SET(fd, &wfds); + } + if (revents & G_IO_PRI) { + FD_SET(fd, &xfds); + } + } +} + #ifndef _WIN32 static void glib_select_fill(int *max_fd, fd_set *rfds, fd_set *wfds, fd_set *xfds, uint32_t *cur_timeout) @@ -212,22 +271,22 @@ static void glib_select_poll(fd_set *rfds, fd_set *wfds, fd_set *xfds, static int os_host_main_loop_wait(uint32_t timeout) { - struct timeval tv, *tvarg = NULL; int ret; glib_select_fill(&nfds, &rfds, &wfds, &xfds, &timeout); - if (timeout < UINT32_MAX) { - tvarg = &tv; - tv.tv_sec = timeout / 1000; - tv.tv_usec = (timeout % 1000) * 1000; - } - if (timeout > 0) { qemu_mutex_unlock_iothread(); } - ret = select(nfds + 1, &rfds, &wfds, &xfds, tvarg); + /* We'll eventually drop fd_set completely. But for now we still have + * *_fill() and *_poll() functions that use rfds/wfds/xfds. + */ + gpollfds_from_select(); + + ret = g_poll((GPollFD *)gpollfds->data, gpollfds->len, timeout); + + gpollfds_to_select(ret); if (timeout > 0) { qemu_mutex_lock_iothread(); @@ -327,6 +386,55 @@ void qemu_fd_register(int fd) FD_CONNECT | FD_WRITE | FD_OOB); } +static int pollfds_fill(GArray *pollfds, fd_set *rfds, fd_set *wfds, + fd_set *xfds) +{ + int nfds = -1; + int i; + + for (i = 0; i < pollfds->len; i++) { + GPollFD *pfd = &g_array_index(pollfds, GPollFD, i); + int fd = pfd->fd; + int events = pfd->events; + if (events & (G_IO_IN | G_IO_HUP | G_IO_ERR)) { + FD_SET(fd, rfds); + nfds = MAX(nfds, fd); + } + if (events & (G_IO_OUT | G_IO_ERR)) { + FD_SET(fd, wfds); + nfds = MAX(nfds, fd); + } + if (events & G_IO_PRI) { + FD_SET(fd, xfds); + nfds = MAX(nfds, fd); + } + } + return nfds; +} + +static void pollfds_poll(GArray *pollfds, int nfds, fd_set *rfds, + fd_set *wfds, fd_set *xfds) +{ + int i; + + for (i = 0; i < pollfds->len; i++) { + GPollFD *pfd = &g_array_index(pollfds, GPollFD, i); + int fd = pfd->fd; + int revents = 0; + + if (FD_ISSET(fd, rfds)) { + revents |= G_IO_IN | G_IO_HUP | G_IO_ERR; + } + if (FD_ISSET(fd, wfds)) { + revents |= G_IO_OUT | G_IO_ERR; + } + if (FD_ISSET(fd, xfds)) { + revents |= G_IO_PRI; + } + pfd->revents |= revents & pfd->events; + } +} + static int os_host_main_loop_wait(uint32_t timeout) { GMainContext *context = g_main_context_default(); @@ -382,12 +490,22 @@ static int os_host_main_loop_wait(uint32_t timeout) * improve socket latency. */ + /* This back-and-forth between GPollFDs and select(2) is temporary. We'll + * drop it in a couple of patches, I promise :). + */ + gpollfds_from_select(); + FD_ZERO(&rfds); + FD_ZERO(&wfds); + FD_ZERO(&xfds); + nfds = pollfds_fill(gpollfds, &rfds, &wfds, &xfds); if (nfds >= 0) { select_ret = select(nfds + 1, &rfds, &wfds, &xfds, &tv0); if (select_ret != 0) { timeout = 0; + pollfds_poll(gpollfds, nfds, &rfds, &wfds, &xfds); } } + gpollfds_to_select(select_ret || g_poll_ret); return select_ret || g_poll_ret; } @@ -403,6 +521,7 @@ int main_loop_wait(int nonblocking) } /* poll any events */ + g_array_set_size(gpollfds, 0); /* reset for new iteration */ /* XXX: separate device handlers from system ones */ nfds = -1; FD_ZERO(&rfds);
Use g_poll(3) instead of select(2). Well, this is kind of a cheat. It's true that we're now using g_poll(3) on POSIX hosts but the *_fill() and *_poll() functions are still using rfds/wfds/xfds. We've set the scene to start converting *_fill() and *_poll() functions step-by-step until no more rfds/wfds/xfds users remain. Then we'll drop the temporary gpollfds_from_select() and gpollfds_to_select() functions and be left with native g_poll(2). On Windows things are a little crazy: convert from rfds/wfds/xfds to GPollFDs, back to rfds/wfds/xfds, call select(2), rfds/wfds/xfds back to GPollFDs, and finally back to rfds/wfds/xfds again. This is only temporary and keeps the Windows build working through the following patches. We'll drop this excessive conversion later and be left with a single GPollFDs -> select(2) -> GPollFDs sequence that allows Windows to use select(2) while the rest of QEMU only knows about GPollFD. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> --- main-loop.c | 135 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 127 insertions(+), 8 deletions(-)