Message ID | 20110207160350.GA26332@amt.cnet |
---|---|
State | New |
Headers | show |
On 02/07/2011 05:03 PM, Marcelo Tosatti wrote: > Is there any other issue that prevents turning CONFIG_IOTHREAD on by > default? I think Windows support. Signal support is actually easy because we can "hack" the IPI as "suspend the VCPU thread+do work in the iothread context+resume the VCPU thread" (the IPI handler doesn't longjmp). Threading primitives support is tricky but not hard (there is lots of code around, especially if you can make assumptions such as "always hold the mutex while signaling a cond. variable"). Paolo
On 2011-02-07 17:23, Paolo Bonzini wrote: > On 02/07/2011 05:03 PM, Marcelo Tosatti wrote: >> Is there any other issue that prevents turning CONFIG_IOTHREAD on by >> default? > > I think Windows support. > > Signal support is actually easy because we can "hack" the IPI as > "suspend the VCPU thread+do work in the iothread context+resume the VCPU > thread" (the IPI handler doesn't longjmp). > > Threading primitives support is tricky but not hard (there is lots of > code around, especially if you can make assumptions such as "always hold > the mutex while signaling a cond. variable"). !CONFIG_IOTHREAD code is doomed to bitrot once we switch to default iothread mode. So if Windows support is not converted to a threading model with moderate differences to POSIX, it will likely bitrot a well. Therefore, conversion should be started rather sooner than later (by someone interested in that platform). Jan
On Mon, Feb 07, 2011 at 02:03:50PM -0200, Marcelo Tosatti wrote: > On Mon, Feb 07, 2011 at 08:12:55AM -0200, Marcelo Tosatti wrote: > > > > One more thing I didn't mention on the email-thread or on IRC is > > > > that last time I checked, qemu with io-thread was performing > > > > significantly slower than non io-thread builds. That was with > > > > TCG emulation (not kvm). Somewhere between 5 - 10% slower, IIRC. > > > > Can you recall what was the test ? > > > > > > Also, although -icount & iothread no longer deadlocks, icount > > > > still sometimes performs incredibly slow with the io-thread (compared > > > > to non-io-thread qemu). In particular when not using -icount auto but > > > > a fixed ticks per insn values. Sometimes it's so slow I thought it > > > > actually deadlocked, but no it was crawling :) I haven't had time > > > > to look at it any closer but I hope to do soon. > > Edgar, please give the attached patch a try with fixed icount value. The > calculation for next event makes no sense for iothread timeout, only for > vcpu context. Thanks Marcelo, this patch fixes the problems I was seeing here. Cheers > diff --git a/cpus.c b/cpus.c > index 9c50a34..2280db1 100644 > --- a/cpus.c > +++ b/cpus.c > @@ -748,7 +748,7 @@ static void qemu_tcg_wait_io_event(void) > CPUState *env; > > while (!any_cpu_has_work()) > - qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, 1000); > + qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, qemu_calculate_timeout()); > > qemu_mutex_unlock(&qemu_global_mutex); > > diff --git a/vl.c b/vl.c > index 837be97..dbd81a1 100644 > --- a/vl.c > +++ b/vl.c > @@ -1323,7 +1323,7 @@ void main_loop_wait(int nonblocking) > if (nonblocking) > timeout = 0; > else { > - timeout = qemu_calculate_timeout(); > + timeout = 1000; > qemu_bh_update_timeout(&timeout); > } >
On Mon, Feb 07, 2011 at 02:03:50PM -0200, Marcelo Tosatti wrote: > On Mon, Feb 07, 2011 at 08:12:55AM -0200, Marcelo Tosatti wrote: > > > > One more thing I didn't mention on the email-thread or on IRC is > > > > that last time I checked, qemu with io-thread was performing > > > > significantly slower than non io-thread builds. That was with > > > > TCG emulation (not kvm). Somewhere between 5 - 10% slower, IIRC. > > > > Can you recall what was the test ? > > It's also something I've seen using network transfer in guest. IIRC the biggest slowdown was using the smc91c111 card under qemu-system-arm where it was about 20% slower. Other cards on other architectures (I remember testing powerpc, mips and sh4) are more in the 5 to 10 % area.
On 02/07/2011 11:10 AM, Jan Kiszka wrote: > On 2011-02-07 17:23, Paolo Bonzini wrote: > >> On 02/07/2011 05:03 PM, Marcelo Tosatti wrote: >> >>> Is there any other issue that prevents turning CONFIG_IOTHREAD on by >>> default? >>> >> I think Windows support. >> >> Signal support is actually easy because we can "hack" the IPI as >> "suspend the VCPU thread+do work in the iothread context+resume the VCPU >> thread" (the IPI handler doesn't longjmp). >> >> Threading primitives support is tricky but not hard (there is lots of >> code around, especially if you can make assumptions such as "always hold >> the mutex while signaling a cond. variable"). >> > !CONFIG_IOTHREAD code is doomed to bitrot once we switch to default > iothread mode. So if Windows support is not converted to a threading > model with moderate differences to POSIX, it will likely bitrot a well. > Therefore, conversion should be started rather sooner than later (by > someone interested in that platform). > As far as I'm concerned, Windows support is already deprecated as noone has stepped up to enhance it or support for a number of years now. We shouldn't remove existing code that supports it or refuse to take reasonable patches but if enabling IO thread by default breaks it, so be it. Regards, Anthony Liguori > Jan > >
On Mon, 7 Feb 2011 14:03:50 -0200 Marcelo Tosatti <mtosatti@redhat.com> wrote: > Is there any other issue that prevents turning CONFIG_IOTHREAD on by > default? > This patch is needed for ppce500_mpc8544ds and ppc440_bamboo to work with I/O thread enabled: http://patchwork.ozlabs.org/patch/66743/ -Scott
On Mon, Feb 07, 2011 at 03:02:03PM -0600, Anthony Liguori wrote: > On 02/07/2011 11:10 AM, Jan Kiszka wrote: >> On 2011-02-07 17:23, Paolo Bonzini wrote: >> >>> On 02/07/2011 05:03 PM, Marcelo Tosatti wrote: >>> >>>> Is there any other issue that prevents turning CONFIG_IOTHREAD on by >>>> default? >>>> >>> I think Windows support. >>> >>> Signal support is actually easy because we can "hack" the IPI as >>> "suspend the VCPU thread+do work in the iothread context+resume the VCPU >>> thread" (the IPI handler doesn't longjmp). >>> >>> Threading primitives support is tricky but not hard (there is lots of >>> code around, especially if you can make assumptions such as "always hold >>> the mutex while signaling a cond. variable"). >>> >> !CONFIG_IOTHREAD code is doomed to bitrot once we switch to default >> iothread mode. So if Windows support is not converted to a threading >> model with moderate differences to POSIX, it will likely bitrot a well. >> Therefore, conversion should be started rather sooner than later (by >> someone interested in that platform). >> > > As far as I'm concerned, Windows support is already deprecated as noone > has stepped up to enhance it or support for a number of years now. We > shouldn't remove existing code that supports it or refuse to take > reasonable patches but if enabling IO thread by default breaks it, so be > it. As far as I see, Blue Swirl and Stefan Weil are regularly committing fixes for win32. Stefan Weil is also providing win32 binaries on his website [1]. I wouldn't call that deprecated. [1] http://qemu.weilnetz.de/
On 02/07/2011 03:45 PM, Aurelien Jarno wrote: > On Mon, Feb 07, 2011 at 03:02:03PM -0600, Anthony Liguori wrote: > >> As far as I'm concerned, Windows support is already deprecated as noone >> has stepped up to enhance it or support for a number of years now. We >> shouldn't remove existing code that supports it or refuse to take >> reasonable patches but if enabling IO thread by default breaks it, so be >> it. >> > As far as I see, Blue Swirl and Stefan Weil are regularly committing > fixes for win32. Stefan Weil is also providing win32 binaries on his > website [1]. I wouldn't call that deprecated. > Occasional compile fixes is a long way from something that is regularly tested and well maintained. Win32 still doesn't have a proper AIO implementation which is probably close to a 4 year old FIXME. Regards, Anthony Liguori > [1] http://qemu.weilnetz.de/ > >
On Mon, Feb 07, 2011 at 08:09:52PM -0600, Anthony Liguori wrote: > On 02/07/2011 03:45 PM, Aurelien Jarno wrote: >> On Mon, Feb 07, 2011 at 03:02:03PM -0600, Anthony Liguori wrote: >> >>> As far as I'm concerned, Windows support is already deprecated as noone >>> has stepped up to enhance it or support for a number of years now. We >>> shouldn't remove existing code that supports it or refuse to take >>> reasonable patches but if enabling IO thread by default breaks it, so be >>> it. >>> >> As far as I see, Blue Swirl and Stefan Weil are regularly committing >> fixes for win32. Stefan Weil is also providing win32 binaries on his >> website [1]. I wouldn't call that deprecated. >> > > Occasional compile fixes is a long way from something that is regularly > tested and well maintained. > > Win32 still doesn't have a proper AIO implementation which is probably > close to a 4 year old FIXME. > I forget to remember when we decided that AIO should be implemented on any host OS. Any pointer?
On 02/08/2011 08:26 AM, Aurelien Jarno wrote: > I forget to remember when we decided that AIO should be implemented on > any host OS. Any pointer? To be fair, I/O-heavy workloads are almost unusable without AIO. For Window targets, they also crash under SMP due to the Windows AP watchdog. But then TCG and SMP do not go very well together anyway. However, I think deprecating Win32 support would be a very bad idea. Paolo
On 2011-02-08 09:08, Paolo Bonzini wrote: > On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >> I forget to remember when we decided that AIO should be implemented on >> any host OS. Any pointer? > > To be fair, I/O-heavy workloads are almost unusable without AIO. For > Window targets, they also crash under SMP due to the Windows AP > watchdog. But then TCG and SMP do not go very well together anyway. > > However, I think deprecating Win32 support would be a very bad idea. It would be too early at this point. But if Windows is once the only reason to keep tons of hardly tested code paths around or to invest significant additional effort to change logic or interfaces in this area, than I would prefer that step. I'm hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those subtle differences are really a PITA and source of various breakages. People interested in that platform should finally realize that its fate is coupled to reducing the #ifdefs as well as the design differences we see right now and even more in the future. Jan
Jan Kiszka a écrit : > On 2011-02-08 09:08, Paolo Bonzini wrote: >> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>> I forget to remember when we decided that AIO should be implemented on >>> any host OS. Any pointer? >> To be fair, I/O-heavy workloads are almost unusable without AIO. For >> Window targets, they also crash under SMP due to the Windows AP >> watchdog. But then TCG and SMP do not go very well together anyway. >> >> However, I think deprecating Win32 support would be a very bad idea. > > It would be too early at this point. > > But if Windows is once the only reason to keep tons of hardly tested > code paths around or to invest significant additional effort to change > logic or interfaces in this area, than I would prefer that step. I'm > hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those > subtle differences are really a PITA and source of various breakages. > > People interested in that platform should finally realize that its fate > is coupled to reducing the #ifdefs as well as the design differences we > see right now and even more in the future. > The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, it's just that people who introduce IOTHREAD didn't care about Windows support at all and added these #ifdef. Disabling Windows support because of that is not fair. We should probably get rid of KVM support in QEMU, so if someone has an idea for a cool TCG feature that can't be supported in KVM, it's the moment to submit it. We can add it with #ifdef, and in one year just ask for KVM support removal.
On 02/08/2011 03:05 AM, Aurelien Jarno wrote: > Jan Kiszka a écrit : > >> On 2011-02-08 09:08, Paolo Bonzini wrote: >> >>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>> >>>> I forget to remember when we decided that AIO should be implemented on >>>> any host OS. Any pointer? >>>> >>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>> Window targets, they also crash under SMP due to the Windows AP >>> watchdog. But then TCG and SMP do not go very well together anyway. >>> >>> However, I think deprecating Win32 support would be a very bad idea. >>> >> It would be too early at this point. >> >> But if Windows is once the only reason to keep tons of hardly tested >> code paths around or to invest significant additional effort to change >> logic or interfaces in this area, than I would prefer that step. I'm >> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >> subtle differences are really a PITA and source of various breakages. >> >> People interested in that platform should finally realize that its fate >> is coupled to reducing the #ifdefs as well as the design differences we >> see right now and even more in the future. >> >> > The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, > IOTHREAD is actually just as necessary for TCG as it is for KVM. Otherwise, you have a signal select race that cannot be avoided. QEMU has never "supported" Windows. It happens to compile on Windows, but historically the Windows build has been non-functional for long periods of time and is still missing basic features (like AIO). Regards, Anthony Liguori > it's just that people who introduce IOTHREAD didn't care about Windows > support at all and added these #ifdef. Disabling Windows support because > of that is not fair. > > We should probably get rid of KVM support in QEMU, so if someone has an > idea for a cool TCG feature that can't be supported in KVM, it's the > moment to submit it. We can add it with #ifdef, and in one year just ask > for KVM support removal. > >
On 02/08/2011 10:12 AM, Anthony Liguori wrote: >> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, > > QEMU has never "supported" Windows. I think both assertions are false. What's true is that the Win32 port has never evolved beyond "barely functional", at least by the standards with which QEMU is judged under Linux. Paolo
On 2011-02-08 10:05, Aurelien Jarno wrote: > Jan Kiszka a écrit : >> On 2011-02-08 09:08, Paolo Bonzini wrote: >>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>> I forget to remember when we decided that AIO should be implemented on >>>> any host OS. Any pointer? >>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>> Window targets, they also crash under SMP due to the Windows AP >>> watchdog. But then TCG and SMP do not go very well together anyway. >>> >>> However, I think deprecating Win32 support would be a very bad idea. >> >> It would be too early at this point. >> >> But if Windows is once the only reason to keep tons of hardly tested >> code paths around or to invest significant additional effort to change >> logic or interfaces in this area, than I would prefer that step. I'm >> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >> subtle differences are really a PITA and source of various breakages. >> >> People interested in that platform should finally realize that its fate >> is coupled to reducing the #ifdefs as well as the design differences we >> see right now and even more in the future. >> > > The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, > it's just that people who introduce IOTHREAD didn't care about Windows > support at all and added these #ifdef. Disabling Windows support because > of that is not fair. The TCG execution model won't scale long-term. It's already a main to boot a quad or just dual core VM, even more when your host has at least as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the future, and the iothread will just be one of 7, 17 or 257 threads. Jan
Jan Kiszka a écrit : > On 2011-02-08 10:05, Aurelien Jarno wrote: >> Jan Kiszka a écrit : >>> On 2011-02-08 09:08, Paolo Bonzini wrote: >>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>>> I forget to remember when we decided that AIO should be implemented on >>>>> any host OS. Any pointer? >>>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>>> Window targets, they also crash under SMP due to the Windows AP >>>> watchdog. But then TCG and SMP do not go very well together anyway. >>>> >>>> However, I think deprecating Win32 support would be a very bad idea. >>> It would be too early at this point. >>> >>> But if Windows is once the only reason to keep tons of hardly tested >>> code paths around or to invest significant additional effort to change >>> logic or interfaces in this area, than I would prefer that step. I'm >>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >>> subtle differences are really a PITA and source of various breakages. >>> >>> People interested in that platform should finally realize that its fate >>> is coupled to reducing the #ifdefs as well as the design differences we >>> see right now and even more in the future. >>> >> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, >> it's just that people who introduce IOTHREAD didn't care about Windows >> support at all and added these #ifdef. Disabling Windows support because >> of that is not fair. > > The TCG execution model won't scale long-term. It's already a main to > boot a quad or just dual core VM, even more when your host has at least > as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the > future, and the iothread will just be one of 7, 17 or 257 threads. > And what's the issue with that? People don't always look for performance when using QEMU. They even often try to emulate old machines (and non x86 ones), which anyway only have one CPU. This won't change in 5 years, the only thing is that those machines will be 5 years older. People have to keep in mind that QEMU doesn't mean only virtualization and doesn't mean only x86.
On 2011-02-08 10:58, Aurelien Jarno wrote: > Jan Kiszka a écrit : >> On 2011-02-08 10:05, Aurelien Jarno wrote: >>> Jan Kiszka a écrit : >>>> On 2011-02-08 09:08, Paolo Bonzini wrote: >>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>>>> I forget to remember when we decided that AIO should be implemented on >>>>>> any host OS. Any pointer? >>>>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>>>> Window targets, they also crash under SMP due to the Windows AP >>>>> watchdog. But then TCG and SMP do not go very well together anyway. >>>>> >>>>> However, I think deprecating Win32 support would be a very bad idea. >>>> It would be too early at this point. >>>> >>>> But if Windows is once the only reason to keep tons of hardly tested >>>> code paths around or to invest significant additional effort to change >>>> logic or interfaces in this area, than I would prefer that step. I'm >>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >>>> subtle differences are really a PITA and source of various breakages. >>>> >>>> People interested in that platform should finally realize that its fate >>>> is coupled to reducing the #ifdefs as well as the design differences we >>>> see right now and even more in the future. >>>> >>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, >>> it's just that people who introduce IOTHREAD didn't care about Windows >>> support at all and added these #ifdef. Disabling Windows support because >>> of that is not fair. >> >> The TCG execution model won't scale long-term. It's already a main to >> boot a quad or just dual core VM, even more when your host has at least >> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the >> future, and the iothread will just be one of 7, 17 or 257 threads. >> > > And what's the issue with that? People don't always look for performance > when using QEMU. They even often try to emulate old machines (and non > x86 ones), which anyway only have one CPU. This won't change in 5 years, > the only thing is that those machines will be 5 years older. > > People have to keep in mind that QEMU doesn't mean only virtualization > and doesn't mean only x86. I'm not talking about virtualization here. I'm talking about usable emulation of today's (!) embedded multi-core platforms. It matters a lot if your test roundtrip for booting into a SMP guest and running some apps is a few 10 seconds, a few minutes or even not practically working. Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I just hope I'll never depend on this for work. Jan
On 02/08/2011 10:58 AM, Aurelien Jarno wrote: > And what's the issue with that? People don't always look for performance > when using QEMU. They even often try to emulate old machines (and non > x86 ones), which anyway only have one CPU. This won't change in 5 years, > the only thing is that those machines will be 5 years older. > > People have to keep in mind that QEMU doesn't mean only virtualization > and doesn't mean only x86. AFAIU nobody is proposing to rip linux-user or TCG, just to improve its implementation. You just as well have to understand that AIO means fewer Windows blue screens of death and not only better performance. Paolo
Jan Kiszka a écrit : > On 2011-02-08 10:58, Aurelien Jarno wrote: >> Jan Kiszka a écrit : >>> On 2011-02-08 10:05, Aurelien Jarno wrote: >>>> Jan Kiszka a écrit : >>>>> On 2011-02-08 09:08, Paolo Bonzini wrote: >>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>>>>> I forget to remember when we decided that AIO should be implemented on >>>>>>> any host OS. Any pointer? >>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>>>>> Window targets, they also crash under SMP due to the Windows AP >>>>>> watchdog. But then TCG and SMP do not go very well together anyway. >>>>>> >>>>>> However, I think deprecating Win32 support would be a very bad idea. >>>>> It would be too early at this point. >>>>> >>>>> But if Windows is once the only reason to keep tons of hardly tested >>>>> code paths around or to invest significant additional effort to change >>>>> logic or interfaces in this area, than I would prefer that step. I'm >>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >>>>> subtle differences are really a PITA and source of various breakages. >>>>> >>>>> People interested in that platform should finally realize that its fate >>>>> is coupled to reducing the #ifdefs as well as the design differences we >>>>> see right now and even more in the future. >>>>> >>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, >>>> it's just that people who introduce IOTHREAD didn't care about Windows >>>> support at all and added these #ifdef. Disabling Windows support because >>>> of that is not fair. >>> The TCG execution model won't scale long-term. It's already a main to >>> boot a quad or just dual core VM, even more when your host has at least >>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the >>> future, and the iothread will just be one of 7, 17 or 257 threads. >>> >> And what's the issue with that? People don't always look for performance >> when using QEMU. They even often try to emulate old machines (and non >> x86 ones), which anyway only have one CPU. This won't change in 5 years, >> the only thing is that those machines will be 5 years older. >> >> People have to keep in mind that QEMU doesn't mean only virtualization >> and doesn't mean only x86. > > I'm not talking about virtualization here. I'm talking about usable > emulation of today's (!) embedded multi-core platforms. It matters a lot > if your test roundtrip for booting into a SMP guest and running some > apps is a few 10 seconds, a few minutes or even not practically working. > Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I > just hope I'll never depend on this for work. Yes, it's slow. But is it a problem? You assume that people use QEMU only for emulating SMP platforms. This is a wrong assumption. Beside the x86 target, only sparc really supports SMP emulation.
On 08.02.2011, at 11:06, Aurelien Jarno wrote: > Jan Kiszka a écrit : >> On 2011-02-08 10:58, Aurelien Jarno wrote: >>> Jan Kiszka a écrit : >>>> On 2011-02-08 10:05, Aurelien Jarno wrote: >>>>> Jan Kiszka a écrit : >>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote: >>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>>>>>> I forget to remember when we decided that AIO should be implemented on >>>>>>>> any host OS. Any pointer? >>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>>>>>> Window targets, they also crash under SMP due to the Windows AP >>>>>>> watchdog. But then TCG and SMP do not go very well together anyway. >>>>>>> >>>>>>> However, I think deprecating Win32 support would be a very bad idea. >>>>>> It would be too early at this point. >>>>>> >>>>>> But if Windows is once the only reason to keep tons of hardly tested >>>>>> code paths around or to invest significant additional effort to change >>>>>> logic or interfaces in this area, than I would prefer that step. I'm >>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >>>>>> subtle differences are really a PITA and source of various breakages. >>>>>> >>>>>> People interested in that platform should finally realize that its fate >>>>>> is coupled to reducing the #ifdefs as well as the design differences we >>>>>> see right now and even more in the future. >>>>>> >>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, >>>>> it's just that people who introduce IOTHREAD didn't care about Windows >>>>> support at all and added these #ifdef. Disabling Windows support because >>>>> of that is not fair. >>>> The TCG execution model won't scale long-term. It's already a main to >>>> boot a quad or just dual core VM, even more when your host has at least >>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the >>>> future, and the iothread will just be one of 7, 17 or 257 threads. >>>> >>> And what's the issue with that? People don't always look for performance >>> when using QEMU. They even often try to emulate old machines (and non >>> x86 ones), which anyway only have one CPU. This won't change in 5 years, >>> the only thing is that those machines will be 5 years older. >>> >>> People have to keep in mind that QEMU doesn't mean only virtualization >>> and doesn't mean only x86. >> >> I'm not talking about virtualization here. I'm talking about usable >> emulation of today's (!) embedded multi-core platforms. It matters a lot >> if your test roundtrip for booting into a SMP guest and running some >> apps is a few 10 seconds, a few minutes or even not practically working. >> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I >> just hope I'll never depend on this for work. > > Yes, it's slow. But is it a problem? You assume that people use QEMU > only for emulating SMP platforms. This is a wrong assumption. Beside the > x86 target, only sparc really supports SMP emulation. I guess his point here really is that soon SMP is commodity. Most new ARM cores move to SMP by default, MIPS is there already and even embedded PPC is multi-core for a while now. Sure, you can work around things by only emulating a single core at times, but it's not always good enough - especially if you're working on interrupt handling code. Either way, the whole discussion is moot. We either do support Windows or we don't. Most of the developers don't even have windows machines, so it's very hard for them to do it - even less so do they have windows programming knowledge. So what we really need is for someone to implement the thread infrastructure and aio support on windows and then all is great (until the next big infrastructure feature of course). If only the Android people wouldn't simply fork every project out there, but work upstream, we'd probably have quite a few folks happy to support windows from that crowd, as they depend on it heavily. Alex
Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class citizens. I think you'd like people to provide full support when they introduce new features. This is a good motivator to use glib and have a unified code path for TCG/KVM and Linux/Windows. Yes it will require some work and some optimization, but at the end we'll have better host platform parity and a simpler main loop for TCG/KVM to interact with. Stefan
On 2011-02-08 11:06, Aurelien Jarno wrote: > Jan Kiszka a écrit : >> On 2011-02-08 10:58, Aurelien Jarno wrote: >>> Jan Kiszka a écrit : >>>> On 2011-02-08 10:05, Aurelien Jarno wrote: >>>>> Jan Kiszka a écrit : >>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote: >>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>>>>>> I forget to remember when we decided that AIO should be implemented on >>>>>>>> any host OS. Any pointer? >>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>>>>>> Window targets, they also crash under SMP due to the Windows AP >>>>>>> watchdog. But then TCG and SMP do not go very well together anyway. >>>>>>> >>>>>>> However, I think deprecating Win32 support would be a very bad idea. >>>>>> It would be too early at this point. >>>>>> >>>>>> But if Windows is once the only reason to keep tons of hardly tested >>>>>> code paths around or to invest significant additional effort to change >>>>>> logic or interfaces in this area, than I would prefer that step. I'm >>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >>>>>> subtle differences are really a PITA and source of various breakages. >>>>>> >>>>>> People interested in that platform should finally realize that its fate >>>>>> is coupled to reducing the #ifdefs as well as the design differences we >>>>>> see right now and even more in the future. >>>>>> >>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, >>>>> it's just that people who introduce IOTHREAD didn't care about Windows >>>>> support at all and added these #ifdef. Disabling Windows support because >>>>> of that is not fair. >>>> The TCG execution model won't scale long-term. It's already a main to >>>> boot a quad or just dual core VM, even more when your host has at least >>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the >>>> future, and the iothread will just be one of 7, 17 or 257 threads. >>>> >>> And what's the issue with that? People don't always look for performance >>> when using QEMU. They even often try to emulate old machines (and non >>> x86 ones), which anyway only have one CPU. This won't change in 5 years, >>> the only thing is that those machines will be 5 years older. >>> >>> People have to keep in mind that QEMU doesn't mean only virtualization >>> and doesn't mean only x86. >> >> I'm not talking about virtualization here. I'm talking about usable >> emulation of today's (!) embedded multi-core platforms. It matters a lot >> if your test roundtrip for booting into a SMP guest and running some >> apps is a few 10 seconds, a few minutes or even not practically working. >> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I >> just hope I'll never depend on this for work. > > Yes, it's slow. But is it a problem? You assume that people use QEMU > only for emulating SMP platforms. This is a wrong assumption. Beside the > x86 target, only sparc really supports SMP emulation. That's too nearsighted. SMP will be commodity on practically _any_ arch within the next years. And if QEMU doesn't keep up with it, feature and performance-wise, it will loose market share. Jan
Jan Kiszka a écrit : > On 2011-02-08 11:06, Aurelien Jarno wrote: >> Jan Kiszka a écrit : >>> On 2011-02-08 10:58, Aurelien Jarno wrote: >>>> Jan Kiszka a écrit : >>>>> On 2011-02-08 10:05, Aurelien Jarno wrote: >>>>>> Jan Kiszka a écrit : >>>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote: >>>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>>>>>>> I forget to remember when we decided that AIO should be implemented on >>>>>>>>> any host OS. Any pointer? >>>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>>>>>>> Window targets, they also crash under SMP due to the Windows AP >>>>>>>> watchdog. But then TCG and SMP do not go very well together anyway. >>>>>>>> >>>>>>>> However, I think deprecating Win32 support would be a very bad idea. >>>>>>> It would be too early at this point. >>>>>>> >>>>>>> But if Windows is once the only reason to keep tons of hardly tested >>>>>>> code paths around or to invest significant additional effort to change >>>>>>> logic or interfaces in this area, than I would prefer that step. I'm >>>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >>>>>>> subtle differences are really a PITA and source of various breakages. >>>>>>> >>>>>>> People interested in that platform should finally realize that its fate >>>>>>> is coupled to reducing the #ifdefs as well as the design differences we >>>>>>> see right now and even more in the future. >>>>>>> >>>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, >>>>>> it's just that people who introduce IOTHREAD didn't care about Windows >>>>>> support at all and added these #ifdef. Disabling Windows support because >>>>>> of that is not fair. >>>>> The TCG execution model won't scale long-term. It's already a main to >>>>> boot a quad or just dual core VM, even more when your host has at least >>>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the >>>>> future, and the iothread will just be one of 7, 17 or 257 threads. >>>>> >>>> And what's the issue with that? People don't always look for performance >>>> when using QEMU. They even often try to emulate old machines (and non >>>> x86 ones), which anyway only have one CPU. This won't change in 5 years, >>>> the only thing is that those machines will be 5 years older. >>>> >>>> People have to keep in mind that QEMU doesn't mean only virtualization >>>> and doesn't mean only x86. >>> I'm not talking about virtualization here. I'm talking about usable >>> emulation of today's (!) embedded multi-core platforms. It matters a lot >>> if your test roundtrip for booting into a SMP guest and running some >>> apps is a few 10 seconds, a few minutes or even not practically working. >>> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I >>> just hope I'll never depend on this for work. >> Yes, it's slow. But is it a problem? You assume that people use QEMU >> only for emulating SMP platforms. This is a wrong assumption. Beside the >> x86 target, only sparc really supports SMP emulation. > > That's too nearsighted. SMP will be commodity on practically _any_ arch > within the next years. And if QEMU doesn't keep up with it, feature and > performance-wise, it will loose market share. > Oh commercial arguments now. I am looking for something that answer my needs, not about market share.
Stefan Hajnoczi a écrit : > Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class > citizens. I think you'd like people to provide full support when they > introduce new features. > I think you really pointed the problem here. We should probably add a feature that will make KVM second class citizen so that people can understand what it means.
On 2011-02-08 11:26, Aurelien Jarno wrote: > Jan Kiszka a écrit : >> On 2011-02-08 11:06, Aurelien Jarno wrote: >>> Jan Kiszka a écrit : >>>> On 2011-02-08 10:58, Aurelien Jarno wrote: >>>>> Jan Kiszka a écrit : >>>>>> On 2011-02-08 10:05, Aurelien Jarno wrote: >>>>>>> Jan Kiszka a écrit : >>>>>>>> On 2011-02-08 09:08, Paolo Bonzini wrote: >>>>>>>>> On 02/08/2011 08:26 AM, Aurelien Jarno wrote: >>>>>>>>>> I forget to remember when we decided that AIO should be implemented on >>>>>>>>>> any host OS. Any pointer? >>>>>>>>> To be fair, I/O-heavy workloads are almost unusable without AIO. For >>>>>>>>> Window targets, they also crash under SMP due to the Windows AP >>>>>>>>> watchdog. But then TCG and SMP do not go very well together anyway. >>>>>>>>> >>>>>>>>> However, I think deprecating Win32 support would be a very bad idea. >>>>>>>> It would be too early at this point. >>>>>>>> >>>>>>>> But if Windows is once the only reason to keep tons of hardly tested >>>>>>>> code paths around or to invest significant additional effort to change >>>>>>>> logic or interfaces in this area, than I would prefer that step. I'm >>>>>>>> hacking on IOTHREAD vs. !IOTHREAD for some weeks now, and all those >>>>>>>> subtle differences are really a PITA and source of various breakages. >>>>>>>> >>>>>>>> People interested in that platform should finally realize that its fate >>>>>>>> is coupled to reducing the #ifdefs as well as the design differences we >>>>>>>> see right now and even more in the future. >>>>>>>> >>>>>>> The guilty here is IOTHREAD. Windows support predates IOTHREAD concept, >>>>>>> it's just that people who introduce IOTHREAD didn't care about Windows >>>>>>> support at all and added these #ifdef. Disabling Windows support because >>>>>>> of that is not fair. >>>>>> The TCG execution model won't scale long-term. It's already a main to >>>>>> boot a quad or just dual core VM, even more when your host has at least >>>>>> as many real cores. I'm sure we'll see multi-threaded TCG CPUs in the >>>>>> future, and the iothread will just be one of 7, 17 or 257 threads. >>>>>> >>>>> And what's the issue with that? People don't always look for performance >>>>> when using QEMU. They even often try to emulate old machines (and non >>>>> x86 ones), which anyway only have one CPU. This won't change in 5 years, >>>>> the only thing is that those machines will be 5 years older. >>>>> >>>>> People have to keep in mind that QEMU doesn't mean only virtualization >>>>> and doesn't mean only x86. >>>> I'm not talking about virtualization here. I'm talking about usable >>>> emulation of today's (!) embedded multi-core platforms. It matters a lot >>>> if your test roundtrip for booting into a SMP guest and running some >>>> apps is a few 10 seconds, a few minutes or even not practically working. >>>> Ever tried to boot a 16 core VM in emulation mode? I did, for fun. I >>>> just hope I'll never depend on this for work. >>> Yes, it's slow. But is it a problem? You assume that people use QEMU >>> only for emulating SMP platforms. This is a wrong assumption. Beside the >>> x86 target, only sparc really supports SMP emulation. >> >> That's too nearsighted. SMP will be commodity on practically _any_ arch >> within the next years. And if QEMU doesn't keep up with it, feature and >> performance-wise, it will loose market share. >> > > Oh commercial arguments now. I am looking for something that answer my > needs, not about market share. > "Market share" simply means user base, for commercial or for hobby, academic, whatever use. QEMU has a nice position here ATM. Even commercial competitors can help continuously comparing their solutions with QEMU (I once enjoyed such a product presentation). However, time does not stand still. Jan
On 02/08/2011 11:27 AM, Aurelien Jarno wrote: > Stefan Hajnoczi a écrit : >> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class >> citizens. I think you'd like people to provide full support when they >> introduce new features. > > I think you really pointed the problem here. We should probably add a > feature that will make KVM second class citizen so that people can > understand what it means. I actually don't think introducing IOTHREAD made Windows a second class citizen, since it was left as a non-default choice for years. People care about IOTHREAD now only because (after years) there is serious thought about making it the default. I'm sure that if you add such a killer feature that is TCG-only, KVM people will try to support it in a shorter timeframe. Paolo
On 2011-02-08 11:27, Aurelien Jarno wrote: > Stefan Hajnoczi a écrit : >> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class >> citizens. I think you'd like people to provide full support when they >> introduce new features. >> > > I think you really pointed the problem here. We should probably add a > feature that will make KVM second class citizen so that people can > understand what it means. There are people out there who already thought loudly about forking or rewriting those QEMU bits required for KVM support just to make "life easier". I already disagreed on this, and I continue to do so as both use cases nicely benefit from each other. KVM is driving QEMU features today that would otherwise have taken years to show up, if at all. On the other side, all those bits related to the cross-arch platform emulation of non-x86 helps and will continue to help KVM support on those archs as well (we already have it on PPC, we'll see on ARM and likely more in the future). So, please let's stop this useless finger pointing, on both sides. KVM and QEMU is a symbiosis. Unfortunately, this is not (yet?) the case for POSIX vs. Windows hosts. Jan
On Feb 8, 2011, at 6:58 PM, Anthony Liguori wrote: > On 02/08/2011 04:06 AM, Aurelien Jarno wrote: >> Yes, it's slow. But is it a problem? You assume that people use QEMU >> only for emulating SMP platforms. This is a wrong assumption. Beside the >> x86 target, only sparc really supports SMP emulation. >> > > It's *not* just about performance. > > TCG requires a signal to break out of a tight chained TB loop. If you have a guest in a tight loop waiting for something external (like polling on a in-memory flag), the device emulation will not get to run until a signal is fired. > > Unless you set SIGIO on every file descriptor that selects polls on (and you can't because there are a number that just don't support SIGIO), then you have a race condition. A race condition ? Looks like you are describing a dead-lock. But the dead lock doesn't happen because of the timer which periodically exits from TCG. Hence the performance issue. > This can be fixed by running TCG in a separate thread than select() and sending a signal to the TCG VCPU when select() returns (effectively SIGIO in userspace). > > This is exactly what the I/O thread does. (Nobody was able to make it working on Windows - or nobody was interested in ?) Tristan.
Anthony Liguori a écrit : > On 02/08/2011 04:06 AM, Aurelien Jarno wrote: >> Yes, it's slow. But is it a problem? You assume that people use QEMU >> only for emulating SMP platforms. This is a wrong assumption. Beside the >> x86 target, only sparc really supports SMP emulation. >> > > It's *not* just about performance. > > TCG requires a signal to break out of a tight chained TB loop. If you > have a guest in a tight loop waiting for something external (like > polling on a in-memory flag), the device emulation will not get to run > until a signal is fired. > > Unless you set SIGIO on every file descriptor that selects polls on (and > you can't because there are a number that just don't support SIGIO), > then you have a race condition. > In practice you will get a signal when the next timer event expire. I agree it's suboptimal, but it works, and has been like that for here. Having that fixed through an I/O thread is actually quite nice, however it should not be done ignoring all the *current* drawbacks of the iothread mode. We know them (at least for some of them), so let's try to solve them. And now, I don't buy the argument "it's been there for years", it was *disabled* by default.
Anthony Liguori a écrit : > On 02/08/2011 04:27 AM, Aurelien Jarno wrote: >> Stefan Hajnoczi a écrit : >> >>> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class >>> citizens. I think you'd like people to provide full support when they >>> introduce new features. >>> >>> >> I think you really pointed the problem here. We should probably add a >> feature that will make KVM second class citizen so that people can >> understand what it means. >> > > Aurelien, > > Have you actually run QEMU on Windows and tried to use it to do > something useful? > > As an exercise, walk through the various releases of QEMU and compare > how well it works on Windows to any Unix platform. Windows support in > QEMU has always been a second class citizen. I never tried to get it working on windows, but I know some people using it there. We should just don't ignore them. Maybe it's not perfect, but it is enough for those people. > If someone is willing to stand up and properly maintain it, I'm all for > doing whatever we can to be supportive of that person but as of right > now, that doesn't exist. There are regular patches for windows support, Stefan Weil is producing builds regularly. Maybe it doesn't have all the features, but people are making sure it basically works. Now you want to break that because the *new* feature you want to introduce is not supported on windows. I insist on the fact it is a new feature simply because it was *disabled* by default. So I don't buy the argument about "that person don't exist". Send a call for help on that subject, give the people some time to come with a solution (let's put a deadline), and then if nobody appears we can definitely consider Windows support as dead. But it should not be done arbitrary.
On Tue, Feb 08, 2011 at 12:07:02PM +0100, Tristan Gingold wrote: > > On Feb 8, 2011, at 6:58 PM, Anthony Liguori wrote: > > > On 02/08/2011 04:06 AM, Aurelien Jarno wrote: > >> Yes, it's slow. But is it a problem? You assume that people use QEMU > >> only for emulating SMP platforms. This is a wrong assumption. Beside the > >> x86 target, only sparc really supports SMP emulation. > >> > > > > It's *not* just about performance. > > > > TCG requires a signal to break out of a tight chained TB loop. If you have a guest in a tight loop waiting for something external (like polling on a in-memory flag), the device emulation will not get to run until a signal is fired. > > > > Unless you set SIGIO on every file descriptor that selects polls on (and you can't because there are a number that just don't support SIGIO), then you have a race condition. > > A race condition ? Looks like you are describing a dead-lock. > > But the dead lock doesn't happen because of the timer which periodically exits from TCG. Hence the performance issue. > > > This can be fixed by running TCG in a separate thread than select() and sending a signal to the TCG VCPU when select() returns (effectively SIGIO in userspace). > > > > This is exactly what the I/O thread does. > > > (Nobody was able to make it working on Windows - or nobody was interested in ?) > Given the I/O thread is disabled by default, my guess is that nobody really see an interest in looking at that.
On 02/08/2011 12:46 PM, Aurelien Jarno wrote: > Given the I/O thread is disabled by default, my guess is that nobody > really see an interest in looking at that. I had started looking at it in my free time. I stopped because the thread pool series were continuously changing the QemuThread APIs. I can resume looking at it. Paolo
On 02/08/2011 12:15 PM, Aurelien Jarno wrote: > however > it should not be done ignoring all the*current* drawbacks of the > iothread mode. We know them (at least for some of them), so let's try to > solve them. Let's also enumerate them. > And now, I don't buy the argument "it's been there for years", it was > *disabled* by default. It was disabled by default only because it is most useful for KVM and people were using qemu-kvm's iothread. Paolo
On Tue, Feb 08, 2011 at 12:05:31PM -0600, Anthony Liguori wrote: > Aurelien, > Have you actually run QEMU on Windows and tried to use it to do > something useful? I'm not Aurelian, but we do use QEMU on win32 as part of Nokia Qt SDK. While it is second class in many ways compared to Linux QEMU (or even OS X), it is still quite useful for us. Riku
Anthony Liguori a écrit : > On 02/08/2011 05:15 AM, Aurelien Jarno wrote: >> Anthony Liguori a écrit : >> >>> On 02/08/2011 04:06 AM, Aurelien Jarno wrote: >>> >>>> Yes, it's slow. But is it a problem? You assume that people use QEMU >>>> only for emulating SMP platforms. This is a wrong assumption. Beside the >>>> x86 target, only sparc really supports SMP emulation. >>>> >>>> >>> It's *not* just about performance. >>> >>> TCG requires a signal to break out of a tight chained TB loop. If you >>> have a guest in a tight loop waiting for something external (like >>> polling on a in-memory flag), the device emulation will not get to run >>> until a signal is fired. >>> >>> Unless you set SIGIO on every file descriptor that selects polls on (and >>> you can't because there are a number that just don't support SIGIO), >>> then you have a race condition. >>> >>> >> In practice you will get a signal when the next timer event expire. I >> agree it's suboptimal, but it works, and has been like that for here. >> > > During early boot up before the periodic timer is enabled can cause > quite a noticable issue here. > > I think it's cris specifically that does polling I/O in the early > startup before any periodic timer is enabled. > >> Having that fixed through an I/O thread is actually quite nice, however >> it should not be done ignoring all the *current* drawbacks of the >> iothread mode. We know them (at least for some of them), so let's try to >> solve them. >> > > Yes, agree 100%. > >> And now, I don't buy the argument "it's been there for years", it was >> *disabled* by default. >> > > Yeah, I think we need to enable it by default and commit to fixing all > of the outstanding issues. So the strategy is let's break everything and wait for the maintainer to fix that? This strategy doesn't work, we have seen for example that with the SeaBIOS switch. While it brings nice features, it has broken the isapc machine. And it's still not fixed... Also this strategy doesn't scale, then the maintainers are spending their time fixing bugs introduced because others didn't care. Resources are not unlimited, especially for those doing that on their free time. > I think we've fixed all that we're aware of but we probably won't find > the rest unless we enable it universally. I agree that we are going to discover bugs, and it's normal. QEMU is quite complex and it's not possible to test every combination. That said we are already aware of some bugs, why not fix them, or at least try to fix them? For example we haven't fixed the performance regression with TCG (at least it wasn't the case two weeks ago).
Paolo Bonzini a écrit : > On 02/08/2011 12:15 PM, Aurelien Jarno wrote: >> however >> it should not be done ignoring all the*current* drawbacks of the >> iothread mode. We know them (at least for some of them), so let's try to >> solve them. > > Let's also enumerate them. > From what I know: - performance regression in TCG mode - windows support I am going to look again at the first one tonight to provide some numbers.
Aurelien Jarno a écrit : > Paolo Bonzini a écrit : >> On 02/08/2011 12:15 PM, Aurelien Jarno wrote: >>> however >>> it should not be done ignoring all the*current* drawbacks of the >>> iothread mode. We know them (at least for some of them), so let's try to >>> solve them. >> Let's also enumerate them. >> > > From what I know: > - performance regression in TCG mode I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing was running except the standard daemons and the CPU governor was set to "performance" on all CPU. I then compared the network performance using netperf in default mode, through a tap interface and a virtio nic. I got the following results (quite reproducible, std below 0.5): - without IO thread: 107.36 MB/s - with IO thread: 89.93 MB/s I haven't redone the tests I have done two weeks ago on MIPS, ARM, PowerPC and SH4 (using different emulated network cards: smc91c111, rtl8139, e1000, virtio), but it was roughly the same slow down, except on ARM where it was more important.
Anthony Liguori a écrit : > On 02/08/2011 07:30 AM, Aurelien Jarno wrote: >> So the strategy is let's break everything and wait for the maintainer to >> fix that? This strategy doesn't work, we have seen for example that with >> the SeaBIOS switch. While it brings nice features, it has broken the >> isapc machine. And it's still not fixed... >> > > The fundamental problem is that poorly thought out features have been > committed in the past. isapc is a good example of this. > > You can't just remove a chipset but leave an ISA bus implementation and > expect things to just keep working. Even the early ISA-only systems had > a chipset that firmware interfaced with. > >> Also this strategy doesn't scale, then the maintainers are spending >> their time fixing bugs introduced because others didn't care. Resources >> are not unlimited, especially for those doing that on their free time. >> > > So are you suggesting that every half baked feature should hold up any > other future developments? I think the real problem is exactly the > opposite of what you describe. Why should we waste finite resources > keeping something like Windows support limping along? > > We need to do a better job of not adding features that there is no > serious intention of every supporting in a meaningful way. I think the > recent discussion of w64 is a good example of this. I can't imagine > trying to support w64 in QEMU until someone actually makes w32 work in a > reasonable way. Yes, we should at least leave people time to find a solution. If nobody comes with a solution, let's consider it deprecated. >>> I think we've fixed all that we're aware of but we probably won't find >>> the rest unless we enable it universally. >>> >> I agree that we are going to discover bugs, and it's normal. QEMU is >> quite complex and it's not possible to test every combination. That said >> we are already aware of some bugs, why not fix them, or at least try to >> fix them? For example we haven't fixed the performance regression with >> TCG (at least it wasn't the case two weeks ago). >> > > If there are known issues, yes, let's fix them before enabling it. > So please look at this TCG performance regression instead of talking about enabling this just after the release. I don't consider TCG a half baked feature, for people who forgot about that it's the original QEMU mode.
On 02/08/2011 04:06 AM, Aurelien Jarno wrote: > Yes, it's slow. But is it a problem? You assume that people use QEMU > only for emulating SMP platforms. This is a wrong assumption. Beside the > x86 target, only sparc really supports SMP emulation. > It's *not* just about performance. TCG requires a signal to break out of a tight chained TB loop. If you have a guest in a tight loop waiting for something external (like polling on a in-memory flag), the device emulation will not get to run until a signal is fired. Unless you set SIGIO on every file descriptor that selects polls on (and you can't because there are a number that just don't support SIGIO), then you have a race condition. This can be fixed by running TCG in a separate thread than select() and sending a signal to the TCG VCPU when select() returns (effectively SIGIO in userspace). This is exactly what the I/O thread does. Regards, Anthony Liguori
On 02/08/2011 04:27 AM, Aurelien Jarno wrote: > Stefan Hajnoczi a écrit : > >> Introducing IOTHREAD made !CONFIG_IOTHREAD platforms second class >> citizens. I think you'd like people to provide full support when they >> introduce new features. >> >> > I think you really pointed the problem here. We should probably add a > feature that will make KVM second class citizen so that people can > understand what it means. > Aurelien, Have you actually run QEMU on Windows and tried to use it to do something useful? As an exercise, walk through the various releases of QEMU and compare how well it works on Windows to any Unix platform. Windows support in QEMU has always been a second class citizen. If someone is willing to stand up and properly maintain it, I'm all for doing whatever we can to be supportive of that person but as of right now, that doesn't exist. Regards, Anthony Liguori
On 02/08/2011 05:15 AM, Aurelien Jarno wrote: > Anthony Liguori a écrit : > >> On 02/08/2011 04:06 AM, Aurelien Jarno wrote: >> >>> Yes, it's slow. But is it a problem? You assume that people use QEMU >>> only for emulating SMP platforms. This is a wrong assumption. Beside the >>> x86 target, only sparc really supports SMP emulation. >>> >>> >> It's *not* just about performance. >> >> TCG requires a signal to break out of a tight chained TB loop. If you >> have a guest in a tight loop waiting for something external (like >> polling on a in-memory flag), the device emulation will not get to run >> until a signal is fired. >> >> Unless you set SIGIO on every file descriptor that selects polls on (and >> you can't because there are a number that just don't support SIGIO), >> then you have a race condition. >> >> > In practice you will get a signal when the next timer event expire. I > agree it's suboptimal, but it works, and has been like that for here. > During early boot up before the periodic timer is enabled can cause quite a noticable issue here. I think it's cris specifically that does polling I/O in the early startup before any periodic timer is enabled. > Having that fixed through an I/O thread is actually quite nice, however > it should not be done ignoring all the *current* drawbacks of the > iothread mode. We know them (at least for some of them), so let's try to > solve them. > Yes, agree 100%. > And now, I don't buy the argument "it's been there for years", it was > *disabled* by default. > Yeah, I think we need to enable it by default and commit to fixing all of the outstanding issues. I think we've fixed all that we're aware of but we probably won't find the rest unless we enable it universally. Regards, Anthony Liguori
On 02/08/2011 05:46 AM, Aurelien Jarno wrote: > On Tue, Feb 08, 2011 at 12:07:02PM +0100, Tristan Gingold wrote: > >> On Feb 8, 2011, at 6:58 PM, Anthony Liguori wrote: >> >> >>> On 02/08/2011 04:06 AM, Aurelien Jarno wrote: >>> >>>> Yes, it's slow. But is it a problem? You assume that people use QEMU >>>> only for emulating SMP platforms. This is a wrong assumption. Beside the >>>> x86 target, only sparc really supports SMP emulation. >>>> >>>> >>> It's *not* just about performance. >>> >>> TCG requires a signal to break out of a tight chained TB loop. If you have a guest in a tight loop waiting for something external (like polling on a in-memory flag), the device emulation will not get to run until a signal is fired. >>> >>> Unless you set SIGIO on every file descriptor that selects polls on (and you can't because there are a number that just don't support SIGIO), then you have a race condition. >>> >> A race condition ? Looks like you are describing a dead-lock. >> >> But the dead lock doesn't happen because of the timer which periodically exits from TCG. Hence the performance issue. >> With dynticks, you don't always have a periodic timer (unless the guest has a periodic timer enabled). There's a good bit of early startup code that runs without a periodic timer enabled. Now that said, we never truly sleep forever. We'll set something like a 5 second timeout. But 5 seconds might as well be forever and this is certainly a giant hack. Regards, Anthony Liguori >>> This can be fixed by running TCG in a separate thread than select() and sending a signal to the TCG VCPU when select() returns (effectively SIGIO in userspace). >>> >>> This is exactly what the I/O thread does. >>> >> >> (Nobody was able to make it working on Windows - or nobody was interested in ?) >> >> > Given the I/O thread is disabled by default, my guess is that nobody > really see an interest in looking at that. > >
On 02/08/2011 07:30 AM, Aurelien Jarno wrote: > So the strategy is let's break everything and wait for the maintainer to > fix that? This strategy doesn't work, we have seen for example that with > the SeaBIOS switch. While it brings nice features, it has broken the > isapc machine. And it's still not fixed... > The fundamental problem is that poorly thought out features have been committed in the past. isapc is a good example of this. You can't just remove a chipset but leave an ISA bus implementation and expect things to just keep working. Even the early ISA-only systems had a chipset that firmware interfaced with. > Also this strategy doesn't scale, then the maintainers are spending > their time fixing bugs introduced because others didn't care. Resources > are not unlimited, especially for those doing that on their free time. > So are you suggesting that every half baked feature should hold up any other future developments? I think the real problem is exactly the opposite of what you describe. Why should we waste finite resources keeping something like Windows support limping along? We need to do a better job of not adding features that there is no serious intention of every supporting in a meaningful way. I think the recent discussion of w64 is a good example of this. I can't imagine trying to support w64 in QEMU until someone actually makes w32 work in a reasonable way. >> I think we've fixed all that we're aware of but we probably won't find >> the rest unless we enable it universally. >> > I agree that we are going to discover bugs, and it's normal. QEMU is > quite complex and it's not possible to test every combination. That said > we are already aware of some bugs, why not fix them, or at least try to > fix them? For example we haven't fixed the performance regression with > TCG (at least it wasn't the case two weeks ago). > If there are known issues, yes, let's fix them before enabling it. Regards, Anthony Liguori
On Tue, Feb 8, 2011 at 5:09 PM, Aurelien Jarno <aurelien@aurel32.net> wrote: > Anthony Liguori a écrit : >> On 02/08/2011 07:30 AM, Aurelien Jarno wrote: >>> So the strategy is let's break everything and wait for the maintainer to >>> fix that? This strategy doesn't work, we have seen for example that with >>> the SeaBIOS switch. While it brings nice features, it has broken the >>> isapc machine. And it's still not fixed... >>> >> >> The fundamental problem is that poorly thought out features have been >> committed in the past. isapc is a good example of this. >> >> You can't just remove a chipset but leave an ISA bus implementation and >> expect things to just keep working. Even the early ISA-only systems had >> a chipset that firmware interfaced with. >> >>> Also this strategy doesn't scale, then the maintainers are spending >>> their time fixing bugs introduced because others didn't care. Resources >>> are not unlimited, especially for those doing that on their free time. >>> >> >> So are you suggesting that every half baked feature should hold up any >> other future developments? I think the real problem is exactly the >> opposite of what you describe. Why should we waste finite resources >> keeping something like Windows support limping along? >> >> We need to do a better job of not adding features that there is no >> serious intention of every supporting in a meaningful way. I think the >> recent discussion of w64 is a good example of this. I can't imagine >> trying to support w64 in QEMU until someone actually makes w32 work in a >> reasonable way. > > Yes, we should at least leave people time to find a solution. If nobody > comes with a solution, let's consider it deprecated. I think win32 situation is somewhat similar (but not nearly as bad as) to kqemu's. It was useful for some users, but there were no maintenance and when it got in the way, it was removed because nobody could fix it. But I'd prefer a solution where somebody steps up as Windows maintainer. I'm also doing regular mingw32 builds but otherwise not much.
On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote: > Aurelien Jarno a écrit : > > Paolo Bonzini a écrit : > >> On 02/08/2011 12:15 PM, Aurelien Jarno wrote: > >>> however > >>> it should not be done ignoring all the*current* drawbacks of the > >>> iothread mode. We know them (at least for some of them), so let's try to > >>> solve them. > >> Let's also enumerate them. > >> > > > > From what I know: > > - performance regression in TCG mode > > I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing > was running except the standard daemons and the CPU governor was set to > "performance" on all CPU. I then compared the network performance using > netperf in default mode, through a tap interface and a virtio nic. I got > the following results (quite reproducible, std below 0.5): > - without IO thread: 107.36 MB/s > - with IO thread: 89.93 MB/s > And the same test on the code from september 2009: - without IO thread: 141.8 MB/s
On 02/09/2011 06:35 PM, Aurelien Jarno wrote: > On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote: > >> Aurelien Jarno a écrit : >> >>> Paolo Bonzini a écrit : >>> >>>> On 02/08/2011 12:15 PM, Aurelien Jarno wrote: >>>> >>>>> however >>>>> it should not be done ignoring all the*current* drawbacks of the >>>>> iothread mode. We know them (at least for some of them), so let's try to >>>>> solve them. >>>>> >>>> Let's also enumerate them. >>>> >>>> >>> From what I know: >>> - performance regression in TCG mode >>> >> I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing >> was running except the standard daemons and the CPU governor was set to >> "performance" on all CPU. I then compared the network performance using >> netperf in default mode, through a tap interface and a virtio nic. I got >> the following results (quite reproducible, std below 0.5): >> - without IO thread: 107.36 MB/s >> - with IO thread: 89.93 MB/s >> >> > And the same test on the code from september 2009: > - without IO thread: 141.8 MB/s > virtio-net is super finicky regarding mitigation strategies and their relationship to the I/O thread. Different benchmarks will behave differently. virtio-blk is probably a better device to test as you'll get much more consistent results across different type of I/O patterns. Regards, Anthony Liguori
Am 09.02.2011 18:13, schrieb Blue Swirl: > I think win32 situation is somewhat similar (but not nearly as bad as) > to kqemu's. It was useful for some users, but there were no > maintenance and when it got in the way, it was removed because nobody > could fix it. > > But I'd prefer a solution where somebody steps up as Windows > maintainer. I'm also doing regular mingw32 builds but otherwise not > much. > VNC threads can be compiled for W32, too. A short test of the resulting executable was successful, no problems. The patch is available here: http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297 I decided to create a new directory structure hosts/w32, so files can be moved from the root to hosts/posix, hosts/w32, or hosts/xxx. Include chains reduce code modifications and conditional compilations. And people who don't want to see w32 support can remove it easily :-) Supporting I/O threads for W32 will be possible, too. I don't think that W32 support is a big problem. It never was. Some of the real problems were already named in the previous mails. Regards, Stefan Weil
On 02/09/2011 11:16 PM, Stefan Weil wrote: > > I decided to create a new directory structure hosts/w32, so files can > be moved from the root to hosts/posix, hosts/w32, or hosts/xxx. > Include chains reduce code modifications and conditional compilations. > And people who don't want to see w32 support can remove it easily :-) > > Supporting I/O threads for W32 will be possible, too. I have patches for Win32 iothread, I'm just posting the series split into multiple pieces. Paolo
On 02/09/2011 11:16 PM, Stefan Weil wrote: > The patch is available here: > http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297 > diff --git a/hosts/w32/include/signal.h b/hosts/w32/include/signal.h > new file mode 100644 > index 0000000..e45f03c > --- /dev/null > +++ b/hosts/w32/include/signal.h > @@ -0,0 +1,20 @@ > +/* > + * QEMU w32 support > + * > + * Copyright (C) 2011 Stefan Weil > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + * > + */ > + > +#ifndef WIN32_SIGNAL_H > +#define WIN32_SIGNAL_H > + > +#include_next <signal.h> > +#include <sys/types.h> /* sigset_t */ > + > +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset); > +int sigfillset(sigset_t *set); > + > +#endif /* WIN32_SIGNAL_H */ > diff --git a/hosts/w32/include/time.h b/hosts/w32/include/time.h > new file mode 100644 > index 0000000..0b997d3 > --- /dev/null > +++ b/hosts/w32/include/time.h > @@ -0,0 +1,31 @@ > +/* > + * QEMU w32 support > + * > + * Copyright (C) 2011 Stefan Weil > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + * > + */ > + > +#if !defined(W32_TIME_H) > +#define W32_TIME_H > + > +#include_next <time.h> > + > +#ifndef HAVE_STRUCT_TIMESPEC > +#define HAVE_STRUCT_TIMESPEC 1 > +struct timespec { > + long tv_sec; > + long tv_nsec; > +}; > +#endif /* HAVE_STRUCT_TIMESPEC */ > + > +typedef enum { > + CLOCK_REALTIME = 0 > +} clockid_t; > + > +int clock_getres (clockid_t clock_id, struct timespec *res); > +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec); > + > +#endif /* W32_TIME_H */ > diff --git a/os-win32.c b/os-win32.c > index b214e6a..7778366 100644 > --- a/os-win32.c > +++ b/os-win32.c > @@ -36,6 +36,45 @@ > /***********************************************************/ > /* Functions missing in mingw */ > > +#if defined(CONFIG_THREAD) > + > +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec) > +{ > + int result = 0; > + if (clock_id == CLOCK_REALTIME && pTimespec != 0) { > + DWORD t = GetTickCount(); > + const unsigned cps = 1000; > + struct timespec ts; > + ts.tv_sec = t / cps; > + ts.tv_nsec = (t % cps) * (1000000000UL / cps); > + *pTimespec = ts; > + } else { > + errno = EINVAL; > + result = -1; > + } > + return result; > +} Why is this needed? The only user of clock_gettime in the POSIX case is using CLOCK_MONOTONIC, and actually has a Win32 version already. > +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset) > +{ > + /* Dummy, do nothing. */ > + return EINVAL; > +} > + > +int sigfillset(sigset_t *set) > +{ > + int result = 0; > + if (set) { > + *(set) = (sigset_t)(-1); > + } else { > + errno = EINVAL; > + result = -1; > + } > + return result; > +} Instead of these, it's better to provide a Win32 implementation of mutexes and condvars. I'll submit it next week hopefully. Paolo
Am 10.02.2011 10:54, schrieb Paolo Bonzini: > On 02/09/2011 11:16 PM, Stefan Weil wrote: >> The patch is available here: >> http://repo.or.cz/w/qemu/ar7.git/commitdiff/aabf11dc0a938b84d76d7c147cbf0445d7bee297 >> [snip] >> diff --git a/os-win32.c b/os-win32.c >> index b214e6a..7778366 100644 >> --- a/os-win32.c >> +++ b/os-win32.c >> @@ -36,6 +36,45 @@ >> /***********************************************************/ >> /* Functions missing in mingw */ >> >> +#if defined(CONFIG_THREAD) >> + >> +int clock_gettime(clockid_t clock_id, struct timespec *pTimespec) >> +{ >> + int result = 0; >> + if (clock_id == CLOCK_REALTIME && pTimespec != 0) { >> + DWORD t = GetTickCount(); >> + const unsigned cps = 1000; >> + struct timespec ts; >> + ts.tv_sec = t / cps; >> + ts.tv_nsec = (t % cps) * (1000000000UL / cps); >> + *pTimespec = ts; >> + } else { >> + errno = EINVAL; >> + result = -1; >> + } >> + return result; >> +} > > Why is this needed? The only user of clock_gettime in the POSIX case > is using CLOCK_MONOTONIC, and actually has a Win32 version already. qemu-thread.c uses clock_gettime(CLOCK_REALTIME, ...) > >> +int pthread_sigmask(int how, const sigset_t *set, sigset_t *oldset) >> +{ >> + /* Dummy, do nothing. */ >> + return EINVAL; >> +} >> + >> +int sigfillset(sigset_t *set) >> +{ >> + int result = 0; >> + if (set) { >> + *(set) = (sigset_t)(-1); >> + } else { >> + errno = EINVAL; >> + result = -1; >> + } >> + return result; >> +} > > Instead of these, it's better to provide a Win32 implementation of > mutexes and condvars. I'll submit it next week hopefully. > > Paolo That's good news. My patch was only a quick hack to make threaded VNC work. Thanks, Stefan
On Wed, Feb 09, 2011 at 09:07:52PM +0100, Anthony Liguori wrote: > On 02/09/2011 06:35 PM, Aurelien Jarno wrote: > >On Tue, Feb 08, 2011 at 04:08:28PM +0100, Aurelien Jarno wrote: > >>Aurelien Jarno a écrit : > >>>Paolo Bonzini a écrit : > >>>>On 02/08/2011 12:15 PM, Aurelien Jarno wrote: > >>>>>however > >>>>>it should not be done ignoring all the*current* drawbacks of the > >>>>>iothread mode. We know them (at least for some of them), so let's try to > >>>>>solve them. > >>>>Let's also enumerate them. > >>>> > >>> From what I know: > >>>- performance regression in TCG mode > >>I setup an x86_64 guest on an x86_64 host (Intel Xeon E5345). Nothing > >>was running except the standard daemons and the CPU governor was set to > >>"performance" on all CPU. I then compared the network performance using > >>netperf in default mode, through a tap interface and a virtio nic. I got > >>the following results (quite reproducible, std below 0.5): > >>- without IO thread: 107.36 MB/s > >>- with IO thread: 89.93 MB/s > >> > >And the same test on the code from september 2009: > >- without IO thread: 141.8 MB/s > virtio-net is super finicky regarding mitigation strategies and > their relationship to the I/O thread. Different benchmarks will > behave differently. virtio-blk is probably a better device to test > as you'll get much more consistent results across different type of > I/O patterns. netperf server on guest, RHEL5.4 guest (e1000), uq/master branch, TCG: iothread: 236MB/s no iothread: 215MB/s Also noticed scp was slightly faster with iothread earlier this week, don't remember numbers.
diff --git a/cpus.c b/cpus.c index 9c50a34..2280db1 100644 --- a/cpus.c +++ b/cpus.c @@ -748,7 +748,7 @@ static void qemu_tcg_wait_io_event(void) CPUState *env; while (!any_cpu_has_work()) - qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, 1000); + qemu_cond_timedwait(tcg_halt_cond, &qemu_global_mutex, qemu_calculate_timeout()); qemu_mutex_unlock(&qemu_global_mutex); diff --git a/vl.c b/vl.c index 837be97..dbd81a1 100644 --- a/vl.c +++ b/vl.c @@ -1323,7 +1323,7 @@ void main_loop_wait(int nonblocking) if (nonblocking) timeout = 0; else { - timeout = qemu_calculate_timeout(); + timeout = 1000; qemu_bh_update_timeout(&timeout); }