Message ID | 20110118001950.GA11802@laped.lan |
---|---|
State | New |
Headers | show |
On 2011-01-18 01:19, Edgar E. Iglesias wrote: > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote: >> Hi, >> >> I'm running an io-thread enabled qemu-system-mipsel with icount. >> When the guest (linux) goes to sleep through the wait insn (waiting >> to be woken up by future timer interrupts), the thing deadlocks. >> >> IIUC, this is because vm timers are driven by icount, but the CPU is >> halted so icount makes no progress and time stands still. >> >> I've locally disabled vcpu halting when icount is enabled, that >> works around my problem but of course makes qemu consume 100% host cpu. >> >> I don't know why I only see this problem with io-thread builds? >> Could be related timing and luck. >> >> Would be interesting to know if someone has any info on how this was >> intended to work (if it was)? And if there are ideas for better >> workarounds or fixes that don't disable vcpu halting entirely. > > Hi, > > I've found the problem. For some reason io-thread builds use a > static timeout for wait loops. The entire chunk of code that > makes sure qemu_icount makes forward progress when the CPU's > are idle has been ifdef'ed away... > > This fixes the problem for me, hopefully without affecting > io-thread runs without icount. > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b > Author: Edgar E. Iglesias <edgar@axis.com> > Date: Tue Jan 18 01:01:57 2011 +0100 > > qemu-timer: Fix timeout calc for io-thread with icount > > Make sure we always make forward progress with qemu_icount to > avoid deadlocks. For io-thread, use the static 1000 timeout > only if icount is disabled. > > Signed-off-by: Edgar E. Iglesias <edgar@axis.com> > > diff --git a/qemu-timer.c b/qemu-timer.c > index 95814af..db1ec49 100644 > --- a/qemu-timer.c > +++ b/qemu-timer.c > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) > } > } > > -#ifndef CONFIG_IOTHREAD > static int64_t qemu_icount_delta(void) > { > if (!use_icount) { > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) > return cpu_get_icount() - cpu_get_clock(); > } > } > -#endif > > /* enable cpu_get_ticks() */ > void cpu_enable_ticks(void) > @@ -1077,9 +1075,17 @@ void quit_timers(void) > > int qemu_calculate_timeout(void) > { > -#ifndef CONFIG_IOTHREAD > int timeout; > > +#ifdef CONFIG_IOTHREAD > + /* When using icount, making forward progress with qemu_icount when the > + guest CPU is idle is critical. We only use the static io-thread timeout > + for non icount runs. */ > + if (!use_icount) { > + return 1000; > + } > +#endif > + > if (!vm_running) > timeout = 5000; > else { > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) > } > > return timeout; > -#else /* CONFIG_IOTHREAD */ > - return 1000; > -#endif > } > > > This logic and timeout values were imported on iothread merge. And I bet at least the timeout value of 1s (vs. 5s) can still be found in qemu-kvm. Maybe someone over there can remember the rationales behind choosing this value. Jan
On 2011-01-18 11:00, Jan Kiszka wrote: > On 2011-01-18 01:19, Edgar E. Iglesias wrote: >> On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote: >>> Hi, >>> >>> I'm running an io-thread enabled qemu-system-mipsel with icount. >>> When the guest (linux) goes to sleep through the wait insn (waiting >>> to be woken up by future timer interrupts), the thing deadlocks. >>> >>> IIUC, this is because vm timers are driven by icount, but the CPU is >>> halted so icount makes no progress and time stands still. >>> >>> I've locally disabled vcpu halting when icount is enabled, that >>> works around my problem but of course makes qemu consume 100% host cpu. >>> >>> I don't know why I only see this problem with io-thread builds? >>> Could be related timing and luck. >>> >>> Would be interesting to know if someone has any info on how this was >>> intended to work (if it was)? And if there are ideas for better >>> workarounds or fixes that don't disable vcpu halting entirely. >> >> Hi, >> >> I've found the problem. For some reason io-thread builds use a >> static timeout for wait loops. The entire chunk of code that >> makes sure qemu_icount makes forward progress when the CPU's >> are idle has been ifdef'ed away... >> >> This fixes the problem for me, hopefully without affecting >> io-thread runs without icount. >> >> commit 0f4f3a919952500b487b438c5520f07a1c6be35b >> Author: Edgar E. Iglesias <edgar@axis.com> >> Date: Tue Jan 18 01:01:57 2011 +0100 >> >> qemu-timer: Fix timeout calc for io-thread with icount >> >> Make sure we always make forward progress with qemu_icount to >> avoid deadlocks. For io-thread, use the static 1000 timeout >> only if icount is disabled. >> >> Signed-off-by: Edgar E. Iglesias <edgar@axis.com> >> >> diff --git a/qemu-timer.c b/qemu-timer.c >> index 95814af..db1ec49 100644 >> --- a/qemu-timer.c >> +++ b/qemu-timer.c >> @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) >> } >> } >> >> -#ifndef CONFIG_IOTHREAD >> static int64_t qemu_icount_delta(void) >> { >> if (!use_icount) { >> @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) >> return cpu_get_icount() - cpu_get_clock(); >> } >> } >> -#endif >> >> /* enable cpu_get_ticks() */ >> void cpu_enable_ticks(void) >> @@ -1077,9 +1075,17 @@ void quit_timers(void) >> >> int qemu_calculate_timeout(void) >> { >> -#ifndef CONFIG_IOTHREAD >> int timeout; >> >> +#ifdef CONFIG_IOTHREAD >> + /* When using icount, making forward progress with qemu_icount when the >> + guest CPU is idle is critical. We only use the static io-thread timeout >> + for non icount runs. */ >> + if (!use_icount) { >> + return 1000; >> + } >> +#endif >> + >> if (!vm_running) >> timeout = 5000; >> else { >> @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) >> } >> >> return timeout; >> -#else /* CONFIG_IOTHREAD */ >> - return 1000; >> -#endif >> } >> >> >> > > This logic and timeout values were imported on iothread merge. And I bet > at least the timeout value of 1s (vs. 5s) can still be found in > qemu-kvm. Maybe someone over there can remember the rationales behind > choosing this value. Correction: qemu-kvm does _not_ use a fixed timeout value for the iothread nor the removed code path (as CONFIG_IOTHREAD is off in qemu-kvm, that tree does not even build when you enable it). Still, the reason for once introducing this difference to qemu should be reflected. Jan
On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote: > On 2011-01-18 01:19, Edgar E. Iglesias wrote: > > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote: > >> Hi, > >> > >> I'm running an io-thread enabled qemu-system-mipsel with icount. > >> When the guest (linux) goes to sleep through the wait insn (waiting > >> to be woken up by future timer interrupts), the thing deadlocks. > >> > >> IIUC, this is because vm timers are driven by icount, but the CPU is > >> halted so icount makes no progress and time stands still. > >> > >> I've locally disabled vcpu halting when icount is enabled, that > >> works around my problem but of course makes qemu consume 100% host cpu. > >> > >> I don't know why I only see this problem with io-thread builds? > >> Could be related timing and luck. > >> > >> Would be interesting to know if someone has any info on how this was > >> intended to work (if it was)? And if there are ideas for better > >> workarounds or fixes that don't disable vcpu halting entirely. > > > > Hi, > > > > I've found the problem. For some reason io-thread builds use a > > static timeout for wait loops. The entire chunk of code that > > makes sure qemu_icount makes forward progress when the CPU's > > are idle has been ifdef'ed away... > > > > This fixes the problem for me, hopefully without affecting > > io-thread runs without icount. > > > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b > > Author: Edgar E. Iglesias <edgar@axis.com> > > Date: Tue Jan 18 01:01:57 2011 +0100 > > > > qemu-timer: Fix timeout calc for io-thread with icount > > > > Make sure we always make forward progress with qemu_icount to > > avoid deadlocks. For io-thread, use the static 1000 timeout > > only if icount is disabled. > > > > Signed-off-by: Edgar E. Iglesias <edgar@axis.com> > > > > diff --git a/qemu-timer.c b/qemu-timer.c > > index 95814af..db1ec49 100644 > > --- a/qemu-timer.c > > +++ b/qemu-timer.c > > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) > > } > > } > > > > -#ifndef CONFIG_IOTHREAD > > static int64_t qemu_icount_delta(void) > > { > > if (!use_icount) { > > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) > > return cpu_get_icount() - cpu_get_clock(); > > } > > } > > -#endif > > > > /* enable cpu_get_ticks() */ > > void cpu_enable_ticks(void) > > @@ -1077,9 +1075,17 @@ void quit_timers(void) > > > > int qemu_calculate_timeout(void) > > { > > -#ifndef CONFIG_IOTHREAD > > int timeout; > > > > +#ifdef CONFIG_IOTHREAD > > + /* When using icount, making forward progress with qemu_icount when the > > + guest CPU is idle is critical. We only use the static io-thread timeout > > + for non icount runs. */ > > + if (!use_icount) { > > + return 1000; > > + } > > +#endif > > + > > if (!vm_running) > > timeout = 5000; > > else { > > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) > > } > > > > return timeout; > > -#else /* CONFIG_IOTHREAD */ > > - return 1000; > > -#endif > > } > > > > > > > > This logic and timeout values were imported on iothread merge. And I bet > at least the timeout value of 1s (vs. 5s) can still be found in > qemu-kvm. Maybe someone over there can remember the rationales behind > choosing this value. > > Jan This timeout is for the main select() call. So there is not a lot of reasoning, how long to wait when there's no activity on the file descriptors.
On Wed, Jan 19, 2011 at 03:02:26PM -0200, Marcelo Tosatti wrote: > On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote: > > On 2011-01-18 01:19, Edgar E. Iglesias wrote: > > > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote: > > >> Hi, > > >> > > >> I'm running an io-thread enabled qemu-system-mipsel with icount. > > >> When the guest (linux) goes to sleep through the wait insn (waiting > > >> to be woken up by future timer interrupts), the thing deadlocks. > > >> > > >> IIUC, this is because vm timers are driven by icount, but the CPU is > > >> halted so icount makes no progress and time stands still. > > >> > > >> I've locally disabled vcpu halting when icount is enabled, that > > >> works around my problem but of course makes qemu consume 100% host cpu. > > >> > > >> I don't know why I only see this problem with io-thread builds? > > >> Could be related timing and luck. > > >> > > >> Would be interesting to know if someone has any info on how this was > > >> intended to work (if it was)? And if there are ideas for better > > >> workarounds or fixes that don't disable vcpu halting entirely. > > > > > > Hi, > > > > > > I've found the problem. For some reason io-thread builds use a > > > static timeout for wait loops. The entire chunk of code that > > > makes sure qemu_icount makes forward progress when the CPU's > > > are idle has been ifdef'ed away... > > > > > > This fixes the problem for me, hopefully without affecting > > > io-thread runs without icount. > > > > > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b > > > Author: Edgar E. Iglesias <edgar@axis.com> > > > Date: Tue Jan 18 01:01:57 2011 +0100 > > > > > > qemu-timer: Fix timeout calc for io-thread with icount > > > > > > Make sure we always make forward progress with qemu_icount to > > > avoid deadlocks. For io-thread, use the static 1000 timeout > > > only if icount is disabled. > > > > > > Signed-off-by: Edgar E. Iglesias <edgar@axis.com> > > > > > > diff --git a/qemu-timer.c b/qemu-timer.c > > > index 95814af..db1ec49 100644 > > > --- a/qemu-timer.c > > > +++ b/qemu-timer.c > > > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) > > > } > > > } > > > > > > -#ifndef CONFIG_IOTHREAD > > > static int64_t qemu_icount_delta(void) > > > { > > > if (!use_icount) { > > > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) > > > return cpu_get_icount() - cpu_get_clock(); > > > } > > > } > > > -#endif > > > > > > /* enable cpu_get_ticks() */ > > > void cpu_enable_ticks(void) > > > @@ -1077,9 +1075,17 @@ void quit_timers(void) > > > > > > int qemu_calculate_timeout(void) > > > { > > > -#ifndef CONFIG_IOTHREAD > > > int timeout; > > > > > > +#ifdef CONFIG_IOTHREAD > > > + /* When using icount, making forward progress with qemu_icount when the > > > + guest CPU is idle is critical. We only use the static io-thread timeout > > > + for non icount runs. */ > > > + if (!use_icount) { > > > + return 1000; > > > + } > > > +#endif > > > + > > > if (!vm_running) > > > timeout = 5000; > > > else { > > > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) > > > } > > > > > > return timeout; > > > -#else /* CONFIG_IOTHREAD */ > > > - return 1000; > > > -#endif > > > } > > > > > > > > > > > > > This logic and timeout values were imported on iothread merge. And I bet > > at least the timeout value of 1s (vs. 5s) can still be found in > > qemu-kvm. Maybe someone over there can remember the rationales behind > > choosing this value. > > > > Jan > > This timeout is for the main select() call. So there is not a lot > of reasoning, how long to wait when there's no activity on the file > descriptors. OK, I suspected something like that. Thanks both of you for the info. I'll give people a couple of days to complain at the patch, if noone does I'll apply it. Cheers
On Wed, Jan 19, 2011 at 08:02:28PM +0100, Edgar E. Iglesias wrote: > On Wed, Jan 19, 2011 at 03:02:26PM -0200, Marcelo Tosatti wrote: > > On Tue, Jan 18, 2011 at 11:00:57AM +0100, Jan Kiszka wrote: > > > On 2011-01-18 01:19, Edgar E. Iglesias wrote: > > > > On Mon, Jan 17, 2011 at 11:03:08AM +0100, Edgar E. Iglesias wrote: > > > >> Hi, > > > >> > > > >> I'm running an io-thread enabled qemu-system-mipsel with icount. > > > >> When the guest (linux) goes to sleep through the wait insn (waiting > > > >> to be woken up by future timer interrupts), the thing deadlocks. > > > >> > > > >> IIUC, this is because vm timers are driven by icount, but the CPU is > > > >> halted so icount makes no progress and time stands still. > > > >> > > > >> I've locally disabled vcpu halting when icount is enabled, that > > > >> works around my problem but of course makes qemu consume 100% host cpu. > > > >> > > > >> I don't know why I only see this problem with io-thread builds? > > > >> Could be related timing and luck. > > > >> > > > >> Would be interesting to know if someone has any info on how this was > > > >> intended to work (if it was)? And if there are ideas for better > > > >> workarounds or fixes that don't disable vcpu halting entirely. > > > > > > > > Hi, > > > > > > > > I've found the problem. For some reason io-thread builds use a > > > > static timeout for wait loops. The entire chunk of code that > > > > makes sure qemu_icount makes forward progress when the CPU's > > > > are idle has been ifdef'ed away... > > > > > > > > This fixes the problem for me, hopefully without affecting > > > > io-thread runs without icount. > > > > > > > > commit 0f4f3a919952500b487b438c5520f07a1c6be35b > > > > Author: Edgar E. Iglesias <edgar@axis.com> > > > > Date: Tue Jan 18 01:01:57 2011 +0100 > > > > > > > > qemu-timer: Fix timeout calc for io-thread with icount > > > > > > > > Make sure we always make forward progress with qemu_icount to > > > > avoid deadlocks. For io-thread, use the static 1000 timeout > > > > only if icount is disabled. > > > > > > > > Signed-off-by: Edgar E. Iglesias <edgar@axis.com> > > > > > > > > diff --git a/qemu-timer.c b/qemu-timer.c > > > > index 95814af..db1ec49 100644 > > > > --- a/qemu-timer.c > > > > +++ b/qemu-timer.c > > > > @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) > > > > } > > > > } > > > > > > > > -#ifndef CONFIG_IOTHREAD > > > > static int64_t qemu_icount_delta(void) > > > > { > > > > if (!use_icount) { > > > > @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) > > > > return cpu_get_icount() - cpu_get_clock(); > > > > } > > > > } > > > > -#endif > > > > > > > > /* enable cpu_get_ticks() */ > > > > void cpu_enable_ticks(void) > > > > @@ -1077,9 +1075,17 @@ void quit_timers(void) > > > > > > > > int qemu_calculate_timeout(void) > > > > { > > > > -#ifndef CONFIG_IOTHREAD > > > > int timeout; > > > > > > > > +#ifdef CONFIG_IOTHREAD > > > > + /* When using icount, making forward progress with qemu_icount when the > > > > + guest CPU is idle is critical. We only use the static io-thread timeout > > > > + for non icount runs. */ > > > > + if (!use_icount) { > > > > + return 1000; > > > > + } > > > > +#endif > > > > + > > > > if (!vm_running) > > > > timeout = 5000; > > > > else { > > > > @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) > > > > } > > > > > > > > return timeout; > > > > -#else /* CONFIG_IOTHREAD */ > > > > - return 1000; > > > > -#endif > > > > } > > > > > > > > > > > > > > > > > > This logic and timeout values were imported on iothread merge. And I bet > > > at least the timeout value of 1s (vs. 5s) can still be found in > > > qemu-kvm. Maybe someone over there can remember the rationales behind > > > choosing this value. > > > > > > Jan > > > > This timeout is for the main select() call. So there is not a lot > > of reasoning, how long to wait when there's no activity on the file > > descriptors. > > OK, I suspected something like that. Thanks both of you for the info. > I'll give people a couple of days to complain at the patch, if noone > does I'll apply it. Silence - so I've applied this one, thanks. Cheers
diff --git a/qemu-timer.c b/qemu-timer.c index 95814af..db1ec49 100644 --- a/qemu-timer.c +++ b/qemu-timer.c @@ -110,7 +110,6 @@ static int64_t cpu_get_clock(void) } } -#ifndef CONFIG_IOTHREAD static int64_t qemu_icount_delta(void) { if (!use_icount) { @@ -124,7 +123,6 @@ static int64_t qemu_icount_delta(void) return cpu_get_icount() - cpu_get_clock(); } } -#endif /* enable cpu_get_ticks() */ void cpu_enable_ticks(void) @@ -1077,9 +1075,17 @@ void quit_timers(void) int qemu_calculate_timeout(void) { -#ifndef CONFIG_IOTHREAD int timeout; +#ifdef CONFIG_IOTHREAD + /* When using icount, making forward progress with qemu_icount when the + guest CPU is idle is critical. We only use the static io-thread timeout + for non icount runs. */ + if (!use_icount) { + return 1000; + } +#endif + if (!vm_running) timeout = 5000; else { @@ -1110,8 +1116,5 @@ int qemu_calculate_timeout(void) } return timeout; -#else /* CONFIG_IOTHREAD */ - return 1000; -#endif }