Message ID | 1396429188-6447-1-git-send-email-stefan.bader@canonical.com |
---|---|
State | New |
Headers | show |
On Wed, Apr 02, 2014 at 10:59:48AM +0200, Stefan Bader wrote: > Unfortunately completely failed to proceed with this one (me and > upstream). This hack is curing at least the big issue with kvm. > As noted in the patch, there is a series of fixes after 3.13 which > I had been picking up while looking into this. These would be stable > material (minus maybe the idle changes though they might be useful for > other reasons and I did not want to make guesses about the things that > patch #3 changes there). > > 1. x86, acpi, idle: Restructure the mwait idle routines > 2. x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers > 3. sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding > 4. sched/preempt/x86: Fix voluntary preempt for x86 > > This just for documentation and in case anything else looks odd in > the scheduler and this might be a reason. Its now too late to pick those > and the hack works without. > > -Stefan > > --- > > From 951bca51ab8456d2a2cbca33f440680ae675b5e4 Mon Sep 17 00:00:00 2001 > From: Stefan Bader <stefan.bader@canonical.com> > Date: Tue, 1 Apr 2014 20:26:00 +0200 > Subject: [PATCH] UBUNTU: SAUCE: kvm: Use schedule instead of cond_reschedule > on i386 > > Since commit c2daa3bed53a81171cf8c1a36db798e82b91afe8 > Author: Peter Zijlstra <peterz@infradead.org> > sched, x86: Provide a per-cpu preempt_count implementation > > there is a problem on 32bit x86 which causes kvm to run wildly in a > tight loop. The cause is that since the preempt_count change it seems > possible on 32bit to have the thread info flag TIF_NEED_RESCHED set > but the IPI that in theory should update that flag into the per-cpu > counter does not see it. This leaves us in a bad place as this > counts as the process should reschule but is not allowed to. And > kvm unfortunately relies on those things to be in sync. > > There is actually a list of fixes to this in general which was not yet > pushed to stable (since we never found a resolution to this issue). So > i386 may behave unfair in general. This here just allows to run kvm on > i386 at all. > > Signed-off-by: Stefan Bader <stefan.bader@canonical.com> > --- > arch/x86/kvm/x86.c | 14 ++++++++++++++ > 1 file changed, 14 insertions(+) > > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c > index 6dc4904..512b1ce 100644 > --- a/arch/x86/kvm/x86.c > +++ b/arch/x86/kvm/x86.c > @@ -6092,7 +6092,21 @@ static int __vcpu_run(struct kvm_vcpu *vcpu) > } > if (need_resched()) { > srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); > +#ifdef CONFIG_X86_32 > + /* > + * This is a hack around a current bug on i386 > + * that causes TIF_NEED_RESCHED not being propagated > + * into the per-cpu counter. This causes cond_resched() > + * which is what kvm_resched() calls to exit without > + * actually rescheduling. We continue immediately with > + * the loop but vcpu_enter_guest checks for the process > + * flag and immediately exits and this loops busily > + * runs forever. > + */ > + schedule(); > +#else > kvm_resched(vcpu); > +#endif > vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); > } > } I am slightly worried that switching to schedule() here is triggering more work that kvm_resched() would have done. As the stated bug is that TIF_NEED_RESCHED does not get copied over correctly "sometimes". Could we not make this patch add like: if (test_tsk_need_resched(current)) set_preempt_need_resched(); Just before kvm_resched() in the 32bit case? -apw
On 02.04.2014 11:31, Andy Whitcroft wrote: > On Wed, Apr 02, 2014 at 10:59:48AM +0200, Stefan Bader wrote: >> Unfortunately completely failed to proceed with this one (me and >> upstream). This hack is curing at least the big issue with kvm. >> As noted in the patch, there is a series of fixes after 3.13 which >> I had been picking up while looking into this. These would be stable >> material (minus maybe the idle changes though they might be useful for >> other reasons and I did not want to make guesses about the things that >> patch #3 changes there). >> >> 1. x86, acpi, idle: Restructure the mwait idle routines >> 2. x86, idle: Use static_cpu_has() for CLFLUSH workaround, add barriers >> 3. sched/preempt: Fix up missed PREEMPT_NEED_RESCHED folding >> 4. sched/preempt/x86: Fix voluntary preempt for x86 >> >> This just for documentation and in case anything else looks odd in >> the scheduler and this might be a reason. Its now too late to pick those >> and the hack works without. >> >> -Stefan >> >> --- >> >> From 951bca51ab8456d2a2cbca33f440680ae675b5e4 Mon Sep 17 00:00:00 2001 >> From: Stefan Bader <stefan.bader@canonical.com> >> Date: Tue, 1 Apr 2014 20:26:00 +0200 >> Subject: [PATCH] UBUNTU: SAUCE: kvm: Use schedule instead of cond_reschedule >> on i386 >> >> Since commit c2daa3bed53a81171cf8c1a36db798e82b91afe8 >> Author: Peter Zijlstra <peterz@infradead.org> >> sched, x86: Provide a per-cpu preempt_count implementation >> >> there is a problem on 32bit x86 which causes kvm to run wildly in a >> tight loop. The cause is that since the preempt_count change it seems >> possible on 32bit to have the thread info flag TIF_NEED_RESCHED set >> but the IPI that in theory should update that flag into the per-cpu >> counter does not see it. This leaves us in a bad place as this >> counts as the process should reschule but is not allowed to. And >> kvm unfortunately relies on those things to be in sync. >> >> There is actually a list of fixes to this in general which was not yet >> pushed to stable (since we never found a resolution to this issue). So >> i386 may behave unfair in general. This here just allows to run kvm on >> i386 at all. >> >> Signed-off-by: Stefan Bader <stefan.bader@canonical.com> >> --- >> arch/x86/kvm/x86.c | 14 ++++++++++++++ >> 1 file changed, 14 insertions(+) >> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >> index 6dc4904..512b1ce 100644 >> --- a/arch/x86/kvm/x86.c >> +++ b/arch/x86/kvm/x86.c >> @@ -6092,7 +6092,21 @@ static int __vcpu_run(struct kvm_vcpu *vcpu) >> } >> if (need_resched()) { >> srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); >> +#ifdef CONFIG_X86_32 >> + /* >> + * This is a hack around a current bug on i386 >> + * that causes TIF_NEED_RESCHED not being propagated >> + * into the per-cpu counter. This causes cond_resched() >> + * which is what kvm_resched() calls to exit without >> + * actually rescheduling. We continue immediately with >> + * the loop but vcpu_enter_guest checks for the process >> + * flag and immediately exits and this loops busily >> + * runs forever. >> + */ >> + schedule(); >> +#else >> kvm_resched(vcpu); >> +#endif >> vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); >> } >> } > > I am slightly worried that switching to schedule() here is triggering > more work that kvm_resched() would have done. As the stated bug is that > TIF_NEED_RESCHED does not get copied over correctly "sometimes". Could > we not make this patch add like: > > if (test_tsk_need_resched(current)) > set_preempt_need_resched(); > > Just before kvm_resched() in the 32bit case? > > -apw > I am working on something equivalent. Will update here when I got it compiled and tested.
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 6dc4904..512b1ce 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -6092,7 +6092,21 @@ static int __vcpu_run(struct kvm_vcpu *vcpu) } if (need_resched()) { srcu_read_unlock(&kvm->srcu, vcpu->srcu_idx); +#ifdef CONFIG_X86_32 + /* + * This is a hack around a current bug on i386 + * that causes TIF_NEED_RESCHED not being propagated + * into the per-cpu counter. This causes cond_resched() + * which is what kvm_resched() calls to exit without + * actually rescheduling. We continue immediately with + * the loop but vcpu_enter_guest checks for the process + * flag and immediately exits and this loops busily + * runs forever. + */ + schedule(); +#else kvm_resched(vcpu); +#endif vcpu->srcu_idx = srcu_read_lock(&kvm->srcu); } }