diff mbox

[v4,01/10] hw/mips/cputimer: Don't start periodic timer in KVM mode

Message ID 1394801281-18997-2-git-send-email-james.hogan@imgtec.com
State New
Headers show

Commit Message

James Hogan March 14, 2014, 12:47 p.m. UTC
From: Sanjay Lal <sanjayl@kymasys.com>

Compare/Count timer interrupts are handled in-kernel for KVM, so don't
bother starting it in QEMU.

Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
---
Changes in v2:
 - Expand commit message
 - Rebase on v1.7.0
 - Wrap comment
---
 hw/mips/cputimer.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Paolo Bonzini March 19, 2014, 4:29 p.m. UTC | #1
Il 14/03/2014 13:47, James Hogan ha scritto:
> From: Sanjay Lal <sanjayl@kymasys.com>
>
> Compare/Count timer interrupts are handled in-kernel for KVM, so don't
> bother starting it in QEMU.
>
> Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
> Signed-off-by: James Hogan <james.hogan@imgtec.com>
> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
> ---
> Changes in v2:
>  - Expand commit message
>  - Rebase on v1.7.0
>  - Wrap comment
> ---
>  hw/mips/cputimer.c | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
>
> diff --git a/hw/mips/cputimer.c b/hw/mips/cputimer.c
> index c8b4b00..52570fd 100644
> --- a/hw/mips/cputimer.c
> +++ b/hw/mips/cputimer.c
> @@ -23,6 +23,7 @@
>  #include "hw/hw.h"
>  #include "hw/mips/cpudevs.h"
>  #include "qemu/timer.h"
> +#include "sysemu/kvm.h"
>
>  #define TIMER_FREQ	100 * 1000 * 1000
>
> @@ -141,7 +142,13 @@ static void mips_timer_cb (void *opaque)
>
>  void cpu_mips_clock_init (CPUMIPSState *env)
>  {
> -    env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb, env);
> -    env->CP0_Compare = 0;
> -    cpu_mips_store_count(env, 1);
> +    /*
> +     * If we're in KVM mode, don't start the periodic timer, that is handled in
> +     * kernel.
> +     */
> +    if (!kvm_enabled()) {
> +        env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb, env);
> +        env->CP0_Compare = 0;
> +        cpu_mips_store_count(env, 1);
> +    }
>  }
>

I hate to make you do unrelated changes, but... initializing CP0_Compare 
is unnecessary, it should already be 0; and for CP0_Count it should not 
be done here. but in cpu_state_reset function.  Then here you can call 
qemu_register_reset to register another reset callback, and call 
cpu_mips_timer_update in that callback.

I'm asking because while

     if (!kvm_enabled()) {
         env->timer = ...
         qemu_register_reset(...);
     }

is fine, changing values of registers conditionally is not.

Also, I noticed two things in the implementation of the CPU timer that 
should be fixed:

1) right now the hypervisor's frequency is hardcoded to 1/4th of the 
host, while QEMU's is 100 MHz.  It would be nice to make them either 
consistent, or customizable (you can use another ONE_REG interface to 
set CPU parameters).

2) in KVM, CP0_Count does not start at the same value on guest reset. 
There is a comment that "Linux doesn't seem to write into COUNT", but 
QEMU does.  So KVM should implement CP0_Count writes and adjust the 
"bias" of the guest CP0_Count.

In fact, right now kvm_mips_te_put_cp0_registers should always return 
-EINVAL because KVM_REG_MIPS_CP0_COUNT is not handled in 
kvm_mips_get/set_reg.  Am I missing something?

Paolo
James Hogan March 20, 2014, 9:57 a.m. UTC | #2
On 19/03/14 16:29, Paolo Bonzini wrote:
> Il 14/03/2014 13:47, James Hogan ha scritto:
>> From: Sanjay Lal <sanjayl@kymasys.com>
>>
>> Compare/Count timer interrupts are handled in-kernel for KVM, so don't
>> bother starting it in QEMU.
>>
>> Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
>> Signed-off-by: James Hogan <james.hogan@imgtec.com>
>> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
>> ---
>> Changes in v2:
>>  - Expand commit message
>>  - Rebase on v1.7.0
>>  - Wrap comment
>> ---
>>  hw/mips/cputimer.c | 13 ++++++++++---
>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/mips/cputimer.c b/hw/mips/cputimer.c
>> index c8b4b00..52570fd 100644
>> --- a/hw/mips/cputimer.c
>> +++ b/hw/mips/cputimer.c
>> @@ -23,6 +23,7 @@
>>  #include "hw/hw.h"
>>  #include "hw/mips/cpudevs.h"
>>  #include "qemu/timer.h"
>> +#include "sysemu/kvm.h"
>>
>>  #define TIMER_FREQ    100 * 1000 * 1000
>>
>> @@ -141,7 +142,13 @@ static void mips_timer_cb (void *opaque)
>>
>>  void cpu_mips_clock_init (CPUMIPSState *env)
>>  {
>> -    env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb, env);
>> -    env->CP0_Compare = 0;
>> -    cpu_mips_store_count(env, 1);
>> +    /*
>> +     * If we're in KVM mode, don't start the periodic timer, that is
>> handled in
>> +     * kernel.
>> +     */
>> +    if (!kvm_enabled()) {
>> +        env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb,
>> env);
>> +        env->CP0_Compare = 0;
>> +        cpu_mips_store_count(env, 1);
>> +    }
>>  }
>>
> 
> I hate to make you do unrelated changes, but... initializing CP0_Compare
> is unnecessary, it should already be 0;

You mean because of the memset in object_initialize_with_type, when
object_new is called? Although that wouldn't handle reset, although
technically the reset state of Compare is undefined.

> and for CP0_Count it should not
> be done here. but in cpu_state_reset function.  Then here you can call
> qemu_register_reset to register another reset callback, and call
> cpu_mips_timer_update in that callback.
> 
> I'm asking because while
> 
>     if (!kvm_enabled()) {
>         env->timer = ...
>         qemu_register_reset(...);
>     }
> 
> is fine, changing values of registers conditionally is not.

Okay, makes sense.

> 
> Also, I noticed two things in the implementation of the CPU timer that
> should be fixed:
> 
> 1) right now the hypervisor's frequency is hardcoded to 1/4th of the
> host, while QEMU's is 100 MHz.  It would be nice to make them either
> consistent, or customizable (you can use another ONE_REG interface to
> set CPU parameters).

Agreed. I'm in the middle of fixing the count/compare timer in KVM to be
based on real time (ktime_get()), so I'll make it default to 100MHz to
match QEMU for now. I can imagine it being useful to be able to control
it too depending on whether you're running on a slow FPGA/emulator or
fast silicon.

> 2) in KVM, CP0_Count does not start at the same value on guest reset.
> There is a comment that "Linux doesn't seem to write into COUNT", but
> QEMU does.  So KVM should implement CP0_Count writes and adjust the
> "bias" of the guest CP0_Count.

True, I hadn't considered qemu writing those registers yet.

Am I right that the correct way to prevent clock drift is for
kvm_arch_put_registers to only set the Count register if level !=
KVM_PUT_RUNTIME_STATE?

> In fact, right now kvm_mips_te_put_cp0_registers should always return
> -EINVAL because KVM_REG_MIPS_CP0_COUNT is not handled in
> kvm_mips_get/set_reg.  Am I missing something?

Yes, you appear to be right!

Thanks a lot for reviewing

Cheers
James
Paolo Bonzini March 20, 2014, 10:36 p.m. UTC | #3
Il 20/03/2014 10:57, James Hogan ha scritto:
> On 19/03/14 16:29, Paolo Bonzini wrote:
>> Il 14/03/2014 13:47, James Hogan ha scritto:
>>> From: Sanjay Lal <sanjayl@kymasys.com>
>>>
>>> Compare/Count timer interrupts are handled in-kernel for KVM, so don't
>>> bother starting it in QEMU.
>>>
>>> Signed-off-by: Sanjay Lal <sanjayl@kymasys.com>
>>> Signed-off-by: James Hogan <james.hogan@imgtec.com>
>>> Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
>>> ---
>>> Changes in v2:
>>>  - Expand commit message
>>>  - Rebase on v1.7.0
>>>  - Wrap comment
>>> ---
>>>  hw/mips/cputimer.c | 13 ++++++++++---
>>>  1 file changed, 10 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/hw/mips/cputimer.c b/hw/mips/cputimer.c
>>> index c8b4b00..52570fd 100644
>>> --- a/hw/mips/cputimer.c
>>> +++ b/hw/mips/cputimer.c
>>> @@ -23,6 +23,7 @@
>>>  #include "hw/hw.h"
>>>  #include "hw/mips/cpudevs.h"
>>>  #include "qemu/timer.h"
>>> +#include "sysemu/kvm.h"
>>>
>>>  #define TIMER_FREQ    100 * 1000 * 1000
>>>
>>> @@ -141,7 +142,13 @@ static void mips_timer_cb (void *opaque)
>>>
>>>  void cpu_mips_clock_init (CPUMIPSState *env)
>>>  {
>>> -    env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb, env);
>>> -    env->CP0_Compare = 0;
>>> -    cpu_mips_store_count(env, 1);
>>> +    /*
>>> +     * If we're in KVM mode, don't start the periodic timer, that is
>>> handled in
>>> +     * kernel.
>>> +     */
>>> +    if (!kvm_enabled()) {
>>> +        env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb,
>>> env);
>>> +        env->CP0_Compare = 0;
>>> +        cpu_mips_store_count(env, 1);
>>> +    }
>>>  }
>>>
>>
>> I hate to make you do unrelated changes, but... initializing CP0_Compare
>> is unnecessary, it should already be 0;
> 
> You mean because of the memset in object_initialize_with_type, when
> object_new is called? Although that wouldn't handle reset, although
> technically the reset state of Compare is undefined.

No, see mips_cpu_reset:

static void mips_cpu_reset(CPUState *s)
{
    MIPSCPU *cpu = MIPS_CPU(s);
    MIPSCPUClass *mcc = MIPS_CPU_GET_CLASS(cpu);
    CPUMIPSState *env = &cpu->env;

    mcc->parent_reset(s);

    memset(env, 0, offsetof(CPUMIPSState, mvp));
    tlb_flush(s, 1);

    cpu_state_reset(env);
}

Fields before mvp are reset to zero (including CP0_Compare and CP0_Count).

> Am I right that the correct way to prevent clock drift is for
> kvm_arch_put_registers to only set the Count register if level !=
> KVM_PUT_RUNTIME_STATE?

Yes, that makes sense.  Or, better, do not provide a set_onereg 
interface for CP0_Count.  Instead, in the kernel you can base the CPU 
timer on the value of CLOCK_MONOTONIC, like this:

+static inline u64 get_monotonic_ns(void)
+{
+	struct timespec ts;
+
+	ktime_get_ts(&ts);
+	return timespec_to_ns(&ts);
+}
+

Then you provide three set_onereg interfaces.  One is normal cp0_count, 
but it is only used if the timer is not running (according to 
cp0_cause).  The second is the rate at which the timer counts 
(cp0_count_hz).  The third is used when the timer is running, and
it is:

	cp0_count_bias
	   = cp0_count * 10^9 / cp0_count_hz - get_monotonic_ns()

So when the timer is running cp0_count is computed as follows:

	cp0_count =
	  = (get_monotonic_ns() + cp0_count_bias) * cp0_count_hz / 10^9

QEMU can then set:

  cp0_count = cpu_mips_get_count(env)
  cp0_count_bias =
     cpu_mips_get_count(env) * 10^9 / cp0_count_hz - qemu_get_clock_ns(rt_clock)

Note that QEMU's qemu_get_clock_ns(rt_clock) == kernel's get_monotonic_ns().

So when the guest reads cp0_count (and the timer was running at the time
kvm_arch_put_registers was set), the kernel will return:

	cp0_count =
	 = (get_monotonic_ns() + cp0_count_bias) * cp0_count_hz / 10^9
	 = env->cp0_count
           + (get_monotonic_ns() - qemu_get_clock_ns(rt_clock)
                                 + qemu_get_clock_ns(vm_clock)) * cp0_count_hz / 10^9
	 = env->cp0_count + qemu_get_clock_ns(vm_clock) * cp0_count_hz / 10^9
         = cpu_mips_get_count(env)
	
Paolo
diff mbox

Patch

diff --git a/hw/mips/cputimer.c b/hw/mips/cputimer.c
index c8b4b00..52570fd 100644
--- a/hw/mips/cputimer.c
+++ b/hw/mips/cputimer.c
@@ -23,6 +23,7 @@ 
 #include "hw/hw.h"
 #include "hw/mips/cpudevs.h"
 #include "qemu/timer.h"
+#include "sysemu/kvm.h"
 
 #define TIMER_FREQ	100 * 1000 * 1000
 
@@ -141,7 +142,13 @@  static void mips_timer_cb (void *opaque)
 
 void cpu_mips_clock_init (CPUMIPSState *env)
 {
-    env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb, env);
-    env->CP0_Compare = 0;
-    cpu_mips_store_count(env, 1);
+    /*
+     * If we're in KVM mode, don't start the periodic timer, that is handled in
+     * kernel.
+     */
+    if (!kvm_enabled()) {
+        env->timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, &mips_timer_cb, env);
+        env->CP0_Compare = 0;
+        cpu_mips_store_count(env, 1);
+    }
 }