diff mbox series

[5/6] hw/ppc: Always store the decrementer value

Message ID 20230726182230.433945-6-npiggin@gmail.com
State New
Headers show
Series ppc fixes possibly for 8.1 | expand

Commit Message

Nicholas Piggin July 26, 2023, 6:22 p.m. UTC
When writing a value to the decrementer that raises an exception, the
irq is raised, but the value is not stored so the store doesn't appear
to have changed the register when it is read again.

Always store the write value to the register.

Fixes: e81a982aa53 ("PPC: Clean up DECR implementation")
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 hw/ppc/ppc.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

Comments

Cédric Le Goater July 27, 2023, 12:26 p.m. UTC | #1
Hello Nick,

On 7/26/23 20:22, Nicholas Piggin wrote:
> When writing a value to the decrementer that raises an exception, the
> irq is raised, but the value is not stored so the store doesn't appear
> to have changed the register when it is read again.
> 
> Always store the write value to the register.

This change has a serious performance impact when a guest is run under
PowerNV. Could you please take a look ?

Thanks,

C.

PS: We should really introduce avocado tests for nested.
  
> Fixes: e81a982aa53 ("PPC: Clean up DECR implementation")
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   hw/ppc/ppc.c | 10 +++++-----
>   1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
> index fa60f76dd4..cd1993e9c1 100644
> --- a/hw/ppc/ppc.c
> +++ b/hw/ppc/ppc.c
> @@ -812,6 +812,11 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
>           return;
>       }
>   
> +    /* Calculate the next timer event */
> +    now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> +    next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value;
> +    *nextp = next; /* nextp is in timebase units */
> +
>       /*
>        * Going from 1 -> 0 or 0 -> -1 is the event to generate a DEC interrupt.
>        *
> @@ -833,11 +838,6 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
>           (*lower_excp)(cpu);
>       }
>   
> -    /* Calculate the next timer event */
> -    now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
> -    next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value;
> -    *nextp = next; /* nextp is in timebase units */
> -
>       /* Adjust timer */
>       timer_mod(timer, muldiv64(next, NANOSECONDS_PER_SECOND, tb_env->decr_freq));
>   }
Nicholas Piggin July 30, 2023, 9:40 a.m. UTC | #2
On Thu Jul 27, 2023 at 10:26 PM AEST, Cédric Le Goater wrote:
> Hello Nick,
>
> On 7/26/23 20:22, Nicholas Piggin wrote:
> > When writing a value to the decrementer that raises an exception, the
> > irq is raised, but the value is not stored so the store doesn't appear
> > to have changed the register when it is read again.
> > 
> > Always store the write value to the register.
>
> This change has a serious performance impact when a guest is run under
> PowerNV. Could you please take a look ?

Yeah, the decrementer load doesn't sign-extend the value correctly as
it should for the large-decrementer option. It makes skiboot detect
the decrementer size as 64 bits instead of 56, and things go bad from
there. KVM seems more affected because it's saving and restoring DEC
frequently.

The fix seems simple but considering the compounding series of bugs
and issues coming up with this, I think it will be better to defer
the decrementer work until 8.2 unfortunately.

Thanks,
Nick

> Thanks,
>
> C.
>
> PS: We should really introduce avocado tests for nested.

Yeah agreed. Both for pseries and powernv, ideally.

Thanks,
Nick
Cédric Le Goater July 30, 2023, 4:18 p.m. UTC | #3
On 7/30/23 11:40, Nicholas Piggin wrote:
> On Thu Jul 27, 2023 at 10:26 PM AEST, Cédric Le Goater wrote:
>> Hello Nick,
>>
>> On 7/26/23 20:22, Nicholas Piggin wrote:
>>> When writing a value to the decrementer that raises an exception, the
>>> irq is raised, but the value is not stored so the store doesn't appear
>>> to have changed the register when it is read again.
>>>
>>> Always store the write value to the register.
>>
>> This change has a serious performance impact when a guest is run under
>> PowerNV. Could you please take a look ?
> 
> Yeah, the decrementer load doesn't sign-extend the value correctly as
> it should for the large-decrementer option. It makes skiboot detect
> the decrementer size as 64 bits instead of 56, and things go bad from
> there. KVM seems more affected because it's saving and restoring DEC
> frequently.
> 
> The fix seems simple but considering the compounding series of bugs
> and issues coming up with this, I think it will be better to defer
> the decrementer work until 8.2 unfortunately.

Yes. QEMU 8.1 has already a lot, fixes, tests and models [1].

>> PS: We should really introduce avocado tests for nested.
> 
> Yeah agreed. Both for pseries and powernv, ideally.

The same disk image could be used for the 3 HV implementations. This would
be a nice addition to Harsh's series [2]

Thanks,

C.

[1] https://wiki.qemu.org/ChangeLog/8.1#PowerPC
[2] https://patchwork.ozlabs.org/project/qemu-ppc/list/?series=364386
diff mbox series

Patch

diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c
index fa60f76dd4..cd1993e9c1 100644
--- a/hw/ppc/ppc.c
+++ b/hw/ppc/ppc.c
@@ -812,6 +812,11 @@  static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
         return;
     }
 
+    /* Calculate the next timer event */
+    now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
+    next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value;
+    *nextp = next; /* nextp is in timebase units */
+
     /*
      * Going from 1 -> 0 or 0 -> -1 is the event to generate a DEC interrupt.
      *
@@ -833,11 +838,6 @@  static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp,
         (*lower_excp)(cpu);
     }
 
-    /* Calculate the next timer event */
-    now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL);
-    next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value;
-    *nextp = next; /* nextp is in timebase units */
-
     /* Adjust timer */
     timer_mod(timer, muldiv64(next, NANOSECONDS_PER_SECOND, tb_env->decr_freq));
 }