Message ID | 20230726182230.433945-6-npiggin@gmail.com |
---|---|
State | New |
Headers | show |
Series | ppc fixes possibly for 8.1 | expand |
Hello Nick, On 7/26/23 20:22, Nicholas Piggin wrote: > When writing a value to the decrementer that raises an exception, the > irq is raised, but the value is not stored so the store doesn't appear > to have changed the register when it is read again. > > Always store the write value to the register. This change has a serious performance impact when a guest is run under PowerNV. Could you please take a look ? Thanks, C. PS: We should really introduce avocado tests for nested. > Fixes: e81a982aa53 ("PPC: Clean up DECR implementation") > Signed-off-by: Nicholas Piggin <npiggin@gmail.com> > --- > hw/ppc/ppc.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c > index fa60f76dd4..cd1993e9c1 100644 > --- a/hw/ppc/ppc.c > +++ b/hw/ppc/ppc.c > @@ -812,6 +812,11 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp, > return; > } > > + /* Calculate the next timer event */ > + now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); > + next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value; > + *nextp = next; /* nextp is in timebase units */ > + > /* > * Going from 1 -> 0 or 0 -> -1 is the event to generate a DEC interrupt. > * > @@ -833,11 +838,6 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp, > (*lower_excp)(cpu); > } > > - /* Calculate the next timer event */ > - now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); > - next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value; > - *nextp = next; /* nextp is in timebase units */ > - > /* Adjust timer */ > timer_mod(timer, muldiv64(next, NANOSECONDS_PER_SECOND, tb_env->decr_freq)); > }
On Thu Jul 27, 2023 at 10:26 PM AEST, Cédric Le Goater wrote: > Hello Nick, > > On 7/26/23 20:22, Nicholas Piggin wrote: > > When writing a value to the decrementer that raises an exception, the > > irq is raised, but the value is not stored so the store doesn't appear > > to have changed the register when it is read again. > > > > Always store the write value to the register. > > This change has a serious performance impact when a guest is run under > PowerNV. Could you please take a look ? Yeah, the decrementer load doesn't sign-extend the value correctly as it should for the large-decrementer option. It makes skiboot detect the decrementer size as 64 bits instead of 56, and things go bad from there. KVM seems more affected because it's saving and restoring DEC frequently. The fix seems simple but considering the compounding series of bugs and issues coming up with this, I think it will be better to defer the decrementer work until 8.2 unfortunately. Thanks, Nick > Thanks, > > C. > > PS: We should really introduce avocado tests for nested. Yeah agreed. Both for pseries and powernv, ideally. Thanks, Nick
On 7/30/23 11:40, Nicholas Piggin wrote: > On Thu Jul 27, 2023 at 10:26 PM AEST, Cédric Le Goater wrote: >> Hello Nick, >> >> On 7/26/23 20:22, Nicholas Piggin wrote: >>> When writing a value to the decrementer that raises an exception, the >>> irq is raised, but the value is not stored so the store doesn't appear >>> to have changed the register when it is read again. >>> >>> Always store the write value to the register. >> >> This change has a serious performance impact when a guest is run under >> PowerNV. Could you please take a look ? > > Yeah, the decrementer load doesn't sign-extend the value correctly as > it should for the large-decrementer option. It makes skiboot detect > the decrementer size as 64 bits instead of 56, and things go bad from > there. KVM seems more affected because it's saving and restoring DEC > frequently. > > The fix seems simple but considering the compounding series of bugs > and issues coming up with this, I think it will be better to defer > the decrementer work until 8.2 unfortunately. Yes. QEMU 8.1 has already a lot, fixes, tests and models [1]. >> PS: We should really introduce avocado tests for nested. > > Yeah agreed. Both for pseries and powernv, ideally. The same disk image could be used for the 3 HV implementations. This would be a nice addition to Harsh's series [2] Thanks, C. [1] https://wiki.qemu.org/ChangeLog/8.1#PowerPC [2] https://patchwork.ozlabs.org/project/qemu-ppc/list/?series=364386
diff --git a/hw/ppc/ppc.c b/hw/ppc/ppc.c index fa60f76dd4..cd1993e9c1 100644 --- a/hw/ppc/ppc.c +++ b/hw/ppc/ppc.c @@ -812,6 +812,11 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp, return; } + /* Calculate the next timer event */ + now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); + next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value; + *nextp = next; /* nextp is in timebase units */ + /* * Going from 1 -> 0 or 0 -> -1 is the event to generate a DEC interrupt. * @@ -833,11 +838,6 @@ static void __cpu_ppc_store_decr(PowerPCCPU *cpu, uint64_t *nextp, (*lower_excp)(cpu); } - /* Calculate the next timer event */ - now = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL); - next = muldiv64(now, tb_env->decr_freq, NANOSECONDS_PER_SECOND) + value; - *nextp = next; /* nextp is in timebase units */ - /* Adjust timer */ timer_mod(timer, muldiv64(next, NANOSECONDS_PER_SECOND, tb_env->decr_freq)); }
When writing a value to the decrementer that raises an exception, the irq is raised, but the value is not stored so the store doesn't appear to have changed the register when it is read again. Always store the write value to the register. Fixes: e81a982aa53 ("PPC: Clean up DECR implementation") Signed-off-by: Nicholas Piggin <npiggin@gmail.com> --- hw/ppc/ppc.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)