Message ID | 20230316051025.1424093-3-kconsul@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Rejected, archived |
Headers | show |
Series | Improving calls to kvmppc_hv_entry | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 6 jobs. |
snowpatch_ozlabs/github-powerpc_kernel_qemu | success | Successfully ran 24 jobs. |
snowpatch_ozlabs/github-powerpc_sparse | success | Successfully ran 4 jobs. |
On Thu Mar 16, 2023 at 3:10 PM AEST, Kautuk Consul wrote: > kvmppc_hv_entry is called from only 2 locations within > book3s_hv_rmhandlers.S. Both of those locations set r4 > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > So, shift the r4 load instruction to kvmppc_hv_entry and > thus modify the calling convention of this function. Is r4 there only used for CONFIG_KVM_BOOK3S_HV_P8_TIMING? Could put it under there. Although you then lose the barrier if it's disabled, that is okay if you're sure that's the only memory operation being ordered. I'm not sure how much new work we want to put into changing this asm code, since it's POWER7/8 only. I would love to move this (and the other) KVM implementations to C like we did with P9. It's a pretty big job though. Thanks, Nick > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b81ba4ee0521..b61f0b2c677b 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > RFI_TO_KERNEL > > kvmppc_call_hv_entry: > - ld r4, HSTATE_KVM_VCPU(r13) > + /* Enter guest. */ > bl kvmppc_hv_entry > > /* Back from guest - restore host state and return to caller */ > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > mtspr SPRN_LDBAR, r0 > isync > 63: > - /* Order load of vcpu after load of vcore */ > - lwsync > - ld r4, HSTATE_KVM_VCPU(r13) > + /* Enter guest. */ > bl kvmppc_hv_entry > > /* Back from the guest, go back to nap */ > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > /* Required state: > * > - * R4 = vcpu pointer (or NULL) > * MSR = ~IR|DR > * R13 = PACA > * R1 = host R1 > @@ -524,6 +521,10 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > li r6, KVM_GUEST_MODE_HOST_HV > stb r6, HSTATE_IN_GUEST(r13) > > + /* Order load of vcpu after load of vcore */ > + lwsync > + ld r4, HSTATE_KVM_VCPU(r13) > + > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > /* Store initial timestamp */ > cmpdi r4, 0 > -- > 2.39.2
Hi, On 2023-03-21 14:15:14, Nicholas Piggin wrote: > On Thu Mar 16, 2023 at 3:10 PM AEST, Kautuk Consul wrote: > > kvmppc_hv_entry is called from only 2 locations within > > book3s_hv_rmhandlers.S. Both of those locations set r4 > > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > > So, shift the r4 load instruction to kvmppc_hv_entry and > > thus modify the calling convention of this function. > > Is r4 there only used for CONFIG_KVM_BOOK3S_HV_P8_TIMING? Could put it > under there. Although you then lose the barrier if it's disabled, that > is okay if you're sure that's the only memory operation being ordered. r4 is also used by the secondary_too_late label. So I decided against moving it inside the CONFIG_KVM_BOOK3S_HV_P8_TIMING #ifdef as then I would need to again load from HSTATE_KVM_VCPU(r13) in secondary_too_late. > > I'm not sure how much new work we want to put into changing this asm > code, since it's POWER7/8 only. I would love to move this (and the > other) KVM implementations to C like we did with P9. It's a pretty big > job though. Yeah. I was just reviewing this code and decided to send this small improvement to the mailing list. I will check with my team and ask them if we could move this implementation to C. > > Thanks, > Nick > > > > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > > --- > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++++++----- > > 1 file changed, 6 insertions(+), 5 deletions(-) > > > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > index b81ba4ee0521..b61f0b2c677b 100644 > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > > RFI_TO_KERNEL > > > > kvmppc_call_hv_entry: > > - ld r4, HSTATE_KVM_VCPU(r13) > > + /* Enter guest. */ > > bl kvmppc_hv_entry > > > > /* Back from guest - restore host state and return to caller */ > > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > > mtspr SPRN_LDBAR, r0 > > isync > > 63: > > - /* Order load of vcpu after load of vcore */ > > - lwsync > > - ld r4, HSTATE_KVM_VCPU(r13) > > + /* Enter guest. */ > > bl kvmppc_hv_entry > > > > /* Back from the guest, go back to nap */ > > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > > > /* Required state: > > * > > - * R4 = vcpu pointer (or NULL) > > * MSR = ~IR|DR > > * R13 = PACA > > * R1 = host R1 > > @@ -524,6 +521,10 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > li r6, KVM_GUEST_MODE_HOST_HV > > stb r6, HSTATE_IN_GUEST(r13) > > > > + /* Order load of vcpu after load of vcore */ > > + lwsync > > + ld r4, HSTATE_KVM_VCPU(r13) > > + > > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > > /* Store initial timestamp */ > > cmpdi r4, 0 > > -- > > 2.39.2 >
On 2023-03-21 10:24:36, Kautuk Consul wrote: > > Is r4 there only used for CONFIG_KVM_BOOK3S_HV_P8_TIMING? Could put it > > under there. Although you then lose the barrier if it's disabled, that > > is okay if you're sure that's the only memory operation being ordered. > r4 is also used by the secondary_too_late label. So I decided against > moving it inside the CONFIG_KVM_BOOK3S_HV_P8_TIMING #ifdef as then I > would need to again load from HSTATE_KVM_VCPU(r13) in secondary_too_late. > > Sorry, forgot to mention, r4 is also being used in the 10f label. Thats one more reason to not shift the r4 load into the #ifdef. > > I'm not sure how much new work we want to put into changing this asm > > code, since it's POWER7/8 only. I would love to move this (and the > > other) KVM implementations to C like we did with P9. It's a pretty big > > job though. > Yeah. I was just reviewing this code and decided to send this small > improvement to the mailing list. I will check with my team and ask > them if we could move this implementation to C. > > > > Thanks, > > Nick > > > > > > > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > > > --- > > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++++++----- > > > 1 file changed, 6 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > index b81ba4ee0521..b61f0b2c677b 100644 > > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > > > RFI_TO_KERNEL > > > > > > kvmppc_call_hv_entry: > > > - ld r4, HSTATE_KVM_VCPU(r13) > > > + /* Enter guest. */ > > > bl kvmppc_hv_entry > > > > > > /* Back from guest - restore host state and return to caller */ > > > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > > > mtspr SPRN_LDBAR, r0 > > > isync > > > 63: > > > - /* Order load of vcpu after load of vcore */ > > > - lwsync > > > - ld r4, HSTATE_KVM_VCPU(r13) > > > + /* Enter guest. */ > > > bl kvmppc_hv_entry > > > > > > /* Back from the guest, go back to nap */ > > > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > > > > > /* Required state: > > > * > > > - * R4 = vcpu pointer (or NULL) > > > * MSR = ~IR|DR > > > * R13 = PACA > > > * R1 = host R1 > > > @@ -524,6 +521,10 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > > li r6, KVM_GUEST_MODE_HOST_HV > > > stb r6, HSTATE_IN_GUEST(r13) > > > > > > + /* Order load of vcpu after load of vcore */ > > > + lwsync > > > + ld r4, HSTATE_KVM_VCPU(r13) > > > + > > > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > > > /* Store initial timestamp */ > > > cmpdi r4, 0 > > > -- > > > 2.39.2 > >
Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > kvmppc_hv_entry is called from only 2 locations within > book3s_hv_rmhandlers.S. Both of those locations set r4 > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > So, shift the r4 load instruction to kvmppc_hv_entry and > thus modify the calling convention of this function. > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++++++----- > 1 file changed, 6 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b81ba4ee0521..b61f0b2c677b 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > RFI_TO_KERNEL > > kvmppc_call_hv_entry: > - ld r4, HSTATE_KVM_VCPU(r13) > + /* Enter guest. */ > bl kvmppc_hv_entry > > /* Back from guest - restore host state and return to caller */ > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > mtspr SPRN_LDBAR, r0 > isync > 63: > - /* Order load of vcpu after load of vcore */ > - lwsync > - ld r4, HSTATE_KVM_VCPU(r13) > + /* Enter guest. */ > bl kvmppc_hv_entry > > /* Back from the guest, go back to nap */ > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > /* Required state: > * > - * R4 = vcpu pointer (or NULL) > * MSR = ~IR|DR > * R13 = PACA > * R1 = host R1 > @@ -524,6 +521,10 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > li r6, KVM_GUEST_MODE_HOST_HV > stb r6, HSTATE_IN_GUEST(r13) > > + /* Order load of vcpu after load of vcore */ Just copying the comment here doesn't work. It doesn't make sense on its own here, because the VCORE is loaded (again) a few lines below (536). So as written this comment seems backward vs the code. The comment would need to expand to explain that the barrier is for the case where we came from kvm_secondary_got_guest. > + lwsync > + ld r4, HSTATE_KVM_VCPU(r13) > + > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > /* Store initial timestamp */ > cmpdi r4, 0 But as Nick says I don't think it's worth investing effort in small tweaks to this code. The risk of introducing bugs is too high for such a small improvement to the code. Thanks for trying, but I think this asm code is best left more or less alone unless we find actual bugs in it - or unless we can make substantial improvements to it, which would be rewriting in C, or at least converting to a fully call/return style rather than the current forest of labels. I will take patch 1 though, as that's an obvious cleanup and poses no risk (famous last words :D). cheers
Hi, On 2023-03-22 23:17:35, Michael Ellerman wrote: > Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > > kvmppc_hv_entry is called from only 2 locations within > > book3s_hv_rmhandlers.S. Both of those locations set r4 > > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > > So, shift the r4 load instruction to kvmppc_hv_entry and > > thus modify the calling convention of this function. > > > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > > --- > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++++++----- > > 1 file changed, 6 insertions(+), 5 deletions(-) > > > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > index b81ba4ee0521..b61f0b2c677b 100644 > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > > RFI_TO_KERNEL > > > > kvmppc_call_hv_entry: > > - ld r4, HSTATE_KVM_VCPU(r13) > > + /* Enter guest. */ > > bl kvmppc_hv_entry > > > > /* Back from guest - restore host state and return to caller */ > > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > > mtspr SPRN_LDBAR, r0 > > isync > > 63: > > - /* Order load of vcpu after load of vcore */ > > - lwsync > > - ld r4, HSTATE_KVM_VCPU(r13) > > + /* Enter guest. */ > > bl kvmppc_hv_entry > > > > /* Back from the guest, go back to nap */ > > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > > > /* Required state: > > * > > - * R4 = vcpu pointer (or NULL) > > * MSR = ~IR|DR > > * R13 = PACA > > * R1 = host R1 > > @@ -524,6 +521,10 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > li r6, KVM_GUEST_MODE_HOST_HV > > stb r6, HSTATE_IN_GUEST(r13) > > > > + /* Order load of vcpu after load of vcore */ > > Just copying the comment here doesn't work. It doesn't make sense on its > own here, because the VCORE is loaded (again) a few lines below (536). > So as written this comment seems backward vs the code. > > The comment would need to expand to explain that the barrier is for the > case where we came from kvm_secondary_got_guest. > Agreed. > > + lwsync > > + ld r4, HSTATE_KVM_VCPU(r13) > > + > > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > > /* Store initial timestamp */ > > cmpdi r4, 0 > > > But as Nick says I don't think it's worth investing effort in small > tweaks to this code. The risk of introducing bugs is too high for such a > small improvement to the code. > > Thanks for trying, but I think this asm code is best left more or less > alone unless we find actual bugs in it - or unless we can make > substantial improvements to it, which would be rewriting in C, or at > least converting to a fully call/return style rather than the current > forest of labels. > > I will take patch 1 though, as that's an obvious cleanup and poses no > risk (famous last words :D). Thanks for taking patch 1. I completely agree that it would not be wise to introduce even small alterations to stable legacy code. But sending patches like this and conversing with this mailing list's reviewers is giving me a good picture of what you are open to accepting at this stage. Thanks again! > > cheers
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index b81ba4ee0521..b61f0b2c677b 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) RFI_TO_KERNEL kvmppc_call_hv_entry: - ld r4, HSTATE_KVM_VCPU(r13) + /* Enter guest. */ bl kvmppc_hv_entry /* Back from guest - restore host state and return to caller */ @@ -352,9 +352,7 @@ kvm_secondary_got_guest: mtspr SPRN_LDBAR, r0 isync 63: - /* Order load of vcpu after load of vcore */ - lwsync - ld r4, HSTATE_KVM_VCPU(r13) + /* Enter guest. */ bl kvmppc_hv_entry /* Back from the guest, go back to nap */ @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) /* Required state: * - * R4 = vcpu pointer (or NULL) * MSR = ~IR|DR * R13 = PACA * R1 = host R1 @@ -524,6 +521,10 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) li r6, KVM_GUEST_MODE_HOST_HV stb r6, HSTATE_IN_GUEST(r13) + /* Order load of vcpu after load of vcore */ + lwsync + ld r4, HSTATE_KVM_VCPU(r13) + #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING /* Store initial timestamp */ cmpdi r4, 0
kvmppc_hv_entry is called from only 2 locations within book3s_hv_rmhandlers.S. Both of those locations set r4 as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. So, shift the r4 load instruction to kvmppc_hv_entry and thus modify the calling convention of this function. Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> --- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)