Message ID | 20230306123740.3648841-3-kconsul@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | Improving calls to kvmppc_hv_entry | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/github-powerpc_selftests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_ppctests | success | Successfully ran 8 jobs. |
snowpatch_ozlabs/github-powerpc_clang | success | Successfully ran 6 jobs. |
snowpatch_ozlabs/github-powerpc_sparse | success | Successfully ran 4 jobs. |
snowpatch_ozlabs/github-powerpc_kernel_qemu | success | Successfully ran 24 jobs. |
Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > kvmppc_hv_entry is called from only 2 locations within > book3s_hv_rmhandlers.S. Both of those locations set r4 > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > So, shift the r4 load instruction to kvmppc_hv_entry and > thus modify the calling convention of this function. > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > --- > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++----- > 1 file changed, 4 insertions(+), 5 deletions(-) > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > index b81ba4ee0521..da9a15db12fe 100644 > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > RFI_TO_KERNEL > > kvmppc_call_hv_entry: > - ld r4, HSTATE_KVM_VCPU(r13) > + /* Enter guest. */ > bl kvmppc_hv_entry > > /* Back from guest - restore host state and return to caller */ > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > mtspr SPRN_LDBAR, r0 > isync > 63: > - /* Order load of vcpu after load of vcore */ > - lwsync Where did this barrier go? I don't see that it's covered by any existing barriers in kvmppc_hv_entry, and you don't add it back anywhere. > - ld r4, HSTATE_KVM_VCPU(r13) > + /* Enter guest. */ > bl kvmppc_hv_entry > > /* Back from the guest, go back to nap */ > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > /* Required state: > * > - * R4 = vcpu pointer (or NULL) > * MSR = ~IR|DR > * R13 = PACA > * R1 = host R1 > @@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > li r6, KVM_GUEST_MODE_HOST_HV > stb r6, HSTATE_IN_GUEST(r13) > > + ld r4, HSTATE_KVM_VCPU(r13) > + > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > /* Store initial timestamp */ > cmpdi r4, 0 cheers
On 2023-03-15 15:48:53, Michael Ellerman wrote: > Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > > kvmppc_hv_entry is called from only 2 locations within > > book3s_hv_rmhandlers.S. Both of those locations set r4 > > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > > So, shift the r4 load instruction to kvmppc_hv_entry and > > thus modify the calling convention of this function. > > > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > > --- > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++----- > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > index b81ba4ee0521..da9a15db12fe 100644 > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > > RFI_TO_KERNEL > > > > kvmppc_call_hv_entry: > > - ld r4, HSTATE_KVM_VCPU(r13) > > + /* Enter guest. */ > > bl kvmppc_hv_entry > > > > /* Back from guest - restore host state and return to caller */ > > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > > mtspr SPRN_LDBAR, r0 > > isync > > 63: > > - /* Order load of vcpu after load of vcore */ > > - lwsync > > Where did this barrier go? > > I don't see that it's covered by any existing barriers in > kvmppc_hv_entry, and you don't add it back anywhere. My concept about this is that since now the call to kvmppc_hv_entry is first taken before the load to r4 shouldn't the pending load in the pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway before-hand ? Or do you mesn to say that pending loads may not be cleared/flushed across the "bl <funcname>" boundary ? > > > - ld r4, HSTATE_KVM_VCPU(r13) > > + /* Enter guest. */ > > bl kvmppc_hv_entry > > > > /* Back from the guest, go back to nap */ > > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > > > /* Required state: > > * > > - * R4 = vcpu pointer (or NULL) > > * MSR = ~IR|DR > > * R13 = PACA > > * R1 = host R1 > > @@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > li r6, KVM_GUEST_MODE_HOST_HV > > stb r6, HSTATE_IN_GUEST(r13) > > > > + ld r4, HSTATE_KVM_VCPU(r13) > > + > > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > > /* Store initial timestamp */ > > cmpdi r4, 0 > > cheers
On 2023-03-15 10:48:01, Kautuk Consul wrote: > On 2023-03-15 15:48:53, Michael Ellerman wrote: > > Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > > > kvmppc_hv_entry is called from only 2 locations within > > > book3s_hv_rmhandlers.S. Both of those locations set r4 > > > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > > > So, shift the r4 load instruction to kvmppc_hv_entry and > > > thus modify the calling convention of this function. > > > > > > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > > > --- > > > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++----- > > > 1 file changed, 4 insertions(+), 5 deletions(-) > > > > > > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > index b81ba4ee0521..da9a15db12fe 100644 > > > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > > > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > > > RFI_TO_KERNEL > > > > > > kvmppc_call_hv_entry: > > > - ld r4, HSTATE_KVM_VCPU(r13) > > > + /* Enter guest. */ > > > bl kvmppc_hv_entry > > > > > > /* Back from guest - restore host state and return to caller */ > > > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > > > mtspr SPRN_LDBAR, r0 > > > isync > > > 63: > > > - /* Order load of vcpu after load of vcore */ > > > - lwsync > > > > Where did this barrier go? > > > > I don't see that it's covered by any existing barriers in > > kvmppc_hv_entry, and you don't add it back anywhere. > > My concept about this is that since now the call to kvmppc_hv_entry > is first taken before the load to r4 shouldn't the pending load in the > pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway > before-hand ? Or do you mesn to say that pending loads may not be > cleared/flushed across the "bl <funcname>" boundary ? Sorry, I mean: " shouldn't the pending load in the pipeline (of the HSTATE_KVM_VCORE) as per the earlier comment be ordered anyway before-hand?" Forgot the paranthesis. > > > > > - ld r4, HSTATE_KVM_VCPU(r13) > > > + /* Enter guest. */ > > > bl kvmppc_hv_entry > > > > > > /* Back from the guest, go back to nap */ > > > @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > > > > > /* Required state: > > > * > > > - * R4 = vcpu pointer (or NULL) > > > * MSR = ~IR|DR > > > * R13 = PACA > > > * R1 = host R1 > > > @@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) > > > li r6, KVM_GUEST_MODE_HOST_HV > > > stb r6, HSTATE_IN_GUEST(r13) > > > > > > + ld r4, HSTATE_KVM_VCPU(r13) > > > + > > > #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING > > > /* Store initial timestamp */ > > > cmpdi r4, 0 > > > > cheers
Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > On 2023-03-15 15:48:53, Michael Ellerman wrote: >> Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: >> > kvmppc_hv_entry is called from only 2 locations within >> > book3s_hv_rmhandlers.S. Both of those locations set r4 >> > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. >> > So, shift the r4 load instruction to kvmppc_hv_entry and >> > thus modify the calling convention of this function. >> > >> > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> >> > --- >> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++----- >> > 1 file changed, 4 insertions(+), 5 deletions(-) >> > >> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> > index b81ba4ee0521..da9a15db12fe 100644 >> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >> > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) >> > RFI_TO_KERNEL >> > >> > kvmppc_call_hv_entry: >> > - ld r4, HSTATE_KVM_VCPU(r13) >> > + /* Enter guest. */ >> > bl kvmppc_hv_entry >> > >> > /* Back from guest - restore host state and return to caller */ >> > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: >> > mtspr SPRN_LDBAR, r0 >> > isync >> > 63: >> > - /* Order load of vcpu after load of vcore */ >> > - lwsync >> >> Where did this barrier go? >> >> I don't see that it's covered by any existing barriers in >> kvmppc_hv_entry, and you don't add it back anywhere. > > My concept about this is that since now the call to kvmppc_hv_entry > is first taken before the load to r4 shouldn't the pending load in the > pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway > before-hand ? No. > Or do you mean to say that pending loads may not be > cleared/flushed across the "bl <funcname>" boundary ? Right. The "bl" imposes no ordering on loads before or after it. In general nothing orders two independant loads, other than a barrier. cheers
Michael Ellerman <mpe@ellerman.id.au> writes: > Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: >> On 2023-03-15 15:48:53, Michael Ellerman wrote: >>> Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: >>> > kvmppc_hv_entry is called from only 2 locations within >>> > book3s_hv_rmhandlers.S. Both of those locations set r4 >>> > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. >>> > So, shift the r4 load instruction to kvmppc_hv_entry and >>> > thus modify the calling convention of this function. >>> > >>> > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> >>> > --- >>> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++----- >>> > 1 file changed, 4 insertions(+), 5 deletions(-) >>> > >>> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >>> > index b81ba4ee0521..da9a15db12fe 100644 >>> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S >>> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S >>> > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) >>> > RFI_TO_KERNEL >>> > >>> > kvmppc_call_hv_entry: >>> > - ld r4, HSTATE_KVM_VCPU(r13) >>> > + /* Enter guest. */ >>> > bl kvmppc_hv_entry >>> > >>> > /* Back from guest - restore host state and return to caller */ >>> > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: >>> > mtspr SPRN_LDBAR, r0 >>> > isync >>> > 63: >>> > - /* Order load of vcpu after load of vcore */ >>> > - lwsync >>> >>> Where did this barrier go? >>> >>> I don't see that it's covered by any existing barriers in >>> kvmppc_hv_entry, and you don't add it back anywhere. >> >> My concept about this is that since now the call to kvmppc_hv_entry >> is first taken before the load to r4 shouldn't the pending load in the >> pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway >> before-hand ? > > No. > >> Or do you mean to say that pending loads may not be >> cleared/flushed across the "bl <funcname>" boundary ? > > Right. > > The "bl" imposes no ordering on loads before or after it. > > In general nothing orders two independant loads, other than a barrier. There's some docs on barriers here: https://www.kernel.org/doc/Documentation/memory-barriers.txt Though admittedly it is pretty dense. cheers
Hi, On 2023-03-16 14:39:08, Michael Ellerman wrote: > Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > > On 2023-03-15 15:48:53, Michael Ellerman wrote: > >> Kautuk Consul <kconsul@linux.vnet.ibm.com> writes: > >> > kvmppc_hv_entry is called from only 2 locations within > >> > book3s_hv_rmhandlers.S. Both of those locations set r4 > >> > as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. > >> > So, shift the r4 load instruction to kvmppc_hv_entry and > >> > thus modify the calling convention of this function. > >> > > >> > Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> > >> > --- > >> > arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++----- > >> > 1 file changed, 4 insertions(+), 5 deletions(-) > >> > > >> > diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > >> > index b81ba4ee0521..da9a15db12fe 100644 > >> > --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S > >> > +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S > >> > @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) > >> > RFI_TO_KERNEL > >> > > >> > kvmppc_call_hv_entry: > >> > - ld r4, HSTATE_KVM_VCPU(r13) > >> > + /* Enter guest. */ > >> > bl kvmppc_hv_entry > >> > > >> > /* Back from guest - restore host state and return to caller */ > >> > @@ -352,9 +352,7 @@ kvm_secondary_got_guest: > >> > mtspr SPRN_LDBAR, r0 > >> > isync > >> > 63: > >> > - /* Order load of vcpu after load of vcore */ > >> > - lwsync > >> > >> Where did this barrier go? > >> > >> I don't see that it's covered by any existing barriers in > >> kvmppc_hv_entry, and you don't add it back anywhere. > > > > My concept about this is that since now the call to kvmppc_hv_entry > > is first taken before the load to r4 shouldn't the pending load in the > > pipeline of the HSTATE_KVM_VCORE as per the earlier comment be ordered anyway > > before-hand ? > > No. > > > Or do you mean to say that pending loads may not be > > cleared/flushed across the "bl <funcname>" boundary ? > > Right. > > The "bl" imposes no ordering on loads before or after it. > > In general nothing orders two independant loads, other than a barrier. > > cheers Okay, I will post a patch v3 with lwsync before the load to r4 in kvmppc_hv_entry. Thanks.
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index b81ba4ee0521..da9a15db12fe 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -85,7 +85,7 @@ _GLOBAL_TOC(kvmppc_hv_entry_trampoline) RFI_TO_KERNEL kvmppc_call_hv_entry: - ld r4, HSTATE_KVM_VCPU(r13) + /* Enter guest. */ bl kvmppc_hv_entry /* Back from guest - restore host state and return to caller */ @@ -352,9 +352,7 @@ kvm_secondary_got_guest: mtspr SPRN_LDBAR, r0 isync 63: - /* Order load of vcpu after load of vcore */ - lwsync - ld r4, HSTATE_KVM_VCPU(r13) + /* Enter guest. */ bl kvmppc_hv_entry /* Back from the guest, go back to nap */ @@ -506,7 +504,6 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) /* Required state: * - * R4 = vcpu pointer (or NULL) * MSR = ~IR|DR * R13 = PACA * R1 = host R1 @@ -524,6 +521,8 @@ SYM_INNER_LABEL(kvmppc_hv_entry, SYM_L_LOCAL) li r6, KVM_GUEST_MODE_HOST_HV stb r6, HSTATE_IN_GUEST(r13) + ld r4, HSTATE_KVM_VCPU(r13) + #ifdef CONFIG_KVM_BOOK3S_HV_P8_TIMING /* Store initial timestamp */ cmpdi r4, 0
kvmppc_hv_entry is called from only 2 locations within book3s_hv_rmhandlers.S. Both of those locations set r4 as HSTATE_KVM_VCPU(r13) before calling kvmppc_hv_entry. So, shift the r4 load instruction to kvmppc_hv_entry and thus modify the calling convention of this function. Signed-off-by: Kautuk Consul <kconsul@linux.vnet.ibm.com> --- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)