Message ID | 20241105132734.499506-1-gautam@linux.ibm.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | arch/powerpc/pseries: Fix KVM guest detection for disabling hardlockup detector | expand |
Gautam Menghani <gautam@linux.ibm.com> writes: > As per the kernel documentation[1], hardlockup detector should be > disabled in KVM guests as it may give false positives. On PPC, hardlockup > detector is broken inside KVM guests because disable_hardlockup_detector() Isn't it the opposite? Inside KVM guests, the hardlockup detector should be *disabled*, but it's not it's *enabled*, due to this bug. ie. it's not broken, it's working, but that's the bug. > is marked as early_initcall and it uses is_kvm_guest(), which is > initialized by check_kvm_guest() later during boot as it is a > core_initcall. check_kvm_guest() is also called in pSeries_smp_probe(), > which is called before initcalls, but it is skipped if KVM guest does > not have doorbell support or if the guest is launched with SMT=1. I'm wondering how no one has noticed. Most KVM guests have SMT=1. > Move the check_kvm_guest() call in pSeries_smp_probe() to the initial > part of function before doorbell/SMT checks so that "kvm_guest" static > key is initialized by the time disable_hardlockup_detector() runs. check_kvm_guest() is safe to be called multiple times so disable_hardlockup_detector() should just call it before it calls is_kvm_guest(). That should avoid future breakage when the order of calls changes, or someone refactors pSeries_smp_probe(). Can you identify the commit that broke this and include a Fixes: tag please. cheers > [1]: Documentation/admin-guide/sysctl/kernel.rst > > Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> > --- > arch/powerpc/platforms/pseries/smp.c | 9 +++++++-- > 1 file changed, 7 insertions(+), 2 deletions(-) > > diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c > index c597711ef20a..516c7bfec933 100644 > --- a/arch/powerpc/platforms/pseries/smp.c > +++ b/arch/powerpc/platforms/pseries/smp.c > @@ -199,6 +199,13 @@ static __init void pSeries_smp_probe(void) > else > xics_smp_probe(); > > + /* > + * Make sure this is called regardless of doorbell/SMT status, as > + * we disable hardlockup detector in an early_initcall where we need to > + * know KVM status for disabling hardlockup detector in KVM guests. > + */ > + check_kvm_guest(); > + > /* No doorbell facility, must use the interrupt controller for IPIs */ > if (!cpu_has_feature(CPU_FTR_DBELL)) > return; > @@ -207,8 +214,6 @@ static __init void pSeries_smp_probe(void) > if (!cpu_has_feature(CPU_FTR_SMT)) > return; > > - check_kvm_guest(); > - > if (is_kvm_guest()) { > /* > * KVM emulates doorbells by disabling FSCR[MSGP] so msgsndp > -- > 2.47.0
On Thu, Nov 07, 2024 at 10:54:29PM +1100, Michael Ellerman wrote: > Gautam Menghani <gautam@linux.ibm.com> writes: > > As per the kernel documentation[1], hardlockup detector should be > > disabled in KVM guests as it may give false positives. On PPC, hardlockup > > detector is broken inside KVM guests because disable_hardlockup_detector() > > Isn't it the opposite? Inside KVM guests, the hardlockup detector should > be *disabled*, but it's not it's *enabled*, due to this bug. > > ie. it's not broken, it's working, but that's the bug. Yes right, will change the description in v2. > > > is marked as early_initcall and it uses is_kvm_guest(), which is > > initialized by check_kvm_guest() later during boot as it is a > > core_initcall. check_kvm_guest() is also called in pSeries_smp_probe(), > > which is called before initcalls, but it is skipped if KVM guest does > > not have doorbell support or if the guest is launched with SMT=1. > > I'm wondering how no one has noticed. Most KVM guests have SMT=1. Looking at the commit history, code around hardlockups and pSeries_smp_probe() was changed around 2021/2022 timeframe, and I believe KVM wasn't being actively tested at the time. Even I noticed this only after coming across the documentation that said hardlockups should be disabled. So probably this feature decision isn't widely known. > > > Move the check_kvm_guest() call in pSeries_smp_probe() to the initial > > part of function before doorbell/SMT checks so that "kvm_guest" static > > key is initialized by the time disable_hardlockup_detector() runs. > > check_kvm_guest() is safe to be called multiple times so > disable_hardlockup_detector() should just call it before it calls > is_kvm_guest(). That should avoid future breakage when the order of > calls changes, or someone refactors pSeries_smp_probe(). Yeah I did that initially but in the worst case, that results in 3 calls to check_kvm_guest() - the core_initcall, pseries_smp_probe() call and then disable_hardlockup_detector(). Will that be fine? > > Can you identify the commit that broke this and include a Fixes: tag > please. Yes will do Thanks, Gautam
Gautam Menghani <gautam@linux.ibm.com> writes: > On Thu, Nov 07, 2024 at 10:54:29PM +1100, Michael Ellerman wrote: >> Gautam Menghani <gautam@linux.ibm.com> writes: >> > As per the kernel documentation[1], hardlockup detector should be >> > disabled in KVM guests as it may give false positives. On PPC, hardlockup >> > detector is broken inside KVM guests because disable_hardlockup_detector() >> >> Isn't it the opposite? Inside KVM guests, the hardlockup detector should >> be *disabled*, but it's not it's *enabled*, due to this bug. >> >> ie. it's not broken, it's working, but that's the bug. > > Yes right, will change the description in v2. Thanks. >> > is marked as early_initcall and it uses is_kvm_guest(), which is >> > initialized by check_kvm_guest() later during boot as it is a >> > core_initcall. check_kvm_guest() is also called in pSeries_smp_probe(), >> > which is called before initcalls, but it is skipped if KVM guest does >> > not have doorbell support or if the guest is launched with SMT=1. >> >> I'm wondering how no one has noticed. Most KVM guests have SMT=1. > > Looking at the commit history, code around hardlockups and > pSeries_smp_probe() was changed around 2021/2022 timeframe, and I > believe KVM wasn't being actively tested at the time. > Even I noticed this only after coming across the documentation that said > hardlockups should be disabled. So probably this feature decision isn't > widely known. I do test KVM but probably not under enough load to notice something like that. >> > Move the check_kvm_guest() call in pSeries_smp_probe() to the initial >> > part of function before doorbell/SMT checks so that "kvm_guest" static >> > key is initialized by the time disable_hardlockup_detector() runs. >> >> check_kvm_guest() is safe to be called multiple times so >> disable_hardlockup_detector() should just call it before it calls >> is_kvm_guest(). That should avoid future breakage when the order of >> calls changes, or someone refactors pSeries_smp_probe(). > > Yeah I did that initially but in the worst case, that results in 3 calls > to check_kvm_guest() - the core_initcall, pseries_smp_probe() call and > then disable_hardlockup_detector(). Will that be fine? Yeah it's fine. It's not pretty, maybe we can come up with something cleaner in future, but it's fine for a bug fix. cheers
diff --git a/arch/powerpc/platforms/pseries/smp.c b/arch/powerpc/platforms/pseries/smp.c index c597711ef20a..516c7bfec933 100644 --- a/arch/powerpc/platforms/pseries/smp.c +++ b/arch/powerpc/platforms/pseries/smp.c @@ -199,6 +199,13 @@ static __init void pSeries_smp_probe(void) else xics_smp_probe(); + /* + * Make sure this is called regardless of doorbell/SMT status, as + * we disable hardlockup detector in an early_initcall where we need to + * know KVM status for disabling hardlockup detector in KVM guests. + */ + check_kvm_guest(); + /* No doorbell facility, must use the interrupt controller for IPIs */ if (!cpu_has_feature(CPU_FTR_DBELL)) return; @@ -207,8 +214,6 @@ static __init void pSeries_smp_probe(void) if (!cpu_has_feature(CPU_FTR_SMT)) return; - check_kvm_guest(); - if (is_kvm_guest()) { /* * KVM emulates doorbells by disabling FSCR[MSGP] so msgsndp
As per the kernel documentation[1], hardlockup detector should be disabled in KVM guests as it may give false positives. On PPC, hardlockup detector is broken inside KVM guests because disable_hardlockup_detector() is marked as early_initcall and it uses is_kvm_guest(), which is initialized by check_kvm_guest() later during boot as it is a core_initcall. check_kvm_guest() is also called in pSeries_smp_probe(), which is called before initcalls, but it is skipped if KVM guest does not have doorbell support or if the guest is launched with SMT=1. Move the check_kvm_guest() call in pSeries_smp_probe() to the initial part of function before doorbell/SMT checks so that "kvm_guest" static key is initialized by the time disable_hardlockup_detector() runs. [1]: Documentation/admin-guide/sysctl/kernel.rst Signed-off-by: Gautam Menghani <gautam@linux.ibm.com> --- arch/powerpc/platforms/pseries/smp.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)