Message ID | 20230705145143.40545-1-ldufour@linux.ibm.com (mailing list archive) |
---|---|
Headers | show |
Series | Introduce SMT level and add PowerPC support | expand |
Hi, Laurent, I ran into a boot hang regression with latest upstream code, and it took me a while to bisect the offending commit and workaround it. Now I have tested this patch series on an Intel RaptorLake Hybrid platform (4 Pcores with HT and 4 Ecores without HT), and it works as expected. So, for patch 1~7 in this series, Tested-by: Zhang Rui <rui.zhang@intel.com> thanks, rui On Wed, 2023-07-05 at 16:51 +0200, Laurent Dufour wrote: > I'm taking over the series Michael sent previously [1] which is > smartly > reviewing the initial series I sent [2]. This series is addressing > the > comments sent by Thomas and me on the Michael's one. > > Here is a short introduction to the issue this series is addressing: > > When a new CPU is added, the kernel is activating all its threads. > This > leads to weird, but functional, result when adding CPU on a SMT 4 > system > for instance. > > Here the newly added CPU 1 has 8 threads while the other one has 4 > threads > active (system has been booted with the 'smt-enabled=4' kernel > option): > > ltcden3-lp12:~ # ppc64_cpu --info > Core 0: 0* 1* 2* 3* 4 5 6 7 > Core 1: 8* 9* 10* 11* 12* 13* 14* 15* > > This mixed SMT level may confused end users and/or some applications. > > There is no SMT level recorded in the kernel (common code), neither > in user > space, as far as I know. Such a level is helpful when adding new CPU > or > when optimizing the energy efficiency (when reactivating CPUs). > > When SMP and HOTPLUG_SMT are defined, this series is adding a new SMT > level > (cpu_smt_num_threads) and few callbacks allowing the architecture > code to > fine control this value, setting a max and a "at boot" level, and > controling whether a thread should be onlined or not. > > v4: > Rebase on top of 6.5's updates > Remove a dependancy against the X86's symbol > cpu_primary_thread_mask > v3: > Fix a build error in the patch 6/9 > v2: > As Thomas suggested, > Reword some commit's description > Remove topology_smt_supported() > Remove topology_smt_threads_supported() > Introduce CONFIG_SMT_NUM_THREADS_DYNAMIC > Remove switch() in __store_smt_control() > Update kernel-parameters.txt > > [1] > https://lore.kernel.org/linuxppc-dev/20230524155630.794584-1-mpe@ellerman.id.au/ > [2] > https://lore.kernel.org/linuxppc-dev/20230331153905.31698-1-ldufour@linux.ibm.com/ > > > Laurent Dufour (2): > cpu/hotplug: remove dependancy against cpu_primary_thread_mask > cpu/SMT: Remove topology_smt_supported() > > Michael Ellerman (8): > cpu/SMT: Move SMT prototypes into cpu_smt.h > cpu/SMT: Move smt/control simple exit cases earlier > cpu/SMT: Store the current/max number of threads > cpu/SMT: Create topology_smt_thread_allowed() > cpu/SMT: Allow enabling partial SMT states via sysfs > powerpc/pseries: Initialise CPU hotplug callbacks earlier > powerpc: Add HOTPLUG_SMT support > powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs > > .../ABI/testing/sysfs-devices-system-cpu | 1 + > .../admin-guide/kernel-parameters.txt | 4 +- > arch/Kconfig | 3 + > arch/powerpc/Kconfig | 2 + > arch/powerpc/include/asm/topology.h | 15 ++ > arch/powerpc/kernel/smp.c | 8 +- > arch/powerpc/platforms/pseries/hotplug-cpu.c | 30 ++-- > arch/powerpc/platforms/pseries/pseries.h | 2 + > arch/powerpc/platforms/pseries/setup.c | 2 + > arch/x86/include/asm/topology.h | 4 +- > arch/x86/kernel/cpu/common.c | 2 +- > arch/x86/kernel/smpboot.c | 8 - > include/linux/cpu.h | 25 +-- > include/linux/cpu_smt.h | 33 ++++ > kernel/cpu.c | 142 +++++++++++++--- > -- > 15 files changed, 196 insertions(+), 85 deletions(-) > create mode 100644 include/linux/cpu_smt.h >
Le 09/07/2023 à 17:25, Zhang, Rui a écrit : > Hi, Laurent, > > I ran into a boot hang regression with latest upstream code, and it > took me a while to bisect the offending commit and workaround it. > > Now I have tested this patch series on an Intel RaptorLake Hybrid > platform (4 Pcores with HT and 4 Ecores without HT), and it works as > expected. > > So, for patch 1~7 in this series, > > Tested-by: Zhang Rui <rui.zhang@intel.com> Thanks Rui! > thanks, > rui > > On Wed, 2023-07-05 at 16:51 +0200, Laurent Dufour wrote: >> I'm taking over the series Michael sent previously [1] which is >> smartly >> reviewing the initial series I sent [2]. This series is addressing >> the >> comments sent by Thomas and me on the Michael's one. >> >> Here is a short introduction to the issue this series is addressing: >> >> When a new CPU is added, the kernel is activating all its threads. >> This >> leads to weird, but functional, result when adding CPU on a SMT 4 >> system >> for instance. >> >> Here the newly added CPU 1 has 8 threads while the other one has 4 >> threads >> active (system has been booted with the 'smt-enabled=4' kernel >> option): >> >> ltcden3-lp12:~ # ppc64_cpu --info >> Core 0: 0* 1* 2* 3* 4 5 6 7 >> Core 1: 8* 9* 10* 11* 12* 13* 14* 15* >> >> This mixed SMT level may confused end users and/or some applications. >> >> There is no SMT level recorded in the kernel (common code), neither >> in user >> space, as far as I know. Such a level is helpful when adding new CPU >> or >> when optimizing the energy efficiency (when reactivating CPUs). >> >> When SMP and HOTPLUG_SMT are defined, this series is adding a new SMT >> level >> (cpu_smt_num_threads) and few callbacks allowing the architecture >> code to >> fine control this value, setting a max and a "at boot" level, and >> controling whether a thread should be onlined or not. >> >> v4: >> Rebase on top of 6.5's updates >> Remove a dependancy against the X86's symbol >> cpu_primary_thread_mask >> v3: >> Fix a build error in the patch 6/9 >> v2: >> As Thomas suggested, >> Reword some commit's description >> Remove topology_smt_supported() >> Remove topology_smt_threads_supported() >> Introduce CONFIG_SMT_NUM_THREADS_DYNAMIC >> Remove switch() in __store_smt_control() >> Update kernel-parameters.txt >> >> [1] >> https://lore.kernel.org/linuxppc-dev/20230524155630.794584-1-mpe@ellerman.id.au/ >> [2] >> https://lore.kernel.org/linuxppc-dev/20230331153905.31698-1-ldufour@linux.ibm.com/ >> >> >> Laurent Dufour (2): >> cpu/hotplug: remove dependancy against cpu_primary_thread_mask >> cpu/SMT: Remove topology_smt_supported() >> >> Michael Ellerman (8): >> cpu/SMT: Move SMT prototypes into cpu_smt.h >> cpu/SMT: Move smt/control simple exit cases earlier >> cpu/SMT: Store the current/max number of threads >> cpu/SMT: Create topology_smt_thread_allowed() >> cpu/SMT: Allow enabling partial SMT states via sysfs >> powerpc/pseries: Initialise CPU hotplug callbacks earlier >> powerpc: Add HOTPLUG_SMT support >> powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs >> >> .../ABI/testing/sysfs-devices-system-cpu | 1 + >> .../admin-guide/kernel-parameters.txt | 4 +- >> arch/Kconfig | 3 + >> arch/powerpc/Kconfig | 2 + >> arch/powerpc/include/asm/topology.h | 15 ++ >> arch/powerpc/kernel/smp.c | 8 +- >> arch/powerpc/platforms/pseries/hotplug-cpu.c | 30 ++-- >> arch/powerpc/platforms/pseries/pseries.h | 2 + >> arch/powerpc/platforms/pseries/setup.c | 2 + >> arch/x86/include/asm/topology.h | 4 +- >> arch/x86/kernel/cpu/common.c | 2 +- >> arch/x86/kernel/smpboot.c | 8 - >> include/linux/cpu.h | 25 +-- >> include/linux/cpu_smt.h | 33 ++++ >> kernel/cpu.c | 142 +++++++++++++--- >> -- >> 15 files changed, 196 insertions(+), 85 deletions(-) >> create mode 100644 include/linux/cpu_smt.h >> >
Rui! On Sun, Jul 09 2023 at 15:25, Rui Zhang wrote: > I ran into a boot hang regression with latest upstream code, and it > took me a while to bisect the offending commit and workaround it. Where is the bug report and the analysis? And what's the workaround? Thanks, tglx
Laurent, Michael! On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote: > I'm taking over the series Michael sent previously [1] which is smartly > reviewing the initial series I sent [2]. This series is addressing the > comments sent by Thomas and me on the Michael's one. Thanks for getting this into shape. I've merged it into: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core and tagged it at patch 7 for consumption into the powerpc tree, so the powerpc specific changes can be applied there on top: git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28 Thanks, tglx
Hi, Thomas, On Fri, 2023-07-28 at 09:40 +0200, Thomas Gleixner wrote: > Rui! > > On Sun, Jul 09 2023 at 15:25, Rui Zhang wrote: > > I ran into a boot hang regression with latest upstream code, and it > > took me a while to bisect the offending commit and workaround it. > > Where is the bug report and the analysis? And what's the workaround? As it is an iwlwifi regression, I didn't paste the link here. The regression was reported at https://lore.kernel.org/all/b533071f38804247f06da9e52a04f15cce7a3836.camel@intel.com/ And it was fixed later by below commit in 6.5-rc2. thanks, rui commit 12a89f0177092dbc2a1cb1d05a9790adbcea2309 Author: Johannes Berg <johannes.berg@intel.com> AuthorDate: Mon Jul 10 16:50:39 2023 +0200 Commit: Jakub Kicinski <kuba@kernel.org> CommitDate: Tue Jul 11 20:26:06 2023 -0700 wifi: iwlwifi: remove 'use_tfh' config to fix crash This is equivalent to 'gen2', and it was always confusing to have two identical config entries. The split config patch actually had been originally developed after removing 'use_tfh" and didn't add the use_tfh in the new configs as they'd later been copied to the new files. Thus the easiest way to fix the init crash here now is to just remove use_tfh (which is erroneously unset in most of the configs now) and use 'gen2' in the code instead. There's possibly still an unwind error in iwl_txq_gen2_init() as it crashes if TXQ 0 fails to initialize, but we can deal with it later since the original failure is due to the use_tfh confusion. Tested-by: Xi Ruoyao <xry111@xry111.site> Reported-and-tested-by: Niklāvs Koļesņikovs <pinkflames.linux@gmail.com> Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com> Reported-and-tested-by: Zhang Rui <rui.zhang@intel.com> Link: https://bugzilla.kernel.org/show_bug.cgi?id=217622 Link: https://lore.kernel.org/all/9274d9bd3d080a457649ff5addcc1726f08ef5b2.camel@xry111.site/ Link: https://lore.kernel.org/all/CAAJw_Zug6VCS5ZqTWaFSr9sd85k%3DtyPm9DEE%2BmV%3DAKoECZM%2BsQ@mail.gmail.com/ Fixes: 19898ce9cf8a ("wifi: iwlwifi: split 22000.c into multiple files") Signed-off-by: Johannes Berg <johannes.berg@intel.com> Link: https://lore.kernel.org/r/20230710145038.84186-2-johannes@sipsolutions.net Signed-off-by: Jakub Kicinski <kuba@kernel.org> > > Thanks, > > tglx
On Fri, Jul 28 2023 at 14:23, Rui Zhang wrote: > On Fri, 2023-07-28 at 09:40 +0200, Thomas Gleixner wrote: >> On Sun, Jul 09 2023 at 15:25, Rui Zhang wrote: >> > I ran into a boot hang regression with latest upstream code, and it >> > took me a while to bisect the offending commit and workaround it. >> >> Where is the bug report and the analysis? And what's the workaround? > > As it is an iwlwifi regression, I didn't paste the link here. > > The regression was reported at > https://lore.kernel.org/all/b533071f38804247f06da9e52a04f15cce7a3836.camel@intel.com/ > > And it was fixed later by below commit in 6.5-rc2. Ah, ok. I was worried that you ran into issues with the parallel bootup muck.
Le 28/07/2023 à 09:58, Thomas Gleixner a écrit : > Laurent, Michael! > > On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote: >> I'm taking over the series Michael sent previously [1] which is smartly >> reviewing the initial series I sent [2]. This series is addressing the >> comments sent by Thomas and me on the Michael's one. > > Thanks for getting this into shape. > > I've merged it into: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core > > and tagged it at patch 7 for consumption into the powerpc tree, so the > powerpc specific changes can be applied there on top: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28 Thanks Thomas!
Thomas Gleixner <tglx@linutronix.de> writes: > Laurent, Michael! > > On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote: >> I'm taking over the series Michael sent previously [1] which is smartly >> reviewing the initial series I sent [2]. This series is addressing the >> comments sent by Thomas and me on the Michael's one. > > Thanks for getting this into shape. > > I've merged it into: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core > > and tagged it at patch 7 for consumption into the powerpc tree, so the > powerpc specific changes can be applied there on top: > > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28 Thanks. I've merged this and applied the powerpc patches on top. I've left it sitting in my topic/cpu-smt branch for the build bots to chew on: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=topic/cpu-smt I'll plan to merge it into my next in the next day or two. cheers
Le 10/08/2023 à 08:23, Michael Ellerman a écrit : > Thomas Gleixner <tglx@linutronix.de> writes: >> Laurent, Michael! >> >> On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote: >>> I'm taking over the series Michael sent previously [1] which is smartly >>> reviewing the initial series I sent [2]. This series is addressing the >>> comments sent by Thomas and me on the Michael's one. >> >> Thanks for getting this into shape. >> >> I've merged it into: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core >> >> and tagged it at patch 7 for consumption into the powerpc tree, so the >> powerpc specific changes can be applied there on top: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28 > > Thanks. I've merged this and applied the powerpc patches on top. > > I've left it sitting in my topic/cpu-smt branch for the build bots to > chew on: > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=topic/cpu-smt > > I'll plan to merge it into my next in the next day or two. Thanks Michael!