mbox series

[v4,00/10] Introduce SMT level and add PowerPC support

Message ID 20230705145143.40545-1-ldufour@linux.ibm.com (mailing list archive)
Headers show
Series Introduce SMT level and add PowerPC support | expand

Message

Laurent Dufour July 5, 2023, 2:51 p.m. UTC
I'm taking over the series Michael sent previously [1] which is smartly
reviewing the initial series I sent [2].  This series is addressing the
comments sent by Thomas and me on the Michael's one.

Here is a short introduction to the issue this series is addressing:

When a new CPU is added, the kernel is activating all its threads. This
leads to weird, but functional, result when adding CPU on a SMT 4 system
for instance.

Here the newly added CPU 1 has 8 threads while the other one has 4 threads
active (system has been booted with the 'smt-enabled=4' kernel option):

ltcden3-lp12:~ # ppc64_cpu --info
Core   0:    0*    1*    2*    3*    4     5     6     7
Core   1:    8*    9*   10*   11*   12*   13*   14*   15*

This mixed SMT level may confused end users and/or some applications.

There is no SMT level recorded in the kernel (common code), neither in user
space, as far as I know. Such a level is helpful when adding new CPU or
when optimizing the energy efficiency (when reactivating CPUs).

When SMP and HOTPLUG_SMT are defined, this series is adding a new SMT level
(cpu_smt_num_threads) and few callbacks allowing the architecture code to
fine control this value, setting a max and a "at boot" level, and
controling whether a thread should be onlined or not.

v4:
  Rebase on top of 6.5's updates
  Remove a dependancy against the X86's symbol cpu_primary_thread_mask
v3:
  Fix a build error in the patch 6/9
v2:
  As Thomas suggested,
    Reword some commit's description
    Remove topology_smt_supported()
    Remove topology_smt_threads_supported()
    Introduce CONFIG_SMT_NUM_THREADS_DYNAMIC
    Remove switch() in __store_smt_control()
  Update kernel-parameters.txt

[1] https://lore.kernel.org/linuxppc-dev/20230524155630.794584-1-mpe@ellerman.id.au/
[2] https://lore.kernel.org/linuxppc-dev/20230331153905.31698-1-ldufour@linux.ibm.com/


Laurent Dufour (2):
  cpu/hotplug: remove dependancy against cpu_primary_thread_mask
  cpu/SMT: Remove topology_smt_supported()

Michael Ellerman (8):
  cpu/SMT: Move SMT prototypes into cpu_smt.h
  cpu/SMT: Move smt/control simple exit cases earlier
  cpu/SMT: Store the current/max number of threads
  cpu/SMT: Create topology_smt_thread_allowed()
  cpu/SMT: Allow enabling partial SMT states via sysfs
  powerpc/pseries: Initialise CPU hotplug callbacks earlier
  powerpc: Add HOTPLUG_SMT support
  powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs

 .../ABI/testing/sysfs-devices-system-cpu      |   1 +
 .../admin-guide/kernel-parameters.txt         |   4 +-
 arch/Kconfig                                  |   3 +
 arch/powerpc/Kconfig                          |   2 +
 arch/powerpc/include/asm/topology.h           |  15 ++
 arch/powerpc/kernel/smp.c                     |   8 +-
 arch/powerpc/platforms/pseries/hotplug-cpu.c  |  30 ++--
 arch/powerpc/platforms/pseries/pseries.h      |   2 +
 arch/powerpc/platforms/pseries/setup.c        |   2 +
 arch/x86/include/asm/topology.h               |   4 +-
 arch/x86/kernel/cpu/common.c                  |   2 +-
 arch/x86/kernel/smpboot.c                     |   8 -
 include/linux/cpu.h                           |  25 +--
 include/linux/cpu_smt.h                       |  33 ++++
 kernel/cpu.c                                  | 142 +++++++++++++-----
 15 files changed, 196 insertions(+), 85 deletions(-)
 create mode 100644 include/linux/cpu_smt.h

Comments

Zhang, Rui July 9, 2023, 3:25 p.m. UTC | #1
Hi, Laurent,

I ran into a boot hang regression with latest upstream code, and it
took me a while to bisect the offending commit and workaround it.

Now I have tested this patch series on an Intel RaptorLake Hybrid
platform (4 Pcores with HT and 4 Ecores without HT), and it works as
expected.

So, for patch 1~7 in this series,

Tested-by: Zhang Rui <rui.zhang@intel.com>

thanks,
rui

On Wed, 2023-07-05 at 16:51 +0200, Laurent Dufour wrote:
> I'm taking over the series Michael sent previously [1] which is
> smartly
> reviewing the initial series I sent [2].  This series is addressing
> the
> comments sent by Thomas and me on the Michael's one.
> 
> Here is a short introduction to the issue this series is addressing:
> 
> When a new CPU is added, the kernel is activating all its threads.
> This
> leads to weird, but functional, result when adding CPU on a SMT 4
> system
> for instance.
> 
> Here the newly added CPU 1 has 8 threads while the other one has 4
> threads
> active (system has been booted with the 'smt-enabled=4' kernel
> option):
> 
> ltcden3-lp12:~ # ppc64_cpu --info
> Core   0:    0*    1*    2*    3*    4     5     6     7
> Core   1:    8*    9*   10*   11*   12*   13*   14*   15*
> 
> This mixed SMT level may confused end users and/or some applications.
> 
> There is no SMT level recorded in the kernel (common code), neither
> in user
> space, as far as I know. Such a level is helpful when adding new CPU
> or
> when optimizing the energy efficiency (when reactivating CPUs).
> 
> When SMP and HOTPLUG_SMT are defined, this series is adding a new SMT
> level
> (cpu_smt_num_threads) and few callbacks allowing the architecture
> code to
> fine control this value, setting a max and a "at boot" level, and
> controling whether a thread should be onlined or not.
> 
> v4:
>   Rebase on top of 6.5's updates
>   Remove a dependancy against the X86's symbol
> cpu_primary_thread_mask
> v3:
>   Fix a build error in the patch 6/9
> v2:
>   As Thomas suggested,
>     Reword some commit's description
>     Remove topology_smt_supported()
>     Remove topology_smt_threads_supported()
>     Introduce CONFIG_SMT_NUM_THREADS_DYNAMIC
>     Remove switch() in __store_smt_control()
>   Update kernel-parameters.txt
> 
> [1]
> https://lore.kernel.org/linuxppc-dev/20230524155630.794584-1-mpe@ellerman.id.au/
> [2]
> https://lore.kernel.org/linuxppc-dev/20230331153905.31698-1-ldufour@linux.ibm.com/
> 
> 
> Laurent Dufour (2):
>   cpu/hotplug: remove dependancy against cpu_primary_thread_mask
>   cpu/SMT: Remove topology_smt_supported()
> 
> Michael Ellerman (8):
>   cpu/SMT: Move SMT prototypes into cpu_smt.h
>   cpu/SMT: Move smt/control simple exit cases earlier
>   cpu/SMT: Store the current/max number of threads
>   cpu/SMT: Create topology_smt_thread_allowed()
>   cpu/SMT: Allow enabling partial SMT states via sysfs
>   powerpc/pseries: Initialise CPU hotplug callbacks earlier
>   powerpc: Add HOTPLUG_SMT support
>   powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs
> 
>  .../ABI/testing/sysfs-devices-system-cpu      |   1 +
>  .../admin-guide/kernel-parameters.txt         |   4 +-
>  arch/Kconfig                                  |   3 +
>  arch/powerpc/Kconfig                          |   2 +
>  arch/powerpc/include/asm/topology.h           |  15 ++
>  arch/powerpc/kernel/smp.c                     |   8 +-
>  arch/powerpc/platforms/pseries/hotplug-cpu.c  |  30 ++--
>  arch/powerpc/platforms/pseries/pseries.h      |   2 +
>  arch/powerpc/platforms/pseries/setup.c        |   2 +
>  arch/x86/include/asm/topology.h               |   4 +-
>  arch/x86/kernel/cpu/common.c                  |   2 +-
>  arch/x86/kernel/smpboot.c                     |   8 -
>  include/linux/cpu.h                           |  25 +--
>  include/linux/cpu_smt.h                       |  33 ++++
>  kernel/cpu.c                                  | 142 +++++++++++++---
> --
>  15 files changed, 196 insertions(+), 85 deletions(-)
>  create mode 100644 include/linux/cpu_smt.h
>
Laurent Dufour July 10, 2023, 9:08 a.m. UTC | #2
Le 09/07/2023 à 17:25, Zhang, Rui a écrit :
> Hi, Laurent,
> 
> I ran into a boot hang regression with latest upstream code, and it
> took me a while to bisect the offending commit and workaround it.
> 
> Now I have tested this patch series on an Intel RaptorLake Hybrid
> platform (4 Pcores with HT and 4 Ecores without HT), and it works as
> expected.
> 
> So, for patch 1~7 in this series,
> 
> Tested-by: Zhang Rui <rui.zhang@intel.com>

Thanks Rui!

> thanks,
> rui
> 
> On Wed, 2023-07-05 at 16:51 +0200, Laurent Dufour wrote:
>> I'm taking over the series Michael sent previously [1] which is
>> smartly
>> reviewing the initial series I sent [2].  This series is addressing
>> the
>> comments sent by Thomas and me on the Michael's one.
>>
>> Here is a short introduction to the issue this series is addressing:
>>
>> When a new CPU is added, the kernel is activating all its threads.
>> This
>> leads to weird, but functional, result when adding CPU on a SMT 4
>> system
>> for instance.
>>
>> Here the newly added CPU 1 has 8 threads while the other one has 4
>> threads
>> active (system has been booted with the 'smt-enabled=4' kernel
>> option):
>>
>> ltcden3-lp12:~ # ppc64_cpu --info
>> Core   0:    0*    1*    2*    3*    4     5     6     7
>> Core   1:    8*    9*   10*   11*   12*   13*   14*   15*
>>
>> This mixed SMT level may confused end users and/or some applications.
>>
>> There is no SMT level recorded in the kernel (common code), neither
>> in user
>> space, as far as I know. Such a level is helpful when adding new CPU
>> or
>> when optimizing the energy efficiency (when reactivating CPUs).
>>
>> When SMP and HOTPLUG_SMT are defined, this series is adding a new SMT
>> level
>> (cpu_smt_num_threads) and few callbacks allowing the architecture
>> code to
>> fine control this value, setting a max and a "at boot" level, and
>> controling whether a thread should be onlined or not.
>>
>> v4:
>>    Rebase on top of 6.5's updates
>>    Remove a dependancy against the X86's symbol
>> cpu_primary_thread_mask
>> v3:
>>    Fix a build error in the patch 6/9
>> v2:
>>    As Thomas suggested,
>>      Reword some commit's description
>>      Remove topology_smt_supported()
>>      Remove topology_smt_threads_supported()
>>      Introduce CONFIG_SMT_NUM_THREADS_DYNAMIC
>>      Remove switch() in __store_smt_control()
>>    Update kernel-parameters.txt
>>
>> [1]
>> https://lore.kernel.org/linuxppc-dev/20230524155630.794584-1-mpe@ellerman.id.au/
>> [2]
>> https://lore.kernel.org/linuxppc-dev/20230331153905.31698-1-ldufour@linux.ibm.com/
>>
>>
>> Laurent Dufour (2):
>>    cpu/hotplug: remove dependancy against cpu_primary_thread_mask
>>    cpu/SMT: Remove topology_smt_supported()
>>
>> Michael Ellerman (8):
>>    cpu/SMT: Move SMT prototypes into cpu_smt.h
>>    cpu/SMT: Move smt/control simple exit cases earlier
>>    cpu/SMT: Store the current/max number of threads
>>    cpu/SMT: Create topology_smt_thread_allowed()
>>    cpu/SMT: Allow enabling partial SMT states via sysfs
>>    powerpc/pseries: Initialise CPU hotplug callbacks earlier
>>    powerpc: Add HOTPLUG_SMT support
>>    powerpc/pseries: Honour current SMT state when DLPAR onlining CPUs
>>
>>   .../ABI/testing/sysfs-devices-system-cpu      |   1 +
>>   .../admin-guide/kernel-parameters.txt         |   4 +-
>>   arch/Kconfig                                  |   3 +
>>   arch/powerpc/Kconfig                          |   2 +
>>   arch/powerpc/include/asm/topology.h           |  15 ++
>>   arch/powerpc/kernel/smp.c                     |   8 +-
>>   arch/powerpc/platforms/pseries/hotplug-cpu.c  |  30 ++--
>>   arch/powerpc/platforms/pseries/pseries.h      |   2 +
>>   arch/powerpc/platforms/pseries/setup.c        |   2 +
>>   arch/x86/include/asm/topology.h               |   4 +-
>>   arch/x86/kernel/cpu/common.c                  |   2 +-
>>   arch/x86/kernel/smpboot.c                     |   8 -
>>   include/linux/cpu.h                           |  25 +--
>>   include/linux/cpu_smt.h                       |  33 ++++
>>   kernel/cpu.c                                  | 142 +++++++++++++---
>> --
>>   15 files changed, 196 insertions(+), 85 deletions(-)
>>   create mode 100644 include/linux/cpu_smt.h
>>
>
Thomas Gleixner July 28, 2023, 7:40 a.m. UTC | #3
Rui!

On Sun, Jul 09 2023 at 15:25, Rui Zhang wrote:
> I ran into a boot hang regression with latest upstream code, and it
> took me a while to bisect the offending commit and workaround it.

Where is the bug report and the analysis? And what's the workaround?

Thanks,

        tglx
Thomas Gleixner July 28, 2023, 7:58 a.m. UTC | #4
Laurent, Michael!

On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote:
> I'm taking over the series Michael sent previously [1] which is smartly
> reviewing the initial series I sent [2].  This series is addressing the
> comments sent by Thomas and me on the Michael's one.

Thanks for getting this into shape.

I've merged it into:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core

and tagged it at patch 7 for consumption into the powerpc tree, so the
powerpc specific changes can be applied there on top:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28

Thanks,

        tglx
Zhang, Rui July 28, 2023, 2:23 p.m. UTC | #5
Hi, Thomas,

On Fri, 2023-07-28 at 09:40 +0200, Thomas Gleixner wrote:
> Rui!
> 
> On Sun, Jul 09 2023 at 15:25, Rui Zhang wrote:
> > I ran into a boot hang regression with latest upstream code, and it
> > took me a while to bisect the offending commit and workaround it.
> 
> Where is the bug report and the analysis? And what's the workaround?

As it is an iwlwifi regression, I didn't paste the link here.

The regression was reported at
https://lore.kernel.org/all/b533071f38804247f06da9e52a04f15cce7a3836.camel@intel.com/

And it was fixed later by below commit in 6.5-rc2.

thanks,
rui

commit 12a89f0177092dbc2a1cb1d05a9790adbcea2309
Author:     Johannes Berg <johannes.berg@intel.com>
AuthorDate: Mon Jul 10 16:50:39 2023 +0200
Commit:     Jakub Kicinski <kuba@kernel.org>
CommitDate: Tue Jul 11 20:26:06 2023 -0700

    wifi: iwlwifi: remove 'use_tfh' config to fix crash
    
    This is equivalent to 'gen2', and it was always confusing to have
    two identical config entries. The split config patch actually had
    been originally developed after removing 'use_tfh" and didn't add
    the use_tfh in the new configs as they'd later been copied to the
    new files. Thus the easiest way to fix the init crash here now is
    to just remove use_tfh (which is erroneously unset in most of the
    configs now) and use 'gen2' in the code instead.
    
    There's possibly still an unwind error in iwl_txq_gen2_init() as
    it crashes if TXQ 0 fails to initialize, but we can deal with it
    later since the original failure is due to the use_tfh confusion.
    
    Tested-by: Xi Ruoyao <xry111@xry111.site>
    Reported-and-tested-by: Niklāvs Koļesņikovs
<pinkflames.linux@gmail.com>
    Reported-and-tested-by: Jeff Chua <jeff.chua.linux@gmail.com>
    Reported-and-tested-by: Zhang Rui <rui.zhang@intel.com>
    Link: https://bugzilla.kernel.org/show_bug.cgi?id=217622
    Link:
https://lore.kernel.org/all/9274d9bd3d080a457649ff5addcc1726f08ef5b2.camel@xry111.site/
    Link:
https://lore.kernel.org/all/CAAJw_Zug6VCS5ZqTWaFSr9sd85k%3DtyPm9DEE%2BmV%3DAKoECZM%2BsQ@mail.gmail.com/
    Fixes: 19898ce9cf8a ("wifi: iwlwifi: split 22000.c into multiple
files")
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
    Link:
https://lore.kernel.org/r/20230710145038.84186-2-johannes@sipsolutions.net
    Signed-off-by: Jakub Kicinski <kuba@kernel.org>

> 
> Thanks,
> 
>         tglx
Thomas Gleixner July 28, 2023, 2:51 p.m. UTC | #6
On Fri, Jul 28 2023 at 14:23, Rui Zhang wrote:
> On Fri, 2023-07-28 at 09:40 +0200, Thomas Gleixner wrote:
>> On Sun, Jul 09 2023 at 15:25, Rui Zhang wrote:
>> > I ran into a boot hang regression with latest upstream code, and it
>> > took me a while to bisect the offending commit and workaround it.
>> 
>> Where is the bug report and the analysis? And what's the workaround?
>
> As it is an iwlwifi regression, I didn't paste the link here.
>
> The regression was reported at
> https://lore.kernel.org/all/b533071f38804247f06da9e52a04f15cce7a3836.camel@intel.com/
>
> And it was fixed later by below commit in 6.5-rc2.

Ah, ok. I was worried that you ran into issues with the parallel bootup
muck.
Laurent Dufour July 31, 2023, 11:55 a.m. UTC | #7
Le 28/07/2023 à 09:58, Thomas Gleixner a écrit :
> Laurent, Michael!
> 
> On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote:
>> I'm taking over the series Michael sent previously [1] which is smartly
>> reviewing the initial series I sent [2].  This series is addressing the
>> comments sent by Thomas and me on the Michael's one.
> 
> Thanks for getting this into shape.
> 
> I've merged it into:
> 
>     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core
> 
> and tagged it at patch 7 for consumption into the powerpc tree, so the
> powerpc specific changes can be applied there on top:
> 
>     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28

Thanks Thomas!
Michael Ellerman Aug. 10, 2023, 6:23 a.m. UTC | #8
Thomas Gleixner <tglx@linutronix.de> writes:
> Laurent, Michael!
>
> On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote:
>> I'm taking over the series Michael sent previously [1] which is smartly
>> reviewing the initial series I sent [2].  This series is addressing the
>> comments sent by Thomas and me on the Michael's one.
>
> Thanks for getting this into shape.
>
> I've merged it into:
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core
>
> and tagged it at patch 7 for consumption into the powerpc tree, so the
> powerpc specific changes can be applied there on top:
>
>    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28

Thanks. I've merged this and applied the powerpc patches on top.

I've left it sitting in my topic/cpu-smt branch for the build bots to
chew on:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=topic/cpu-smt

I'll plan to merge it into my next in the next day or two.

cheers
Laurent Dufour Aug. 10, 2023, 8:51 a.m. UTC | #9
Le 10/08/2023 à 08:23, Michael Ellerman a écrit :
> Thomas Gleixner <tglx@linutronix.de> writes:
>> Laurent, Michael!
>>
>> On Wed, Jul 05 2023 at 16:51, Laurent Dufour wrote:
>>> I'm taking over the series Michael sent previously [1] which is smartly
>>> reviewing the initial series I sent [2].  This series is addressing the
>>> comments sent by Thomas and me on the Michael's one.
>>
>> Thanks for getting this into shape.
>>
>> I've merged it into:
>>
>>     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp/core
>>
>> and tagged it at patch 7 for consumption into the powerpc tree, so the
>> powerpc specific changes can be applied there on top:
>>
>>     git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git smp-core-for-ppc-23-07-28
> 
> Thanks. I've merged this and applied the powerpc patches on top.
> 
> I've left it sitting in my topic/cpu-smt branch for the build bots to
> chew on:
> 
>    https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=topic/cpu-smt
> 
> I'll plan to merge it into my next in the next day or two.

Thanks Michael!