Message ID | 20190529211340.17087-1-atish.patra@wdc.com |
---|---|
Headers | show |
Series | Unify CPU topology across ARM & RISC-V | expand |
Hi, On 5/29/19 4:13 PM, Atish Patra wrote: > The cpu-map DT entry in ARM can describe the CPU topology in much better > way compared to other existing approaches. RISC-V can easily adopt this > binding to represent its own CPU topology. Thus, both cpu-map DT > binding and topology parsing code can be moved to a common location so > that RISC-V or any other architecture can leverage that. > > The relevant discussion regarding unifying cpu topology can be found in > [1]. > > arch_topology seems to be a perfect place to move the common code. I > have not introduced any significant functional changes in the moved code. > The only downside in this approach is that the capacity code will be > executed for RISC-V as well. But, it will exit immediately after not > able to find the appropriate DT node. If the overhead is considered too > much, we can always compile out capacity related functions under a > different config for the architectures that do not support them. > > There was an opportunity to unify topology data structure for ARM32 done > by patch 3/4. But, I refrained from making any other changes as I am not > very well versed with original intention for some functions that > are present in arch_topology.c. I hope this patch series can be served > as a baseline for such changes in the future. > > The patches have been tested for RISC-V and compile tested for ARM64, > ARM32 & x86. > I applied these to 5.2rc2, along with my PPTT/MT change and verified the system & scheduler topology/etc on DAWN and ThunderX2 using ACPI on arm64. They appear to be working correctly. so for the series, Tested-by: Jeremy Linton <jeremy.linton@arm.com> The code itself looks fine to me as well: Reviewed-by: Jeremy Linton <jeremy.linton@arm.com> Thanks! > The socket change[2] is also now part of this series. > > [1] https://lkml.org/lkml/2018/11/6/19 > [2] https://lkml.org/lkml/2018/11/7/918 > > QEMU changes for RISC-V topology are available at > > https://github.com/atishp04/qemu/tree/riscv_topology_dt > > HiFive Unleashed DT with topology node is available here. > https://github.com/atishp04/opensbi/tree/HiFive_unleashed_topology > > It can be verified with OpenSBI with following additional compile time > option. > > FW_PAYLOAD_FDT="unleashed_topology.dtb" > > Changes from v5->v6 > 1. Added two more patches from Sudeep about maintainership of arch_topology.c > and Kconfig update. > 2. Added Tested-by & Reviewed-by > 3. Fixed a nit (reordering of variables) > > Changes from v4-v5 > 1. Removed the arch_topology.h header inclusion from topology.c and arch_topology.c > file. Added it in linux/topology.h. > 2. core_id is set to -1 upon reset. Otherwise, ARM topology store function does not > work. > > Changes from v3->v4 > 1. Get rid of ARM32 specific information in topology structure. > 2. Remove redundant functions from ARM32 and use common code instead. > > Changes from v2->v3 > 1. Cover letter update with experiment DT for topology changes. > 2. Added the patch for [2]. > > Changes from v1->v2 > 1. ARM32 can now use the common code as well. > > Atish Patra (4): > dt-binding: cpu-topology: Move cpu-map to a common binding. > cpu-topology: Move cpu topology code to common code. > arm: Use common cpu_topology structure and functions. > RISC-V: Parse cpu topology during boot. > > Sudeep Holla (3): > Documentation: DT: arm: add support for sockets defining package > boundaries > base: arch_topology: update Kconfig help description > MAINTAINERS: Add an entry for generic architecture topology > > .../topology.txt => cpu/cpu-topology.txt} | 134 ++++++-- > MAINTAINERS | 7 + > arch/arm/include/asm/topology.h | 20 -- > arch/arm/kernel/topology.c | 60 +--- > arch/arm64/include/asm/topology.h | 23 -- > arch/arm64/kernel/topology.c | 303 +----------------- > arch/riscv/Kconfig | 1 + > arch/riscv/kernel/smpboot.c | 3 + > drivers/base/Kconfig | 2 +- > drivers/base/arch_topology.c | 298 +++++++++++++++++ > include/linux/arch_topology.h | 26 ++ > include/linux/topology.h | 1 + > 12 files changed, 452 insertions(+), 426 deletions(-) > rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%) > > -- > 2.21.0 >
On 5/30/19 2:12 PM, Jeremy Linton wrote: > Hi, > > On 5/29/19 4:13 PM, Atish Patra wrote: >> The cpu-map DT entry in ARM can describe the CPU topology in much better >> way compared to other existing approaches. RISC-V can easily adopt this >> binding to represent its own CPU topology. Thus, both cpu-map DT >> binding and topology parsing code can be moved to a common location so >> that RISC-V or any other architecture can leverage that. >> >> The relevant discussion regarding unifying cpu topology can be found in >> [1]. >> >> arch_topology seems to be a perfect place to move the common code. I >> have not introduced any significant functional changes in the moved code. >> The only downside in this approach is that the capacity code will be >> executed for RISC-V as well. But, it will exit immediately after not >> able to find the appropriate DT node. If the overhead is considered too >> much, we can always compile out capacity related functions under a >> different config for the architectures that do not support them. >> >> There was an opportunity to unify topology data structure for ARM32 done >> by patch 3/4. But, I refrained from making any other changes as I am not >> very well versed with original intention for some functions that >> are present in arch_topology.c. I hope this patch series can be served >> as a baseline for such changes in the future. >> >> The patches have been tested for RISC-V and compile tested for ARM64, >> ARM32 & x86. >> > > I applied these to 5.2rc2, along with my PPTT/MT change and verified the > system & scheduler topology/etc on DAWN and ThunderX2 using ACPI on > arm64. They appear to be working correctly. > > so for the series, > Tested-by: Jeremy Linton <jeremy.linton@arm.com> > > The code itself looks fine to me as well: > > Reviewed-by: Jeremy Linton <jeremy.linton@arm.com> > Thank you the review and testing on arm64 server. > Thanks! > >> The socket change[2] is also now part of this series. >> >> [1] https://lkml.org/lkml/2018/11/6/19 >> [2] https://lkml.org/lkml/2018/11/7/918 >> >> QEMU changes for RISC-V topology are available at >> >> https://github.com/atishp04/qemu/tree/riscv_topology_dt >> >> HiFive Unleashed DT with topology node is available here. >> https://github.com/atishp04/opensbi/tree/HiFive_unleashed_topology >> >> It can be verified with OpenSBI with following additional compile time >> option. >> >> FW_PAYLOAD_FDT="unleashed_topology.dtb" >> >> Changes from v5->v6 >> 1. Added two more patches from Sudeep about maintainership of arch_topology.c >> and Kconfig update. >> 2. Added Tested-by & Reviewed-by >> 3. Fixed a nit (reordering of variables) >> >> Changes from v4-v5 >> 1. Removed the arch_topology.h header inclusion from topology.c and arch_topology.c >> file. Added it in linux/topology.h. >> 2. core_id is set to -1 upon reset. Otherwise, ARM topology store function does not >> work. >> >> Changes from v3->v4 >> 1. Get rid of ARM32 specific information in topology structure. >> 2. Remove redundant functions from ARM32 and use common code instead. >> >> Changes from v2->v3 >> 1. Cover letter update with experiment DT for topology changes. >> 2. Added the patch for [2]. >> >> Changes from v1->v2 >> 1. ARM32 can now use the common code as well. >> >> Atish Patra (4): >> dt-binding: cpu-topology: Move cpu-map to a common binding. >> cpu-topology: Move cpu topology code to common code. >> arm: Use common cpu_topology structure and functions. >> RISC-V: Parse cpu topology during boot. >> >> Sudeep Holla (3): >> Documentation: DT: arm: add support for sockets defining package >> boundaries >> base: arch_topology: update Kconfig help description >> MAINTAINERS: Add an entry for generic architecture topology >> >> .../topology.txt => cpu/cpu-topology.txt} | 134 ++++++-- >> MAINTAINERS | 7 + >> arch/arm/include/asm/topology.h | 20 -- >> arch/arm/kernel/topology.c | 60 +--- >> arch/arm64/include/asm/topology.h | 23 -- >> arch/arm64/kernel/topology.c | 303 +----------------- >> arch/riscv/Kconfig | 1 + >> arch/riscv/kernel/smpboot.c | 3 + >> drivers/base/Kconfig | 2 +- >> drivers/base/arch_topology.c | 298 +++++++++++++++++ >> include/linux/arch_topology.h | 26 ++ >> include/linux/topology.h | 1 + >> 12 files changed, 452 insertions(+), 426 deletions(-) >> rename Documentation/devicetree/bindings/{arm/topology.txt => cpu/cpu-topology.txt} (66%) >> >> -- >> 2.21.0 >> > >
On 5/29/19 2:15 PM, Atish Patra wrote: > Currently, ARM32 and ARM64 uses different data structures to represent > their cpu topologies. Since, we are moving the ARM64 topology to common > code to be used by other architectures, we can reuse that for ARM32 as > well. > > Take this opprtunity to remove the redundant functions from ARM32 and > reuse the common code instead. > > To: Russell King <linux@armlinux.org.uk> > Signed-off-by: Atish Patra <atish.patra@wdc.com> > Tested-by: Sudeep Holla <sudeep.holla@arm.com> (on TC2) > Reviewed-by : Sudeep Holla <sudeep.holla@arm.com> > > --- > Hi Russell, > Can we get a ACK for this patch ? We are hoping that the entire > series can be merged at one go. > --- > arch/arm/include/asm/topology.h | 20 ----------- > arch/arm/kernel/topology.c | 60 ++++----------------------------- > drivers/base/arch_topology.c | 4 ++- > include/linux/arch_topology.h | 6 ++-- > 4 files changed, 11 insertions(+), 79 deletions(-) > > diff --git a/arch/arm/include/asm/topology.h b/arch/arm/include/asm/topology.h > index 2a786f54d8b8..8a0fae94d45e 100644 > --- a/arch/arm/include/asm/topology.h > +++ b/arch/arm/include/asm/topology.h > @@ -5,26 +5,6 @@ > #ifdef CONFIG_ARM_CPU_TOPOLOGY > > #include <linux/cpumask.h> > - > -struct cputopo_arm { > - int thread_id; > - int core_id; > - int socket_id; > - cpumask_t thread_sibling; > - cpumask_t core_sibling; > -}; > - > -extern struct cputopo_arm cpu_topology[NR_CPUS]; > - > -#define topology_physical_package_id(cpu) (cpu_topology[cpu].socket_id) > -#define topology_core_id(cpu) (cpu_topology[cpu].core_id) > -#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling) > -#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling) > - > -void init_cpu_topology(void); > -void store_cpu_topology(unsigned int cpuid); > -const struct cpumask *cpu_coregroup_mask(int cpu); > - > #include <linux/arch_topology.h> > > /* Replace task scheduler's default frequency-invariant accounting */ > diff --git a/arch/arm/kernel/topology.c b/arch/arm/kernel/topology.c > index 60e375ce1ab2..238f1da0219c 100644 > --- a/arch/arm/kernel/topology.c > +++ b/arch/arm/kernel/topology.c > @@ -177,17 +177,6 @@ static inline void parse_dt_topology(void) {} > static inline void update_cpu_capacity(unsigned int cpuid) {} > #endif > > - /* > - * cpu topology table > - */ > -struct cputopo_arm cpu_topology[NR_CPUS]; > -EXPORT_SYMBOL_GPL(cpu_topology); > - > -const struct cpumask *cpu_coregroup_mask(int cpu) > -{ > - return &cpu_topology[cpu].core_sibling; > -} > - > /* > * The current assumption is that we can power gate each core independently. > * This will be superseded by DT binding once available. > @@ -197,32 +186,6 @@ const struct cpumask *cpu_corepower_mask(int cpu) > return &cpu_topology[cpu].thread_sibling; > } > > -static void update_siblings_masks(unsigned int cpuid) > -{ > - struct cputopo_arm *cpu_topo, *cpuid_topo = &cpu_topology[cpuid]; > - int cpu; > - > - /* update core and thread sibling masks */ > - for_each_possible_cpu(cpu) { > - cpu_topo = &cpu_topology[cpu]; > - > - if (cpuid_topo->socket_id != cpu_topo->socket_id) > - continue; > - > - cpumask_set_cpu(cpuid, &cpu_topo->core_sibling); > - if (cpu != cpuid) > - cpumask_set_cpu(cpu, &cpuid_topo->core_sibling); > - > - if (cpuid_topo->core_id != cpu_topo->core_id) > - continue; > - > - cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling); > - if (cpu != cpuid) > - cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling); > - } > - smp_wmb(); > -} > - > /* > * store_cpu_topology is called at boot when only one cpu is running > * and with the mutex cpu_hotplug.lock locked, when several cpus have booted, > @@ -230,7 +193,7 @@ static void update_siblings_masks(unsigned int cpuid) > */ > void store_cpu_topology(unsigned int cpuid) > { > - struct cputopo_arm *cpuid_topo = &cpu_topology[cpuid]; > + struct cpu_topology *cpuid_topo = &cpu_topology[cpuid]; > unsigned int mpidr; > > /* If the cpu topology has been already set, just return */ > @@ -250,12 +213,12 @@ void store_cpu_topology(unsigned int cpuid) > /* core performance interdependency */ > cpuid_topo->thread_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); > cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); > - cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); > + cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 2); > } else { > /* largely independent cores */ > cpuid_topo->thread_id = -1; > cpuid_topo->core_id = MPIDR_AFFINITY_LEVEL(mpidr, 0); > - cpuid_topo->socket_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); > + cpuid_topo->package_id = MPIDR_AFFINITY_LEVEL(mpidr, 1); > } > } else { > /* > @@ -265,7 +228,7 @@ void store_cpu_topology(unsigned int cpuid) > */ > cpuid_topo->thread_id = -1; > cpuid_topo->core_id = 0; > - cpuid_topo->socket_id = -1; > + cpuid_topo->package_id = -1; > } > > update_siblings_masks(cpuid); > @@ -275,7 +238,7 @@ void store_cpu_topology(unsigned int cpuid) > pr_info("CPU%u: thread %d, cpu %d, socket %d, mpidr %x\n", > cpuid, cpu_topology[cpuid].thread_id, > cpu_topology[cpuid].core_id, > - cpu_topology[cpuid].socket_id, mpidr); > + cpu_topology[cpuid].package_id, mpidr); > } > > static inline int cpu_corepower_flags(void) > @@ -298,18 +261,7 @@ static struct sched_domain_topology_level arm_topology[] = { > */ > void __init init_cpu_topology(void) > { > - unsigned int cpu; > - > - /* init core mask and capacity */ > - for_each_possible_cpu(cpu) { > - struct cputopo_arm *cpu_topo = &(cpu_topology[cpu]); > - > - cpu_topo->thread_id = -1; > - cpu_topo->core_id = -1; > - cpu_topo->socket_id = -1; > - cpumask_clear(&cpu_topo->core_sibling); > - cpumask_clear(&cpu_topo->thread_sibling); > - } > + reset_cpu_topology(); > smp_wmb(); > > parse_dt_topology(); > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c > index 5781bb4c457c..797e3cd71bea 100644 > --- a/drivers/base/arch_topology.c > +++ b/drivers/base/arch_topology.c > @@ -426,6 +426,7 @@ static int __init parse_dt_topology(void) > of_node_put(cn); > return ret; > } > +#endif > > /* > * cpu topology table > @@ -491,7 +492,7 @@ static void clear_cpu_topology(int cpu) > cpumask_set_cpu(cpu, &cpu_topo->thread_sibling); > } > > -static void __init reset_cpu_topology(void) > +void __init reset_cpu_topology(void) > { > unsigned int cpu; > > @@ -526,6 +527,7 @@ __weak int __init parse_acpi_topology(void) > return 0; > } > > +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV) > void __init init_cpu_topology(void) > { > reset_cpu_topology(); > diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h > index d4e76e0a283f..d4311127970d 100644 > --- a/include/linux/arch_topology.h > +++ b/include/linux/arch_topology.h > @@ -54,11 +54,9 @@ extern struct cpu_topology cpu_topology[NR_CPUS]; > void init_cpu_topology(void); > void store_cpu_topology(unsigned int cpuid); > const struct cpumask *cpu_coregroup_mask(int cpu); > -#endif > - > -#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV) > void update_siblings_masks(unsigned int cpu); > -#endif > void remove_cpu_topology(unsigned int cpuid); > +void reset_cpu_topology(void); > +#endif > > #endif /* _LINUX_ARCH_TOPOLOGY_H_ */ > Hi Russell, Can we get an ACK for ARM if you don't have any objection to the series ?
On 5/29/19 2:15 PM, Atish Patra wrote: > Both RISC-V & ARM64 are using cpu-map device tree to describe > their cpu topology. It's better to move the relevant code to > a common place instead of duplicate code. > > To: Will Deacon <will.deacon@arm.com> > To: Catalin Marinas <catalin.marinas@arm.com> > Signed-off-by: Atish Patra <atish.patra@wdc.com> > [Tested on QDF2400] > Tested-by: Jeffrey Hugo <jhugo@codeaurora.org> > [Tested on Juno and other embedded platforms.] > Tested-by: Sudeep Holla <sudeep.holla@arm.com> > Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> > > --- > Hi Will/Catalin, > Can we get ack for this patch ? We are hoping to get the entire series > merged at one go. > --- > arch/arm64/include/asm/topology.h | 23 --- > arch/arm64/kernel/topology.c | 303 +----------------------------- > drivers/base/arch_topology.c | 296 +++++++++++++++++++++++++++++ > include/linux/arch_topology.h | 28 +++ > include/linux/topology.h | 1 + > 5 files changed, 329 insertions(+), 322 deletions(-) > > diff --git a/arch/arm64/include/asm/topology.h b/arch/arm64/include/asm/topology.h > index 0524f2438649..a4d945db95a2 100644 > --- a/arch/arm64/include/asm/topology.h > +++ b/arch/arm64/include/asm/topology.h > @@ -4,29 +4,6 @@ > > #include <linux/cpumask.h> > > -struct cpu_topology { > - int thread_id; > - int core_id; > - int package_id; > - int llc_id; > - cpumask_t thread_sibling; > - cpumask_t core_sibling; > - cpumask_t llc_sibling; > -}; > - > -extern struct cpu_topology cpu_topology[NR_CPUS]; > - > -#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id) > -#define topology_core_id(cpu) (cpu_topology[cpu].core_id) > -#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling) > -#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling) > -#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling) > - > -void init_cpu_topology(void); > -void store_cpu_topology(unsigned int cpuid); > -void remove_cpu_topology(unsigned int cpuid); > -const struct cpumask *cpu_coregroup_mask(int cpu); > - > #ifdef CONFIG_NUMA > > struct pci_bus; > diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c > index 0825c4a856e3..6b95c91e7d67 100644 > --- a/arch/arm64/kernel/topology.c > +++ b/arch/arm64/kernel/topology.c > @@ -14,250 +14,13 @@ > #include <linux/acpi.h> > #include <linux/arch_topology.h> > #include <linux/cacheinfo.h> > -#include <linux/cpu.h> > -#include <linux/cpumask.h> > #include <linux/init.h> > #include <linux/percpu.h> > -#include <linux/node.h> > -#include <linux/nodemask.h> > -#include <linux/of.h> > -#include <linux/sched.h> > -#include <linux/sched/topology.h> > -#include <linux/slab.h> > -#include <linux/smp.h> > -#include <linux/string.h> > > #include <asm/cpu.h> > #include <asm/cputype.h> > #include <asm/topology.h> > > -static int __init get_cpu_for_node(struct device_node *node) > -{ > - struct device_node *cpu_node; > - int cpu; > - > - cpu_node = of_parse_phandle(node, "cpu", 0); > - if (!cpu_node) > - return -1; > - > - cpu = of_cpu_node_to_id(cpu_node); > - if (cpu >= 0) > - topology_parse_cpu_capacity(cpu_node, cpu); > - else > - pr_crit("Unable to find CPU node for %pOF\n", cpu_node); > - > - of_node_put(cpu_node); > - return cpu; > -} > - > -static int __init parse_core(struct device_node *core, int package_id, > - int core_id) > -{ > - char name[10]; > - bool leaf = true; > - int i = 0; > - int cpu; > - struct device_node *t; > - > - do { > - snprintf(name, sizeof(name), "thread%d", i); > - t = of_get_child_by_name(core, name); > - if (t) { > - leaf = false; > - cpu = get_cpu_for_node(t); > - if (cpu >= 0) { > - cpu_topology[cpu].package_id = package_id; > - cpu_topology[cpu].core_id = core_id; > - cpu_topology[cpu].thread_id = i; > - } else { > - pr_err("%pOF: Can't get CPU for thread\n", > - t); > - of_node_put(t); > - return -EINVAL; > - } > - of_node_put(t); > - } > - i++; > - } while (t); > - > - cpu = get_cpu_for_node(core); > - if (cpu >= 0) { > - if (!leaf) { > - pr_err("%pOF: Core has both threads and CPU\n", > - core); > - return -EINVAL; > - } > - > - cpu_topology[cpu].package_id = package_id; > - cpu_topology[cpu].core_id = core_id; > - } else if (leaf) { > - pr_err("%pOF: Can't get CPU for leaf core\n", core); > - return -EINVAL; > - } > - > - return 0; > -} > - > -static int __init parse_cluster(struct device_node *cluster, int depth) > -{ > - char name[10]; > - bool leaf = true; > - bool has_cores = false; > - struct device_node *c; > - static int package_id __initdata; > - int core_id = 0; > - int i, ret; > - > - /* > - * First check for child clusters; we currently ignore any > - * information about the nesting of clusters and present the > - * scheduler with a flat list of them. > - */ > - i = 0; > - do { > - snprintf(name, sizeof(name), "cluster%d", i); > - c = of_get_child_by_name(cluster, name); > - if (c) { > - leaf = false; > - ret = parse_cluster(c, depth + 1); > - of_node_put(c); > - if (ret != 0) > - return ret; > - } > - i++; > - } while (c); > - > - /* Now check for cores */ > - i = 0; > - do { > - snprintf(name, sizeof(name), "core%d", i); > - c = of_get_child_by_name(cluster, name); > - if (c) { > - has_cores = true; > - > - if (depth == 0) { > - pr_err("%pOF: cpu-map children should be clusters\n", > - c); > - of_node_put(c); > - return -EINVAL; > - } > - > - if (leaf) { > - ret = parse_core(c, package_id, core_id++); > - } else { > - pr_err("%pOF: Non-leaf cluster with core %s\n", > - cluster, name); > - ret = -EINVAL; > - } > - > - of_node_put(c); > - if (ret != 0) > - return ret; > - } > - i++; > - } while (c); > - > - if (leaf && !has_cores) > - pr_warn("%pOF: empty cluster\n", cluster); > - > - if (leaf) > - package_id++; > - > - return 0; > -} > - > -static int __init parse_dt_topology(void) > -{ > - struct device_node *cn, *map; > - int ret = 0; > - int cpu; > - > - cn = of_find_node_by_path("/cpus"); > - if (!cn) { > - pr_err("No CPU information found in DT\n"); > - return 0; > - } > - > - /* > - * When topology is provided cpu-map is essentially a root > - * cluster with restricted subnodes. > - */ > - map = of_get_child_by_name(cn, "cpu-map"); > - if (!map) > - goto out; > - > - ret = parse_cluster(map, 0); > - if (ret != 0) > - goto out_map; > - > - topology_normalize_cpu_scale(); > - > - /* > - * Check that all cores are in the topology; the SMP code will > - * only mark cores described in the DT as possible. > - */ > - for_each_possible_cpu(cpu) > - if (cpu_topology[cpu].package_id == -1) > - ret = -EINVAL; > - > -out_map: > - of_node_put(map); > -out: > - of_node_put(cn); > - return ret; > -} > - > -/* > - * cpu topology table > - */ > -struct cpu_topology cpu_topology[NR_CPUS]; > -EXPORT_SYMBOL_GPL(cpu_topology); > - > -const struct cpumask *cpu_coregroup_mask(int cpu) > -{ > - const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu)); > - > - /* Find the smaller of NUMA, core or LLC siblings */ > - if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) { > - /* not numa in package, lets use the package siblings */ > - core_mask = &cpu_topology[cpu].core_sibling; > - } > - if (cpu_topology[cpu].llc_id != -1) { > - if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) > - core_mask = &cpu_topology[cpu].llc_sibling; > - } > - > - return core_mask; > -} > - > -static void update_siblings_masks(unsigned int cpuid) > -{ > - struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid]; > - int cpu; > - > - /* update core and thread sibling masks */ > - for_each_online_cpu(cpu) { > - cpu_topo = &cpu_topology[cpu]; > - > - if (cpuid_topo->llc_id == cpu_topo->llc_id) { > - cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling); > - cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling); > - } > - > - if (cpuid_topo->package_id != cpu_topo->package_id) > - continue; > - > - cpumask_set_cpu(cpuid, &cpu_topo->core_sibling); > - cpumask_set_cpu(cpu, &cpuid_topo->core_sibling); > - > - if (cpuid_topo->core_id != cpu_topo->core_id) > - continue; > - > - cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling); > - cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling); > - } > -} > - > void store_cpu_topology(unsigned int cpuid) > { > struct cpu_topology *cpuid_topo = &cpu_topology[cpuid]; > @@ -296,59 +59,19 @@ void store_cpu_topology(unsigned int cpuid) > update_siblings_masks(cpuid); > } > > -static void clear_cpu_topology(int cpu) > -{ > - struct cpu_topology *cpu_topo = &cpu_topology[cpu]; > - > - cpumask_clear(&cpu_topo->llc_sibling); > - cpumask_set_cpu(cpu, &cpu_topo->llc_sibling); > - > - cpumask_clear(&cpu_topo->core_sibling); > - cpumask_set_cpu(cpu, &cpu_topo->core_sibling); > - cpumask_clear(&cpu_topo->thread_sibling); > - cpumask_set_cpu(cpu, &cpu_topo->thread_sibling); > -} > - > -static void __init reset_cpu_topology(void) > -{ > - unsigned int cpu; > - > - for_each_possible_cpu(cpu) { > - struct cpu_topology *cpu_topo = &cpu_topology[cpu]; > - > - cpu_topo->thread_id = -1; > - cpu_topo->core_id = 0; > - cpu_topo->package_id = -1; > - cpu_topo->llc_id = -1; > - > - clear_cpu_topology(cpu); > - } > -} > - > -void remove_cpu_topology(unsigned int cpu) > -{ > - int sibling; > - > - for_each_cpu(sibling, topology_core_cpumask(cpu)) > - cpumask_clear_cpu(cpu, topology_core_cpumask(sibling)); > - for_each_cpu(sibling, topology_sibling_cpumask(cpu)) > - cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling)); > - for_each_cpu(sibling, topology_llc_cpumask(cpu)) > - cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling)); > - > - clear_cpu_topology(cpu); > -} > - > #ifdef CONFIG_ACPI > /* > * Propagate the topology information of the processor_topology_node tree to the > * cpu_topology array. > */ > -static int __init parse_acpi_topology(void) > +int __init parse_acpi_topology(void) > { > bool is_threaded; > int cpu, topology_id; > > + if (acpi_disabled) > + return 0; > + > is_threaded = read_cpuid_mpidr() & MPIDR_MT_BITMASK; > > for_each_possible_cpu(cpu) { > @@ -384,24 +107,6 @@ static int __init parse_acpi_topology(void) > > return 0; > } > - > -#else > -static inline int __init parse_acpi_topology(void) > -{ > - return -EINVAL; > -} > #endif > > -void __init init_cpu_topology(void) > -{ > - reset_cpu_topology(); > > - /* > - * Discard anything that was parsed if we hit an error so we > - * don't use partial information. > - */ > - if (!acpi_disabled && parse_acpi_topology()) > - reset_cpu_topology(); > - else if (of_have_populated_dt() && parse_dt_topology()) > - reset_cpu_topology(); > -} > diff --git a/drivers/base/arch_topology.c b/drivers/base/arch_topology.c > index 1739d7e1952a..5781bb4c457c 100644 > --- a/drivers/base/arch_topology.c > +++ b/drivers/base/arch_topology.c > @@ -15,6 +15,11 @@ > #include <linux/string.h> > #include <linux/sched/topology.h> > #include <linux/cpuset.h> > +#include <linux/cpumask.h> > +#include <linux/init.h> > +#include <linux/percpu.h> > +#include <linux/sched.h> > +#include <linux/smp.h> > > DEFINE_PER_CPU(unsigned long, freq_scale) = SCHED_CAPACITY_SCALE; > > @@ -244,3 +249,294 @@ static void parsing_done_workfn(struct work_struct *work) > #else > core_initcall(free_raw_capacity); > #endif > + > +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV) > +static int __init get_cpu_for_node(struct device_node *node) > +{ > + struct device_node *cpu_node; > + int cpu; > + > + cpu_node = of_parse_phandle(node, "cpu", 0); > + if (!cpu_node) > + return -1; > + > + cpu = of_cpu_node_to_id(cpu_node); > + if (cpu >= 0) > + topology_parse_cpu_capacity(cpu_node, cpu); > + else > + pr_crit("Unable to find CPU node for %pOF\n", cpu_node); > + > + of_node_put(cpu_node); > + return cpu; > +} > + > +static int __init parse_core(struct device_node *core, int package_id, > + int core_id) > +{ > + char name[10]; > + bool leaf = true; > + int i = 0; > + int cpu; > + struct device_node *t; > + > + do { > + snprintf(name, sizeof(name), "thread%d", i); > + t = of_get_child_by_name(core, name); > + if (t) { > + leaf = false; > + cpu = get_cpu_for_node(t); > + if (cpu >= 0) { > + cpu_topology[cpu].package_id = package_id; > + cpu_topology[cpu].core_id = core_id; > + cpu_topology[cpu].thread_id = i; > + } else { > + pr_err("%pOF: Can't get CPU for thread\n", > + t); > + of_node_put(t); > + return -EINVAL; > + } > + of_node_put(t); > + } > + i++; > + } while (t); > + > + cpu = get_cpu_for_node(core); > + if (cpu >= 0) { > + if (!leaf) { > + pr_err("%pOF: Core has both threads and CPU\n", > + core); > + return -EINVAL; > + } > + > + cpu_topology[cpu].package_id = package_id; > + cpu_topology[cpu].core_id = core_id; > + } else if (leaf) { > + pr_err("%pOF: Can't get CPU for leaf core\n", core); > + return -EINVAL; > + } > + > + return 0; > +} > + > +static int __init parse_cluster(struct device_node *cluster, int depth) > +{ > + char name[10]; > + bool leaf = true; > + bool has_cores = false; > + struct device_node *c; > + static int package_id __initdata; > + int core_id = 0; > + int i, ret; > + > + /* > + * First check for child clusters; we currently ignore any > + * information about the nesting of clusters and present the > + * scheduler with a flat list of them. > + */ > + i = 0; > + do { > + snprintf(name, sizeof(name), "cluster%d", i); > + c = of_get_child_by_name(cluster, name); > + if (c) { > + leaf = false; > + ret = parse_cluster(c, depth + 1); > + of_node_put(c); > + if (ret != 0) > + return ret; > + } > + i++; > + } while (c); > + > + /* Now check for cores */ > + i = 0; > + do { > + snprintf(name, sizeof(name), "core%d", i); > + c = of_get_child_by_name(cluster, name); > + if (c) { > + has_cores = true; > + > + if (depth == 0) { > + pr_err("%pOF: cpu-map children should be clusters\n", > + c); > + of_node_put(c); > + return -EINVAL; > + } > + > + if (leaf) { > + ret = parse_core(c, package_id, core_id++); > + } else { > + pr_err("%pOF: Non-leaf cluster with core %s\n", > + cluster, name); > + ret = -EINVAL; > + } > + > + of_node_put(c); > + if (ret != 0) > + return ret; > + } > + i++; > + } while (c); > + > + if (leaf && !has_cores) > + pr_warn("%pOF: empty cluster\n", cluster); > + > + if (leaf) > + package_id++; > + > + return 0; > +} > + > +static int __init parse_dt_topology(void) > +{ > + struct device_node *cn, *map; > + int ret = 0; > + int cpu; > + > + cn = of_find_node_by_path("/cpus"); > + if (!cn) { > + pr_err("No CPU information found in DT\n"); > + return 0; > + } > + > + /* > + * When topology is provided cpu-map is essentially a root > + * cluster with restricted subnodes. > + */ > + map = of_get_child_by_name(cn, "cpu-map"); > + if (!map) > + goto out; > + > + ret = parse_cluster(map, 0); > + if (ret != 0) > + goto out_map; > + > + topology_normalize_cpu_scale(); > + > + /* > + * Check that all cores are in the topology; the SMP code will > + * only mark cores described in the DT as possible. > + */ > + for_each_possible_cpu(cpu) > + if (cpu_topology[cpu].package_id == -1) > + ret = -EINVAL; > + > +out_map: > + of_node_put(map); > +out: > + of_node_put(cn); > + return ret; > +} > + > +/* > + * cpu topology table > + */ > +struct cpu_topology cpu_topology[NR_CPUS]; > +EXPORT_SYMBOL_GPL(cpu_topology); > + > +const struct cpumask *cpu_coregroup_mask(int cpu) > +{ > + const cpumask_t *core_mask = cpumask_of_node(cpu_to_node(cpu)); > + > + /* Find the smaller of NUMA, core or LLC siblings */ > + if (cpumask_subset(&cpu_topology[cpu].core_sibling, core_mask)) { > + /* not numa in package, lets use the package siblings */ > + core_mask = &cpu_topology[cpu].core_sibling; > + } > + if (cpu_topology[cpu].llc_id != -1) { > + if (cpumask_subset(&cpu_topology[cpu].llc_sibling, core_mask)) > + core_mask = &cpu_topology[cpu].llc_sibling; > + } > + > + return core_mask; > +} > + > +void update_siblings_masks(unsigned int cpuid) > +{ > + struct cpu_topology *cpu_topo, *cpuid_topo = &cpu_topology[cpuid]; > + int cpu; > + > + /* update core and thread sibling masks */ > + for_each_online_cpu(cpu) { > + cpu_topo = &cpu_topology[cpu]; > + > + if (cpuid_topo->llc_id == cpu_topo->llc_id) { > + cpumask_set_cpu(cpu, &cpuid_topo->llc_sibling); > + cpumask_set_cpu(cpuid, &cpu_topo->llc_sibling); > + } > + > + if (cpuid_topo->package_id != cpu_topo->package_id) > + continue; > + > + cpumask_set_cpu(cpuid, &cpu_topo->core_sibling); > + cpumask_set_cpu(cpu, &cpuid_topo->core_sibling); > + > + if (cpuid_topo->core_id != cpu_topo->core_id) > + continue; > + > + cpumask_set_cpu(cpuid, &cpu_topo->thread_sibling); > + cpumask_set_cpu(cpu, &cpuid_topo->thread_sibling); > + } > +} > + > +static void clear_cpu_topology(int cpu) > +{ > + struct cpu_topology *cpu_topo = &cpu_topology[cpu]; > + > + cpumask_clear(&cpu_topo->llc_sibling); > + cpumask_set_cpu(cpu, &cpu_topo->llc_sibling); > + > + cpumask_clear(&cpu_topo->core_sibling); > + cpumask_set_cpu(cpu, &cpu_topo->core_sibling); > + cpumask_clear(&cpu_topo->thread_sibling); > + cpumask_set_cpu(cpu, &cpu_topo->thread_sibling); > +} > + > +static void __init reset_cpu_topology(void) > +{ > + unsigned int cpu; > + > + for_each_possible_cpu(cpu) { > + struct cpu_topology *cpu_topo = &cpu_topology[cpu]; > + > + cpu_topo->thread_id = -1; > + cpu_topo->core_id = -1; > + cpu_topo->package_id = -1; > + cpu_topo->llc_id = -1; > + > + clear_cpu_topology(cpu); > + } > +} > + > +void remove_cpu_topology(unsigned int cpu) > +{ > + int sibling; > + > + for_each_cpu(sibling, topology_core_cpumask(cpu)) > + cpumask_clear_cpu(cpu, topology_core_cpumask(sibling)); > + for_each_cpu(sibling, topology_sibling_cpumask(cpu)) > + cpumask_clear_cpu(cpu, topology_sibling_cpumask(sibling)); > + for_each_cpu(sibling, topology_llc_cpumask(cpu)) > + cpumask_clear_cpu(cpu, topology_llc_cpumask(sibling)); > + > + clear_cpu_topology(cpu); > +} > + > +__weak int __init parse_acpi_topology(void) > +{ > + return 0; > +} > + > +void __init init_cpu_topology(void) > +{ > + reset_cpu_topology(); > + > + /* > + * Discard anything that was parsed if we hit an error so we > + * don't use partial information. > + */ > + if (parse_acpi_topology()) > + reset_cpu_topology(); > + else if (of_have_populated_dt() && parse_dt_topology()) > + reset_cpu_topology(); > +} > +#endif > diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h > index d9bdc1a7f4e7..d4e76e0a283f 100644 > --- a/include/linux/arch_topology.h > +++ b/include/linux/arch_topology.h > @@ -33,4 +33,32 @@ unsigned long topology_get_freq_scale(int cpu) > return per_cpu(freq_scale, cpu); > } > > +struct cpu_topology { > + int thread_id; > + int core_id; > + int package_id; > + int llc_id; > + cpumask_t thread_sibling; > + cpumask_t core_sibling; > + cpumask_t llc_sibling; > +}; > + > +#ifdef CONFIG_GENERIC_ARCH_TOPOLOGY > +extern struct cpu_topology cpu_topology[NR_CPUS]; > + > +#define topology_physical_package_id(cpu) (cpu_topology[cpu].package_id) > +#define topology_core_id(cpu) (cpu_topology[cpu].core_id) > +#define topology_core_cpumask(cpu) (&cpu_topology[cpu].core_sibling) > +#define topology_sibling_cpumask(cpu) (&cpu_topology[cpu].thread_sibling) > +#define topology_llc_cpumask(cpu) (&cpu_topology[cpu].llc_sibling) > +void init_cpu_topology(void); > +void store_cpu_topology(unsigned int cpuid); > +const struct cpumask *cpu_coregroup_mask(int cpu); > +#endif > + > +#if defined(CONFIG_ARM64) || defined(CONFIG_RISCV) > +void update_siblings_masks(unsigned int cpu); > +#endif > +void remove_cpu_topology(unsigned int cpuid); > + > #endif /* _LINUX_ARCH_TOPOLOGY_H_ */ > diff --git a/include/linux/topology.h b/include/linux/topology.h > index cb0775e1ee4b..4b3755d65812 100644 > --- a/include/linux/topology.h > +++ b/include/linux/topology.h > @@ -27,6 +27,7 @@ > #ifndef _LINUX_TOPOLOGY_H > #define _LINUX_TOPOLOGY_H > > +#include <linux/arch_topology.h> > #include <linux/cpumask.h> > #include <linux/bitops.h> > #include <linux/mmzone.h> > Hi Will/Catalin, Can we get an ACK for ARM64 if you are okay with this series ?
On Wed, 29 May 2019, Atish Patra wrote: > Currently, there are no topology defined for RISC-V. > Parse the cpu-map node from device tree and setup the > cpu topology. > > CPU topology after applying the patch. > $cat /sys/devices/system/cpu/cpu2/topology/core_siblings_list > 0-3 > $cat /sys/devices/system/cpu/cpu3/topology/core_siblings_list > 0-3 > $cat /sys/devices/system/cpu/cpu3/topology/physical_package_id > 0 > $cat /sys/devices/system/cpu/cpu3/topology/core_id > 3 > > Signed-off-by: Atish Patra <atish.patra@wdc.com> > Acked-by: Sudeep Holla <sudeep.holla@arm.com> Looks reasonable to me. Acked-by: Paul Walmsley <paul.walmsley@sifive.com> We're assuming, on the RISC-V side, that these patches will go in via another tree. - Paul
On Wed, May 29, 2019 at 02:13:36PM -0700, Atish Patra wrote: > Both RISC-V & ARM64 are using cpu-map device tree to describe > their cpu topology. It's better to move the relevant code to > a common place instead of duplicate code. > > To: Will Deacon <will.deacon@arm.com> > To: Catalin Marinas <catalin.marinas@arm.com> > Signed-off-by: Atish Patra <atish.patra@wdc.com> > [Tested on QDF2400] > Tested-by: Jeffrey Hugo <jhugo@codeaurora.org> > [Tested on Juno and other embedded platforms.] > Tested-by: Sudeep Holla <sudeep.holla@arm.com> > Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> > > --- > Hi Will/Catalin, > Can we get ack for this patch ? We are hoping to get the entire series > merged at one go. If Sudeep is happy with it then that's good enough for me. Acked-by: Will Deacon <will.deacon@arm.com> Will