From patchwork Tue Apr 23 07:44:26 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tao Xu X-Patchwork-Id: 1089118 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44pFwb5nN6z9sNf for ; Tue, 23 Apr 2019 17:50:07 +1000 (AEST) Received: from localhost ([127.0.0.1]:49593 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIqBp-0003w6-Ph for incoming@patchwork.ozlabs.org; Tue, 23 Apr 2019 03:50:05 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34145) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIq8X-0001ln-AZ for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:44 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hIq8U-0002r7-8t for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:41 -0400 Received: from mga11.intel.com ([192.55.52.93]:31625) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hIq8T-0002oK-Tc for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:38 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Apr 2019 00:46:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,385,1549958400"; d="scan'208";a="163990244" Received: from tao-optiplex-7060.sh.intel.com ([10.239.13.92]) by fmsmga002.fm.intel.com with ESMTP; 23 Apr 2019 00:46:33 -0700 From: Tao Xu To: ehabkost@redhat.com, imammedo@redhat.com, pbonzini@redhat.com Date: Tue, 23 Apr 2019 15:44:26 +0800 Message-Id: <20190423074428.23031-2-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190423074428.23031-1-tao3.xu@intel.com> References: <20190423074428.23031-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.93 Subject: [Qemu-devel] [PATCH v3 1/3] numa: move numa global variable nb_numa_nodes into MachineState X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The aim of this patch is to add struct NumaState in MachineState and move existing numa global nb_numa_nodes(renamed as "num_nodes") into NumaState. And add variable numa_support into MachineClass to decide which submachines support NUMA. Changes in v3 -> v2: - rename the "NumaState::nb_numa_nodes" as "NumaState::num_nodes" (Eduardo) - use machine_num_numa_nodes(MachineState *ms) to check if ms->numa_state is NULL before using NumaState::num_nodes (Eduardo) - check if ms->numa_state == NULL in the set_numa_options to avoid using -numa in a machine-type which don't support numa Changes in v2: - fix the mistake in numa_complete_configuration in numa.c - add MachineState into some functions to avoid using qdev_get_machine - add some if experssion to avoid the NumaState is null Suggested-by: Igor Mammedov Suggested-by: Eduardo Habkost Signed-off-by: Tao Xu --- exec.c | 5 ++- hw/acpi/aml-build.c | 3 +- hw/arm/boot.c | 2 ++ hw/arm/virt-acpi-build.c | 8 +++-- hw/arm/virt.c | 5 ++- hw/core/machine.c | 21 ++++++++--- hw/i386/acpi-build.c | 2 +- hw/i386/pc.c | 7 +++- hw/mem/pc-dimm.c | 2 ++ hw/pci-bridge/pci_expander_bridge.c | 2 ++ hw/ppc/spapr.c | 17 +++++++-- include/hw/acpi/aml-build.h | 2 +- include/hw/boards.h | 10 ++++++ include/sysemu/numa.h | 3 +- monitor.c | 4 ++- numa.c | 54 ++++++++++++++++++----------- 16 files changed, 108 insertions(+), 39 deletions(-) diff --git a/exec.c b/exec.c index 6ab62f4eee..87f4da207d 100644 --- a/exec.c +++ b/exec.c @@ -1708,6 +1708,7 @@ long qemu_getrampagesize(void) long hpsize = LONG_MAX; long mainrampagesize; Object *memdev_root; + MachineState *ms = MACHINE(qdev_get_machine()); mainrampagesize = qemu_mempath_getpagesize(mem_path); @@ -1735,7 +1736,9 @@ long qemu_getrampagesize(void) * so if its page size is smaller we have got to report that size instead. */ if (hpsize > mainrampagesize && - (nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) { + (ms->numa_state == NULL || + ms->numa_state->num_nodes == 0 || + numa_info[0].node_memdev == NULL)) { static bool warned; if (!warned) { error_report("Huge page support disabled (n/a for main memory)."); diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index 555c24f21d..c67f4561a4 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -1726,10 +1726,11 @@ void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base, * ACPI spec 5.2.17 System Locality Distance Information Table * (Revision 2.0 or later) */ -void build_slit(GArray *table_data, BIOSLinker *linker) +void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms) { int slit_start, i, j; slit_start = table_data->len; + int nb_numa_nodes = machine_num_numa_nodes(ms); acpi_data_push(table_data, sizeof(AcpiTableHeader)); diff --git a/hw/arm/boot.c b/hw/arm/boot.c index a830655e1a..8ff08814fd 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -532,6 +532,8 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo, hwaddr mem_base, mem_len; char **node_path; Error *err = NULL; + MachineState *ms = MACHINE(qdev_get_machine()); + int nb_numa_nodes = machine_num_numa_nodes(ms); if (binfo->dtb_filename) { char *filename; diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index bf9c0bc2f4..6805b4de51 100644 --- a/hw/arm/virt-acpi-build.c +++ b/hw/arm/virt-acpi-build.c @@ -516,7 +516,9 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms) int i, srat_start; uint64_t mem_base; MachineClass *mc = MACHINE_GET_CLASS(vms); - const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(MACHINE(vms)); + MachineState *ms = MACHINE(vms); + const CPUArchIdList *cpu_list = mc->possible_cpu_arch_ids(ms); + int nb_numa_nodes = machine_num_numa_nodes(ms); srat_start = table_data->len; srat = acpi_data_push(table_data, sizeof(*srat)); @@ -780,6 +782,8 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables) GArray *table_offsets; unsigned dsdt, xsdt; GArray *tables_blob = tables->table_data; + MachineState *ms = MACHINE(vms); + int nb_numa_nodes = machine_num_numa_nodes(ms); table_offsets = g_array_new(false, true /* clear */, sizeof(uint32_t)); @@ -813,7 +817,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables) build_srat(tables_blob, tables->linker, vms); if (have_numa_distance) { acpi_add_table(table_offsets, tables_blob); - build_slit(tables_blob, tables->linker); + build_slit(tables_blob, tables->linker, ms); } } diff --git a/hw/arm/virt.c b/hw/arm/virt.c index ce2664a30b..280174b1b7 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -195,6 +195,8 @@ static bool cpu_type_valid(const char *cpu) static void create_fdt(VirtMachineState *vms) { + MachineState *ms = MACHINE(vms); + int nb_numa_nodes = machine_num_numa_nodes(ms); void *fdt = create_device_tree(&vms->fdt_size); if (!fdt) { @@ -1780,7 +1782,7 @@ virt_cpu_index_to_props(MachineState *ms, unsigned cpu_index) static int64_t virt_get_default_cpu_node_id(const MachineState *ms, int idx) { - return idx % nb_numa_nodes; + return idx % machine_num_numa_nodes(ms); } static const CPUArchIdList *virt_possible_cpu_arch_ids(MachineState *ms) @@ -1886,6 +1888,7 @@ static void virt_machine_class_init(ObjectClass *oc, void *data) mc->kvm_type = virt_kvm_type; assert(!mc->get_hotplug_handler); mc->get_hotplug_handler = virt_machine_get_hotplug_handler; + mc->numa_supported = true; hc->plug = virt_machine_device_plug_cb; } diff --git a/hw/core/machine.c b/hw/core/machine.c index 743fef2898..752a90199d 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -854,6 +854,11 @@ static void machine_initfn(Object *obj) NULL); } + if (mc->numa_supported) { + ms->numa_state = g_new0(NumaState, 1); + } else { + ms->numa_state = NULL; + } /* Register notifier when init is done for sysbus sanity checks */ ms->sysbus_notifier.notify = machine_init_notify; @@ -874,6 +879,7 @@ static void machine_finalize(Object *obj) g_free(ms->firmware); g_free(ms->device_memory); g_free(ms->nvdimms_state); + g_free(ms->numa_state); } bool machine_usb(MachineState *machine) @@ -916,6 +922,11 @@ bool machine_mem_merge(MachineState *machine) return machine->mem_merge; } +int machine_num_numa_nodes(const MachineState *machine) +{ + return machine->numa_state ? machine->numa_state->num_nodes : 0; +} + static char *cpu_slot_to_string(const CPUArchId *cpu) { GString *s = g_string_new(NULL); @@ -945,7 +956,7 @@ static void machine_numa_finish_cpu_init(MachineState *machine) MachineClass *mc = MACHINE_GET_CLASS(machine); const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(machine); - assert(nb_numa_nodes); + assert(machine_num_numa_nodes(machine)); for (i = 0; i < possible_cpus->len; i++) { if (possible_cpus->cpus[i].props.has_node_id) { break; @@ -991,9 +1002,11 @@ void machine_run_board_init(MachineState *machine) { MachineClass *machine_class = MACHINE_GET_CLASS(machine); - numa_complete_configuration(machine); - if (nb_numa_nodes) { - machine_numa_finish_cpu_init(machine); + if (machine_class->numa_supported) { + numa_complete_configuration(machine); + if (machine->numa_state->num_nodes) { + machine_numa_finish_cpu_init(machine); + } } /* If the machine supports the valid_cpu_types check and the user diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 416da318ae..7d9bc88ac9 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2687,7 +2687,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine) build_srat(tables_blob, tables->linker, machine); if (have_numa_distance) { acpi_add_table(table_offsets, tables_blob); - build_slit(tables_blob, tables->linker); + build_slit(tables_blob, tables->linker, machine); } } if (acpi_get_mcfg(&mcfg)) { diff --git a/hw/i386/pc.c b/hw/i386/pc.c index f2c15bf1f2..a752594df2 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -996,6 +996,8 @@ static FWCfgState *bochs_bios_init(AddressSpace *as, PCMachineState *pcms) int i; const CPUArchIdList *cpus; MachineClass *mc = MACHINE_GET_CLASS(pcms); + MachineState *ms = MACHINE(pcms); + int nb_numa_nodes = machine_num_numa_nodes(ms); fw_cfg = fw_cfg_init_io_dma(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4, as); fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus); @@ -1672,6 +1674,8 @@ void pc_machine_done(Notifier *notifier, void *data) void pc_guest_info_init(PCMachineState *pcms) { int i; + MachineState *ms = MACHINE(pcms); + int nb_numa_nodes = machine_num_numa_nodes(ms); pcms->apic_xrupt_override = kvm_allows_irq0_override(); pcms->numa_nodes = nb_numa_nodes; @@ -2655,7 +2659,7 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx) assert(idx < ms->possible_cpus->len); x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id, smp_cores, smp_threads, &topo); - return topo.pkg_id % nb_numa_nodes; + return topo.pkg_id % machine_num_numa_nodes(ms); } static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms) @@ -2749,6 +2753,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data) nc->nmi_monitor_handler = x86_nmi; mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE; mc->nvdimm_supported = true; + mc->numa_supported = true; object_class_property_add(oc, PC_MACHINE_DEVMEM_REGION_SIZE, "int", pc_machine_get_device_memory_region_size, NULL, diff --git a/hw/mem/pc-dimm.c b/hw/mem/pc-dimm.c index 152400b1fc..48cbd53e6b 100644 --- a/hw/mem/pc-dimm.c +++ b/hw/mem/pc-dimm.c @@ -160,6 +160,8 @@ static void pc_dimm_realize(DeviceState *dev, Error **errp) { PCDIMMDevice *dimm = PC_DIMM(dev); PCDIMMDeviceClass *ddc = PC_DIMM_GET_CLASS(dimm); + MachineState *ms = MACHINE(qdev_get_machine()); + int nb_numa_nodes = machine_num_numa_nodes(ms); if (!dimm->hostmem) { error_setg(errp, "'" PC_DIMM_MEMDEV_PROP "' property is not set"); diff --git a/hw/pci-bridge/pci_expander_bridge.c b/hw/pci-bridge/pci_expander_bridge.c index e62de4218f..d0590c0973 100644 --- a/hw/pci-bridge/pci_expander_bridge.c +++ b/hw/pci-bridge/pci_expander_bridge.c @@ -217,6 +217,8 @@ static void pxb_dev_realize_common(PCIDevice *dev, bool pcie, Error **errp) PCIBus *bus; const char *dev_name = NULL; Error *local_err = NULL; + MachineState *ms = MACHINE(qdev_get_machine()); + int nb_numa_nodes = machine_num_numa_nodes(ms); if (pxb->numa_node != NUMA_NODE_UNASSIGNED && pxb->numa_node >= nb_numa_nodes) { diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index b52b82d298..23789256f7 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -290,6 +290,8 @@ static int spapr_fixup_cpu_dt(void *fdt, SpaprMachineState *spapr) CPUState *cs; char cpu_model[32]; uint32_t pft_size_prop[] = {0, cpu_to_be32(spapr->htab_shift)}; + MachineState *ms = MACHINE(spapr); + int nb_numa_nodes = machine_num_numa_nodes(ms); CPU_FOREACH(cs) { PowerPCCPU *cpu = POWERPC_CPU(cs); @@ -344,6 +346,7 @@ static int spapr_fixup_cpu_dt(void *fdt, SpaprMachineState *spapr) static hwaddr spapr_node0_size(MachineState *machine) { + int nb_numa_nodes = machine_num_numa_nodes(machine); if (nb_numa_nodes) { int i; for (i = 0; i < nb_numa_nodes; ++i) { @@ -390,6 +393,7 @@ static int spapr_populate_memory_node(void *fdt, int nodeid, hwaddr start, static int spapr_populate_memory(SpaprMachineState *spapr, void *fdt) { MachineState *machine = MACHINE(spapr); + int nb_numa_nodes = machine_num_numa_nodes(machine); hwaddr mem_start, node_size; int i, nb_nodes = nb_numa_nodes; NodeInfo *nodes = numa_info; @@ -444,6 +448,8 @@ static void spapr_populate_cpu_dt(CPUState *cs, void *fdt, int offset, PowerPCCPU *cpu = POWERPC_CPU(cs); CPUPPCState *env = &cpu->env; PowerPCCPUClass *pcc = POWERPC_CPU_GET_CLASS(cs); + MachineState *ms = MACHINE(spapr); + int nb_numa_nodes = machine_num_numa_nodes(ms); int index = spapr_get_vcpu_id(cpu); uint32_t segs[] = {cpu_to_be32(28), cpu_to_be32(40), 0xffffffff, 0xffffffff}; @@ -849,6 +855,7 @@ static int spapr_populate_drmem_v1(SpaprMachineState *spapr, void *fdt, static int spapr_populate_drconf_memory(SpaprMachineState *spapr, void *fdt) { MachineState *machine = MACHINE(spapr); + int nb_numa_nodes = machine_num_numa_nodes(machine); int ret, i, offset; uint64_t lmb_size = SPAPR_MEMORY_BLOCK_SIZE; uint32_t prop_lmb_size[] = {0, cpu_to_be32(lmb_size)}; @@ -1023,11 +1030,13 @@ int spapr_h_cas_compose_response(SpaprMachineState *spapr, static void spapr_dt_rtas(SpaprMachineState *spapr, void *fdt) { int rtas; + MachineState *ms = MACHINE(spapr); + int nb_numa_nodes = machine_num_numa_nodes(ms); GString *hypertas = g_string_sized_new(256); GString *qemu_hypertas = g_string_sized_new(256); uint32_t refpoints[] = { cpu_to_be32(0x4), cpu_to_be32(0x4) }; - uint64_t max_device_addr = MACHINE(spapr)->device_memory->base + - memory_region_size(&MACHINE(spapr)->device_memory->mr); + uint64_t max_device_addr = ms->device_memory->base + + memory_region_size(&ms->device_memory->mr); uint32_t lrdr_capacity[] = { cpu_to_be32(max_device_addr >> 32), cpu_to_be32(max_device_addr & 0xffffffff), @@ -2466,6 +2475,7 @@ static void spapr_create_lmb_dr_connectors(SpaprMachineState *spapr) static void spapr_validate_node_memory(MachineState *machine, Error **errp) { int i; + int nb_numa_nodes = machine_num_numa_nodes(machine); if (machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) { error_setg(errp, "Memory size 0x" RAM_ADDR_FMT @@ -4066,7 +4076,7 @@ spapr_cpu_index_to_props(MachineState *machine, unsigned cpu_index) static int64_t spapr_get_default_cpu_node_id(const MachineState *ms, int idx) { - return idx / smp_cores % nb_numa_nodes; + return idx / smp_cores % machine_num_numa_nodes(ms); } static const CPUArchIdList *spapr_possible_cpu_arch_ids(MachineState *machine) @@ -4266,6 +4276,7 @@ static void spapr_machine_class_init(ObjectClass *oc, void *data) smc->update_dt_enabled = true; mc->default_cpu_type = POWERPC_CPU_TYPE_NAME("power9_v2.0"); mc->has_hotpluggable_cpus = true; + mc->numa_supported = true; smc->resize_hpt_default = SPAPR_RESIZE_HPT_ENABLED; fwc->get_dev_path = spapr_get_fw_dev_path; nc->nmi_monitor_handler = spapr_nmi; diff --git a/include/hw/acpi/aml-build.h b/include/hw/acpi/aml-build.h index 1a563ad756..991cf05134 100644 --- a/include/hw/acpi/aml-build.h +++ b/include/hw/acpi/aml-build.h @@ -414,7 +414,7 @@ build_append_gas_from_struct(GArray *table, const struct AcpiGenericAddress *s) void build_srat_memory(AcpiSratMemoryAffinity *numamem, uint64_t base, uint64_t len, int node, MemoryAffinityFlags flags); -void build_slit(GArray *table_data, BIOSLinker *linker); +void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms); void build_fadt(GArray *tbl, BIOSLinker *linker, const AcpiFadtData *f, const char *oem_id, const char *oem_table_id); diff --git a/include/hw/boards.h b/include/hw/boards.h index e231860666..579d73f6c4 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -5,6 +5,7 @@ #include "sysemu/blockdev.h" #include "sysemu/accel.h" +#include "sysemu/sysemu.h" #include "hw/qdev.h" #include "qom/object.h" #include "qom/cpu.h" @@ -69,6 +70,7 @@ int machine_kvm_shadow_mem(MachineState *machine); int machine_phandle_start(MachineState *machine); bool machine_dump_guest_core(MachineState *machine); bool machine_mem_merge(MachineState *machine); +int machine_num_numa_nodes(const MachineState *machine); HotpluggableCPUList *machine_query_hotpluggable_cpus(MachineState *machine); void machine_set_cpu_numa_node(MachineState *machine, const CpuInstanceProperties *props, @@ -211,6 +213,7 @@ struct MachineClass { bool ignore_boot_device_suffixes; bool smbus_no_migration_support; bool nvdimm_supported; + bool numa_supported; HotplugHandler *(*get_hotplug_handler)(MachineState *machine, DeviceState *dev); @@ -231,6 +234,12 @@ typedef struct DeviceMemoryState { MemoryRegion mr; } DeviceMemoryState; +typedef struct NumaState { + /* Number of NUMA nodes */ + int num_nodes; + +} NumaState; + /** * MachineState: */ @@ -274,6 +283,7 @@ struct MachineState { AccelState *accelerator; CPUArchIdList *possible_cpus; struct NVDIMMState *nvdimms_state; + NumaState *numa_state; }; #define DEFINE_MACHINE(namestr, machine_initfn) \ diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h index b6ac7de43e..a55e2be563 100644 --- a/include/sysemu/numa.h +++ b/include/sysemu/numa.h @@ -6,7 +6,6 @@ #include "sysemu/hostmem.h" #include "hw/boards.h" -extern int nb_numa_nodes; /* Number of NUMA nodes */ extern bool have_numa_distance; struct NodeInfo { @@ -24,7 +23,7 @@ struct NumaNodeMem { extern NodeInfo numa_info[MAX_NODES]; void parse_numa_opts(MachineState *ms); void numa_complete_configuration(MachineState *ms); -void query_numa_node_mem(NumaNodeMem node_mem[]); +void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms); extern QemuOptsList qemu_numa_opts; void numa_legacy_auto_assign_ram(MachineClass *mc, NodeInfo *nodes, int nb_nodes, ram_addr_t size); diff --git a/monitor.c b/monitor.c index 4807bbe811..80b6e01c9b 100644 --- a/monitor.c +++ b/monitor.c @@ -1908,11 +1908,13 @@ static void hmp_info_numa(Monitor *mon, const QDict *qdict) int i; NumaNodeMem *node_mem; CpuInfoList *cpu_list, *cpu; + MachineState *ms = MACHINE(qdev_get_machine()); + int nb_numa_nodes = machine_num_numa_nodes(ms); cpu_list = qmp_query_cpus(&error_abort); node_mem = g_new0(NumaNodeMem, nb_numa_nodes); - query_numa_node_mem(node_mem); + query_numa_node_mem(node_mem, ms); monitor_printf(mon, "%d nodes\n", nb_numa_nodes); for (i = 0; i < nb_numa_nodes; i++) { monitor_printf(mon, "node %d cpus:", i); diff --git a/numa.c b/numa.c index 3875e1efda..343fcaf13f 100644 --- a/numa.c +++ b/numa.c @@ -52,7 +52,6 @@ static int have_memdevs = -1; static int max_numa_nodeid; /* Highest specified NUMA node ID, plus one. * For all nodes, nodeid < max_numa_nodeid */ -int nb_numa_nodes; bool have_numa_distance; NodeInfo numa_info[MAX_NODES]; @@ -68,7 +67,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node, if (node->has_nodeid) { nodenr = node->nodeid; } else { - nodenr = nb_numa_nodes; + nodenr = machine_num_numa_nodes(ms); } if (nodenr >= MAX_NODES) { @@ -136,10 +135,11 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node, } numa_info[nodenr].present = true; max_numa_nodeid = MAX(max_numa_nodeid, nodenr + 1); - nb_numa_nodes++; + ms->numa_state->num_nodes++; } -static void parse_numa_distance(NumaDistOptions *dist, Error **errp) +static +void parse_numa_distance(MachineState *ms, NumaDistOptions *dist, Error **errp) { uint16_t src = dist->src; uint16_t dst = dist->dst; @@ -179,6 +179,11 @@ void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp) { Error *err = NULL; + if (ms->numa_state == NULL) { + error_setg(errp, "NUMA is not supported by this machine-type"); + goto end; + } + switch (object->type) { case NUMA_OPTIONS_TYPE_NODE: parse_numa_node(ms, &object->u.node, &err); @@ -187,7 +192,7 @@ void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp) } break; case NUMA_OPTIONS_TYPE_DIST: - parse_numa_distance(&object->u.dist, &err); + parse_numa_distance(ms, &object->u.dist, &err); if (err) { goto end; } @@ -252,10 +257,11 @@ end: * distance from a node to itself is always NUMA_DISTANCE_MIN, * so providing it is never necessary. */ -static void validate_numa_distance(void) +static void validate_numa_distance(MachineState *ms) { int src, dst; bool is_asymmetrical = false; + int nb_numa_nodes = machine_num_numa_nodes(ms); for (src = 0; src < nb_numa_nodes; src++) { for (dst = src; dst < nb_numa_nodes; dst++) { @@ -293,9 +299,10 @@ static void validate_numa_distance(void) } } -static void complete_init_numa_distance(void) +static void complete_init_numa_distance(MachineState *ms) { int src, dst; + int nb_numa_nodes = machine_num_numa_nodes(ms); /* Fixup NUMA distance by symmetric policy because if it is an * asymmetric distance table, it should be a complete table and @@ -369,7 +376,7 @@ void numa_complete_configuration(MachineState *ms) * * Enable NUMA implicitly by adding a new NUMA node automatically. */ - if (ms->ram_slots > 0 && nb_numa_nodes == 0 && + if (ms->ram_slots > 0 && ms->numa_state->num_nodes == 0 && mc->auto_enable_numa_with_memhp) { NumaNodeOptions node = { }; parse_numa_node(ms, &node, &error_abort); @@ -387,30 +394,33 @@ void numa_complete_configuration(MachineState *ms) } /* This must be always true if all nodes are present: */ - assert(nb_numa_nodes == max_numa_nodeid); + assert(ms->numa_state->num_nodes == max_numa_nodeid); - if (nb_numa_nodes > 0) { + if (ms->numa_state->num_nodes > 0) { uint64_t numa_total; - if (nb_numa_nodes > MAX_NODES) { - nb_numa_nodes = MAX_NODES; + if (ms->numa_state->num_nodes > MAX_NODES) { + ms->numa_state->num_nodes = MAX_NODES; } /* If no memory size is given for any node, assume the default case * and distribute the available memory equally across all nodes */ - for (i = 0; i < nb_numa_nodes; i++) { + for (i = 0; i < ms->numa_state->num_nodes; i++) { if (numa_info[i].node_mem != 0) { break; } } - if (i == nb_numa_nodes) { + if (i == ms->numa_state->num_nodes) { assert(mc->numa_auto_assign_ram); - mc->numa_auto_assign_ram(mc, numa_info, nb_numa_nodes, ram_size); + mc->numa_auto_assign_ram(mc, + numa_info, + ms->numa_state->num_nodes, + ram_size); } numa_total = 0; - for (i = 0; i < nb_numa_nodes; i++) { + for (i = 0; i < ms->numa_state->num_nodes; i++) { numa_total += numa_info[i].node_mem; } if (numa_total != ram_size) { @@ -434,10 +444,10 @@ void numa_complete_configuration(MachineState *ms) */ if (have_numa_distance) { /* Validate enough NUMA distance information was provided. */ - validate_numa_distance(); + validate_numa_distance(ms); /* Validation succeeded, now fill in any missing distances. */ - complete_init_numa_distance(); + complete_init_numa_distance(ms); } } } @@ -513,6 +523,8 @@ void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner, { uint64_t addr = 0; int i; + MachineState *ms = MACHINE(qdev_get_machine()); + int nb_numa_nodes = machine_num_numa_nodes(ms); if (nb_numa_nodes == 0 || !have_memdevs) { allocate_system_memory_nonnuma(mr, owner, name, ram_size); @@ -578,16 +590,16 @@ static void numa_stat_memory_devices(NumaNodeMem node_mem[]) qapi_free_MemoryDeviceInfoList(info_list); } -void query_numa_node_mem(NumaNodeMem node_mem[]) +void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms) { int i; - if (nb_numa_nodes <= 0) { + if (ms->numa_state == NULL || ms->numa_state->num_nodes <= 0) { return; } numa_stat_memory_devices(node_mem); - for (i = 0; i < nb_numa_nodes; i++) { + for (i = 0; i < ms->numa_state->num_nodes; i++) { node_mem[i].node_mem += numa_info[i].node_mem; } } From patchwork Tue Apr 23 07:44:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tao Xu X-Patchwork-Id: 1089115 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44pFsj5yMxz9sNN for ; Tue, 23 Apr 2019 17:47:37 +1000 (AEST) Received: from localhost ([127.0.0.1]:49566 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIq9P-0001m1-QK for incoming@patchwork.ozlabs.org; Tue, 23 Apr 2019 03:47:35 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34121) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIq8V-0001iY-7E for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:40 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hIq8U-0002qP-0x for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:39 -0400 Received: from mga11.intel.com ([192.55.52.93]:31627) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hIq8T-0002p2-NX for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:37 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Apr 2019 00:46:36 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,385,1549958400"; d="scan'208";a="163990248" Received: from tao-optiplex-7060.sh.intel.com ([10.239.13.92]) by fmsmga002.fm.intel.com with ESMTP; 23 Apr 2019 00:46:35 -0700 From: Tao Xu To: ehabkost@redhat.com, imammedo@redhat.com, pbonzini@redhat.com Date: Tue, 23 Apr 2019 15:44:27 +0800 Message-Id: <20190423074428.23031-3-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190423074428.23031-1-tao3.xu@intel.com> References: <20190423074428.23031-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.93 Subject: [Qemu-devel] [PATCH v3 2/3] numa: move numa global variable have_numa_distance into MachineState X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The aim of this patch is to move existing numa global have_numa_distance into NumaState. Suggested-by: Igor Mammedov Suggested-by: Eduardo Habkost Signed-off-by: Tao Xu --- hw/arm/virt-acpi-build.c | 2 +- hw/arm/virt.c | 2 +- hw/i386/acpi-build.c | 2 +- include/hw/boards.h | 2 ++ include/sysemu/numa.h | 2 -- numa.c | 5 ++--- 6 files changed, 7 insertions(+), 8 deletions(-) diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index 6805b4de51..65f070843c 100644 --- a/hw/arm/virt-acpi-build.c +++ b/hw/arm/virt-acpi-build.c @@ -815,7 +815,7 @@ void virt_acpi_build(VirtMachineState *vms, AcpiBuildTables *tables) if (nb_numa_nodes > 0) { acpi_add_table(table_offsets, tables_blob); build_srat(tables_blob, tables->linker, vms); - if (have_numa_distance) { + if (ms->numa_state->have_numa_distance) { acpi_add_table(table_offsets, tables_blob); build_slit(tables_blob, tables->linker, ms); } diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 280174b1b7..6cd5e13eb9 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -228,7 +228,7 @@ static void create_fdt(VirtMachineState *vms) "clk24mhz"); qemu_fdt_setprop_cell(fdt, "/apb-pclk", "phandle", vms->clock_phandle); - if (have_numa_distance) { + if (nb_numa_nodes > 0 && ms->numa_state->have_numa_distance) { int size = nb_numa_nodes * nb_numa_nodes * 3 * sizeof(uint32_t); uint32_t *matrix = g_malloc0(size); int idx, i, j; diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 7d9bc88ac9..43a807c483 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2685,7 +2685,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine) if (pcms->numa_nodes) { acpi_add_table(table_offsets, tables_blob); build_srat(tables_blob, tables->linker, machine); - if (have_numa_distance) { + if (machine->numa_state->have_numa_distance) { acpi_add_table(table_offsets, tables_blob); build_slit(tables_blob, tables->linker, machine); } diff --git a/include/hw/boards.h b/include/hw/boards.h index 579d73f6c4..c3ff2dbd4a 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -238,6 +238,8 @@ typedef struct NumaState { /* Number of NUMA nodes */ int num_nodes; + /* Allow setting NUMA distance for different NUMA nodes */ + bool have_numa_distance; } NumaState; /** diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h index a55e2be563..1a29408db9 100644 --- a/include/sysemu/numa.h +++ b/include/sysemu/numa.h @@ -6,8 +6,6 @@ #include "sysemu/hostmem.h" #include "hw/boards.h" -extern bool have_numa_distance; - struct NodeInfo { uint64_t node_mem; struct HostMemoryBackend *node_memdev; diff --git a/numa.c b/numa.c index 343fcaf13f..d4f5ff5193 100644 --- a/numa.c +++ b/numa.c @@ -52,7 +52,6 @@ static int have_memdevs = -1; static int max_numa_nodeid; /* Highest specified NUMA node ID, plus one. * For all nodes, nodeid < max_numa_nodeid */ -bool have_numa_distance; NodeInfo numa_info[MAX_NODES]; @@ -171,7 +170,7 @@ void parse_numa_distance(MachineState *ms, NumaDistOptions *dist, Error **errp) } numa_info[src].distance[dst] = val; - have_numa_distance = true; + ms->numa_state->have_numa_distance = true; } static @@ -442,7 +441,7 @@ void numa_complete_configuration(MachineState *ms) * asymmetric. In this case, the distances for both directions * of all node pairs are required. */ - if (have_numa_distance) { + if (ms->numa_state->have_numa_distance) { /* Validate enough NUMA distance information was provided. */ validate_numa_distance(ms); From patchwork Tue Apr 23 07:44:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tao Xu X-Patchwork-Id: 1089117 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Authentication-Results: ozlabs.org; spf=pass (mailfrom) smtp.mailfrom=nongnu.org (client-ip=209.51.188.17; helo=lists.gnu.org; envelope-from=qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org; receiver=) Authentication-Results: ozlabs.org; dmarc=fail (p=none dis=none) header.from=intel.com Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 44pFsq63rzz9sNN for ; Tue, 23 Apr 2019 17:47:43 +1000 (AEST) Received: from localhost ([127.0.0.1]:49568 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIq9V-0001oG-Nb for incoming@patchwork.ozlabs.org; Tue, 23 Apr 2019 03:47:41 -0400 Received: from eggs.gnu.org ([209.51.188.92]:34137) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1hIq8W-0001jK-Fg for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:42 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1hIq8U-0002rq-GW for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:40 -0400 Received: from mga11.intel.com ([192.55.52.93]:31627) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1hIq8U-0002p2-8R for qemu-devel@nongnu.org; Tue, 23 Apr 2019 03:46:38 -0400 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Apr 2019 00:46:37 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.60,385,1549958400"; d="scan'208";a="163990254" Received: from tao-optiplex-7060.sh.intel.com ([10.239.13.92]) by fmsmga002.fm.intel.com with ESMTP; 23 Apr 2019 00:46:36 -0700 From: Tao Xu To: ehabkost@redhat.com, imammedo@redhat.com, pbonzini@redhat.com Date: Tue, 23 Apr 2019 15:44:28 +0800 Message-Id: <20190423074428.23031-4-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190423074428.23031-1-tao3.xu@intel.com> References: <20190423074428.23031-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.93 Subject: [Qemu-devel] [PATCH v3 3/3] numa: move numa global variable numa_info into MachineState X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" The aim of this patch is to move existing numa global numa_info (renamed as "nodes") into NumaState. Changes in v3 -> v2: - rename the "NumaState::numa_info" as "NumaState::nodes" (Eduardo) Suggested-by: Igor Mammedov Suggested-by: Eduardo Habkost Signed-off-by: Tao Xu --- exec.c | 2 +- hw/acpi/aml-build.c | 6 ++++-- hw/arm/boot.c | 2 +- hw/arm/virt-acpi-build.c | 7 ++++--- hw/arm/virt.c | 1 + hw/i386/pc.c | 4 ++-- hw/ppc/spapr.c | 8 +++++++- hw/ppc/spapr_pci.c | 2 ++ include/hw/boards.h | 10 ++++++++++ include/sysemu/numa.h | 8 -------- numa.c | 15 +++++++++------ 11 files changed, 41 insertions(+), 24 deletions(-) diff --git a/exec.c b/exec.c index 87f4da207d..6b3bda2766 100644 --- a/exec.c +++ b/exec.c @@ -1738,7 +1738,7 @@ long qemu_getrampagesize(void) if (hpsize > mainrampagesize && (ms->numa_state == NULL || ms->numa_state->num_nodes == 0 || - numa_info[0].node_memdev == NULL)) { + ms->numa_state->nodes[0].node_memdev == NULL)) { static bool warned; if (!warned) { error_report("Huge page support disabled (n/a for main memory)."); diff --git a/hw/acpi/aml-build.c b/hw/acpi/aml-build.c index c67f4561a4..b53a55cb56 100644 --- a/hw/acpi/aml-build.c +++ b/hw/acpi/aml-build.c @@ -1737,8 +1737,10 @@ void build_slit(GArray *table_data, BIOSLinker *linker, MachineState *ms) build_append_int_noprefix(table_data, nb_numa_nodes, 8); for (i = 0; i < nb_numa_nodes; i++) { for (j = 0; j < nb_numa_nodes; j++) { - assert(numa_info[i].distance[j]); - build_append_int_noprefix(table_data, numa_info[i].distance[j], 1); + assert(ms->numa_state->nodes[i].distance[j]); + build_append_int_noprefix(table_data, + ms->numa_state->nodes[i].distance[j], + 1); } } diff --git a/hw/arm/boot.c b/hw/arm/boot.c index 8ff08814fd..845b737ab9 100644 --- a/hw/arm/boot.c +++ b/hw/arm/boot.c @@ -602,7 +602,7 @@ int arm_load_dtb(hwaddr addr, const struct arm_boot_info *binfo, if (nb_numa_nodes > 0) { mem_base = binfo->loader_start; for (i = 0; i < nb_numa_nodes; i++) { - mem_len = numa_info[i].node_mem; + mem_len = ms->numa_state->nodes[i].node_mem; rc = fdt_add_memory_node(fdt, acells, mem_base, scells, mem_len, i); if (rc < 0) { diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index 65f070843c..b22c3d27ad 100644 --- a/hw/arm/virt-acpi-build.c +++ b/hw/arm/virt-acpi-build.c @@ -535,11 +535,12 @@ build_srat(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms) mem_base = vms->memmap[VIRT_MEM].base; for (i = 0; i < nb_numa_nodes; ++i) { - if (numa_info[i].node_mem > 0) { + if (ms->numa_state->nodes[i].node_mem > 0) { numamem = acpi_data_push(table_data, sizeof(*numamem)); - build_srat_memory(numamem, mem_base, numa_info[i].node_mem, i, + build_srat_memory(numamem, mem_base, + ms->numa_state->nodes[i].node_mem, i, MEM_AFFINITY_ENABLED); - mem_base += numa_info[i].node_mem; + mem_base += ms->numa_state->nodes[i].node_mem; } } diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 6cd5e13eb9..8a8fc0d9b4 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -232,6 +232,7 @@ static void create_fdt(VirtMachineState *vms) int size = nb_numa_nodes * nb_numa_nodes * 3 * sizeof(uint32_t); uint32_t *matrix = g_malloc0(size); int idx, i, j; + NodeInfo *numa_info = ms->numa_state->nodes; for (i = 0; i < nb_numa_nodes; i++) { for (j = 0; j < nb_numa_nodes; j++) { diff --git a/hw/i386/pc.c b/hw/i386/pc.c index a752594df2..4ace0ad893 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -1040,7 +1040,7 @@ static FWCfgState *bochs_bios_init(AddressSpace *as, PCMachineState *pcms) } for (i = 0; i < nb_numa_nodes; i++) { numa_fw_cfg[pcms->apic_id_limit + 1 + i] = - cpu_to_le64(numa_info[i].node_mem); + cpu_to_le64(ms->numa_state->nodes[i].node_mem); } fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, numa_fw_cfg, (1 + pcms->apic_id_limit + nb_numa_nodes) * @@ -1682,7 +1682,7 @@ void pc_guest_info_init(PCMachineState *pcms) pcms->node_mem = g_malloc0(pcms->numa_nodes * sizeof *pcms->node_mem); for (i = 0; i < nb_numa_nodes; i++) { - pcms->node_mem[i] = numa_info[i].node_mem; + pcms->node_mem[i] = ms->numa_state->nodes[i].node_mem; } pcms->machine_done.notify = pc_machine_done; diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c index 23789256f7..7701cb1dac 100644 --- a/hw/ppc/spapr.c +++ b/hw/ppc/spapr.c @@ -349,6 +349,7 @@ static hwaddr spapr_node0_size(MachineState *machine) int nb_numa_nodes = machine_num_numa_nodes(machine); if (nb_numa_nodes) { int i; + NodeInfo *numa_info = machine->numa_state->nodes; for (i = 0; i < nb_numa_nodes; ++i) { if (numa_info[i].node_mem) { return MIN(pow2floor(numa_info[i].node_mem), @@ -396,7 +397,9 @@ static int spapr_populate_memory(SpaprMachineState *spapr, void *fdt) int nb_numa_nodes = machine_num_numa_nodes(machine); hwaddr mem_start, node_size; int i, nb_nodes = nb_numa_nodes; - NodeInfo *nodes = numa_info; + NodeInfo *nodes = machine->numa_state ? + machine->numa_state->nodes : + NULL; NodeInfo ramnode; /* No NUMA nodes, assume there is just one node with whole RAM */ @@ -2476,6 +2479,9 @@ static void spapr_validate_node_memory(MachineState *machine, Error **errp) { int i; int nb_numa_nodes = machine_num_numa_nodes(machine); + NodeInfo *numa_info = machine->numa_state ? + machine->numa_state->nodes : + NULL; if (machine->ram_size % SPAPR_MEMORY_BLOCK_SIZE) { error_setg(errp, "Memory size 0x" RAM_ADDR_FMT diff --git a/hw/ppc/spapr_pci.c b/hw/ppc/spapr_pci.c index f62e6833b8..d4378558d3 100644 --- a/hw/ppc/spapr_pci.c +++ b/hw/ppc/spapr_pci.c @@ -1672,6 +1672,8 @@ static void spapr_phb_realize(DeviceState *dev, Error **errp) SysBusDevice *s = SYS_BUS_DEVICE(dev); SpaprPhbState *sphb = SPAPR_PCI_HOST_BRIDGE(s); PCIHostState *phb = PCI_HOST_BRIDGE(s); + MachineState *ms = MACHINE(spapr); + NodeInfo *numa_info = ms->numa_state ? ms->numa_state->nodes : NULL; char *namebuf; int i; PCIBus *bus; diff --git a/include/hw/boards.h b/include/hw/boards.h index c3ff2dbd4a..e2676009fe 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -234,12 +234,22 @@ typedef struct DeviceMemoryState { MemoryRegion mr; } DeviceMemoryState; +struct NodeInfo { + uint64_t node_mem; + struct HostMemoryBackend *node_memdev; + bool present; + uint8_t distance[MAX_NODES]; +}; + typedef struct NumaState { /* Number of NUMA nodes */ int num_nodes; /* Allow setting NUMA distance for different NUMA nodes */ bool have_numa_distance; + + /* NUMA nodes information */ + NodeInfo nodes[MAX_NODES]; } NumaState; /** diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h index 1a29408db9..7b8011f9ea 100644 --- a/include/sysemu/numa.h +++ b/include/sysemu/numa.h @@ -6,19 +6,11 @@ #include "sysemu/hostmem.h" #include "hw/boards.h" -struct NodeInfo { - uint64_t node_mem; - struct HostMemoryBackend *node_memdev; - bool present; - uint8_t distance[MAX_NODES]; -}; - struct NumaNodeMem { uint64_t node_mem; uint64_t node_plugged_mem; }; -extern NodeInfo numa_info[MAX_NODES]; void parse_numa_opts(MachineState *ms); void numa_complete_configuration(MachineState *ms); void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms); diff --git a/numa.c b/numa.c index d4f5ff5193..ddea376d72 100644 --- a/numa.c +++ b/numa.c @@ -52,8 +52,6 @@ static int have_memdevs = -1; static int max_numa_nodeid; /* Highest specified NUMA node ID, plus one. * For all nodes, nodeid < max_numa_nodeid */ -NodeInfo numa_info[MAX_NODES]; - static void parse_numa_node(MachineState *ms, NumaNodeOptions *node, Error **errp) @@ -62,6 +60,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeOptions *node, uint16_t nodenr; uint16List *cpus = NULL; MachineClass *mc = MACHINE_GET_CLASS(ms); + NodeInfo *numa_info = ms->numa_state->nodes; if (node->has_nodeid) { nodenr = node->nodeid; @@ -143,6 +142,7 @@ void parse_numa_distance(MachineState *ms, NumaDistOptions *dist, Error **errp) uint16_t src = dist->src; uint16_t dst = dist->dst; uint8_t val = dist->val; + NodeInfo *numa_info = ms->numa_state->nodes; if (src >= MAX_NODES || dst >= MAX_NODES) { error_setg(errp, "Parameter '%s' expects an integer between 0 and %d", @@ -201,7 +201,7 @@ void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp) error_setg(&err, "Missing mandatory node-id property"); goto end; } - if (!numa_info[object->u.cpu.node_id].present) { + if (!ms->numa_state->nodes[object->u.cpu.node_id].present) { error_setg(&err, "Invalid node-id=%" PRId64 ", NUMA node must be " "defined with -numa node,nodeid=ID before it's used with " "-numa cpu,node-id=ID", object->u.cpu.node_id); @@ -261,6 +261,7 @@ static void validate_numa_distance(MachineState *ms) int src, dst; bool is_asymmetrical = false; int nb_numa_nodes = machine_num_numa_nodes(ms); + NodeInfo *numa_info = ms->numa_state->nodes; for (src = 0; src < nb_numa_nodes; src++) { for (dst = src; dst < nb_numa_nodes; dst++) { @@ -302,6 +303,7 @@ static void complete_init_numa_distance(MachineState *ms) { int src, dst; int nb_numa_nodes = machine_num_numa_nodes(ms); + NodeInfo *numa_info = ms->numa_state->nodes; /* Fixup NUMA distance by symmetric policy because if it is an * asymmetric distance table, it should be a complete table and @@ -361,6 +363,7 @@ void numa_complete_configuration(MachineState *ms) { int i; MachineClass *mc = MACHINE_GET_CLASS(ms); + NodeInfo *numa_info = ms->numa_state->nodes; /* * If memory hotplug is enabled (slots > 0) but without '-numa' @@ -532,8 +535,8 @@ void memory_region_allocate_system_memory(MemoryRegion *mr, Object *owner, memory_region_init(mr, owner, name, ram_size); for (i = 0; i < nb_numa_nodes; i++) { - uint64_t size = numa_info[i].node_mem; - HostMemoryBackend *backend = numa_info[i].node_memdev; + uint64_t size = ms->numa_state->nodes[i].node_mem; + HostMemoryBackend *backend = ms->numa_state->nodes[i].node_memdev; if (!backend) { continue; } @@ -599,7 +602,7 @@ void query_numa_node_mem(NumaNodeMem node_mem[], MachineState *ms) numa_stat_memory_devices(node_mem); for (i = 0; i < ms->numa_state->num_nodes; i++) { - node_mem[i].node_mem += numa_info[i].node_mem; + node_mem[i].node_mem += ms->numa_state->nodes[i].node_mem; } }