From patchwork Thu Dec 22 20:13:49 2011 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Tosatti X-Patchwork-Id: 132898 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [140.186.70.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by ozlabs.org (Postfix) with ESMTPS id AA6D8B719E for ; Fri, 23 Dec 2011 07:16:04 +1100 (EST) Received: from localhost ([::1]:56235 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rdp3Y-0007Gp-IJ for incoming@patchwork.ozlabs.org; Thu, 22 Dec 2011 15:16:00 -0500 Received: from eggs.gnu.org ([140.186.70.92]:35087) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rdp3F-0006sW-8O for qemu-devel@nongnu.org; Thu, 22 Dec 2011 15:15:45 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1Rdp3A-0007Ep-9l for qemu-devel@nongnu.org; Thu, 22 Dec 2011 15:15:41 -0500 Received: from mx1.redhat.com ([209.132.183.28]:53389) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1Rdp39-0007Dy-OO for qemu-devel@nongnu.org; Thu, 22 Dec 2011 15:15:36 -0500 Received: from int-mx12.intmail.prod.int.phx2.redhat.com (int-mx12.intmail.prod.int.phx2.redhat.com [10.5.11.25]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id pBMKFYii013394 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Thu, 22 Dec 2011 15:15:34 -0500 Received: from ns3.rdu.redhat.com (ns3.rdu.redhat.com [10.11.255.199]) by int-mx12.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id pBMKFXkv022166; Thu, 22 Dec 2011 15:15:33 -0500 Received: from amt.cnet (vpn-11-232.rdu.redhat.com [10.11.11.232]) by ns3.rdu.redhat.com (8.13.8/8.13.8) with ESMTP id pBMKFWWK032218; Thu, 22 Dec 2011 15:15:32 -0500 Received: from amt.cnet (amt.cnet [127.0.0.1]) by amt.cnet (Postfix) with ESMTP id 872E4652177; Thu, 22 Dec 2011 18:14:04 -0200 (BRST) Received: (from marcelo@localhost) by amt.cnet (8.14.5/8.14.5/Submit) id pBMKE2bm016204; Thu, 22 Dec 2011 18:14:02 -0200 From: Marcelo Tosatti To: Anthony Liguori Date: Thu, 22 Dec 2011 18:13:49 -0200 Message-Id: <991dfefdee8f4d1405f4b3cd799e7579f54b6c9f.1324584830.git.mtosatti@redhat.com> In-Reply-To: References: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.25 X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.6 (newer, 3) X-Received-From: 209.132.183.28 Cc: Vasilis Liaskovitis , Marcelo Tosatti , qemu-devel@nongnu.org, kvm@vger.kernel.org Subject: [Qemu-devel] [PATCH 4/5] Set numa topology for max_cpus X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org From: Vasilis Liaskovitis qemu-kvm passes numa/SRAT topology information for smp_cpus to SeaBIOS. However SeaBIOS always expects to setup max_cpus number of SRAT cpu entries (MaxCountCPUs variable in build_srat function of Seabios). When qemu-kvm runs with smp_cpus != max_cpus (e.g. -smp 2,maxcpus=4), Seabios will mistakenly use memory SRAT info for setting up CPU SRAT entries for the offline CPUs. Wrong SRAT memory entries are also created. This breaks NUMA in a guest. Fix by setting up SRAT info for max_cpus in qemu-kvm. Signed-off-by: Vasilis Liaskovitis Signed-off-by: Marcelo Tosatti --- hw/pc.c | 8 ++++---- vl.c | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/hw/pc.c b/hw/pc.c index 3a71992..f51afa8 100644 --- a/hw/pc.c +++ b/hw/pc.c @@ -624,9 +624,9 @@ static void *bochs_bios_init(void) * of nodes, one word for each VCPU->node and one word for each node to * hold the amount of memory. */ - numa_fw_cfg = g_malloc0((1 + smp_cpus + nb_numa_nodes) * 8); + numa_fw_cfg = g_malloc0((1 + max_cpus + nb_numa_nodes) * 8); numa_fw_cfg[0] = cpu_to_le64(nb_numa_nodes); - for (i = 0; i < smp_cpus; i++) { + for (i = 0; i < max_cpus; i++) { for (j = 0; j < nb_numa_nodes; j++) { if (node_cpumask[j] & (1 << i)) { numa_fw_cfg[i + 1] = cpu_to_le64(j); @@ -635,10 +635,10 @@ static void *bochs_bios_init(void) } } for (i = 0; i < nb_numa_nodes; i++) { - numa_fw_cfg[smp_cpus + 1 + i] = cpu_to_le64(node_mem[i]); + numa_fw_cfg[max_cpus + 1 + i] = cpu_to_le64(node_mem[i]); } fw_cfg_add_bytes(fw_cfg, FW_CFG_NUMA, (uint8_t *)numa_fw_cfg, - (1 + smp_cpus + nb_numa_nodes) * 8); + (1 + max_cpus + nb_numa_nodes) * 8); return fw_cfg; } diff --git a/vl.c b/vl.c index c03abb6..d925424 100644 --- a/vl.c +++ b/vl.c @@ -3305,7 +3305,7 @@ int main(int argc, char **argv, char **envp) * real machines which also use this scheme. */ if (i == nb_numa_nodes) { - for (i = 0; i < smp_cpus; i++) { + for (i = 0; i < max_cpus; i++) { node_cpumask[i % nb_numa_nodes] |= 1 << i; } }