From patchwork Wed Apr 26 10:07:01 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Laurent Vivier X-Patchwork-Id: 755365 Return-Path: X-Original-To: incoming@patchwork.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from lists.gnu.org (lists.gnu.org [IPv6:2001:4830:134:3::11]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by ozlabs.org (Postfix) with ESMTPS id 3wCbP45Klpz9s8Y for ; Wed, 26 Apr 2017 20:07:52 +1000 (AEST) Received: from localhost ([::1]:53984 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3JrS-0003cK-5k for incoming@patchwork.ozlabs.org; Wed, 26 Apr 2017 06:07:50 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:46708) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1d3Jqm-0003aw-Vx for qemu-devel@nongnu.org; Wed, 26 Apr 2017 06:07:10 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1d3Jqj-00024g-Ro for qemu-devel@nongnu.org; Wed, 26 Apr 2017 06:07:08 -0400 Received: from mx1.redhat.com ([209.132.183.28]:38806) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1d3Jqj-00024B-LQ for qemu-devel@nongnu.org; Wed, 26 Apr 2017 06:07:05 -0400 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 43FDB81254; Wed, 26 Apr 2017 10:07:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 43FDB81254 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=lvivier@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 43FDB81254 Received: from thinkpad.redhat.com (unknown [10.36.118.57]) by smtp.corp.redhat.com (Postfix) with ESMTP id 80CA190DFF; Wed, 26 Apr 2017 10:07:01 +0000 (UTC) From: Laurent Vivier To: Eduardo Habkost Date: Wed, 26 Apr 2017 12:07:01 +0200 Message-Id: <20170426100701.21893-1-lvivier@redhat.com> X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Wed, 26 Apr 2017 10:07:04 +0000 (UTC) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x [generic] [fuzzy] X-Received-From: 209.132.183.28 Subject: [Qemu-devel] [PATCH] numa: equally distribute memory on nodes X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Laurent Vivier , Thomas Huth , qemu-devel@nongnu.org, David Gibson Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" When there is more nodes than memory available to put the minimum allowed memory by node, all the memory is put on the last node. This is because we put (ram_size / nb_numa_nodes) & ~((1 << mc->numa_mem_align_shift) - 1); on each node, and in this case the value is 0. This is particularly true with pseries, as the memory must be aligned to 256MB. To avoid this problem, this patch uses an error diffusion algorithm [1] to distribute equally the memory on nodes. Example: qemu-system-ppc64 -S -nographic -nodefaults -monitor stdio -m 1G -smp 8 \ -numa node -numa node -numa node \ -numa node -numa node -numa node Before: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 0 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 0 MB node 4 cpus: 4 node 4 size: 0 MB node 5 cpus: 5 node 5 size: 1024 MB After: (qemu) info numa 6 nodes node 0 cpus: 0 6 node 0 size: 0 MB node 1 cpus: 1 7 node 1 size: 256 MB node 2 cpus: 2 node 2 size: 0 MB node 3 cpus: 3 node 3 size: 256 MB node 4 cpus: 4 node 4 size: 256 MB node 5 cpus: 5 node 5 size: 256 MB [1] https://en.wikipedia.org/wiki/Error_diffusion Signed-off-by: Laurent Vivier --- numa.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-) diff --git a/numa.c b/numa.c index 6fc2393..bcf1c54 100644 --- a/numa.c +++ b/numa.c @@ -336,15 +336,19 @@ void parse_numa_opts(MachineClass *mc) } } if (i == nb_numa_nodes) { - uint64_t usedmem = 0; + uint64_t usedmem = 0, node_mem; + uint64_t granularity = ram_size / nb_numa_nodes; + uint64_t propagate = 0; /* Align each node according to the alignment * requirements of the machine class */ for (i = 0; i < nb_numa_nodes - 1; i++) { - numa_info[i].node_mem = (ram_size / nb_numa_nodes) & + node_mem = (granularity + propagate) & ~((1 << mc->numa_mem_align_shift) - 1); - usedmem += numa_info[i].node_mem; + propagate = granularity + propagate - node_mem; + numa_info[i].node_mem = node_mem; + usedmem += node_mem; } numa_info[i].node_mem = ram_size - usedmem; }