[v1] vl/s390: fixup ram sizes for compat machines

Message ID	20200331153554.69341-1-borntraeger@de.ibm.com
State	New
Headers	show Return-Path: <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org> From: Christian Borntraeger <borntraeger@de.ibm.com> To: qemu-devel <qemu-devel@nongnu.org>, Cornelia Huck <cohuck@redhat.com> Subject: [PATCH v1] vl/s390: fixup ram sizes for compat machines Date: Tue, 31 Mar 2020 11:35:54 -0400 Message-Id: <20200331153554.69341-1-borntraeger@de.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Precedence: list Cc: =?utf-8?b?THVrw6HFoSBEb2t0b3I=?= <ldoktor@redhat.com>, Thomas Huth <thuth@redhat.com>, Janosch Frank <frankja@linux.ibm.com>, David Hildenbrand <david@redhat.com>, "Dr . David Alan Gilbert" <dgilbert@redhat.com>, Halil Pasic <pasic@linux.ibm.com>, Christian Borntraeger <borntraeger@de.ibm.com>, qemu-s390x <qemu-s390x@nongnu.org>, Igor Mammedov <imammedo@redhat.com>, Richard Henderson <rth@twiddle.net> Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>
Series	[v1] vl/s390: fixup ram sizes for compat machines \| expand [v1] vl/s390: fixup ram sizes for compat machines

Message ID

20200331153554.69341-1-borntraeger@de.ibm.com

State

New

Headers

From: Christian Borntraeger <borntraeger@de.ibm.com>
To: qemu-devel <qemu-devel@nongnu.org>, Cornelia Huck <cohuck@redhat.com>
Subject: [PATCH v1] vl/s390: fixup ram sizes for compat machines
Date: Tue, 31 Mar 2020 11:35:54 -0400
Message-Id: <20200331153554.69341-1-borntraeger@de.ibm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Precedence: list
Cc: =?utf-8?b?THVrw6HFoSBEb2t0b3I=?= <ldoktor@redhat.com>,
	Thomas Huth <thuth@redhat.com>, Janosch Frank <frankja@linux.ibm.com>, 
	David Hildenbrand <david@redhat.com>, 
	"Dr . David Alan Gilbert" <dgilbert@redhat.com>, Halil Pasic
	<pasic@linux.ibm.com>,  Christian Borntraeger <borntraeger@de.ibm.com>,
	qemu-s390x <qemu-s390x@nongnu.org>, Igor Mammedov <imammedo@redhat.com>, 
	Richard Henderson <rth@twiddle.net>
Errors-To: qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+incoming=patchwork.ozlabs.org@nongnu.org>

Series

[v1] vl/s390: fixup ram sizes for compat machines | expand

Commit Message

Christian Borntraeger March 31, 2020, 3:35 p.m. UTC

compat machines did fixup the ram size to match what can be reported via
sclp. We need to mimic those for machines 4.2 and older to not fail on
inbound migration. For Machines >= 5.0 we can simply use an increment
size of 1M und use the full range of increment number which allows for
all possible memory sizes. The old limitation of having a maximum of
1020 increments was added for standby memory, which we no longer
support. With that we can now support even weird memory sizes like
10001234 MB.

Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
Reported-by: Lukáš Doktor <ldoktor@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
---
RFC->v1:
- also fix mamram
- provide full granularity for machine 5.0

 hw/s390x/s390-skeys.c              |  2 +-
 hw/s390x/s390-stattrib-kvm.c       |  4 ++--
 hw/s390x/s390-virtio-ccw.c         | 20 ++++++++++++++++++++
 hw/s390x/sclp.c                    | 19 ++++++-------------
 include/hw/boards.h                |  1 +
 include/hw/s390x/s390-virtio-ccw.h |  4 +++-
 softmmu/vl.c                       |  3 +++
 7 files changed, 36 insertions(+), 17 deletions(-)

Comments

Cornelia Huck March 31, 2020, 4:30 p.m. UTC | #1

On Tue, 31 Mar 2020 11:35:54 -0400
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> compat machines did fixup the ram size to match what can be reported via

s/compat machines/Older QEMU versions/ ?

> sclp. We need to mimic those for machines 4.2 and older to not fail on

"We need to mimic this behavior for machine types 4.2 and older" ?

> inbound migration. For Machines >= 5.0 we can simply use an increment

s/Machines/machine types/

> size of 1M und use the full range of increment number which allows for
> all possible memory sizes. The old limitation of having a maximum of
> 1020 increments was added for standby memory, which we no longer
> support. With that we can now support even weird memory sizes like
> 10001234 MB.

FWIW, I think this approach is better than a hard fail of odd memory
sizes, even if we think now that the automatic rounding was not such a
good idea.

> 
> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
> Reported-by: Lukáš Doktor <ldoktor@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
> RFC->v1:
> - also fix mamram
> - provide full granularity for machine 5.0
> 
>  hw/s390x/s390-skeys.c              |  2 +-
>  hw/s390x/s390-stattrib-kvm.c       |  4 ++--
>  hw/s390x/s390-virtio-ccw.c         | 20 ++++++++++++++++++++
>  hw/s390x/sclp.c                    | 19 ++++++-------------
>  include/hw/boards.h                |  1 +
>  include/hw/s390x/s390-virtio-ccw.h |  4 +++-
>  softmmu/vl.c                       |  3 +++
>  7 files changed, 36 insertions(+), 17 deletions(-)
> 

(...)

> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 3cf19c99f3..bdfd10f77d 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -579,6 +579,16 @@ static void s390_nmi(NMIState *n, int cpu_index, Error **errp)
>      s390_cpu_restart(S390_CPU(cs));
>  }
>  
> +static ram_addr_t s390_align_ram(ram_addr_t sz)
> +{
> +    /* same logic as in sclp.c */
> +    int increment_size = 20;
> +    while ((sz >> increment_size) > 1020) {

<nitpick>Use MAX_STORAGE_INCREMENTS?</nitpick>

> +        increment_size++;
> +    }
> +    return sz >> increment_size << increment_size;
> +}
> +
>  static void ccw_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> @@ -590,6 +600,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
>      s390mc->cpu_model_allowed = true;
>      s390mc->css_migration_enabled = true;
>      s390mc->hpage_1m_allowed = true;
> +    s390mc->mem_inc_1020 = false;
>      mc->init = ccw_init;
>      mc->reset = s390_machine_reset;
>      mc->hot_add_cpu = s390_hot_add_cpu;
> @@ -686,6 +697,11 @@ bool hpage_1m_allowed(void)
>      return get_machine_class()->hpage_1m_allowed;
>  }
>  
> +bool mem_inc_1020(void)
> +{
> +    return get_machine_class()->mem_inc_1020;

Not sure I like that name; but cannot think of a better name, either :(

> +}
> +
>  static char *machine_get_loadparm(Object *obj, Error **errp)
>  {
>      S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
> @@ -807,7 +823,11 @@ static void ccw_machine_4_2_instance_options(MachineState *machine)
>  
>  static void ccw_machine_4_2_class_options(MachineClass *mc)
>  {
> +    S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
> +
>      ccw_machine_5_0_class_options(mc);
> +    mc->machine_align_ram = s390_align_ram;
> +    s390mc->mem_inc_1020 = true;
>      compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
>  }
>  DEFINE_CCW_MACHINE(4_2, "4.2", false);
> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
> index d8ae207731..d1fff18443 100644
> --- a/hw/s390x/sclp.c
> +++ b/hw/s390x/sclp.c
> @@ -21,6 +21,7 @@
>  #include "hw/s390x/sclp.h"
>  #include "hw/s390x/event-facility.h"
>  #include "hw/s390x/s390-pci-bus.h"
> +#include "hw/s390x/s390-virtio-ccw.h"
>  #include "hw/s390x/ipl.h"
>  
>  static inline SCLPDevice *get_sclp_device(void)
> @@ -346,7 +347,7 @@ static void sclp_realize(DeviceState *dev, Error **errp)
>       */
>      qdev_set_parent_bus(DEVICE(sclp->event_facility), sysbus_get_default());
>  
> -    ret = s390_set_memory_limit(machine->maxram_size, &hw_limit);
> +    ret = s390_set_memory_limit(machine->ram_size, &hw_limit);
>      if (ret == -E2BIG) {
>          error_setg(&err, "host supports a maximum of %" PRIu64 " GB",
>                     hw_limit / GiB);
> @@ -365,23 +366,15 @@ static void sclp_memory_init(SCLPDevice *sclp)
>      int increment_size = 20;
>  
>      /* The storage increment size is a multiple of 1M and is a power of 2.
> -     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer.
> +     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer
> +     * for some machine types.

I think

"For some machine types, the number of..."

would read a bit better.

>       * The variable 'increment_size' is an exponent of 2 that can be
>       * used to calculate the size (in bytes) of an increment. */
> -    while ((initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {
> +    while (mem_inc_1020() &&
> +           (initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {
>          increment_size++;
>      }
>      sclp->increment_size = increment_size;
> -
> -    /* The core memory area needs to be aligned with the increment size.
> -     * In effect, this can cause the user-specified memory size to be rounded
> -     * down to align with the nearest increment boundary. */
> -    initial_mem = initial_mem >> increment_size << increment_size;
> -
> -    machine->ram_size = initial_mem;
> -    machine->maxram_size = initial_mem;
> -    /* let's propagate the changed ram size into the global variable. */
> -    ram_size = initial_mem;
>  }
>  
>  static void sclp_init(Object *obj)
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 236d239c19..e3574f4b5f 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -218,6 +218,7 @@ struct MachineClass {
>                                                           unsigned cpu_index);
>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> +    ram_addr_t (*machine_align_ram)(ram_addr_t size);

/* Intended for compatibility handling. */ ?

>  };
>  
>  /**

(...)

Apart from the nits, looks sane to me.

Igor Mammedov March 31, 2020, 9:49 p.m. UTC | #2

On Tue, 31 Mar 2020 11:35:54 -0400
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> compat machines did fixup the ram size to match what can be reported via
> sclp. We need to mimic those for machines 4.2 and older to not fail on
> inbound migration.
not entirely correct, migration wouldn't fail if target was started
with correct ram size, so OLD machine with incorrect size can still
be migrated to new QEMU provided target QEMU CLI has corrected ram size.

this probably should be captured in commit message so it
would be clear that we are adding hack to keep rounding bug so users
won't have to care about correcting it on their side.

Probably also add here the table, David composed, about how much
RAM user would loose (but still pay for :/) in case of fixup kicks in.

PS:
Shall we add deprecation message along with this patch, so that
eventually we could remove fixup altogether, like we do with other
CLI breaking changes?
Something along the lines:
"
S390:
If user using -m explicitly specified ram size not aligned according
to the table below, 4.2 and older machine types will actually silently
round it down and VM will get less RAM than it was asked for on alignment
value.

Alignment table:
 VM size (<=) | Alignment
--------------------------
      1020M   |     1M
      2040M   |     2M
      4080M   |     4M
      8160M   |     8M
     16320M   |    16M
     32640M   |    32M
     65280M   |    64M
    130560M   |   128M
    261120M   |   256M
    522240M   |   512M
   1044480M   |     1G
   2088960M   |     2G
   4177920M   |     4G
   8355840M   |     8G

Suggested action is to replace unaligned -m value with a suitable aligned one,
future versions will add strict check for valid initial RAM sizes
so VM started on old QEMU with unaligned size won't be able to migrate
strait away since new QEMU won't start with incorrect size. However there is
still possibility to migrate old running VM if migration target is started with
corrected RAM size according to above table. 
"


> For Machines >= 5.0 we can simply use an increment
> size of 1M und use the full range of increment number which allows for
             ^^^ and 
> all possible memory sizes. The old limitation of having a maximum of
> 1020 increments was added for standby memory, which we no longer
> support. With that we can now support even weird memory sizes like
> 10001234 MB.
> 
> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
> Reported-by: Lukáš Doktor <ldoktor@redhat.com>
> Cc: Igor Mammedov <imammedo@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
> ---
> RFC->v1:
> - also fix mamram
> - provide full granularity for machine 5.0
> 
>  hw/s390x/s390-skeys.c              |  2 +-
>  hw/s390x/s390-stattrib-kvm.c       |  4 ++--
>  hw/s390x/s390-virtio-ccw.c         | 20 ++++++++++++++++++++
>  hw/s390x/sclp.c                    | 19 ++++++-------------
>  include/hw/boards.h                |  1 +
>  include/hw/s390x/s390-virtio-ccw.h |  4 +++-
>  softmmu/vl.c                       |  3 +++
>  7 files changed, 36 insertions(+), 17 deletions(-)
> 
> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
> index 5da6e5292f..a9a4ae7b39 100644
> --- a/hw/s390x/s390-skeys.c
> +++ b/hw/s390x/s390-skeys.c
> @@ -176,7 +176,7 @@ static void qemu_s390_skeys_init(Object *obj)
>      QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj);
>      MachineState *machine = MACHINE(qdev_get_machine());
>  
> -    skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE;
> +    skeys->key_count = machine->ram_size / TARGET_PAGE_SIZE;
>      skeys->keydata = g_malloc0(skeys->key_count);
>  }
>  
> diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
> index c7e1f35524..f89d8d9d16 100644
> --- a/hw/s390x/s390-stattrib-kvm.c
> +++ b/hw/s390x/s390-stattrib-kvm.c
> @@ -85,7 +85,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa,
>  {
>      KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
>      MachineState *machine = MACHINE(qdev_get_machine());
> -    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
> +    unsigned long max = machine->ram_size / TARGET_PAGE_SIZE;
>  
>      if (start_gfn + count > max) {
>          error_report("Out of memory bounds when setting storage attributes");
> @@ -104,7 +104,7 @@ static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
>  {
>      KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
>      MachineState *machine = MACHINE(qdev_get_machine());
> -    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
> +    unsigned long max = machine->ram_size / TARGET_PAGE_SIZE;
>      /* We do not need to reach the maximum buffer size allowed */
>      unsigned long cx, len = KVM_S390_SKEYS_MAX / 2;
>      int r;
> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
> index 3cf19c99f3..bdfd10f77d 100644
> --- a/hw/s390x/s390-virtio-ccw.c
> +++ b/hw/s390x/s390-virtio-ccw.c
> @@ -579,6 +579,16 @@ static void s390_nmi(NMIState *n, int cpu_index, Error **errp)
>      s390_cpu_restart(S390_CPU(cs));
>  }
>  
> +static ram_addr_t s390_align_ram(ram_addr_t sz)
> +{
> +    /* same logic as in sclp.c */
> +    int increment_size = 20;
> +    while ((sz >> increment_size) > 1020) {
> +        increment_size++;
> +    }

Print a warning here that asked for ram size is incorrect (if that's the case)?
and will be fixed up to (and maybe suggest user to fix config to valid size we calculate here and how much size were lost due to alignment)

> +    return sz >> increment_size << increment_size;
> +}
> +
>  static void ccw_machine_class_init(ObjectClass *oc, void *data)
>  {
>      MachineClass *mc = MACHINE_CLASS(oc);
> @@ -590,6 +600,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
>      s390mc->cpu_model_allowed = true;
>      s390mc->css_migration_enabled = true;
>      s390mc->hpage_1m_allowed = true;
> +    s390mc->mem_inc_1020 = false;
>      mc->init = ccw_init;
>      mc->reset = s390_machine_reset;
>      mc->hot_add_cpu = s390_hot_add_cpu;
> @@ -686,6 +697,11 @@ bool hpage_1m_allowed(void)
>      return get_machine_class()->hpage_1m_allowed;
>  }
>  
> +bool mem_inc_1020(void)
> +{
> +    return get_machine_class()->mem_inc_1020;
> +}
> +
>  static char *machine_get_loadparm(Object *obj, Error **errp)
>  {
>      S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
> @@ -807,7 +823,11 @@ static void ccw_machine_4_2_instance_options(MachineState *machine)
>  
>  static void ccw_machine_4_2_class_options(MachineClass *mc)
>  {
> +    S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
> +
>      ccw_machine_5_0_class_options(mc);
> +    mc->machine_align_ram = s390_align_ram;
> +    s390mc->mem_inc_1020 = true;
>      compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
>  }
>  DEFINE_CCW_MACHINE(4_2, "4.2", false);
> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
> index d8ae207731..d1fff18443 100644
> --- a/hw/s390x/sclp.c
> +++ b/hw/s390x/sclp.c
> @@ -21,6 +21,7 @@
>  #include "hw/s390x/sclp.h"
>  #include "hw/s390x/event-facility.h"
>  #include "hw/s390x/s390-pci-bus.h"
> +#include "hw/s390x/s390-virtio-ccw.h"
>  #include "hw/s390x/ipl.h"
>  
>  static inline SCLPDevice *get_sclp_device(void)
> @@ -346,7 +347,7 @@ static void sclp_realize(DeviceState *dev, Error **errp)
>       */
>      qdev_set_parent_bus(DEVICE(sclp->event_facility), sysbus_get_default());
>  
> -    ret = s390_set_memory_limit(machine->maxram_size, &hw_limit);
> +    ret = s390_set_memory_limit(machine->ram_size, &hw_limit);
>      if (ret == -E2BIG) {
>          error_setg(&err, "host supports a maximum of %" PRIu64 " GB",
>                     hw_limit / GiB);
> @@ -365,23 +366,15 @@ static void sclp_memory_init(SCLPDevice *sclp)
>      int increment_size = 20;
>  
>      /* The storage increment size is a multiple of 1M and is a power of 2.
> -     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer.
> +     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer
> +     * for some machine types.
>       * The variable 'increment_size' is an exponent of 2 that can be
>       * used to calculate the size (in bytes) of an increment. */
> -    while ((initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {

if s/machine_align_ram/fixup_ram_size/
then mem_inc_1020()&co could be replaced with

machine_class->fixup_ram_size != NULL

> +    while ( mem_inc_1020() &&
> +           (initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {
>          increment_size++;
>      }
>      sclp->increment_size = increment_size;
> -
> -    /* The core memory area needs to be aligned with the increment size.
> -     * In effect, this can cause the user-specified memory size to be rounded
> -     * down to align with the nearest increment boundary. */
> -    initial_mem = initial_mem >> increment_size << increment_size;
> -
> -    machine->ram_size = initial_mem;
> -    machine->maxram_size = initial_mem;
> -    /* let's propagate the changed ram size into the global variable. */
> -    ram_size = initial_mem;
>  }
>  
>  static void sclp_init(Object *obj)
> diff --git a/include/hw/boards.h b/include/hw/boards.h
> index 236d239c19..e3574f4b5f 100644
> --- a/include/hw/boards.h
> +++ b/include/hw/boards.h
> @@ -218,6 +218,7 @@ struct MachineClass {
>                                                           unsigned cpu_index);
>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
> +    ram_addr_t (*machine_align_ram)(ram_addr_t size);
I'd s/machine_//, looks redundant
also I'd call it fixup_ram_size() + 
doc comment above saying something like:

"
 amends user provided ram size (with -m option) using machine
 specific algorithm. to be used by old machine types for compat
 purposes only.
 Applies only to default memory backend (i.e. explicit memory backend
 wasn't used.
"

>  };
>  
>  /**
> diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
> index cd1dccc6e3..022ee6685b 100644
> --- a/include/hw/s390x/s390-virtio-ccw.h
> +++ b/include/hw/s390x/s390-virtio-ccw.h
> @@ -41,6 +41,7 @@ typedef struct S390CcwMachineClass {
>      bool cpu_model_allowed;
>      bool css_migration_enabled;
>      bool hpage_1m_allowed;
> +    bool mem_inc_1020;
>  } S390CcwMachineClass;
>  
>  /* runtime-instrumentation allowed by the machine */
> @@ -49,7 +50,8 @@ bool ri_allowed(void);
>  bool cpu_model_allowed(void);
>  /* 1M huge page mappings allowed by the machine */
>  bool hpage_1m_allowed(void);
> -
> +/* Machine has only 1020 memory increments */
> +bool mem_inc_1020(void);
>  /**
>   * Returns true if (vmstate based) migration of the channel subsystem
>   * is enabled, false if it is disabled.
> diff --git a/softmmu/vl.c b/softmmu/vl.c
> index 1d33a28340..12b5758d12 100644
> --- a/softmmu/vl.c
> +++ b/softmmu/vl.c
> @@ -2601,6 +2601,9 @@ static bool set_memory_options(uint64_t *ram_slots, ram_addr_t *maxram_size,
>      }
>  
>      sz = QEMU_ALIGN_UP(sz, 8192);
> +    if (mc->machine_align_ram) {
> +        sz = mc->machine_align_ram(sz);
> +    }
>      ram_size = sz;
>      if (ram_size != sz) {
>          error_report("ram size too large");

Christian Borntraeger April 1, 2020, 6:04 a.m. UTC | #3

On 31.03.20 23:49, Igor Mammedov wrote:
> On Tue, 31 Mar 2020 11:35:54 -0400
> Christian Borntraeger <borntraeger@de.ibm.com> wrote:
> 
>> compat machines did fixup the ram size to match what can be reported via
>> sclp. We need to mimic those for machines 4.2 and older to not fail on
>> inbound migration.
> not entirely correct, migration wouldn't fail if target was started
> with correct ram size, so OLD machine with incorrect size can still
> be migrated to new QEMU provided target QEMU CLI has corrected ram size.
> 
> this probably should be captured in commit message so it
> would be clear that we are adding hack to keep rounding bug so users
> won't have to care about correcting it on their side.

ack
> 
> Probably also add here the table, David composed, about how much
> RAM user would loose (but still pay for :/) in case of fixup kicks in.
> 
> PS:
> Shall we add deprecation message along with this patch, so that
> eventually we could remove fixup altogether, like we do with other
> CLI breaking changes?
> Something along the lines:
> "
> S390:
> If user using -m explicitly specified ram size not aligned according
> to the table below, 4.2 and older machine types will actually silently
> round it down and VM will get less RAM than it was asked for on alignment
> value.
> 
> Alignment table:
>  VM size (<=) | Alignment
> --------------------------
>       1020M   |     1M
>       2040M   |     2M
>       4080M   |     4M
>       8160M   |     8M
>      16320M   |    16M
>      32640M   |    32M
>      65280M   |    64M
>     130560M   |   128M
>     261120M   |   256M
>     522240M   |   512M
>    1044480M   |     1G
>    2088960M   |     2G
>    4177920M   |     4G
>    8355840M   |     8G
> 
> Suggested action is to replace unaligned -m value with a suitable aligned one,
> future versions will add strict check for valid initial RAM sizes
> so VM started on old QEMU with unaligned size won't be able to migrate
> strait away since new QEMU won't start with incorrect size. However there is
> still possibility to migrate old running VM if migration target is started with
> corrected RAM size according to above table. 


yes, that makes sense. I would like to keep the hack at least > 2 years or
so as this will allow to migrate from one Ubuntu LTS to the next.


> "
> 
> 
>> For Machines >= 5.0 we can simply use an increment
>> size of 1M und use the full range of increment number which allows for
>              ^^^ and 
>> all possible memory sizes. The old limitation of having a maximum of
>> 1020 increments was added for standby memory, which we no longer
>> support. With that we can now support even weird memory sizes like
>> 10001234 MB.
>>
>> Fixes: 3a12fc61af5c ("390x/s390-virtio-ccw: use memdev for RAM")
>> Reported-by: Lukáš Doktor <ldoktor@redhat.com>
>> Cc: Igor Mammedov <imammedo@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
>> ---
>> RFC->v1:
>> - also fix mamram
>> - provide full granularity for machine 5.0
>>
>>  hw/s390x/s390-skeys.c              |  2 +-
>>  hw/s390x/s390-stattrib-kvm.c       |  4 ++--
>>  hw/s390x/s390-virtio-ccw.c         | 20 ++++++++++++++++++++
>>  hw/s390x/sclp.c                    | 19 ++++++-------------
>>  include/hw/boards.h                |  1 +
>>  include/hw/s390x/s390-virtio-ccw.h |  4 +++-
>>  softmmu/vl.c                       |  3 +++
>>  7 files changed, 36 insertions(+), 17 deletions(-)
>>
>> diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
>> index 5da6e5292f..a9a4ae7b39 100644
>> --- a/hw/s390x/s390-skeys.c
>> +++ b/hw/s390x/s390-skeys.c
>> @@ -176,7 +176,7 @@ static void qemu_s390_skeys_init(Object *obj)
>>      QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj);
>>      MachineState *machine = MACHINE(qdev_get_machine());
>>  
>> -    skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE;
>> +    skeys->key_count = machine->ram_size / TARGET_PAGE_SIZE;
>>      skeys->keydata = g_malloc0(skeys->key_count);
>>  }
>>  
>> diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
>> index c7e1f35524..f89d8d9d16 100644
>> --- a/hw/s390x/s390-stattrib-kvm.c
>> +++ b/hw/s390x/s390-stattrib-kvm.c
>> @@ -85,7 +85,7 @@ static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa,
>>  {
>>      KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
>>      MachineState *machine = MACHINE(qdev_get_machine());
>> -    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
>> +    unsigned long max = machine->ram_size / TARGET_PAGE_SIZE;
>>  
>>      if (start_gfn + count > max) {
>>          error_report("Out of memory bounds when setting storage attributes");
>> @@ -104,7 +104,7 @@ static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
>>  {
>>      KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
>>      MachineState *machine = MACHINE(qdev_get_machine());
>> -    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
>> +    unsigned long max = machine->ram_size / TARGET_PAGE_SIZE;
>>      /* We do not need to reach the maximum buffer size allowed */
>>      unsigned long cx, len = KVM_S390_SKEYS_MAX / 2;
>>      int r;
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index 3cf19c99f3..bdfd10f77d 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -579,6 +579,16 @@ static void s390_nmi(NMIState *n, int cpu_index, Error **errp)
>>      s390_cpu_restart(S390_CPU(cs));
>>  }
>>  
>> +static ram_addr_t s390_align_ram(ram_addr_t sz)
>> +{
>> +    /* same logic as in sclp.c */
>> +    int increment_size = 20;
>> +    while ((sz >> increment_size) > 1020) {
>> +        increment_size++;
>> +    }
> 
> Print a warning here that asked for ram size is incorrect (if that's the case)?
> and will be fixed up to (and maybe suggest user to fix config to valid size we calculate here and how much size were lost due to alignment)
>

Yes, checking for sz != sz >> incremenent_size << increment_size and the warn makes sense.

 
>> +    return sz >> increment_size << increment_size;
>> +}
>> +
>>  static void ccw_machine_class_init(ObjectClass *oc, void *data)
>>  {
>>      MachineClass *mc = MACHINE_CLASS(oc);
>> @@ -590,6 +600,7 @@ static void ccw_machine_class_init(ObjectClass *oc, void *data)
>>      s390mc->cpu_model_allowed = true;
>>      s390mc->css_migration_enabled = true;
>>      s390mc->hpage_1m_allowed = true;
>> +    s390mc->mem_inc_1020 = false;
>>      mc->init = ccw_init;
>>      mc->reset = s390_machine_reset;
>>      mc->hot_add_cpu = s390_hot_add_cpu;
>> @@ -686,6 +697,11 @@ bool hpage_1m_allowed(void)
>>      return get_machine_class()->hpage_1m_allowed;
>>  }
>>  
>> +bool mem_inc_1020(void)
>> +{
>> +    return get_machine_class()->mem_inc_1020;
>> +}
>> +
>>  static char *machine_get_loadparm(Object *obj, Error **errp)
>>  {
>>      S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
>> @@ -807,7 +823,11 @@ static void ccw_machine_4_2_instance_options(MachineState *machine)
>>  
>>  static void ccw_machine_4_2_class_options(MachineClass *mc)
>>  {
>> +    S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
>> +
>>      ccw_machine_5_0_class_options(mc);
>> +    mc->machine_align_ram = s390_align_ram;
>> +    s390mc->mem_inc_1020 = true;
>>      compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
>>  }
>>  DEFINE_CCW_MACHINE(4_2, "4.2", false);
>> diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
>> index d8ae207731..d1fff18443 100644
>> --- a/hw/s390x/sclp.c
>> +++ b/hw/s390x/sclp.c
>> @@ -21,6 +21,7 @@
>>  #include "hw/s390x/sclp.h"
>>  #include "hw/s390x/event-facility.h"
>>  #include "hw/s390x/s390-pci-bus.h"
>> +#include "hw/s390x/s390-virtio-ccw.h"
>>  #include "hw/s390x/ipl.h"
>>  
>>  static inline SCLPDevice *get_sclp_device(void)
>> @@ -346,7 +347,7 @@ static void sclp_realize(DeviceState *dev, Error **errp)
>>       */
>>      qdev_set_parent_bus(DEVICE(sclp->event_facility), sysbus_get_default());
>>  
>> -    ret = s390_set_memory_limit(machine->maxram_size, &hw_limit);
>> +    ret = s390_set_memory_limit(machine->ram_size, &hw_limit);
>>      if (ret == -E2BIG) {
>>          error_setg(&err, "host supports a maximum of %" PRIu64 " GB",
>>                     hw_limit / GiB);
>> @@ -365,23 +366,15 @@ static void sclp_memory_init(SCLPDevice *sclp)
>>      int increment_size = 20;
>>  
>>      /* The storage increment size is a multiple of 1M and is a power of 2.
>> -     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer.
>> +     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer
>> +     * for some machine types.
>>       * The variable 'increment_size' is an exponent of 2 that can be
>>       * used to calculate the size (in bytes) of an increment. */
>> -    while ((initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {
> 
> if s/machine_align_ram/fixup_ram_size/
> then mem_inc_1020()&co could be replaced with
> 
> machine_class->fixup_ram_size != NULL

good idea.
> 
>> +    while ( mem_inc_1020() &&
>> +           (initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {
>>          increment_size++;
>>      }
>>      sclp->increment_size = increment_size;
>> -
>> -    /* The core memory area needs to be aligned with the increment size.
>> -     * In effect, this can cause the user-specified memory size to be rounded
>> -     * down to align with the nearest increment boundary. */
>> -    initial_mem = initial_mem >> increment_size << increment_size;
>> -
>> -    machine->ram_size = initial_mem;
>> -    machine->maxram_size = initial_mem;
>> -    /* let's propagate the changed ram size into the global variable. */
>> -    ram_size = initial_mem;
>>  }
>>  
>>  static void sclp_init(Object *obj)
>> diff --git a/include/hw/boards.h b/include/hw/boards.h
>> index 236d239c19..e3574f4b5f 100644
>> --- a/include/hw/boards.h
>> +++ b/include/hw/boards.h
>> @@ -218,6 +218,7 @@ struct MachineClass {
>>                                                           unsigned cpu_index);
>>      const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
>>      int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
>> +    ram_addr_t (*machine_align_ram)(ram_addr_t size);
> I'd s/machine_//, looks redundant
> also I'd call it fixup_ram_size() + 
> doc comment above saying something like:
> 
> "
>  amends user provided ram size (with -m option) using machine
>  specific algorithm. to be used by old machine types for compat
>  purposes only.
>  Applies only to default memory backend (i.e. explicit memory backend
>  wasn't used.
> "

ack

> 
>>  };
>>  
>>  /**
>> diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
>> index cd1dccc6e3..022ee6685b 100644
>> --- a/include/hw/s390x/s390-virtio-ccw.h
>> +++ b/include/hw/s390x/s390-virtio-ccw.h
>> @@ -41,6 +41,7 @@ typedef struct S390CcwMachineClass {
>>      bool cpu_model_allowed;
>>      bool css_migration_enabled;
>>      bool hpage_1m_allowed;
>> +    bool mem_inc_1020;
>>  } S390CcwMachineClass;
>>  
>>  /* runtime-instrumentation allowed by the machine */
>> @@ -49,7 +50,8 @@ bool ri_allowed(void);
>>  bool cpu_model_allowed(void);
>>  /* 1M huge page mappings allowed by the machine */
>>  bool hpage_1m_allowed(void);
>> -
>> +/* Machine has only 1020 memory increments */
>> +bool mem_inc_1020(void);
>>  /**
>>   * Returns true if (vmstate based) migration of the channel subsystem
>>   * is enabled, false if it is disabled.
>> diff --git a/softmmu/vl.c b/softmmu/vl.c
>> index 1d33a28340..12b5758d12 100644
>> --- a/softmmu/vl.c
>> +++ b/softmmu/vl.c
>> @@ -2601,6 +2601,9 @@ static bool set_memory_options(uint64_t *ram_slots, ram_addr_t *maxram_size,
>>      }
>>  
>>      sz = QEMU_ALIGN_UP(sz, 8192);
>> +    if (mc->machine_align_ram) {
>> +        sz = mc->machine_align_ram(sz);
>> +    }
>>      ram_size = sz;
>>      if (ram_size != sz) {
>>          error_report("ram size too large");
> 


Thanks a lot for the quick (and good) feedback.

diff --git a/hw/s390x/s390-skeys.c b/hw/s390x/s390-skeys.c
index 5da6e5292f..a9a4ae7b39 100644
--- a/hw/s390x/s390-skeys.c
+++ b/hw/s390x/s390-skeys.c
@@ -176,7 +176,7 @@  static void qemu_s390_skeys_init(Object *obj)
     QEMUS390SKeysState *skeys = QEMU_S390_SKEYS(obj);
     MachineState *machine = MACHINE(qdev_get_machine());
 
-    skeys->key_count = machine->maxram_size / TARGET_PAGE_SIZE;
+    skeys->key_count = machine->ram_size / TARGET_PAGE_SIZE;
     skeys->keydata = g_malloc0(skeys->key_count);
 }
 
diff --git a/hw/s390x/s390-stattrib-kvm.c b/hw/s390x/s390-stattrib-kvm.c
index c7e1f35524..f89d8d9d16 100644
--- a/hw/s390x/s390-stattrib-kvm.c
+++ b/hw/s390x/s390-stattrib-kvm.c
@@ -85,7 +85,7 @@  static int kvm_s390_stattrib_set_stattr(S390StAttribState *sa,
 {
     KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
     MachineState *machine = MACHINE(qdev_get_machine());
-    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
+    unsigned long max = machine->ram_size / TARGET_PAGE_SIZE;
 
     if (start_gfn + count > max) {
         error_report("Out of memory bounds when setting storage attributes");
@@ -104,7 +104,7 @@  static void kvm_s390_stattrib_synchronize(S390StAttribState *sa)
 {
     KVMS390StAttribState *sas = KVM_S390_STATTRIB(sa);
     MachineState *machine = MACHINE(qdev_get_machine());
-    unsigned long max = machine->maxram_size / TARGET_PAGE_SIZE;
+    unsigned long max = machine->ram_size / TARGET_PAGE_SIZE;
     /* We do not need to reach the maximum buffer size allowed */
     unsigned long cx, len = KVM_S390_SKEYS_MAX / 2;
     int r;
diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
index 3cf19c99f3..bdfd10f77d 100644
--- a/hw/s390x/s390-virtio-ccw.c
+++ b/hw/s390x/s390-virtio-ccw.c
@@ -579,6 +579,16 @@  static void s390_nmi(NMIState *n, int cpu_index, Error **errp)
     s390_cpu_restart(S390_CPU(cs));
 }
 
+static ram_addr_t s390_align_ram(ram_addr_t sz)
+{
+    /* same logic as in sclp.c */
+    int increment_size = 20;
+    while ((sz >> increment_size) > 1020) {
+        increment_size++;
+    }
+    return sz >> increment_size << increment_size;
+}
+
 static void ccw_machine_class_init(ObjectClass *oc, void *data)
 {
     MachineClass *mc = MACHINE_CLASS(oc);
@@ -590,6 +600,7 @@  static void ccw_machine_class_init(ObjectClass *oc, void *data)
     s390mc->cpu_model_allowed = true;
     s390mc->css_migration_enabled = true;
     s390mc->hpage_1m_allowed = true;
+    s390mc->mem_inc_1020 = false;
     mc->init = ccw_init;
     mc->reset = s390_machine_reset;
     mc->hot_add_cpu = s390_hot_add_cpu;
@@ -686,6 +697,11 @@  bool hpage_1m_allowed(void)
     return get_machine_class()->hpage_1m_allowed;
 }
 
+bool mem_inc_1020(void)
+{
+    return get_machine_class()->mem_inc_1020;
+}
+
 static char *machine_get_loadparm(Object *obj, Error **errp)
 {
     S390CcwMachineState *ms = S390_CCW_MACHINE(obj);
@@ -807,7 +823,11 @@  static void ccw_machine_4_2_instance_options(MachineState *machine)
 
 static void ccw_machine_4_2_class_options(MachineClass *mc)
 {
+    S390CcwMachineClass *s390mc = S390_MACHINE_CLASS(mc);
+
     ccw_machine_5_0_class_options(mc);
+    mc->machine_align_ram = s390_align_ram;
+    s390mc->mem_inc_1020 = true;
     compat_props_add(mc->compat_props, hw_compat_4_2, hw_compat_4_2_len);
 }
 DEFINE_CCW_MACHINE(4_2, "4.2", false);
diff --git a/hw/s390x/sclp.c b/hw/s390x/sclp.c
index d8ae207731..d1fff18443 100644
--- a/hw/s390x/sclp.c
+++ b/hw/s390x/sclp.c
@@ -21,6 +21,7 @@ 
 #include "hw/s390x/sclp.h"
 #include "hw/s390x/event-facility.h"
 #include "hw/s390x/s390-pci-bus.h"
+#include "hw/s390x/s390-virtio-ccw.h"
 #include "hw/s390x/ipl.h"
 
 static inline SCLPDevice *get_sclp_device(void)
@@ -346,7 +347,7 @@  static void sclp_realize(DeviceState *dev, Error **errp)
      */
     qdev_set_parent_bus(DEVICE(sclp->event_facility), sysbus_get_default());
 
-    ret = s390_set_memory_limit(machine->maxram_size, &hw_limit);
+    ret = s390_set_memory_limit(machine->ram_size, &hw_limit);
     if (ret == -E2BIG) {
         error_setg(&err, "host supports a maximum of %" PRIu64 " GB",
                    hw_limit / GiB);
@@ -365,23 +366,15 @@  static void sclp_memory_init(SCLPDevice *sclp)
     int increment_size = 20;
 
     /* The storage increment size is a multiple of 1M and is a power of 2.
-     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer.
+     * The number of storage increments must be MAX_STORAGE_INCREMENTS or fewer
+     * for some machine types.
      * The variable 'increment_size' is an exponent of 2 that can be
      * used to calculate the size (in bytes) of an increment. */
-    while ((initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {
+    while (mem_inc_1020() &&
+           (initial_mem >> increment_size) > MAX_STORAGE_INCREMENTS) {
         increment_size++;
     }
     sclp->increment_size = increment_size;
-
-    /* The core memory area needs to be aligned with the increment size.
-     * In effect, this can cause the user-specified memory size to be rounded
-     * down to align with the nearest increment boundary. */
-    initial_mem = initial_mem >> increment_size << increment_size;
-
-    machine->ram_size = initial_mem;
-    machine->maxram_size = initial_mem;
-    /* let's propagate the changed ram size into the global variable. */
-    ram_size = initial_mem;
 }
 
 static void sclp_init(Object *obj)
diff --git a/include/hw/boards.h b/include/hw/boards.h
index 236d239c19..e3574f4b5f 100644
--- a/include/hw/boards.h
+++ b/include/hw/boards.h
@@ -218,6 +218,7 @@  struct MachineClass {
                                                          unsigned cpu_index);
     const CPUArchIdList *(*possible_cpu_arch_ids)(MachineState *machine);
     int64_t (*get_default_cpu_node_id)(const MachineState *ms, int idx);
+    ram_addr_t (*machine_align_ram)(ram_addr_t size);
 };
 
 /**
diff --git a/include/hw/s390x/s390-virtio-ccw.h b/include/hw/s390x/s390-virtio-ccw.h
index cd1dccc6e3..022ee6685b 100644
--- a/include/hw/s390x/s390-virtio-ccw.h
+++ b/include/hw/s390x/s390-virtio-ccw.h
@@ -41,6 +41,7 @@  typedef struct S390CcwMachineClass {
     bool cpu_model_allowed;
     bool css_migration_enabled;
     bool hpage_1m_allowed;
+    bool mem_inc_1020;
 } S390CcwMachineClass;
 
 /* runtime-instrumentation allowed by the machine */
@@ -49,7 +50,8 @@  bool ri_allowed(void);
 bool cpu_model_allowed(void);
 /* 1M huge page mappings allowed by the machine */
 bool hpage_1m_allowed(void);
-
+/* Machine has only 1020 memory increments */
+bool mem_inc_1020(void);
 /**
  * Returns true if (vmstate based) migration of the channel subsystem
  * is enabled, false if it is disabled.
diff --git a/softmmu/vl.c b/softmmu/vl.c
index 1d33a28340..12b5758d12 100644
--- a/softmmu/vl.c
+++ b/softmmu/vl.c
@@ -2601,6 +2601,9 @@  static bool set_memory_options(uint64_t *ram_slots, ram_addr_t *maxram_size,
     }
 
     sz = QEMU_ALIGN_UP(sz, 8192);
+    if (mc->machine_align_ram) {
+        sz = mc->machine_align_ram(sz);
+    }
     ram_size = sz;
     if (ram_size != sz) {
         error_report("ram size too large");

[v1] vl/s390: fixup ram sizes for compat machines

Commit Message

Comments

Patch