Message ID | baffeba4795e1ea38959ca0424a8d3709588fcc1.1346347994.git.jbaron@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, Aug 30, 2012 at 01:51:10PM -0400, Jason Baron wrote: > While testing q35 live migration, I found that the migration would abort with > the following error: "Unknown savevm section type 76". > > The error is due to this check failing in 'vmstate_load_state()': > > while(field->name) { > if ((field->field_exists && > field->field_exists(opaque, version_id)) || > (!field->field_exists && > field->version_id <= version_id)) { > > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > 'version_id' in the above check is 1. And thus we fail to load the pcie device > field. Further the code returns to 'qemu_loadvm_state()' which produces the > error that I saw. > > I'm proposing to fix this by simply dropping the 'version_id' field from > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > versioned. Thus, any versioning issues could be detected at the vmsd level. > > Taking a step back, I think that the 'field->version_id' should be compared > against a saved version number for the field not the 'version_id'. Futhermore, > once vmstate_load_state() is called recursively on another vmsd, the check of: > > if (version_id > vmsd->version_id) { > return -EINVAL; > } > > Will never fail since version_id is always equal to vmsd->version_id. So I'm > wondering why we aren't storing the vmsd version id of the source in the > migration stream? > > This patch also renames the 'name' field of vmstate_pcie_device from: > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > Signed-off-by: Jason Baron <jbaron@redhat.com> I have this queued to apply already - this is just a repost or did I miss something? I have no idea about the larger issues - Juan? > --- > hw/pci.c | 2 +- > hw/pcie.h | 1 - > 2 files changed, 1 insertions(+), 2 deletions(-) > > diff --git a/hw/pci.c b/hw/pci.c > index 3727afa..5386a4f 100644 > --- a/hw/pci.c > +++ b/hw/pci.c > @@ -439,7 +439,7 @@ const VMStateDescription vmstate_pci_device = { > }; > > const VMStateDescription vmstate_pcie_device = { > - .name = "PCIDevice", > + .name = "PCIEDevice", > .version_id = 2, > .minimum_version_id = 1, > .minimum_version_id_old = 1, > diff --git a/hw/pcie.h b/hw/pcie.h > index b8ab0c7..4889194 100644 > --- a/hw/pcie.h > +++ b/hw/pcie.h > @@ -133,7 +133,6 @@ extern const VMStateDescription vmstate_pcie_device; > > #define VMSTATE_PCIE_DEVICE(_field, _state) { \ > .name = (stringify(_field)), \ > - .version_id = 2, \ > .size = sizeof(PCIDevice), \ > .vmsd = &vmstate_pcie_device, \ > .flags = VMS_STRUCT, \ > -- > 1.7.1
Jason Baron <jbaron@redhat.com> wrote: > While testing q35 live migration, I found that the migration would abort with > the following error: "Unknown savevm section type 76". Before we start, migration of PCI is ugly due to backwards compatibility. > The error is due to this check failing in 'vmstate_load_state()': > > while(field->name) { > if ((field->field_exists && > field->field_exists(opaque, version_id)) || > (!field->field_exists && > field->version_id <= version_id)) { This is a long history, but to make things short, when we started with vmstate, there were no cose where this were not true. It could potentially be true, but there were no examples. > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > 'version_id' in the above check is 1. And thus we fail to load the pcie device > field. Further the code returns to 'qemu_loadvm_state()' which produces the > error that I saw. Agreed about removing the set of the value. > I'm proposing to fix this by simply dropping the 'version_id' field from > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > versioned. Thus, any versioning issues could be detected at the vmsd level. > > Taking a step back, I think that the 'field->version_id' should be compared > against a saved version number for the field not the 'version_id'. Futhermore, > once vmstate_load_state() is called recursively on another vmsd, the check of: > > if (version_id > vmsd->version_id) { > return -EINVAL; > } This is a can of worms you don't want to open. Basically it is imposible to get it _sane_ and backward compatible. We choose on its time to maintain it backward compatible, and that is why it is not sane. I would have to dig the details, but the problems where when we were loading things that were not backward compatible. The problem was that PCI is the other "section inside a section", where we store the version_id. This is wrong for a number of reasons, between them that if we "increase" the version_id of pci, the "outermost" section id, has changed but it would not receive a new version number. Remember the part about _sane_ and _backward compatible_ :-() > Will never fail since version_id is always equal to vmsd->version_id. So I'm > wondering why we aren't storing the vmsd version id of the source in the > migration stream? > > This patch also renames the 'name' field of vmstate_pcie_device from: > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > Signed-off-by: Jason Baron <jbaron@redhat.com> Acked-by: Juan Quintela <quintela@redhat.com> I agree with the migration bits of this series. I am not ack'ing the other patch because I don't understand pci well enough to know what the code does (before or after). Later, Juan.
On Fri, Aug 31, 2012 at 11:44:54AM +0300, Michael S. Tsirkin wrote: > On Thu, Aug 30, 2012 at 01:51:10PM -0400, Jason Baron wrote: > > While testing q35 live migration, I found that the migration would abort with > > the following error: "Unknown savevm section type 76". > > > > The error is due to this check failing in 'vmstate_load_state()': > > > > while(field->name) { > > if ((field->field_exists && > > field->field_exists(opaque, version_id)) || > > (!field->field_exists && > > field->version_id <= version_id)) { > > > > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > > 'version_id' in the above check is 1. And thus we fail to load the pcie device > > field. Further the code returns to 'qemu_loadvm_state()' which produces the > > error that I saw. > > > > I'm proposing to fix this by simply dropping the 'version_id' field from > > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > > versioned. Thus, any versioning issues could be detected at the vmsd level. > > > > Taking a step back, I think that the 'field->version_id' should be compared > > against a saved version number for the field not the 'version_id'. Futhermore, > > once vmstate_load_state() is called recursively on another vmsd, the check of: > > > > if (version_id > vmsd->version_id) { > > return -EINVAL; > > } > > > > Will never fail since version_id is always equal to vmsd->version_id. So I'm > > wondering why we aren't storing the vmsd version id of the source in the > > migration stream? > > > > This patch also renames the 'name' field of vmstate_pcie_device from: > > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > > > Signed-off-by: Jason Baron <jbaron@redhat.com> > > I have this queued to apply already - this is just a repost > or did I miss something? > just a repost...but now we have Juan's ack :)
On Fri, Aug 31, 2012 at 10:46:51AM -0400, Jason Baron wrote: > On Fri, Aug 31, 2012 at 11:44:54AM +0300, Michael S. Tsirkin wrote: > > On Thu, Aug 30, 2012 at 01:51:10PM -0400, Jason Baron wrote: > > > While testing q35 live migration, I found that the migration would abort with > > > the following error: "Unknown savevm section type 76". > > > > > > The error is due to this check failing in 'vmstate_load_state()': > > > > > > while(field->name) { > > > if ((field->field_exists && > > > field->field_exists(opaque, version_id)) || > > > (!field->field_exists && > > > field->version_id <= version_id)) { > > > > > > The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, > > > 'version_id' in the above check is 1. And thus we fail to load the pcie device > > > field. Further the code returns to 'qemu_loadvm_state()' which produces the > > > error that I saw. > > > > > > I'm proposing to fix this by simply dropping the 'version_id' field from > > > VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further > > > the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already > > > versioned. Thus, any versioning issues could be detected at the vmsd level. > > > > > > Taking a step back, I think that the 'field->version_id' should be compared > > > against a saved version number for the field not the 'version_id'. Futhermore, > > > once vmstate_load_state() is called recursively on another vmsd, the check of: > > > > > > if (version_id > vmsd->version_id) { > > > return -EINVAL; > > > } > > > > > > Will never fail since version_id is always equal to vmsd->version_id. So I'm > > > wondering why we aren't storing the vmsd version id of the source in the > > > migration stream? > > > > > > This patch also renames the 'name' field of vmstate_pcie_device from: > > > PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. > > > > > > Signed-off-by: Jason Baron <jbaron@redhat.com> > > > > I have this queued to apply already - this is just a repost > > or did I miss something? > > > > just a repost...but now we have Juan's ack :) Right. It is preferable to tag reposts explicitly but no big deal.
diff --git a/hw/pci.c b/hw/pci.c index 3727afa..5386a4f 100644 --- a/hw/pci.c +++ b/hw/pci.c @@ -439,7 +439,7 @@ const VMStateDescription vmstate_pci_device = { }; const VMStateDescription vmstate_pcie_device = { - .name = "PCIDevice", + .name = "PCIEDevice", .version_id = 2, .minimum_version_id = 1, .minimum_version_id_old = 1, diff --git a/hw/pcie.h b/hw/pcie.h index b8ab0c7..4889194 100644 --- a/hw/pcie.h +++ b/hw/pcie.h @@ -133,7 +133,6 @@ extern const VMStateDescription vmstate_pcie_device; #define VMSTATE_PCIE_DEVICE(_field, _state) { \ .name = (stringify(_field)), \ - .version_id = 2, \ .size = sizeof(PCIDevice), \ .vmsd = &vmstate_pcie_device, \ .flags = VMS_STRUCT, \
While testing q35 live migration, I found that the migration would abort with the following error: "Unknown savevm section type 76". The error is due to this check failing in 'vmstate_load_state()': while(field->name) { if ((field->field_exists && field->field_exists(opaque, version_id)) || (!field->field_exists && field->version_id <= version_id)) { The VMSTATE_PCIE_DEVICE() currently has a 'version_id' set to 2. However, 'version_id' in the above check is 1. And thus we fail to load the pcie device field. Further the code returns to 'qemu_loadvm_state()' which produces the error that I saw. I'm proposing to fix this by simply dropping the 'version_id' field from VMSTATE_PCIE_DEVICE(). VMSTATE_PCI_DEVICE() defines no such field and further the vmstate_pcie_device that VMSTATE_PCI_DEVICE() refers to is already versioned. Thus, any versioning issues could be detected at the vmsd level. Taking a step back, I think that the 'field->version_id' should be compared against a saved version number for the field not the 'version_id'. Futhermore, once vmstate_load_state() is called recursively on another vmsd, the check of: if (version_id > vmsd->version_id) { return -EINVAL; } Will never fail since version_id is always equal to vmsd->version_id. So I'm wondering why we aren't storing the vmsd version id of the source in the migration stream? This patch also renames the 'name' field of vmstate_pcie_device from: PCIDevice -> PCIEDevice to differentiate it from vmstate_pci_device. Signed-off-by: Jason Baron <jbaron@redhat.com> --- hw/pci.c | 2 +- hw/pcie.h | 1 - 2 files changed, 1 insertions(+), 2 deletions(-)