diff mbox series

[1/2] vfio/mdev: add version field as mandatory attribute for mdev device

Message ID 20190419083505.19654-1-yan.y.zhao@intel.com
State New
Headers show
Series introduction of version attribute for VFIO live migration | expand

Commit Message

Yan Zhao April 19, 2019, 8:35 a.m. UTC
device version attribute in mdev sysfs is used by user space software
(e.g. libvirt) to query device compatibility for live migration of VFIO
mdev devices. This attribute is mandatory if a mdev device supports live
migration.

It consists of two parts: common part and vendor proprietary part.
common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
             identifies device type. e.g., for pci device, it is
             "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
vendor proprietary part: this part is varied in length. vendor driver can
             specify any string to identify a device.

When reading this attribute, it should show device version string of the
device of type <type-id>. If a device does not support live migration, it
should return errno.
When writing a string to this attribute, it returns errno for
incompatibility or returns written string length in compatibility case.
If a device does not support live migration, it always returns errno.

For user space software to use:
1.
Before starting live migration, user space software first reads source side
mdev device's version. e.g.
"#cat \
/sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
00028086-193b-i915-GVTg_V5_4

2.
Then, user space software writes the source side returned version string
to device version attribute in target side, and checks the return value.
If a negative errno is returned in the target side, then mdev devices in
source and target sides are not compatible;
If a positive number is returned and it equals to the length of written
string, then the two mdev devices in source and target side are compatible.
e.g.
(a) compatibility case
"# echo 00028086-193b-i915-GVTg_V5_4 >
/sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"

(b) incompatibility case
"#echo 00028086-193b-i915-GVTg_V5_1 >
/sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
-bash: echo: write error: Invalid argument

3. if two mdev devices are compatible, user space software can start
live migration, and vice versa.

Note: if a mdev device does not support live migration, it either does
not provide a version attribute, or always returns errno when its version
attribute is read/written.

Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: Erik Skultety <eskultet@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: "Tian, Kevin" <kevin.tian@intel.com>
Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
Cc: Neo Jia <cjia@nvidia.com>
Cc: Kirti Wankhede <kwankhede@nvidia.com>

Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
---
 Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
 samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
 samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
 samples/vfio-mdev/mtty.c               | 16 ++++++++++++
 4 files changed, 85 insertions(+)

Comments

Alex Williamson April 22, 2019, 2:39 p.m. UTC | #1
On Fri, 19 Apr 2019 04:35:04 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> device version attribute in mdev sysfs is used by user space software
> (e.g. libvirt) to query device compatibility for live migration of VFIO
> mdev devices. This attribute is mandatory if a mdev device supports live
> migration.

The Subject: doesn't quite match what's being proposed here.
 
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>              identifies device type. e.g., for pci device, it is
>              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).

What purpose does this serve?  If it's intended as some sort of
namespace feature, shouldn't we first assume that we can only support
migration to devices of the same type?  Therefore each type would
already have its own namespace.  Also that would make the trailing bit
of the version string listed below in the example redundant.  A vendor
is still welcome to include this in their version string if they wish,
but I think the string should be entirely vendor defined.

> vendor proprietary part: this part is varied in length. vendor driver can
>              specify any string to identify a device.
> 
> When reading this attribute, it should show device version string of the
> device of type <type-id>. If a device does not support live migration, it
> should return errno.
> When writing a string to this attribute, it returns errno for
> incompatibility or returns written string length in compatibility case.
> If a device does not support live migration, it always returns errno.
> 
> For user space software to use:
> 1.
> Before starting live migration, user space software first reads source side
> mdev device's version. e.g.
> "#cat \
> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> 00028086-193b-i915-GVTg_V5_4
> 
> 2.
> Then, user space software writes the source side returned version string
> to device version attribute in target side, and checks the return value.
> If a negative errno is returned in the target side, then mdev devices in
> source and target sides are not compatible;
> If a positive number is returned and it equals to the length of written
> string, then the two mdev devices in source and target side are compatible.
> e.g.
> (a) compatibility case
> "# echo 00028086-193b-i915-GVTg_V5_4 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> 
> (b) incompatibility case
> "#echo 00028086-193b-i915-GVTg_V5_1 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> -bash: echo: write error: Invalid argument
> 
> 3. if two mdev devices are compatible, user space software can start
> live migration, and vice versa.
> 
> Note: if a mdev device does not support live migration, it either does
> not provide a version attribute, or always returns errno when its version
> attribute is read/written.

I think it would be cleaner to do the former, not supply the
attribute.  This seems to do the latter in the sample drivers.  Thanks,

Alex

> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Erik Skultety <eskultet@redhat.com>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> Cc: Neo Jia <cjia@nvidia.com>
> Cc: Kirti Wankhede <kwankhede@nvidia.com>
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
>  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
>  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
>  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
>  4 files changed, 85 insertions(+)
> 
> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> index c3f69bcaf96e..bc28471c0667 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
>    |     |   |--- available_instances
>    |     |   |--- device_api
>    |     |   |--- description
> +  |     |   |--- version
>    |     |   |--- [devices]
>    |     |--- [<type-id>]
>    |     |   |--- create
> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
>    |     |   |--- available_instances
>    |     |   |--- device_api
>    |     |   |--- description
> +  |     |   |--- version
>    |     |   |--- [devices]
>    |     |--- [<type-id>]
>    |          |--- create
> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
>    |          |--- available_instances
>    |          |--- device_api
>    |          |--- description
> +  |          |--- version
>    |          |--- [devices]
>  
>  * [mdev_supported_types]
> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
>    [<type-id>], device_api, and available_instances are mandatory attributes
>    that should be provided by vendor driver.
>  
> +  version is a mandatory attribute if a mdev device supports live migration.
> +
>  * [<type-id>]
>  
>    The [<type-id>] name is created by adding the device driver string as a prefix
> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
>    This attribute should show the number of devices of type <type-id> that can be
>    created.
>  
> +* version
> +
> +  This attribute is rw. It is used to check whether two devices are compatible
> +  for live migration. If this attribute is missing, then the corresponding mdev
> +  device is regarded as not supporting live migration.
> +
> +  It consists of two parts: common part and vendor proprietary part.
> +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> +               device type. e.g., for pci device, it is
> +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> +  vendor proprietary part: this part is varied in length. vendor driver can
> +               specify any string to identify a device.
> +
> +  When reading this attribute, it should show device version string of the device
> +  of type <type-id>. If a device does not support live migration, it should
> +  return errno.
> +  When writing a string to this attribute, it returns errno for incompatibility
> +  or returns written string length in compatibility case. If a device does not
> +  support live migration, it always returns errno.
> +
> +  for example.
> +  # cat \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> +  00028086-193b-i915-GVTg_V5_2
> +
> +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> + -bash: echo: write error: Invalid argument
> +
>  * [device]
>  
>    This directory contains links to the devices of type <type-id> that have been
> @@ -327,12 +361,14 @@ card.
>          |   |   |-- available_instances
>          |   |   |-- create
>          |   |   |-- device_api
> +        |   |   |-- version
>          |   |   |-- devices
>          |   |   `-- name
>          |   `-- mtty-2
>          |       |-- available_instances
>          |       |-- create
>          |       |-- device_api
> +        |       |-- version
>          |       |-- devices
>          |       `-- name
>          |-- mtty_dev
> diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> index b038aa9f5a70..2f5ba96b91a2 100644
> --- a/samples/vfio-mdev/mbochs.c
> +++ b/samples/vfio-mdev/mbochs.c
> @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
>  }
>  MDEV_TYPE_ATTR_RO(device_api);
>  
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> +		char *buf)
> +{
> +	/* do not support live migration */
> +	return -EINVAL;
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> +		const char *buf, size_t count)
> +{
> +	/* do not support live migration */
> +	return -EINVAL;
> +}
> +
> +static MDEV_TYPE_ATTR_RW(version);
> +
>  static struct attribute *mdev_types_attrs[] = {
>  	&mdev_type_attr_name.attr,
>  	&mdev_type_attr_description.attr,
>  	&mdev_type_attr_device_api.attr,
>  	&mdev_type_attr_available_instances.attr,
> +	&mdev_type_attr_version.attr,
>  	NULL,
>  };
>  
> diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> index cc86bf6566e4..ff15fdfc7d46 100644
> --- a/samples/vfio-mdev/mdpy.c
> +++ b/samples/vfio-mdev/mdpy.c
> @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
>  }
>  MDEV_TYPE_ATTR_RO(device_api);
>  
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> +		char *buf)
> +{
> +	/* do not support live migration */
> +	return -EINVAL;
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> +		const char *buf, size_t count)
> +{
> +	/* do not support live migration */
> +	return -EINVAL;
> +}
> +static MDEV_TYPE_ATTR_RW(version);
> +
>  static struct attribute *mdev_types_attrs[] = {
>  	&mdev_type_attr_name.attr,
>  	&mdev_type_attr_description.attr,
>  	&mdev_type_attr_device_api.attr,
>  	&mdev_type_attr_available_instances.attr,
> +	&mdev_type_attr_version.attr,
>  	NULL,
>  };
>  
> diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> index 1c77c370c92f..4ae3aad3474d 100644
> --- a/samples/vfio-mdev/mtty.c
> +++ b/samples/vfio-mdev/mtty.c
> @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
>  
>  MDEV_TYPE_ATTR_RO(device_api);
>  
> +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> +		char *buf)
> +{
> +	/* do not support live migration */
> +	return -EINVAL;
> +}
> +
> +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> +		const char *buf, size_t count)
> +{
> +	/* do not support live migration */
> +	return -EINVAL;
> +}
> +
> +static MDEV_TYPE_ATTR_RW(version);
>  static struct attribute *mdev_types_attrs[] = {
>  	&mdev_type_attr_name.attr,
>  	&mdev_type_attr_device_api.attr,
>  	&mdev_type_attr_available_instances.attr,
> +	&mdev_type_attr_version.attr,
>  	NULL,
>  };
>
Yan Zhao April 23, 2019, 1:01 a.m. UTC | #2
On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> On Fri, 19 Apr 2019 04:35:04 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> 
> The Subject: doesn't quite match what's being proposed here.
>  
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >              identifies device type. e.g., for pci device, it is
> >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> 
> What purpose does this serve?  If it's intended as some sort of
> namespace feature, shouldn't we first assume that we can only support
> migration to devices of the same type?  Therefore each type would
> already have its own namespace.  Also that would make the trailing bit
> of the version string listed below in the example redundant.  A vendor
> is still welcome to include this in their version string if they wish,
> but I think the string should be entirely vendor defined.
>
hi Alex,
This common part is a kind of namespace.
Because if version string is entirely defined by vendors, I'm worried about
if there is a case that one vendor's version string happens to deceive and
interfere with another vendor's version checking?
e.g.
vendor A has a version string like: vendor id + device id + mdev type
vendor B has a version string like: device id + vendor id + mdev type
but vendor A's vendor id is 0x8086, device id is 0x1217
vendor B's vendor id is 0x1217, device id is 0x8086.

In this corner case, the two vendors may regard the two device is
migratable but actually they are not.

That's the reason for this common part that serve as a kind of namespace
that all vendors will comply with to avoid overlap.

> > vendor proprietary part: this part is varied in length. vendor driver can
> >              specify any string to identify a device.
> > 
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> > 
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> > 
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > 
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
> > 
> > 3. if two mdev devices are compatible, user space software can start
> > live migration, and vice versa.
> > 
> > Note: if a mdev device does not support live migration, it either does
> > not provide a version attribute, or always returns errno when its version
> > attribute is read/written.
> 
> I think it would be cleaner to do the former, not supply the
> attribute.  This seems to do the latter in the sample drivers.  Thanks,
Ok. you are right!
what about just keep one sample driver to show how to do the latter,
and let the others do the former?

Thanks!
Yan


> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Erik Skultety <eskultet@redhat.com>
> > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > Cc: Neo Jia <cjia@nvidia.com>
> > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > 
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---
> >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> >  4 files changed, 85 insertions(+)
> > 
> > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > index c3f69bcaf96e..bc28471c0667 100644
> > --- a/Documentation/vfio-mediated-device.txt
> > +++ b/Documentation/vfio-mediated-device.txt
> > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |     |   |--- available_instances
> >    |     |   |--- device_api
> >    |     |   |--- description
> > +  |     |   |--- version
> >    |     |   |--- [devices]
> >    |     |--- [<type-id>]
> >    |     |   |--- create
> > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |     |   |--- available_instances
> >    |     |   |--- device_api
> >    |     |   |--- description
> > +  |     |   |--- version
> >    |     |   |--- [devices]
> >    |     |--- [<type-id>]
> >    |          |--- create
> > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |          |--- available_instances
> >    |          |--- device_api
> >    |          |--- description
> > +  |          |--- version
> >    |          |--- [devices]
> >  
> >  * [mdev_supported_types]
> > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> >    [<type-id>], device_api, and available_instances are mandatory attributes
> >    that should be provided by vendor driver.
> >  
> > +  version is a mandatory attribute if a mdev device supports live migration.
> > +
> >  * [<type-id>]
> >  
> >    The [<type-id>] name is created by adding the device driver string as a prefix
> > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> >    This attribute should show the number of devices of type <type-id> that can be
> >    created.
> >  
> > +* version
> > +
> > +  This attribute is rw. It is used to check whether two devices are compatible
> > +  for live migration. If this attribute is missing, then the corresponding mdev
> > +  device is regarded as not supporting live migration.
> > +
> > +  It consists of two parts: common part and vendor proprietary part.
> > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > +               device type. e.g., for pci device, it is
> > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > +  vendor proprietary part: this part is varied in length. vendor driver can
> > +               specify any string to identify a device.
> > +
> > +  When reading this attribute, it should show device version string of the device
> > +  of type <type-id>. If a device does not support live migration, it should
> > +  return errno.
> > +  When writing a string to this attribute, it returns errno for incompatibility
> > +  or returns written string length in compatibility case. If a device does not
> > +  support live migration, it always returns errno.
> > +
> > +  for example.
> > +  # cat \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > +  00028086-193b-i915-GVTg_V5_2
> > +
> > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > + -bash: echo: write error: Invalid argument
> > +
> >  * [device]
> >  
> >    This directory contains links to the devices of type <type-id> that have been
> > @@ -327,12 +361,14 @@ card.
> >          |   |   |-- available_instances
> >          |   |   |-- create
> >          |   |   |-- device_api
> > +        |   |   |-- version
> >          |   |   |-- devices
> >          |   |   `-- name
> >          |   `-- mtty-2
> >          |       |-- available_instances
> >          |       |-- create
> >          |       |-- device_api
> > +        |       |-- version
> >          |       |-- devices
> >          |       `-- name
> >          |-- mtty_dev
> > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > index b038aa9f5a70..2f5ba96b91a2 100644
> > --- a/samples/vfio-mdev/mbochs.c
> > +++ b/samples/vfio-mdev/mbochs.c
> > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> >  }
> >  MDEV_TYPE_ATTR_RO(device_api);
> >  
> > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > +		char *buf)
> > +{
> > +	/* do not support live migration */
> > +	return -EINVAL;
> > +}
> > +
> > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > +		const char *buf, size_t count)
> > +{
> > +	/* do not support live migration */
> > +	return -EINVAL;
> > +}
> > +
> > +static MDEV_TYPE_ATTR_RW(version);
> > +
> >  static struct attribute *mdev_types_attrs[] = {
> >  	&mdev_type_attr_name.attr,
> >  	&mdev_type_attr_description.attr,
> >  	&mdev_type_attr_device_api.attr,
> >  	&mdev_type_attr_available_instances.attr,
> > +	&mdev_type_attr_version.attr,
> >  	NULL,
> >  };
> >  
> > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > index cc86bf6566e4..ff15fdfc7d46 100644
> > --- a/samples/vfio-mdev/mdpy.c
> > +++ b/samples/vfio-mdev/mdpy.c
> > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> >  }
> >  MDEV_TYPE_ATTR_RO(device_api);
> >  
> > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > +		char *buf)
> > +{
> > +	/* do not support live migration */
> > +	return -EINVAL;
> > +}
> > +
> > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > +		const char *buf, size_t count)
> > +{
> > +	/* do not support live migration */
> > +	return -EINVAL;
> > +}
> > +static MDEV_TYPE_ATTR_RW(version);
> > +
> >  static struct attribute *mdev_types_attrs[] = {
> >  	&mdev_type_attr_name.attr,
> >  	&mdev_type_attr_description.attr,
> >  	&mdev_type_attr_device_api.attr,
> >  	&mdev_type_attr_available_instances.attr,
> > +	&mdev_type_attr_version.attr,
> >  	NULL,
> >  };
> >  
> > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > index 1c77c370c92f..4ae3aad3474d 100644
> > --- a/samples/vfio-mdev/mtty.c
> > +++ b/samples/vfio-mdev/mtty.c
> > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> >  
> >  MDEV_TYPE_ATTR_RO(device_api);
> >  
> > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > +		char *buf)
> > +{
> > +	/* do not support live migration */
> > +	return -EINVAL;
> > +}
> > +
> > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > +		const char *buf, size_t count)
> > +{
> > +	/* do not support live migration */
> > +	return -EINVAL;
> > +}
> > +
> > +static MDEV_TYPE_ATTR_RW(version);
> >  static struct attribute *mdev_types_attrs[] = {
> >  	&mdev_type_attr_name.attr,
> >  	&mdev_type_attr_device_api.attr,
> >  	&mdev_type_attr_available_instances.attr,
> > +	&mdev_type_attr_version.attr,
> >  	NULL,
> >  };
> >  
> 
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Alex Williamson April 23, 2019, 1:21 a.m. UTC | #3
On Mon, 22 Apr 2019 21:01:52 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > On Fri, 19 Apr 2019 04:35:04 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > device version attribute in mdev sysfs is used by user space software
> > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > migration.  
> > 
> > The Subject: doesn't quite match what's being proposed here.
> >    
> > > It consists of two parts: common part and vendor proprietary part.
> > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > >              identifies device type. e.g., for pci device, it is
> > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).  
> > 
> > What purpose does this serve?  If it's intended as some sort of
> > namespace feature, shouldn't we first assume that we can only support
> > migration to devices of the same type?  Therefore each type would
> > already have its own namespace.  Also that would make the trailing bit
> > of the version string listed below in the example redundant.  A vendor
> > is still welcome to include this in their version string if they wish,
> > but I think the string should be entirely vendor defined.
> >  
> hi Alex,
> This common part is a kind of namespace.
> Because if version string is entirely defined by vendors, I'm worried about
> if there is a case that one vendor's version string happens to deceive and
> interfere with another vendor's version checking?
> e.g.
> vendor A has a version string like: vendor id + device id + mdev type
> vendor B has a version string like: device id + vendor id + mdev type
> but vendor A's vendor id is 0x8086, device id is 0x1217
> vendor B's vendor id is 0x1217, device id is 0x8086.
> 
> In this corner case, the two vendors may regard the two device is
> migratable but actually they are not.
> 
> That's the reason for this common part that serve as a kind of namespace
> that all vendors will comply with to avoid overlap.

If we assume that migration can only occur between matching mdev types,
this is redundant, each type already has their own namespace.

> > > vendor proprietary part: this part is varied in length. vendor driver can
> > >              specify any string to identify a device.
> > > 
> > > When reading this attribute, it should show device version string of the
> > > device of type <type-id>. If a device does not support live migration, it
> > > should return errno.
> > > When writing a string to this attribute, it returns errno for
> > > incompatibility or returns written string length in compatibility case.
> > > If a device does not support live migration, it always returns errno.
> > > 
> > > For user space software to use:
> > > 1.
> > > Before starting live migration, user space software first reads source side
> > > mdev device's version. e.g.
> > > "#cat \
> > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > 00028086-193b-i915-GVTg_V5_4
> > > 
> > > 2.
> > > Then, user space software writes the source side returned version string
> > > to device version attribute in target side, and checks the return value.
> > > If a negative errno is returned in the target side, then mdev devices in
> > > source and target sides are not compatible;
> > > If a positive number is returned and it equals to the length of written
> > > string, then the two mdev devices in source and target side are compatible.
> > > e.g.
> > > (a) compatibility case
> > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > 
> > > (b) incompatibility case
> > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > -bash: echo: write error: Invalid argument
> > > 
> > > 3. if two mdev devices are compatible, user space software can start
> > > live migration, and vice versa.
> > > 
> > > Note: if a mdev device does not support live migration, it either does
> > > not provide a version attribute, or always returns errno when its version
> > > attribute is read/written.  
> > 
> > I think it would be cleaner to do the former, not supply the
> > attribute.  This seems to do the latter in the sample drivers.  Thanks,  
> Ok. you are right!
> what about just keep one sample driver to show how to do the latter,
> and let the others do the former?

I'd rather that if a vendor driver doesn't support features requiring
the version attribute that they don't implement it.  It's confusing to
developers looking at the sample driver for guidance if we have
different implementations.  Of course if you'd like to add migration
support to one of the sample drivers, that'd be very welcome.  Thanks,

Alex

> > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > Cc: Erik Skultety <eskultet@redhat.com>
> > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > Cc: Neo Jia <cjia@nvidia.com>
> > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > 
> > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > ---
> > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > >  4 files changed, 85 insertions(+)
> > > 
> > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > index c3f69bcaf96e..bc28471c0667 100644
> > > --- a/Documentation/vfio-mediated-device.txt
> > > +++ b/Documentation/vfio-mediated-device.txt
> > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > >    |     |   |--- available_instances
> > >    |     |   |--- device_api
> > >    |     |   |--- description
> > > +  |     |   |--- version
> > >    |     |   |--- [devices]
> > >    |     |--- [<type-id>]
> > >    |     |   |--- create
> > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > >    |     |   |--- available_instances
> > >    |     |   |--- device_api
> > >    |     |   |--- description
> > > +  |     |   |--- version
> > >    |     |   |--- [devices]
> > >    |     |--- [<type-id>]
> > >    |          |--- create
> > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > >    |          |--- available_instances
> > >    |          |--- device_api
> > >    |          |--- description
> > > +  |          |--- version
> > >    |          |--- [devices]
> > >  
> > >  * [mdev_supported_types]
> > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > >    that should be provided by vendor driver.
> > >  
> > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > +
> > >  * [<type-id>]
> > >  
> > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > >    This attribute should show the number of devices of type <type-id> that can be
> > >    created.
> > >  
> > > +* version
> > > +
> > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > +  device is regarded as not supporting live migration.
> > > +
> > > +  It consists of two parts: common part and vendor proprietary part.
> > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > +               device type. e.g., for pci device, it is
> > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > +               specify any string to identify a device.
> > > +
> > > +  When reading this attribute, it should show device version string of the device
> > > +  of type <type-id>. If a device does not support live migration, it should
> > > +  return errno.
> > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > +  or returns written string length in compatibility case. If a device does not
> > > +  support live migration, it always returns errno.
> > > +
> > > +  for example.
> > > +  # cat \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > +  00028086-193b-i915-GVTg_V5_2
> > > +
> > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > + -bash: echo: write error: Invalid argument
> > > +
> > >  * [device]
> > >  
> > >    This directory contains links to the devices of type <type-id> that have been
> > > @@ -327,12 +361,14 @@ card.
> > >          |   |   |-- available_instances
> > >          |   |   |-- create
> > >          |   |   |-- device_api
> > > +        |   |   |-- version
> > >          |   |   |-- devices
> > >          |   |   `-- name
> > >          |   `-- mtty-2
> > >          |       |-- available_instances
> > >          |       |-- create
> > >          |       |-- device_api
> > > +        |       |-- version
> > >          |       |-- devices
> > >          |       `-- name
> > >          |-- mtty_dev
> > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > --- a/samples/vfio-mdev/mbochs.c
> > > +++ b/samples/vfio-mdev/mbochs.c
> > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > >  }
> > >  MDEV_TYPE_ATTR_RO(device_api);
> > >  
> > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > +		char *buf)
> > > +{
> > > +	/* do not support live migration */
> > > +	return -EINVAL;
> > > +}
> > > +
> > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > +		const char *buf, size_t count)
> > > +{
> > > +	/* do not support live migration */
> > > +	return -EINVAL;
> > > +}
> > > +
> > > +static MDEV_TYPE_ATTR_RW(version);
> > > +
> > >  static struct attribute *mdev_types_attrs[] = {
> > >  	&mdev_type_attr_name.attr,
> > >  	&mdev_type_attr_description.attr,
> > >  	&mdev_type_attr_device_api.attr,
> > >  	&mdev_type_attr_available_instances.attr,
> > > +	&mdev_type_attr_version.attr,
> > >  	NULL,
> > >  };
> > >  
> > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > --- a/samples/vfio-mdev/mdpy.c
> > > +++ b/samples/vfio-mdev/mdpy.c
> > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > >  }
> > >  MDEV_TYPE_ATTR_RO(device_api);
> > >  
> > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > +		char *buf)
> > > +{
> > > +	/* do not support live migration */
> > > +	return -EINVAL;
> > > +}
> > > +
> > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > +		const char *buf, size_t count)
> > > +{
> > > +	/* do not support live migration */
> > > +	return -EINVAL;
> > > +}
> > > +static MDEV_TYPE_ATTR_RW(version);
> > > +
> > >  static struct attribute *mdev_types_attrs[] = {
> > >  	&mdev_type_attr_name.attr,
> > >  	&mdev_type_attr_description.attr,
> > >  	&mdev_type_attr_device_api.attr,
> > >  	&mdev_type_attr_available_instances.attr,
> > > +	&mdev_type_attr_version.attr,
> > >  	NULL,
> > >  };
> > >  
> > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > index 1c77c370c92f..4ae3aad3474d 100644
> > > --- a/samples/vfio-mdev/mtty.c
> > > +++ b/samples/vfio-mdev/mtty.c
> > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > >  
> > >  MDEV_TYPE_ATTR_RO(device_api);
> > >  
> > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > +		char *buf)
> > > +{
> > > +	/* do not support live migration */
> > > +	return -EINVAL;
> > > +}
> > > +
> > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > +		const char *buf, size_t count)
> > > +{
> > > +	/* do not support live migration */
> > > +	return -EINVAL;
> > > +}
> > > +
> > > +static MDEV_TYPE_ATTR_RW(version);
> > >  static struct attribute *mdev_types_attrs[] = {
> > >  	&mdev_type_attr_name.attr,
> > >  	&mdev_type_attr_device_api.attr,
> > >  	&mdev_type_attr_available_instances.attr,
> > > +	&mdev_type_attr_version.attr,
> > >  	NULL,
> > >  };
> > >    
> > 
> > _______________________________________________
> > intel-gvt-dev mailing list
> > intel-gvt-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Yan Zhao April 23, 2019, 5:41 a.m. UTC | #4
On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> On Mon, 22 Apr 2019 21:01:52 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > >   
> > > > device version attribute in mdev sysfs is used by user space software
> > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > migration.  
> > > 
> > > The Subject: doesn't quite match what's being proposed here.
> > >    
> > > > It consists of two parts: common part and vendor proprietary part.
> > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > >              identifies device type. e.g., for pci device, it is
> > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).  
> > > 
> > > What purpose does this serve?  If it's intended as some sort of
> > > namespace feature, shouldn't we first assume that we can only support
> > > migration to devices of the same type?  Therefore each type would
> > > already have its own namespace.  Also that would make the trailing bit
> > > of the version string listed below in the example redundant.  A vendor
> > > is still welcome to include this in their version string if they wish,
> > > but I think the string should be entirely vendor defined.
> > >  
> > hi Alex,
> > This common part is a kind of namespace.
> > Because if version string is entirely defined by vendors, I'm worried about
> > if there is a case that one vendor's version string happens to deceive and
> > interfere with another vendor's version checking?
> > e.g.
> > vendor A has a version string like: vendor id + device id + mdev type
> > vendor B has a version string like: device id + vendor id + mdev type
> > but vendor A's vendor id is 0x8086, device id is 0x1217
> > vendor B's vendor id is 0x1217, device id is 0x8086.
> > 
> > In this corner case, the two vendors may regard the two device is
> > migratable but actually they are not.
> > 
> > That's the reason for this common part that serve as a kind of namespace
> > that all vendors will comply with to avoid overlap.
> 
> If we assume that migration can only occur between matching mdev types,
> this is redundant, each type already has their own namespace.
>
hi Alex,
do you mean user space software like libvirt needs to first check whether
mdev type is matching and then check whether version is matching?

if user space software only checks version for migration, it means vendor
driver has to include mdev type in their vendor proprietary part string,
right?

Another thing is that could there be any future mdev parent driver that
applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
driver (https://lkml.org/lkml/2019/3/13/114)?

> > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > >              specify any string to identify a device.
> > > > 
> > > > When reading this attribute, it should show device version string of the
> > > > device of type <type-id>. If a device does not support live migration, it
> > > > should return errno.
> > > > When writing a string to this attribute, it returns errno for
> > > > incompatibility or returns written string length in compatibility case.
> > > > If a device does not support live migration, it always returns errno.
> > > > 
> > > > For user space software to use:
> > > > 1.
> > > > Before starting live migration, user space software first reads source side
> > > > mdev device's version. e.g.
> > > > "#cat \
> > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > 00028086-193b-i915-GVTg_V5_4
> > > > 
> > > > 2.
> > > > Then, user space software writes the source side returned version string
> > > > to device version attribute in target side, and checks the return value.
> > > > If a negative errno is returned in the target side, then mdev devices in
> > > > source and target sides are not compatible;
> > > > If a positive number is returned and it equals to the length of written
> > > > string, then the two mdev devices in source and target side are compatible.
> > > > e.g.
> > > > (a) compatibility case
> > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > 
> > > > (b) incompatibility case
> > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > -bash: echo: write error: Invalid argument
> > > > 
> > > > 3. if two mdev devices are compatible, user space software can start
> > > > live migration, and vice versa.
> > > > 
> > > > Note: if a mdev device does not support live migration, it either does
> > > > not provide a version attribute, or always returns errno when its version
> > > > attribute is read/written.  
> > > 
> > > I think it would be cleaner to do the former, not supply the
> > > attribute.  This seems to do the latter in the sample drivers.  Thanks,  
> > Ok. you are right!
> > what about just keep one sample driver to show how to do the latter,
> > and let the others do the former?
> 
> I'd rather that if a vendor driver doesn't support features requiring
> the version attribute that they don't implement it.  It's confusing to
> developers looking at the sample driver for guidance if we have
> different implementations.  Of course if you'd like to add migration
> support to one of the sample drivers, that'd be very welcome.  Thanks,
>
Got it:)

Thanks!
Yan

> 
> > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > Cc: Erik Skultety <eskultet@redhat.com>
> > > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > > Cc: Neo Jia <cjia@nvidia.com>
> > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > > 
> > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > ---
> > > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > > >  4 files changed, 85 insertions(+)
> > > > 
> > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > --- a/Documentation/vfio-mediated-device.txt
> > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > >    |     |   |--- available_instances
> > > >    |     |   |--- device_api
> > > >    |     |   |--- description
> > > > +  |     |   |--- version
> > > >    |     |   |--- [devices]
> > > >    |     |--- [<type-id>]
> > > >    |     |   |--- create
> > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > >    |     |   |--- available_instances
> > > >    |     |   |--- device_api
> > > >    |     |   |--- description
> > > > +  |     |   |--- version
> > > >    |     |   |--- [devices]
> > > >    |     |--- [<type-id>]
> > > >    |          |--- create
> > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > >    |          |--- available_instances
> > > >    |          |--- device_api
> > > >    |          |--- description
> > > > +  |          |--- version
> > > >    |          |--- [devices]
> > > >  
> > > >  * [mdev_supported_types]
> > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > >    that should be provided by vendor driver.
> > > >  
> > > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > > +
> > > >  * [<type-id>]
> > > >  
> > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > >    This attribute should show the number of devices of type <type-id> that can be
> > > >    created.
> > > >  
> > > > +* version
> > > > +
> > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > +  device is regarded as not supporting live migration.
> > > > +
> > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > +               device type. e.g., for pci device, it is
> > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > +               specify any string to identify a device.
> > > > +
> > > > +  When reading this attribute, it should show device version string of the device
> > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > +  return errno.
> > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > +  or returns written string length in compatibility case. If a device does not
> > > > +  support live migration, it always returns errno.
> > > > +
> > > > +  for example.
> > > > +  # cat \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > +  00028086-193b-i915-GVTg_V5_2
> > > > +
> > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > + -bash: echo: write error: Invalid argument
> > > > +
> > > >  * [device]
> > > >  
> > > >    This directory contains links to the devices of type <type-id> that have been
> > > > @@ -327,12 +361,14 @@ card.
> > > >          |   |   |-- available_instances
> > > >          |   |   |-- create
> > > >          |   |   |-- device_api
> > > > +        |   |   |-- version
> > > >          |   |   |-- devices
> > > >          |   |   `-- name
> > > >          |   `-- mtty-2
> > > >          |       |-- available_instances
> > > >          |       |-- create
> > > >          |       |-- device_api
> > > > +        |       |-- version
> > > >          |       |-- devices
> > > >          |       `-- name
> > > >          |-- mtty_dev
> > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > --- a/samples/vfio-mdev/mbochs.c
> > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > >  }
> > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > >  
> > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > +		char *buf)
> > > > +{
> > > > +	/* do not support live migration */
> > > > +	return -EINVAL;
> > > > +}
> > > > +
> > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > +		const char *buf, size_t count)
> > > > +{
> > > > +	/* do not support live migration */
> > > > +	return -EINVAL;
> > > > +}
> > > > +
> > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > +
> > > >  static struct attribute *mdev_types_attrs[] = {
> > > >  	&mdev_type_attr_name.attr,
> > > >  	&mdev_type_attr_description.attr,
> > > >  	&mdev_type_attr_device_api.attr,
> > > >  	&mdev_type_attr_available_instances.attr,
> > > > +	&mdev_type_attr_version.attr,
> > > >  	NULL,
> > > >  };
> > > >  
> > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > --- a/samples/vfio-mdev/mdpy.c
> > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > >  }
> > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > >  
> > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > +		char *buf)
> > > > +{
> > > > +	/* do not support live migration */
> > > > +	return -EINVAL;
> > > > +}
> > > > +
> > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > +		const char *buf, size_t count)
> > > > +{
> > > > +	/* do not support live migration */
> > > > +	return -EINVAL;
> > > > +}
> > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > +
> > > >  static struct attribute *mdev_types_attrs[] = {
> > > >  	&mdev_type_attr_name.attr,
> > > >  	&mdev_type_attr_description.attr,
> > > >  	&mdev_type_attr_device_api.attr,
> > > >  	&mdev_type_attr_available_instances.attr,
> > > > +	&mdev_type_attr_version.attr,
> > > >  	NULL,
> > > >  };
> > > >  
> > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > --- a/samples/vfio-mdev/mtty.c
> > > > +++ b/samples/vfio-mdev/mtty.c
> > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > >  
> > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > >  
> > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > +		char *buf)
> > > > +{
> > > > +	/* do not support live migration */
> > > > +	return -EINVAL;
> > > > +}
> > > > +
> > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > +		const char *buf, size_t count)
> > > > +{
> > > > +	/* do not support live migration */
> > > > +	return -EINVAL;
> > > > +}
> > > > +
> > > > +static MDEV_TYPE_ATTR_RW(version);
> > > >  static struct attribute *mdev_types_attrs[] = {
> > > >  	&mdev_type_attr_name.attr,
> > > >  	&mdev_type_attr_device_api.attr,
> > > >  	&mdev_type_attr_available_instances.attr,
> > > > +	&mdev_type_attr_version.attr,
> > > >  	NULL,
> > > >  };
> > > >    
> > > 
> > > _______________________________________________
> > > intel-gvt-dev mailing list
> > > intel-gvt-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev  
> 
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Cornelia Huck April 23, 2019, 9:45 a.m. UTC | #5
On Tue, 23 Apr 2019 01:41:57 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:  
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > >     
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > migration.    
> > > > 
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >      
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > >              identifies device type. e.g., for pci device, it is
> > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).    
> > > > 
> > > > What purpose does this serve?  If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type?  Therefore each type would
> > > > already have its own namespace.  Also that would make the trailing bit
> > > > of the version string listed below in the example redundant.  A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >    
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > 
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > > 
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.  
> > 
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >  
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?
> 
> if user space software only checks version for migration, it means vendor
> driver has to include mdev type in their vendor proprietary part string,
> right?

Can't userspace simply check for the driver in use and only then check
the version attribute?

> Another thing is that could there be any future mdev parent driver that
> applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> driver (https://lkml.org/lkml/2019/3/13/114)?

Hm, I think that the vfio-pci-mdev driver then needs to expose
information regarding compatibility (and not the core).
Cornelia Huck April 23, 2019, 9:59 a.m. UTC | #6
On Fri, 19 Apr 2019 04:35:04 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> device version attribute in mdev sysfs is used by user space software
> (e.g. libvirt) to query device compatibility for live migration of VFIO
> mdev devices. This attribute is mandatory if a mdev device supports live
> migration.
> 
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>              identifies device type. e.g., for pci device, it is
>              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
>              specify any string to identify a device.
> 
> When reading this attribute, it should show device version string of the
> device of type <type-id>. If a device does not support live migration, it
> should return errno.

It might make more sense if the driver does not register the attribute
for the device in that case at all.

> When writing a string to this attribute, it returns errno for
> incompatibility or returns written string length in compatibility case.
> If a device does not support live migration, it always returns errno.
> 
> For user space software to use:
> 1.
> Before starting live migration, user space software first reads source side
> mdev device's version. e.g.
> "#cat \
> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> 00028086-193b-i915-GVTg_V5_4
> 
> 2.
> Then, user space software writes the source side returned version string
> to device version attribute in target side, and checks the return value.
> If a negative errno is returned in the target side, then mdev devices in
> source and target sides are not compatible;
> If a positive number is returned and it equals to the length of written
> string, then the two mdev devices in source and target side are compatible.
> e.g.
> (a) compatibility case
> "# echo 00028086-193b-i915-GVTg_V5_4 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> 
> (b) incompatibility case
> "#echo 00028086-193b-i915-GVTg_V5_1 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> -bash: echo: write error: Invalid argument
> 
> 3. if two mdev devices are compatible, user space software can start
> live migration, and vice versa.
> 
> Note: if a mdev device does not support live migration, it either does
> not provide a version attribute, or always returns errno when its version
> attribute is read/written.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Erik Skultety <eskultet@redhat.com>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> Cc: Neo Jia <cjia@nvidia.com>
> Cc: Kirti Wankhede <kwankhede@nvidia.com>
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
>  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
>  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
>  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
>  4 files changed, 85 insertions(+)
> 
> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> index c3f69bcaf96e..bc28471c0667 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
>    |     |   |--- available_instances
>    |     |   |--- device_api
>    |     |   |--- description
> +  |     |   |--- version
>    |     |   |--- [devices]
>    |     |--- [<type-id>]
>    |     |   |--- create
> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
>    |     |   |--- available_instances
>    |     |   |--- device_api
>    |     |   |--- description
> +  |     |   |--- version
>    |     |   |--- [devices]
>    |     |--- [<type-id>]
>    |          |--- create
> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
>    |          |--- available_instances
>    |          |--- device_api
>    |          |--- description
> +  |          |--- version
>    |          |--- [devices]
>  
>  * [mdev_supported_types]
> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
>    [<type-id>], device_api, and available_instances are mandatory attributes
>    that should be provided by vendor driver.
>  
> +  version is a mandatory attribute if a mdev device supports live migration.

What about "An mdev device wishing to support live migration must
provide the version attribute."?

> +
>  * [<type-id>]
>  
>    The [<type-id>] name is created by adding the device driver string as a prefix
> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
>    This attribute should show the number of devices of type <type-id> that can be
>    created.
>  
> +* version
> +
> +  This attribute is rw. It is used to check whether two devices are compatible
> +  for live migration. If this attribute is missing, then the corresponding mdev
> +  device is regarded as not supporting live migration.
> +
> +  It consists of two parts: common part and vendor proprietary part.
> +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> +               device type. e.g., for pci device, it is
> +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> +  vendor proprietary part: this part is varied in length. vendor driver can
> +               specify any string to identify a device.
> +
> +  When reading this attribute, it should show device version string of the device
> +  of type <type-id>. If a device does not support live migration, it should
> +  return errno.
> +  When writing a string to this attribute, it returns errno for incompatibility
> +  or returns written string length in compatibility case. If a device does not
> +  support live migration, it always returns errno.

I'm not sure whether a device that does not support live migration
should expose this attribute in the first place. Or is that to cover
cases where a driver supports live migration only for some of the
devices it supports?

Also, I'm not sure if a string that has to be parsed is a good idea...
is this 'version' attribute supposed to convey some human-readable
information as well? The procedure you describe for compatibility
checking does the checking within the vendor driver which I would
expect to have a table/rules for that anyway.

I think you should also specify which errno writing an incompatible id
is supposed to return (probably best something different than if the
device does not support live migration at all, if we stick with
creating the attribute in that case.)

> +
> +  for example.
> +  # cat \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> +  00028086-193b-i915-GVTg_V5_2
> +
> +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> + -bash: echo: write error: Invalid argument
> +
>  * [device]
>  
>    This directory contains links to the devices of type <type-id> that have been
Daniel P. Berrangé April 23, 2019, 10:24 a.m. UTC | #7
On Tue, Apr 23, 2019 at 01:41:57AM -0400, Yan Zhao wrote:
> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > 
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > >   
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > migration.  
> > > > 
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >    
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > >              identifies device type. e.g., for pci device, it is
> > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).  
> > > > 
> > > > What purpose does this serve?  If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type?  Therefore each type would
> > > > already have its own namespace.  Also that would make the trailing bit
> > > > of the version string listed below in the example redundant.  A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >  
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > 
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > > 
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.
> > 
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?

I would expect that libvirt (or other mgmt apps) will always first
check that the vendor id, device id, mdev type all match. So for the
version string it should suffice to be a "normal" numeric value.

Essentially version string just needs to be there to distinguish
revisions of the same mdev type over time.


Regards,
Daniel
Daniel P. Berrangé April 23, 2019, 10:39 a.m. UTC | #8
On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> device version attribute in mdev sysfs is used by user space software
> (e.g. libvirt) to query device compatibility for live migration of VFIO
> mdev devices. This attribute is mandatory if a mdev device supports live
> migration.
> 
> It consists of two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>              identifies device type. e.g., for pci device, it is
>              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
>              specify any string to identify a device.
> 
> When reading this attribute, it should show device version string of the
> device of type <type-id>. If a device does not support live migration, it
> should return errno.
> When writing a string to this attribute, it returns errno for
> incompatibility or returns written string length in compatibility case.
> If a device does not support live migration, it always returns errno.
> 
> For user space software to use:
> 1.
> Before starting live migration, user space software first reads source side
> mdev device's version. e.g.
> "#cat \
> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> 00028086-193b-i915-GVTg_V5_4
> 
> 2.
> Then, user space software writes the source side returned version string
> to device version attribute in target side, and checks the return value.
> If a negative errno is returned in the target side, then mdev devices in
> source and target sides are not compatible;
> If a positive number is returned and it equals to the length of written
> string, then the two mdev devices in source and target side are compatible.
> e.g.
> (a) compatibility case
> "# echo 00028086-193b-i915-GVTg_V5_4 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> 
> (b) incompatibility case
> "#echo 00028086-193b-i915-GVTg_V5_1 >
> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> -bash: echo: write error: Invalid argument

What you have written here seems to imply that each mdev type is able to
support many different versions at the same time. Writing a version into
this sysfs file then chooses which of the many versions to actually use.

This is good as it allows for live migration across driver software upgrades.

A mgmt application may well want to know what versions are supported for an
mdev type *before* starting a migration. A mgmt app can query all the 100's
of hosts it knows and thus figure out which are valid to use as the target
of a migration.

IOW, we want to avoid the ever hitting the incompatibility case in the
first place, by only choosing to migrate to a host that we know is going
to be compatible.

This would need some kind of way to report the full list of supported
versions against the mdev supported types on the host.


> 3. if two mdev devices are compatible, user space software can start
> live migration, and vice versa.
> 
> Note: if a mdev device does not support live migration, it either does
> not provide a version attribute, or always returns errno when its version
> attribute is read/written.
> 
> Cc: Alex Williamson <alex.williamson@redhat.com>
> Cc: Erik Skultety <eskultet@redhat.com>
> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> Cc: Cornelia Huck <cohuck@redhat.com>
> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> Cc: Neo Jia <cjia@nvidia.com>
> Cc: Kirti Wankhede <kwankhede@nvidia.com>
> 
> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> ---
>  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
>  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
>  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
>  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
>  4 files changed, 85 insertions(+)
> 
> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> index c3f69bcaf96e..bc28471c0667 100644
> --- a/Documentation/vfio-mediated-device.txt
> +++ b/Documentation/vfio-mediated-device.txt
> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
>    |     |   |--- available_instances
>    |     |   |--- device_api
>    |     |   |--- description
> +  |     |   |--- version
>    |     |   |--- [devices]
>    |     |--- [<type-id>]
>    |     |   |--- create
> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
>    |     |   |--- available_instances
>    |     |   |--- device_api
>    |     |   |--- description
> +  |     |   |--- version
>    |     |   |--- [devices]
>    |     |--- [<type-id>]
>    |          |--- create
> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
>    |          |--- available_instances
>    |          |--- device_api
>    |          |--- description
> +  |          |--- version
>    |          |--- [devices]
>  
>  * [mdev_supported_types]
> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
>    [<type-id>], device_api, and available_instances are mandatory attributes
>    that should be provided by vendor driver.
>  
> +  version is a mandatory attribute if a mdev device supports live migration.
> +
>  * [<type-id>]
>  
>    The [<type-id>] name is created by adding the device driver string as a prefix
> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
>    This attribute should show the number of devices of type <type-id> that can be
>    created.
>  
> +* version
> +
> +  This attribute is rw. It is used to check whether two devices are compatible
> +  for live migration. If this attribute is missing, then the corresponding mdev
> +  device is regarded as not supporting live migration.
> +
> +  It consists of two parts: common part and vendor proprietary part.
> +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> +               device type. e.g., for pci device, it is
> +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> +  vendor proprietary part: this part is varied in length. vendor driver can
> +               specify any string to identify a device.
> +
> +  When reading this attribute, it should show device version string of the device
> +  of type <type-id>. If a device does not support live migration, it should
> +  return errno.
> +  When writing a string to this attribute, it returns errno for incompatibility
> +  or returns written string length in compatibility case. If a device does not
> +  support live migration, it always returns errno.
> +
> +  for example.
> +  # cat \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> +  00028086-193b-i915-GVTg_V5_2
> +
> +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> + -bash: echo: write error: Invalid argument
> +

IIUC this path is against the physical device. IOW, the mgmt app would have
to first write to the "version" file to choose a version, and then write to
the "create" file to actually create an virtual device. This has the obvious
concurrency problem if multiple devices are being created at the same time
and distinct versions for each device are required. There would need to be
a locking scheme defined to ensure safety.

Wouldn't it be better if we can pass the desired version when we write to
the "create" file, so that we avoid any concurrent usage problems. "version"
could be just a read-only file with a *list* of supported versions.

eg

  $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
  5.0
  5.1
  5.2

  $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
      /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create


Regards,
Daniel
Alex Williamson April 23, 2019, 12:35 p.m. UTC | #9
On Tue, 23 Apr 2019 11:39:39 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> > 
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >              identifies device type. e.g., for pci device, it is
> >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> >              specify any string to identify a device.
> > 
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> > 
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> > 
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > 
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument  
> 
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.

That's not actually what's being proposed here, reading the version
attribute provides an opaque string.  Writing the version attribute
from a different system reports whether that version string is
compatible for migration.  IOW, we're not creating a device of a given
version, we're only reporting if that version is compatible with this
mdev.  The version is intentionally opaque to allow vendor specific
nuances, therefore it's rather impractical create the sort of version
list requested below.

> This is good as it allows for live migration across driver software upgrades.
> 
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
> 
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.

This is provided, the migration source reports a version string which
can be queried against the version attribute on other hosts to insure
compatibility prior to migration.

> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.

This is not provided, matching versions are vendor specific.

> > 3. if two mdev devices are compatible, user space software can start
> > live migration, and vice versa.
> > 
> > Note: if a mdev device does not support live migration, it either does
> > not provide a version attribute, or always returns errno when its version
> > attribute is read/written.
> > 
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Erik Skultety <eskultet@redhat.com>
> > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > Cc: Neo Jia <cjia@nvidia.com>
> > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > 
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---
> >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> >  4 files changed, 85 insertions(+)
> > 
> > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > index c3f69bcaf96e..bc28471c0667 100644
> > --- a/Documentation/vfio-mediated-device.txt
> > +++ b/Documentation/vfio-mediated-device.txt
> > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |     |   |--- available_instances
> >    |     |   |--- device_api
> >    |     |   |--- description
> > +  |     |   |--- version
> >    |     |   |--- [devices]
> >    |     |--- [<type-id>]
> >    |     |   |--- create
> > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |     |   |--- available_instances
> >    |     |   |--- device_api
> >    |     |   |--- description
> > +  |     |   |--- version
> >    |     |   |--- [devices]
> >    |     |--- [<type-id>]
> >    |          |--- create
> > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |          |--- available_instances
> >    |          |--- device_api
> >    |          |--- description
> > +  |          |--- version
> >    |          |--- [devices]
> >  
> >  * [mdev_supported_types]
> > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> >    [<type-id>], device_api, and available_instances are mandatory attributes
> >    that should be provided by vendor driver.
> >  
> > +  version is a mandatory attribute if a mdev device supports live migration.
> > +
> >  * [<type-id>]
> >  
> >    The [<type-id>] name is created by adding the device driver string as a prefix
> > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> >    This attribute should show the number of devices of type <type-id> that can be
> >    created.
> >  
> > +* version
> > +
> > +  This attribute is rw. It is used to check whether two devices are compatible
> > +  for live migration. If this attribute is missing, then the corresponding mdev
> > +  device is regarded as not supporting live migration.
> > +
> > +  It consists of two parts: common part and vendor proprietary part.
> > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > +               device type. e.g., for pci device, it is
> > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > +  vendor proprietary part: this part is varied in length. vendor driver can
> > +               specify any string to identify a device.
> > +
> > +  When reading this attribute, it should show device version string of the device
> > +  of type <type-id>. If a device does not support live migration, it should
> > +  return errno.
> > +  When writing a string to this attribute, it returns errno for incompatibility
> > +  or returns written string length in compatibility case. If a device does not
> > +  support live migration, it always returns errno.
> > +
> > +  for example.
> > +  # cat \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > +  00028086-193b-i915-GVTg_V5_2
> > +
> > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > + -bash: echo: write error: Invalid argument
> > +  
> 
> IIUC this path is against the physical device. IOW, the mgmt app would have
> to first write to the "version" file to choose a version, and then write to
> the "create" file to actually create an virtual device. This has the obvious
> concurrency problem if multiple devices are being created at the same time
> and distinct versions for each device are required. There would need to be
> a locking scheme defined to ensure safety.

"Create a device of a given version" is not an intended feature of this
interface aiui.  Writing the version attribute only indicates
migration compatibility with a binary result.
 
> Wouldn't it be better if we can pass the desired version when we write to
> the "create" file, so that we avoid any concurrent usage problems. "version"
> could be just a read-only file with a *list* of supported versions.
> 
> eg
> 
>   $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
>   5.0
>   5.1
>   5.2
> 
>   $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
>       /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create

This is reminiscent of the proposed aggregation support, but again,
this sort of feature is not intended here.  It's no expected that any
vendor driver would support creating device types of different
versions, but they may support migration from different versions.
Thanks,

Alex
Daniel P. Berrangé April 23, 2019, 1:44 p.m. UTC | #10
On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> On Tue, 23 Apr 2019 11:39:39 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > > +* version
> > > +
> > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > +  device is regarded as not supporting live migration.
> > > +
> > > +  It consists of two parts: common part and vendor proprietary part.
> > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > +               device type. e.g., for pci device, it is
> > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > +               specify any string to identify a device.
> > > +
> > > +  When reading this attribute, it should show device version string of the device
> > > +  of type <type-id>. If a device does not support live migration, it should
> > > +  return errno.
> > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > +  or returns written string length in compatibility case. If a device does not
> > > +  support live migration, it always returns errno.
> > > +
> > > +  for example.
> > > +  # cat \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > +  00028086-193b-i915-GVTg_V5_2
> > > +
> > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > + -bash: echo: write error: Invalid argument
> > > +  
> > 
> > IIUC this path is against the physical device. IOW, the mgmt app would have
> > to first write to the "version" file to choose a version, and then write to
> > the "create" file to actually create an virtual device. This has the obvious
> > concurrency problem if multiple devices are being created at the same time
> > and distinct versions for each device are required. There would need to be
> > a locking scheme defined to ensure safety.
> 
> "Create a device of a given version" is not an intended feature of this
> interface aiui.  Writing the version attribute only indicates
> migration compatibility with a binary result.
>  
> > Wouldn't it be better if we can pass the desired version when we write to
> > the "create" file, so that we avoid any concurrent usage problems. "version"
> > could be just a read-only file with a *list* of supported versions.
> > 
> > eg
> > 
> >   $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> >   5.0
> >   5.1
> >   5.2
> > 
> >   $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> >       /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
> 
> This is reminiscent of the proposed aggregation support, but again,
> this sort of feature is not intended here.  It's no expected that any
> vendor driver would support creating device types of different
> versions, but they may support migration from different versions.

Hmm, that's a subtle distinction I wasn't seeing the patch series.
IIUC from what you're saying, a device can be created with version
"C", but for an incoming migration it can (potentially) accept
serialized state matching any of versions "A", "B" or "C".

That is sufficient if migration is being used as a host upgrade
tool, to move from OS release N to N + 1.

It wouldn't cover the case where you need to support backwards
migration too. eg if you have a mixture of hosts with release
N and N + 1 and want to make sure that VMs can always move
betweeen any host.  That would require that you can create
mdevs with the lowest common denominator version, not solely
the most recent.

In QEMU this is done by mgmt applications picking a versioned
machine type for QEMU that is older than most recent.

Regards,
Daniel
Alex Williamson April 23, 2019, 2:48 p.m. UTC | #11
On Tue, 23 Apr 2019 14:44:00 +0100
Daniel P. Berrangé <berrange@redhat.com> wrote:

> On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> > On Tue, 23 Apr 2019 11:39:39 +0100
> > Daniel P. Berrangé <berrange@redhat.com> wrote:
> >   
> > > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:  
> > > > +* version
> > > > +
> > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > +  device is regarded as not supporting live migration.
> > > > +
> > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > +               device type. e.g., for pci device, it is
> > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > +               specify any string to identify a device.
> > > > +
> > > > +  When reading this attribute, it should show device version string of the device
> > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > +  return errno.
> > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > +  or returns written string length in compatibility case. If a device does not
> > > > +  support live migration, it always returns errno.
> > > > +
> > > > +  for example.
> > > > +  # cat \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > +  00028086-193b-i915-GVTg_V5_2
> > > > +
> > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > + -bash: echo: write error: Invalid argument
> > > > +    
> > > 
> > > IIUC this path is against the physical device. IOW, the mgmt app would have
> > > to first write to the "version" file to choose a version, and then write to
> > > the "create" file to actually create an virtual device. This has the obvious
> > > concurrency problem if multiple devices are being created at the same time
> > > and distinct versions for each device are required. There would need to be
> > > a locking scheme defined to ensure safety.  
> > 
> > "Create a device of a given version" is not an intended feature of this
> > interface aiui.  Writing the version attribute only indicates
> > migration compatibility with a binary result.
> >    
> > > Wouldn't it be better if we can pass the desired version when we write to
> > > the "create" file, so that we avoid any concurrent usage problems. "version"
> > > could be just a read-only file with a *list* of supported versions.
> > > 
> > > eg
> > > 
> > >   $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > >   5.0
> > >   5.1
> > >   5.2
> > > 
> > >   $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> > >       /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create  
> > 
> > This is reminiscent of the proposed aggregation support, but again,
> > this sort of feature is not intended here.  It's no expected that any
> > vendor driver would support creating device types of different
> > versions, but they may support migration from different versions.  
> 
> Hmm, that's a subtle distinction I wasn't seeing the patch series.
> IIUC from what you're saying, a device can be created with version
> "C", but for an incoming migration it can (potentially) accept
> serialized state matching any of versions "A", "B" or "C".
> 
> That is sufficient if migration is being used as a host upgrade
> tool, to move from OS release N to N + 1.
> 
> It wouldn't cover the case where you need to support backwards
> migration too. eg if you have a mixture of hosts with release
> N and N + 1 and want to make sure that VMs can always move
> betweeen any host.  That would require that you can create
> mdevs with the lowest common denominator version, not solely
> the most recent.
> 
> In QEMU this is done by mgmt applications picking a versioned
> machine type for QEMU that is older than most recent.

I suppose we'd need to determine how important that is, this is just a
device after all, there are always fallback mechanisms via hotplug.
There are a lot of pieces that need to line up for backwards migration
including support for it at the individual vendor driver.  Nothing we
design into the API is going to require vendor drivers to support
backwards migration and we already have some vendors requiring host and
guest driver alignment.  Specifying a version with a create syntax as
you've provided is reasonable, but this also balloons into whole
tangential interface providing information regarding what versions a
vendor driver is able to create, because presumably management tools
wouldn't prefer a try-and-see type interface.  I believe an intentional
aspect of the proposal here is that we don't need to provide a list of
compatible versions, that's handled internally to the vendor driver.

I don't know if we could start with the proposal here and later add
these sorts of features.  Ideas?  Thanks,

Alex
Daniel P. Berrangé April 23, 2019, 2:57 p.m. UTC | #12
On Tue, Apr 23, 2019 at 08:48:52AM -0600, Alex Williamson wrote:
> On Tue, 23 Apr 2019 14:44:00 +0100
> Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> > On Tue, Apr 23, 2019 at 06:35:40AM -0600, Alex Williamson wrote:
> > > On Tue, 23 Apr 2019 11:39:39 +0100
> > > Daniel P. Berrangé <berrange@redhat.com> wrote:
> > >   
> > > > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:  
> > > > > +* version
> > > > > +
> > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > +  device is regarded as not supporting live migration.
> > > > > +
> > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > +               device type. e.g., for pci device, it is
> > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > +               specify any string to identify a device.
> > > > > +
> > > > > +  When reading this attribute, it should show device version string of the device
> > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > +  return errno.
> > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > +  support live migration, it always returns errno.
> > > > > +
> > > > > +  for example.
> > > > > +  # cat \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > +  00028086-193b-i915-GVTg_V5_2
> > > > > +
> > > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > + -bash: echo: write error: Invalid argument
> > > > > +    
> > > > 
> > > > IIUC this path is against the physical device. IOW, the mgmt app would have
> > > > to first write to the "version" file to choose a version, and then write to
> > > > the "create" file to actually create an virtual device. This has the obvious
> > > > concurrency problem if multiple devices are being created at the same time
> > > > and distinct versions for each device are required. There would need to be
> > > > a locking scheme defined to ensure safety.  
> > > 
> > > "Create a device of a given version" is not an intended feature of this
> > > interface aiui.  Writing the version attribute only indicates
> > > migration compatibility with a binary result.
> > >    
> > > > Wouldn't it be better if we can pass the desired version when we write to
> > > > the "create" file, so that we avoid any concurrent usage problems. "version"
> > > > could be just a read-only file with a *list* of supported versions.
> > > > 
> > > > eg
> > > > 
> > > >   $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > >   5.0
> > > >   5.1
> > > >   5.2
> > > > 
> > > >   $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> > > >       /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create  
> > > 
> > > This is reminiscent of the proposed aggregation support, but again,
> > > this sort of feature is not intended here.  It's no expected that any
> > > vendor driver would support creating device types of different
> > > versions, but they may support migration from different versions.  
> > 
> > Hmm, that's a subtle distinction I wasn't seeing the patch series.
> > IIUC from what you're saying, a device can be created with version
> > "C", but for an incoming migration it can (potentially) accept
> > serialized state matching any of versions "A", "B" or "C".
> > 
> > That is sufficient if migration is being used as a host upgrade
> > tool, to move from OS release N to N + 1.
> > 
> > It wouldn't cover the case where you need to support backwards
> > migration too. eg if you have a mixture of hosts with release
> > N and N + 1 and want to make sure that VMs can always move
> > betweeen any host.  That would require that you can create
> > mdevs with the lowest common denominator version, not solely
> > the most recent.
> > 
> > In QEMU this is done by mgmt applications picking a versioned
> > machine type for QEMU that is older than most recent.
> 
> I suppose we'd need to determine how important that is, this is just a
> device after all, there are always fallback mechanisms via hotplug.
> There are a lot of pieces that need to line up for backwards migration
> including support for it at the individual vendor driver.  Nothing we
> design into the API is going to require vendor drivers to support
> backwards migration and we already have some vendors requiring host and
> guest driver alignment.  Specifying a version with a create syntax as
> you've provided is reasonable, but this also balloons into whole
> tangential interface providing information regarding what versions a
> vendor driver is able to create, because presumably management tools
> wouldn't prefer a try-and-see type interface.  I believe an intentional
> aspect of the proposal here is that we don't need to provide a list of
> compatible versions, that's handled internally to the vendor driver.

Worse is that multi-versions backwards migration also balloons the
testing matrix which is particularly time consuming.

To be clear I'm /not/ saying that multi-versions backwards migration
is a must-have. I just want to be sure we understand the limitations
of what's proposed and that we're happy with them.

Certainly forwards migration is the really big win here.

Backwards migration is more of a nice-to-have and only worth doing if
we expect people are willing to actually do enough testing to make
it reliable. Claiming to support backwards migration and then
never testing it works is even worse than not supporting it, because
it will be a rarely exercised code path comparatively speaking, so
easy to bit-rot if not well tested.

Regards,
Daniel
Alex Williamson April 23, 2019, 3:02 p.m. UTC | #13
On Tue, 23 Apr 2019 01:41:57 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > On Mon, 22 Apr 2019 21:01:52 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:  
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > >     
> > > > > device version attribute in mdev sysfs is used by user space software
> > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > migration.    
> > > > 
> > > > The Subject: doesn't quite match what's being proposed here.
> > > >      
> > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > >              identifies device type. e.g., for pci device, it is
> > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).    
> > > > 
> > > > What purpose does this serve?  If it's intended as some sort of
> > > > namespace feature, shouldn't we first assume that we can only support
> > > > migration to devices of the same type?  Therefore each type would
> > > > already have its own namespace.  Also that would make the trailing bit
> > > > of the version string listed below in the example redundant.  A vendor
> > > > is still welcome to include this in their version string if they wish,
> > > > but I think the string should be entirely vendor defined.
> > > >    
> > > hi Alex,
> > > This common part is a kind of namespace.
> > > Because if version string is entirely defined by vendors, I'm worried about
> > > if there is a case that one vendor's version string happens to deceive and
> > > interfere with another vendor's version checking?
> > > e.g.
> > > vendor A has a version string like: vendor id + device id + mdev type
> > > vendor B has a version string like: device id + vendor id + mdev type
> > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > 
> > > In this corner case, the two vendors may regard the two device is
> > > migratable but actually they are not.
> > > 
> > > That's the reason for this common part that serve as a kind of namespace
> > > that all vendors will comply with to avoid overlap.  
> > 
> > If we assume that migration can only occur between matching mdev types,
> > this is redundant, each type already has their own namespace.
> >  
> hi Alex,
> do you mean user space software like libvirt needs to first check whether
> mdev type is matching and then check whether version is matching?

Yes.
 
> if user space software only checks version for migration, it means vendor
> driver has to include mdev type in their vendor proprietary part string,
> right?

Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
be a failure on the part of the user.

> Another thing is that could there be any future mdev parent driver that
> applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> driver (https://lkml.org/lkml/2019/3/13/114)?

For starters, this is just a sample driver from which vendor specific
mdev drivers could be forked to support these features, but
additionally, the type is defined by the vendor driver, so even a meta
driver like vfio-pci-mdev could create types like
"vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
device.  The "vfio-pci-mdev-type1" is just a sample implementation to
say "the native type of the thing bound to me" and it's going to have
limited usefulness for any sort of persistence to userspace.  Thus,
it's a sample driver.  Thanks,

Alex

> > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > >              specify any string to identify a device.
> > > > > 
> > > > > When reading this attribute, it should show device version string of the
> > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > should return errno.
> > > > > When writing a string to this attribute, it returns errno for
> > > > > incompatibility or returns written string length in compatibility case.
> > > > > If a device does not support live migration, it always returns errno.
> > > > > 
> > > > > For user space software to use:
> > > > > 1.
> > > > > Before starting live migration, user space software first reads source side
> > > > > mdev device's version. e.g.
> > > > > "#cat \
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > 00028086-193b-i915-GVTg_V5_4
> > > > > 
> > > > > 2.
> > > > > Then, user space software writes the source side returned version string
> > > > > to device version attribute in target side, and checks the return value.
> > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > source and target sides are not compatible;
> > > > > If a positive number is returned and it equals to the length of written
> > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > e.g.
> > > > > (a) compatibility case
> > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > 
> > > > > (b) incompatibility case
> > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > -bash: echo: write error: Invalid argument
> > > > > 
> > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > live migration, and vice versa.
> > > > > 
> > > > > Note: if a mdev device does not support live migration, it either does
> > > > > not provide a version attribute, or always returns errno when its version
> > > > > attribute is read/written.    
> > > > 
> > > > I think it would be cleaner to do the former, not supply the
> > > > attribute.  This seems to do the latter in the sample drivers.  Thanks,    
> > > Ok. you are right!
> > > what about just keep one sample driver to show how to do the latter,
> > > and let the others do the former?  
> > 
> > I'd rather that if a vendor driver doesn't support features requiring
> > the version attribute that they don't implement it.  It's confusing to
> > developers looking at the sample driver for guidance if we have
> > different implementations.  Of course if you'd like to add migration
> > support to one of the sample drivers, that'd be very welcome.  Thanks,
> >  
> Got it:)
> 
> Thanks!
> Yan
> 
> >   
> > > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > > Cc: Erik Skultety <eskultet@redhat.com>
> > > > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > > > Cc: Neo Jia <cjia@nvidia.com>
> > > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > > > 
> > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > ---
> > > > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > > > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > > > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > > > >  4 files changed, 85 insertions(+)
> > > > > 
> > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    |     |   |--- available_instances
> > > > >    |     |   |--- device_api
> > > > >    |     |   |--- description
> > > > > +  |     |   |--- version
> > > > >    |     |   |--- [devices]
> > > > >    |     |--- [<type-id>]
> > > > >    |     |   |--- create
> > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    |     |   |--- available_instances
> > > > >    |     |   |--- device_api
> > > > >    |     |   |--- description
> > > > > +  |     |   |--- version
> > > > >    |     |   |--- [devices]
> > > > >    |     |--- [<type-id>]
> > > > >    |          |--- create
> > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    |          |--- available_instances
> > > > >    |          |--- device_api
> > > > >    |          |--- description
> > > > > +  |          |--- version
> > > > >    |          |--- [devices]
> > > > >  
> > > > >  * [mdev_supported_types]
> > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > > >    that should be provided by vendor driver.
> > > > >  
> > > > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > > > +
> > > > >  * [<type-id>]
> > > > >  
> > > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    This attribute should show the number of devices of type <type-id> that can be
> > > > >    created.
> > > > >  
> > > > > +* version
> > > > > +
> > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > +  device is regarded as not supporting live migration.
> > > > > +
> > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > +               device type. e.g., for pci device, it is
> > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > +               specify any string to identify a device.
> > > > > +
> > > > > +  When reading this attribute, it should show device version string of the device
> > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > +  return errno.
> > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > +  support live migration, it always returns errno.
> > > > > +
> > > > > +  for example.
> > > > > +  # cat \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > +  00028086-193b-i915-GVTg_V5_2
> > > > > +
> > > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > + -bash: echo: write error: Invalid argument
> > > > > +
> > > > >  * [device]
> > > > >  
> > > > >    This directory contains links to the devices of type <type-id> that have been
> > > > > @@ -327,12 +361,14 @@ card.
> > > > >          |   |   |-- available_instances
> > > > >          |   |   |-- create
> > > > >          |   |   |-- device_api
> > > > > +        |   |   |-- version
> > > > >          |   |   |-- devices
> > > > >          |   |   `-- name
> > > > >          |   `-- mtty-2
> > > > >          |       |-- available_instances
> > > > >          |       |-- create
> > > > >          |       |-- device_api
> > > > > +        |       |-- version
> > > > >          |       |-- devices
> > > > >          |       `-- name
> > > > >          |-- mtty_dev
> > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > >  }
> > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > >  
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > +		char *buf)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > +		const char *buf, size_t count)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > +
> > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > >  	&mdev_type_attr_name.attr,
> > > > >  	&mdev_type_attr_description.attr,
> > > > >  	&mdev_type_attr_device_api.attr,
> > > > >  	&mdev_type_attr_available_instances.attr,
> > > > > +	&mdev_type_attr_version.attr,
> > > > >  	NULL,
> > > > >  };
> > > > >  
> > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > >  }
> > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > >  
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > +		char *buf)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > +		const char *buf, size_t count)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > +
> > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > >  	&mdev_type_attr_name.attr,
> > > > >  	&mdev_type_attr_description.attr,
> > > > >  	&mdev_type_attr_device_api.attr,
> > > > >  	&mdev_type_attr_available_instances.attr,
> > > > > +	&mdev_type_attr_version.attr,
> > > > >  	NULL,
> > > > >  };
> > > > >  
> > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > >  
> > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > >  
> > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > +		char *buf)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > +		const char *buf, size_t count)
> > > > > +{
> > > > > +	/* do not support live migration */
> > > > > +	return -EINVAL;
> > > > > +}
> > > > > +
> > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > >  	&mdev_type_attr_name.attr,
> > > > >  	&mdev_type_attr_device_api.attr,
> > > > >  	&mdev_type_attr_available_instances.attr,
> > > > > +	&mdev_type_attr_version.attr,
> > > > >  	NULL,
> > > > >  };
> > > > >      
> > > > 
> > > > _______________________________________________
> > > > intel-gvt-dev mailing list
> > > > intel-gvt-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev    
> > 
> > _______________________________________________
> > intel-gvt-dev mailing list
> > intel-gvt-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Yan Zhao April 24, 2019, 3:10 a.m. UTC | #14
On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> On Fri, 19 Apr 2019 04:35:04 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> >
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >              identifies device type. e.g., for pci device, it is
> >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> >              specify any string to identify a device.
> >
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
> 
> It might make more sense if the driver does not register the attribute
> for the device in that case at all.
> 
yes. what about rephrase like this:
"
When reading this attribute, it should show device version string of the
device of type <type-id>.
If a device does not support live migration, it has two choices:
1. do not register this version attribute
2. return -ENODEV on rw this version attribute
Choice 1 is preferred.
"


> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> >
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> >
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
> >
> > 3. if two mdev devices are compatible, user space software can start
> > live migration, and vice versa.
> >
> > Note: if a mdev device does not support live migration, it either does
> > not provide a version attribute, or always returns errno when its version
> > attribute is read/written.
> >
> > Cc: Alex Williamson <alex.williamson@redhat.com>
> > Cc: Erik Skultety <eskultet@redhat.com>
> > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > Cc: Cornelia Huck <cohuck@redhat.com>
> > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > Cc: Neo Jia <cjia@nvidia.com>
> > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> >
> > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > ---
> >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> >  4 files changed, 85 insertions(+)
> >
> > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > index c3f69bcaf96e..bc28471c0667 100644
> > --- a/Documentation/vfio-mediated-device.txt
> > +++ b/Documentation/vfio-mediated-device.txt
> > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |     |   |--- available_instances
> >    |     |   |--- device_api
> >    |     |   |--- description
> > +  |     |   |--- version
> >    |     |   |--- [devices]
> >    |     |--- [<type-id>]
> >    |     |   |--- create
> > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |     |   |--- available_instances
> >    |     |   |--- device_api
> >    |     |   |--- description
> > +  |     |   |--- version
> >    |     |   |--- [devices]
> >    |     |--- [<type-id>]
> >    |          |--- create
> > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> >    |          |--- available_instances
> >    |          |--- device_api
> >    |          |--- description
> > +  |          |--- version
> >    |          |--- [devices]
> >
> >  * [mdev_supported_types]
> > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> >    [<type-id>], device_api, and available_instances are mandatory attributes
> >    that should be provided by vendor driver.
> >
> > +  version is a mandatory attribute if a mdev device supports live migration.
> 
> What about "An mdev device wishing to support live migration must
> provide the version attribute."?
yes, I just want to keep consistent with the line above it 
" [<type-id>], device_api, and available_instances are mandatory attributes
  that should be provided by vendor driver."
what about below one?
  "version is a mandatory attribute if a mdev device wishing to support live
  migration."


> > +
> >  * [<type-id>]
> >
> >    The [<type-id>] name is created by adding the device driver string as a prefix
> > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> >    This attribute should show the number of devices of type <type-id> that can be
> >    created.
> >
> > +* version
> > +
> > +  This attribute is rw. It is used to check whether two devices are compatible
> > +  for live migration. If this attribute is missing, then the corresponding mdev
> > +  device is regarded as not supporting live migration.
> > +
> > +  It consists of two parts: common part and vendor proprietary part.
> > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > +               device type. e.g., for pci device, it is
> > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > +  vendor proprietary part: this part is varied in length. vendor driver can
> > +               specify any string to identify a device.
> > +
> > +  When reading this attribute, it should show device version string of the device
> > +  of type <type-id>. If a device does not support live migration, it should
> > +  return errno.
> > +  When writing a string to this attribute, it returns errno for incompatibility
> > +  or returns written string length in compatibility case. If a device does not
> > +  support live migration, it always returns errno.
> 
> I'm not sure whether a device that does not support live migration
> should expose this attribute in the first place. Or is that to cover
> cases where a driver supports live migration only for some of the
> devices it supports?
yes, driver returning error code is to cover the cases where only part of devices it
supports can be migrated.


> Also, I'm not sure if a string that has to be parsed is a good idea...
> is this 'version' attribute supposed to convey some human-readable
> information as well? The procedure you describe for compatibility
> checking does the checking within the vendor driver which I would
> expect to have a table/rules for that anyway.
right. if a vendor driver has the confidence to migrate between devices of
diffent platform or mdev types, it can maintain a compatibility table for that
purpose. That's the reason why we would leave the compatibility check to vendor
driver. vendor driver can freely choose its own complicated way to decide
which device is migratable to which device.

> I think you should also specify which errno writing an incompatible id
> is supposed to return (probably best something different than if the
> device does not support live migration at all, if we stick with
> creating the attribute in that case.)
Agree. I'll define it clearly in next revison.

Thanks
Yan

> > +
> > +  for example.
> > +  # cat \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > +  00028086-193b-i915-GVTg_V5_2
> > +
> > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > + -bash: echo: write error: Invalid argument
> > +
> >  * [device]
> >
> >    This directory contains links to the devices of type <type-id> that have been
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Yan Zhao April 24, 2019, 3:33 a.m. UTC | #15
On Tue, Apr 23, 2019 at 06:24:19PM +0800, Daniel P. Berrangé wrote:
> On Tue, Apr 23, 2019 at 01:41:57AM -0400, Yan Zhao wrote:
> > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > >
> > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > > >
> > > > > > device version attribute in mdev sysfs is used by user space software
> > > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > > migration.
> > > > >
> > > > > The Subject: doesn't quite match what's being proposed here.
> > > > >
> > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > >              identifies device type. e.g., for pci device, it is
> > > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > >
> > > > > What purpose does this serve?  If it's intended as some sort of
> > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > migration to devices of the same type?  Therefore each type would
> > > > > already have its own namespace.  Also that would make the trailing bit
> > > > > of the version string listed below in the example redundant.  A vendor
> > > > > is still welcome to include this in their version string if they wish,
> > > > > but I think the string should be entirely vendor defined.
> > > > >
> > > > hi Alex,
> > > > This common part is a kind of namespace.
> > > > Because if version string is entirely defined by vendors, I'm worried about
> > > > if there is a case that one vendor's version string happens to deceive and
> > > > interfere with another vendor's version checking?
> > > > e.g.
> > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > >
> > > > In this corner case, the two vendors may regard the two device is
> > > > migratable but actually they are not.
> > > >
> > > > That's the reason for this common part that serve as a kind of namespace
> > > > that all vendors will comply with to avoid overlap.
> > >
> > > If we assume that migration can only occur between matching mdev types,
> > > this is redundant, each type already has their own namespace.
> > >
> > hi Alex,
> > do you mean user space software like libvirt needs to first check whether
> > mdev type is matching and then check whether version is matching?
> 
> I would expect that libvirt (or other mgmt apps) will always first
> check that the vendor id, device id, mdev type all match. So for the
> version string it should suffice to be a "normal" numeric value.
> 
> Essentially version string just needs to be there to distinguish
> revisions of the same mdev type over time.
>
hi Daniel,
The way that user space software checks that the vendor id, device id, mdev
type all match and version string is just revisions is somewhat restrictive?
user space software could not have so much detailed information regarding to
which devices are compatible, especially when vendor id, device id,
revision id, and mdev types are not enough or no need to be exactly the same.
By moving decision making for compatibility from user space to vendor
driver, user space software can be more change-resistant.
Agree?

Thanks
Yan

> 
> Regards,
> Daniel
> --
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Yan Zhao April 24, 2019, 3:39 a.m. UTC | #16
On Tue, Apr 23, 2019 at 11:02:56PM +0800, Alex Williamson wrote:
> On Tue, 23 Apr 2019 01:41:57 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:
> > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > >
> > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:
> > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > > >
> > > > > > device version attribute in mdev sysfs is used by user space software
> > > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > > migration.
> > > > >
> > > > > The Subject: doesn't quite match what's being proposed here.
> > > > >
> > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > >              identifies device type. e.g., for pci device, it is
> > > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > >
> > > > > What purpose does this serve?  If it's intended as some sort of
> > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > migration to devices of the same type?  Therefore each type would
> > > > > already have its own namespace.  Also that would make the trailing bit
> > > > > of the version string listed below in the example redundant.  A vendor
> > > > > is still welcome to include this in their version string if they wish,
> > > > > but I think the string should be entirely vendor defined.
> > > > >
> > > > hi Alex,
> > > > This common part is a kind of namespace.
> > > > Because if version string is entirely defined by vendors, I'm worried about
> > > > if there is a case that one vendor's version string happens to deceive and
> > > > interfere with another vendor's version checking?
> > > > e.g.
> > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > >
> > > > In this corner case, the two vendors may regard the two device is
> > > > migratable but actually they are not.
> > > >
> > > > That's the reason for this common part that serve as a kind of namespace
> > > > that all vendors will comply with to avoid overlap.
> > >
> > > If we assume that migration can only occur between matching mdev types,
> > > this is redundant, each type already has their own namespace.
> > >
> > hi Alex,
> > do you mean user space software like libvirt needs to first check whether
> > mdev type is matching and then check whether version is matching?
> 
> Yes.
>
may I know the drawback of defining the this common part?
	common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
             identifies device type. e.g., for pci device, it is
             "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
By doing so, user space software has no need to first check whether mdev type
is matching.  A confident vendor driver can even allow devices migrating
between different mdev types.


> > if user space software only checks version for migration, it means vendor
> > driver has to include mdev type in their vendor proprietary part string,
> > right?
> 
> Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
> be a failure on the part of the user.
> 
> > Another thing is that could there be any future mdev parent driver that
> > applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> > driver (https://lkml.org/lkml/2019/3/13/114)?
> 
> For starters, this is just a sample driver from which vendor specific
> mdev drivers could be forked to support these features, but
> additionally, the type is defined by the vendor driver, so even a meta
> driver like vfio-pci-mdev could create types like
> "vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
> device.  The "vfio-pci-mdev-type1" is just a sample implementation to
> say "the native type of the thing bound to me" and it's going to have
> limited usefulness for any sort of persistence to userspace.  Thus,
> it's a sample driver.  Thanks,
Thanks for this explanation:)


> 
> > > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > > >              specify any string to identify a device.
> > > > > >
> > > > > > When reading this attribute, it should show device version string of the
> > > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > > should return errno.
> > > > > > When writing a string to this attribute, it returns errno for
> > > > > > incompatibility or returns written string length in compatibility case.
> > > > > > If a device does not support live migration, it always returns errno.
> > > > > >
> > > > > > For user space software to use:
> > > > > > 1.
> > > > > > Before starting live migration, user space software first reads source side
> > > > > > mdev device's version. e.g.
> > > > > > "#cat \
> > > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > > 00028086-193b-i915-GVTg_V5_4
> > > > > >
> > > > > > 2.
> > > > > > Then, user space software writes the source side returned version string
> > > > > > to device version attribute in target side, and checks the return value.
> > > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > > source and target sides are not compatible;
> > > > > > If a positive number is returned and it equals to the length of written
> > > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > > e.g.
> > > > > > (a) compatibility case
> > > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > >
> > > > > > (b) incompatibility case
> > > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > -bash: echo: write error: Invalid argument
> > > > > >
> > > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > > live migration, and vice versa.
> > > > > >
> > > > > > Note: if a mdev device does not support live migration, it either does
> > > > > > not provide a version attribute, or always returns errno when its version
> > > > > > attribute is read/written.
> > > > >
> > > > > I think it would be cleaner to do the former, not supply the
> > > > > attribute.  This seems to do the latter in the sample drivers.  Thanks,
> > > > Ok. you are right!
> > > > what about just keep one sample driver to show how to do the latter,
> > > > and let the others do the former?
> > >
> > > I'd rather that if a vendor driver doesn't support features requiring
> > > the version attribute that they don't implement it.  It's confusing to
> > > developers looking at the sample driver for guidance if we have
> > > different implementations.  Of course if you'd like to add migration
> > > support to one of the sample drivers, that'd be very welcome.  Thanks,
> > >
> > Got it:)
> >
> > Thanks!
> > Yan
> >
> > >
> > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > > > Cc: Erik Skultety <eskultet@redhat.com>
> > > > > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > > > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > > > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > > > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > > > > Cc: Neo Jia <cjia@nvidia.com>
> > > > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > > > >
> > > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > > ---
> > > > > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > > > > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > > > > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > > > > >  4 files changed, 85 insertions(+)
> > > > > >
> > > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > >    |     |   |--- available_instances
> > > > > >    |     |   |--- device_api
> > > > > >    |     |   |--- description
> > > > > > +  |     |   |--- version
> > > > > >    |     |   |--- [devices]
> > > > > >    |     |--- [<type-id>]
> > > > > >    |     |   |--- create
> > > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > >    |     |   |--- available_instances
> > > > > >    |     |   |--- device_api
> > > > > >    |     |   |--- description
> > > > > > +  |     |   |--- version
> > > > > >    |     |   |--- [devices]
> > > > > >    |     |--- [<type-id>]
> > > > > >    |          |--- create
> > > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > >    |          |--- available_instances
> > > > > >    |          |--- device_api
> > > > > >    |          |--- description
> > > > > > +  |          |--- version
> > > > > >    |          |--- [devices]
> > > > > >
> > > > > >  * [mdev_supported_types]
> > > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > >    that should be provided by vendor driver.
> > > > > >
> > > > > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > > > > +
> > > > > >  * [<type-id>]
> > > > > >
> > > > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > >    This attribute should show the number of devices of type <type-id> that can be
> > > > > >    created.
> > > > > >
> > > > > > +* version
> > > > > > +
> > > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > > +  device is regarded as not supporting live migration.
> > > > > > +
> > > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > > +               device type. e.g., for pci device, it is
> > > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > +               specify any string to identify a device.
> > > > > > +
> > > > > > +  When reading this attribute, it should show device version string of the device
> > > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > > +  return errno.
> > > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > > +  support live migration, it always returns errno.
> > > > > > +
> > > > > > +  for example.
> > > > > > +  # cat \
> > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > > +  00028086-193b-i915-GVTg_V5_2
> > > > > > +
> > > > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > > + -bash: echo: write error: Invalid argument
> > > > > > +
> > > > > >  * [device]
> > > > > >
> > > > > >    This directory contains links to the devices of type <type-id> that have been
> > > > > > @@ -327,12 +361,14 @@ card.
> > > > > >          |   |   |-- available_instances
> > > > > >          |   |   |-- create
> > > > > >          |   |   |-- device_api
> > > > > > +        |   |   |-- version
> > > > > >          |   |   |-- devices
> > > > > >          |   |   `-- name
> > > > > >          |   `-- mtty-2
> > > > > >          |       |-- available_instances
> > > > > >          |       |-- create
> > > > > >          |       |-- device_api
> > > > > > +        |       |-- version
> > > > > >          |       |-- devices
> > > > > >          |       `-- name
> > > > > >          |-- mtty_dev
> > > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > >  }
> > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > >
> > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > +             char *buf)
> > > > > > +{
> > > > > > +     /* do not support live migration */
> > > > > > +     return -EINVAL;
> > > > > > +}
> > > > > > +
> > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > +             const char *buf, size_t count)
> > > > > > +{
> > > > > > +     /* do not support live migration */
> > > > > > +     return -EINVAL;
> > > > > > +}
> > > > > > +
> > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > +
> > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > >       &mdev_type_attr_name.attr,
> > > > > >       &mdev_type_attr_description.attr,
> > > > > >       &mdev_type_attr_device_api.attr,
> > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > +     &mdev_type_attr_version.attr,
> > > > > >       NULL,
> > > > > >  };
> > > > > >
> > > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > >  }
> > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > >
> > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > +             char *buf)
> > > > > > +{
> > > > > > +     /* do not support live migration */
> > > > > > +     return -EINVAL;
> > > > > > +}
> > > > > > +
> > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > +             const char *buf, size_t count)
> > > > > > +{
> > > > > > +     /* do not support live migration */
> > > > > > +     return -EINVAL;
> > > > > > +}
> > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > +
> > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > >       &mdev_type_attr_name.attr,
> > > > > >       &mdev_type_attr_description.attr,
> > > > > >       &mdev_type_attr_device_api.attr,
> > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > +     &mdev_type_attr_version.attr,
> > > > > >       NULL,
> > > > > >  };
> > > > > >
> > > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > >
> > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > >
> > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > +             char *buf)
> > > > > > +{
> > > > > > +     /* do not support live migration */
> > > > > > +     return -EINVAL;
> > > > > > +}
> > > > > > +
> > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > +             const char *buf, size_t count)
> > > > > > +{
> > > > > > +     /* do not support live migration */
> > > > > > +     return -EINVAL;
> > > > > > +}
> > > > > > +
> > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > >       &mdev_type_attr_name.attr,
> > > > > >       &mdev_type_attr_device_api.attr,
> > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > +     &mdev_type_attr_version.attr,
> > > > > >       NULL,
> > > > > >  };
> > > > > >
> > > > >
> > > > > _______________________________________________
> > > > > intel-gvt-dev mailing list
> > > > > intel-gvt-dev@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> > >
> > > _______________________________________________
> > > intel-gvt-dev mailing list
> > > intel-gvt-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
> 
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Neo Jia April 24, 2019, 4:13 a.m. UTC | #17
On Tue, Apr 23, 2019 at 11:39:39AM +0100, Daniel P. Berrangé wrote:
> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> > device version attribute in mdev sysfs is used by user space software
> > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > mdev devices. This attribute is mandatory if a mdev device supports live
> > migration.
> > 
> > It consists of two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >              identifies device type. e.g., for pci device, it is
> >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> >              specify any string to identify a device.
> > 
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>. If a device does not support live migration, it
> > should return errno.
> > When writing a string to this attribute, it returns errno for
> > incompatibility or returns written string length in compatibility case.
> > If a device does not support live migration, it always returns errno.
> > 
> > For user space software to use:
> > 1.
> > Before starting live migration, user space software first reads source side
> > mdev device's version. e.g.
> > "#cat \
> > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > 00028086-193b-i915-GVTg_V5_4
> > 
> > 2.
> > Then, user space software writes the source side returned version string
> > to device version attribute in target side, and checks the return value.
> > If a negative errno is returned in the target side, then mdev devices in
> > source and target sides are not compatible;
> > If a positive number is returned and it equals to the length of written
> > string, then the two mdev devices in source and target side are compatible.
> > e.g.
> > (a) compatibility case
> > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > 
> > (b) incompatibility case
> > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > -bash: echo: write error: Invalid argument
> 
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.
> 
> This is good as it allows for live migration across driver software upgrades.
> 
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
> 
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.
> 
> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.

What would be the typical scenario / use case for mgmt layer to query the version
information? Do they expect this get done completely offline as long as the
vendor driver installed on each host?

Thanks,
Neo

> 
>
Cornelia Huck April 24, 2019, 7:56 a.m. UTC | #18
On Tue, 23 Apr 2019 23:10:37 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> > On Fri, 19 Apr 2019 04:35:04 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > device version attribute in mdev sysfs is used by user space software
> > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > migration.
> > >
> > > It consists of two parts: common part and vendor proprietary part.
> > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > >              identifies device type. e.g., for pci device, it is
> > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > vendor proprietary part: this part is varied in length. vendor driver can
> > >              specify any string to identify a device.
> > >
> > > When reading this attribute, it should show device version string of the
> > > device of type <type-id>. If a device does not support live migration, it
> > > should return errno.  
> > 
> > It might make more sense if the driver does not register the attribute
> > for the device in that case at all.
> >   
> yes. what about rephrase like this:
> "
> When reading this attribute, it should show device version string of the
> device of type <type-id>.
> If a device does not support live migration, it has two choices:
> 1. do not register this version attribute
> 2. return -ENODEV on rw this version attribute

"return -ENODEV when accessing the version attribute" ?

> Choice 1 is preferred.
> "
> 
> 
> > > When writing a string to this attribute, it returns errno for
> > > incompatibility or returns written string length in compatibility case.
> > > If a device does not support live migration, it always returns errno.
> > >
> > > For user space software to use:
> > > 1.
> > > Before starting live migration, user space software first reads source side
> > > mdev device's version. e.g.
> > > "#cat \
> > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > 00028086-193b-i915-GVTg_V5_4
> > >
> > > 2.
> > > Then, user space software writes the source side returned version string
> > > to device version attribute in target side, and checks the return value.
> > > If a negative errno is returned in the target side, then mdev devices in
> > > source and target sides are not compatible;
> > > If a positive number is returned and it equals to the length of written
> > > string, then the two mdev devices in source and target side are compatible.
> > > e.g.
> > > (a) compatibility case
> > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > >
> > > (b) incompatibility case
> > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > -bash: echo: write error: Invalid argument
> > >
> > > 3. if two mdev devices are compatible, user space software can start
> > > live migration, and vice versa.
> > >
> > > Note: if a mdev device does not support live migration, it either does
> > > not provide a version attribute, or always returns errno when its version
> > > attribute is read/written.
> > >
> > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > Cc: Erik Skultety <eskultet@redhat.com>
> > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > Cc: Neo Jia <cjia@nvidia.com>
> > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > >
> > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > ---
> > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > >  4 files changed, 85 insertions(+)
> > >
> > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > index c3f69bcaf96e..bc28471c0667 100644
> > > --- a/Documentation/vfio-mediated-device.txt
> > > +++ b/Documentation/vfio-mediated-device.txt
> > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > >    |     |   |--- available_instances
> > >    |     |   |--- device_api
> > >    |     |   |--- description
> > > +  |     |   |--- version
> > >    |     |   |--- [devices]
> > >    |     |--- [<type-id>]
> > >    |     |   |--- create
> > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > >    |     |   |--- available_instances
> > >    |     |   |--- device_api
> > >    |     |   |--- description
> > > +  |     |   |--- version
> > >    |     |   |--- [devices]
> > >    |     |--- [<type-id>]
> > >    |          |--- create
> > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > >    |          |--- available_instances
> > >    |          |--- device_api
> > >    |          |--- description
> > > +  |          |--- version
> > >    |          |--- [devices]
> > >
> > >  * [mdev_supported_types]
> > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > >    that should be provided by vendor driver.
> > >
> > > +  version is a mandatory attribute if a mdev device supports live migration.  
> > 
> > What about "An mdev device wishing to support live migration must
> > provide the version attribute."?  
> yes, I just want to keep consistent with the line above it 
> " [<type-id>], device_api, and available_instances are mandatory attributes
>   that should be provided by vendor driver."
> what about below one?
>   "version is a mandatory attribute if a mdev device wishing to support live
>   migration."

My point is that an attribute is not mandatory if it can be left out :)
(I'm not a native speaker, though; maybe this makes perfect sense
after all?)

Maybe "version is a required attribute if live migration is supported
for an mdev device"?

> 
> 
> > > +
> > >  * [<type-id>]
> > >
> > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > >    This attribute should show the number of devices of type <type-id> that can be
> > >    created.
> > >
> > > +* version
> > > +
> > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > +  device is regarded as not supporting live migration.
> > > +
> > > +  It consists of two parts: common part and vendor proprietary part.
> > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > +               device type. e.g., for pci device, it is
> > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > +               specify any string to identify a device.
> > > +
> > > +  When reading this attribute, it should show device version string of the device
> > > +  of type <type-id>. If a device does not support live migration, it should
> > > +  return errno.
> > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > +  or returns written string length in compatibility case. If a device does not
> > > +  support live migration, it always returns errno.  
> > 
> > I'm not sure whether a device that does not support live migration
> > should expose this attribute in the first place. Or is that to cover
> > cases where a driver supports live migration only for some of the
> > devices it supports?  
> yes, driver returning error code is to cover the cases where only part of devices it
> supports can be migrated.
> 
> 
> > Also, I'm not sure if a string that has to be parsed is a good idea...
> > is this 'version' attribute supposed to convey some human-readable
> > information as well? The procedure you describe for compatibility
> > checking does the checking within the vendor driver which I would
> > expect to have a table/rules for that anyway.  
> right. if a vendor driver has the confidence to migrate between devices of
> diffent platform or mdev types, it can maintain a compatibility table for that
> purpose. That's the reason why we would leave the compatibility check to vendor
> driver. vendor driver can freely choose its own complicated way to decide
> which device is migratable to which device.

I think there are two scenarios here:
- Migrating between different device types, which is unlikely to work,
  except in special cases.
- Migrating between different versions of the same device type, which
  may work for some drivers/devices (and at least migrating to a newer
  version looks quite reasonable).

But both should be something that is decided by the individual driver;
I hope we don't want to support migration between different drivers :-O

Can we make this a driver-defined format?

> 
> > I think you should also specify which errno writing an incompatible id
> > is supposed to return (probably best something different than if the
> > device does not support live migration at all, if we stick with
> > creating the attribute in that case.)  
> Agree. I'll define it clearly in next revison.

Thanks!
Yan Zhao April 24, 2019, 8:15 a.m. UTC | #19
On Wed, Apr 24, 2019 at 03:56:24PM +0800, Cornelia Huck wrote:
> On Tue, 23 Apr 2019 23:10:37 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > >   
> > > > device version attribute in mdev sysfs is used by user space software
> > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > migration.
> > > >
> > > > It consists of two parts: common part and vendor proprietary part.
> > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > >              identifies device type. e.g., for pci device, it is
> > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > >              specify any string to identify a device.
> > > >
> > > > When reading this attribute, it should show device version string of the
> > > > device of type <type-id>. If a device does not support live migration, it
> > > > should return errno.  
> > > 
> > > It might make more sense if the driver does not register the attribute
> > > for the device in that case at all.
> > >   
> > yes. what about rephrase like this:
> > "
> > When reading this attribute, it should show device version string of the
> > device of type <type-id>.
> > If a device does not support live migration, it has two choices:
> > 1. do not register this version attribute
> > 2. return -ENODEV on rw this version attribute
> 
> "return -ENODEV when accessing the version attribute" ?
Yeah, this one is better, thanks :)

> > Choice 1 is preferred.
> > "
> > 
> > 
> > > > When writing a string to this attribute, it returns errno for
> > > > incompatibility or returns written string length in compatibility case.
> > > > If a device does not support live migration, it always returns errno.
> > > >
> > > > For user space software to use:
> > > > 1.
> > > > Before starting live migration, user space software first reads source side
> > > > mdev device's version. e.g.
> > > > "#cat \
> > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > 00028086-193b-i915-GVTg_V5_4
> > > >
> > > > 2.
> > > > Then, user space software writes the source side returned version string
> > > > to device version attribute in target side, and checks the return value.
> > > > If a negative errno is returned in the target side, then mdev devices in
> > > > source and target sides are not compatible;
> > > > If a positive number is returned and it equals to the length of written
> > > > string, then the two mdev devices in source and target side are compatible.
> > > > e.g.
> > > > (a) compatibility case
> > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > >
> > > > (b) incompatibility case
> > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > -bash: echo: write error: Invalid argument
> > > >
> > > > 3. if two mdev devices are compatible, user space software can start
> > > > live migration, and vice versa.
> > > >
> > > > Note: if a mdev device does not support live migration, it either does
> > > > not provide a version attribute, or always returns errno when its version
> > > > attribute is read/written.
> > > >
> > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > Cc: Erik Skultety <eskultet@redhat.com>
> > > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > > Cc: Neo Jia <cjia@nvidia.com>
> > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > >
> > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > ---
> > > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > > >  4 files changed, 85 insertions(+)
> > > >
> > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > --- a/Documentation/vfio-mediated-device.txt
> > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > >    |     |   |--- available_instances
> > > >    |     |   |--- device_api
> > > >    |     |   |--- description
> > > > +  |     |   |--- version
> > > >    |     |   |--- [devices]
> > > >    |     |--- [<type-id>]
> > > >    |     |   |--- create
> > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > >    |     |   |--- available_instances
> > > >    |     |   |--- device_api
> > > >    |     |   |--- description
> > > > +  |     |   |--- version
> > > >    |     |   |--- [devices]
> > > >    |     |--- [<type-id>]
> > > >    |          |--- create
> > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > >    |          |--- available_instances
> > > >    |          |--- device_api
> > > >    |          |--- description
> > > > +  |          |--- version
> > > >    |          |--- [devices]
> > > >
> > > >  * [mdev_supported_types]
> > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > >    that should be provided by vendor driver.
> > > >
> > > > +  version is a mandatory attribute if a mdev device supports live migration.  
> > > 
> > > What about "An mdev device wishing to support live migration must
> > > provide the version attribute."?  
> > yes, I just want to keep consistent with the line above it 
> > " [<type-id>], device_api, and available_instances are mandatory attributes
> >   that should be provided by vendor driver."
> > what about below one?
> >   "version is a mandatory attribute if a mdev device wishing to support live
> >   migration."
> 
> My point is that an attribute is not mandatory if it can be left out :)
> (I'm not a native speaker, though; maybe this makes perfect sense
> after all?)
> 
> Maybe "version is a required attribute if live migration is supported
> for an mdev device"?
> 
you are right, "mandatory" may bring some confusion.
Maybe
"vendor driver must provide version attribute for an mdev device wishing to
support live migration." ?
based on your first version :)

> > 
> > 
> > > > +
> > > >  * [<type-id>]
> > > >
> > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > >    This attribute should show the number of devices of type <type-id> that can be
> > > >    created.
> > > >
> > > > +* version
> > > > +
> > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > +  device is regarded as not supporting live migration.
> > > > +
> > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > +               device type. e.g., for pci device, it is
> > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > +               specify any string to identify a device.
> > > > +
> > > > +  When reading this attribute, it should show device version string of the device
> > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > +  return errno.
> > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > +  or returns written string length in compatibility case. If a device does not
> > > > +  support live migration, it always returns errno.  
> > > 
> > > I'm not sure whether a device that does not support live migration
> > > should expose this attribute in the first place. Or is that to cover
> > > cases where a driver supports live migration only for some of the
> > > devices it supports?  
> > yes, driver returning error code is to cover the cases where only part of devices it
> > supports can be migrated.
> > 
> > 
> > > Also, I'm not sure if a string that has to be parsed is a good idea...
> > > is this 'version' attribute supposed to convey some human-readable
> > > information as well? The procedure you describe for compatibility
> > > checking does the checking within the vendor driver which I would
> > > expect to have a table/rules for that anyway.  
> > right. if a vendor driver has the confidence to migrate between devices of
> > diffent platform or mdev types, it can maintain a compatibility table for that
> > purpose. That's the reason why we would leave the compatibility check to vendor
> > driver. vendor driver can freely choose its own complicated way to decide
> > which device is migratable to which device.
> 
> I think there are two scenarios here:
> - Migrating between different device types, which is unlikely to work,
>   except in special cases.
> - Migrating between different versions of the same device type, which
>   may work for some drivers/devices (and at least migrating to a newer
>   version looks quite reasonable).
> 
> But both should be something that is decided by the individual driver;
> I hope we don't want to support migration between different drivers :-O
> 
> Can we make this a driver-defined format?
>
yes, this is indeed driver-defined format.
Actually we define it into two parts: common part and vendor proprietary part.
common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
             identifies device type. e.g., for pci device, it is
             "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
vendor proprietary part: this part is varied in length. vendor driver can
             specify any string to identify a device.

vendor proprietary part is defined by vendor driver. vendor driver can
define any format it wishes to use. Also it is its own responsibility to
ensure backward compatibility if it wants to update format definition in this
part.

So user space only needs to get source side's version string, and asks
target side whether the two are compatible. The decision maker is the
vendor driver:)


> > 
> > > I think you should also specify which errno writing an incompatible id
> > > is supposed to return (probably best something different than if the
> > > device does not support live migration at all, if we stick with
> > > creating the attribute in that case.)  
> > Agree. I'll define it clearly in next revison.
> 
> Thanks!
Christophe de Dinechin April 24, 2019, 9:10 a.m. UTC | #20
> On 23 Apr 2019, at 12:39, Daniel P. Berrangé <berrange@redhat.com> wrote:
> 
> On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
>> device version attribute in mdev sysfs is used by user space software
>> (e.g. libvirt) to query device compatibility for live migration of VFIO
>> mdev devices. This attribute is mandatory if a mdev device supports live
>> migration.
>> 
>> It consists of two parts: common part and vendor proprietary part.
>> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>>             identifies device type. e.g., for pci device, it is
>>             "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
>> vendor proprietary part: this part is varied in length. vendor driver can
>>             specify any string to identify a device.
>> 
>> When reading this attribute, it should show device version string of the
>> device of type <type-id>. If a device does not support live migration, it
>> should return errno.
>> When writing a string to this attribute, it returns errno for
>> incompatibility or returns written string length in compatibility case.
>> If a device does not support live migration, it always returns errno.
>> 
>> For user space software to use:
>> 1.
>> Before starting live migration, user space software first reads source side
>> mdev device's version. e.g.
>> "#cat \
>> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
>> 00028086-193b-i915-GVTg_V5_4
>> 
>> 2.
>> Then, user space software writes the source side returned version string
>> to device version attribute in target side, and checks the return value.
>> If a negative errno is returned in the target side, then mdev devices in
>> source and target sides are not compatible;
>> If a positive number is returned and it equals to the length of written
>> string, then the two mdev devices in source and target side are compatible.
>> e.g.
>> (a) compatibility case
>> "# echo 00028086-193b-i915-GVTg_V5_4 >
>> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>> 
>> (b) incompatibility case
>> "#echo 00028086-193b-i915-GVTg_V5_1 >
>> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
>> -bash: echo: write error: Invalid argument
> 
> What you have written here seems to imply that each mdev type is able to
> support many different versions at the same time. Writing a version into
> this sysfs file then chooses which of the many versions to actually use.
> 
> This is good as it allows for live migration across driver software upgrades.
> 
> A mgmt application may well want to know what versions are supported for an
> mdev type *before* starting a migration. A mgmt app can query all the 100's
> of hosts it knows and thus figure out which are valid to use as the target
> of a migration.
> 
> IOW, we want to avoid the ever hitting the incompatibility case in the
> first place, by only choosing to migrate to a host that we know is going
> to be compatible.
> 
> This would need some kind of way to report the full list of supported
> versions against the mdev supported types on the host.
> 
> 
>> 3. if two mdev devices are compatible, user space software can start
>> live migration, and vice versa.
>> 
>> Note: if a mdev device does not support live migration, it either does
>> not provide a version attribute, or always returns errno when its version
>> attribute is read/written.
>> 
>> Cc: Alex Williamson <alex.williamson@redhat.com>
>> Cc: Erik Skultety <eskultet@redhat.com>
>> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>> Cc: Cornelia Huck <cohuck@redhat.com>
>> Cc: "Tian, Kevin" <kevin.tian@intel.com>
>> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
>> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
>> Cc: Neo Jia <cjia@nvidia.com>
>> Cc: Kirti Wankhede <kwankhede@nvidia.com>
>> 
>> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
>> ---
>> Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
>> samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
>> samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
>> samples/vfio-mdev/mtty.c               | 16 ++++++++++++
>> 4 files changed, 85 insertions(+)
>> 
>> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
>> index c3f69bcaf96e..bc28471c0667 100644
>> --- a/Documentation/vfio-mediated-device.txt
>> +++ b/Documentation/vfio-mediated-device.txt
>> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
>>   |     |   |--- available_instances
>>   |     |   |--- device_api
>>   |     |   |--- description
>> +  |     |   |--- version
>>   |     |   |--- [devices]
>>   |     |--- [<type-id>]
>>   |     |   |--- create
>> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
>>   |     |   |--- available_instances
>>   |     |   |--- device_api
>>   |     |   |--- description
>> +  |     |   |--- version
>>   |     |   |--- [devices]
>>   |     |--- [<type-id>]
>>   |          |--- create
>> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
>>   |          |--- available_instances
>>   |          |--- device_api
>>   |          |--- description
>> +  |          |--- version
>>   |          |--- [devices]
>> 
>> * [mdev_supported_types]
>> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
>>   [<type-id>], device_api, and available_instances are mandatory attributes
>>   that should be provided by vendor driver.
>> 
>> +  version is a mandatory attribute if a mdev device supports live migration.
>> +
>> * [<type-id>]
>> 
>>   The [<type-id>] name is created by adding the device driver string as a prefix
>> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
>>   This attribute should show the number of devices of type <type-id> that can be
>>   created.
>> 
>> +* version
>> +
>> +  This attribute is rw. It is used to check whether two devices are compatible
>> +  for live migration. If this attribute is missing, then the corresponding mdev
>> +  device is regarded as not supporting live migration.
>> +
>> +  It consists of two parts: common part and vendor proprietary part.
>> +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
>> +               device type. e.g., for pci device, it is
>> +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
>> +  vendor proprietary part: this part is varied in length. vendor driver can
>> +               specify any string to identify a device.
>> +
>> +  When reading this attribute, it should show device version string of the device
>> +  of type <type-id>. If a device does not support live migration, it should
>> +  return errno.
>> +  When writing a string to this attribute, it returns errno for incompatibility
>> +  or returns written string length in compatibility case. If a device does not
>> +  support live migration, it always returns errno.
>> +
>> +  for example.
>> +  # cat \
>> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
>> +  00028086-193b-i915-GVTg_V5_2
>> +
>> +  #echo 00028086-193b-i915-GVTg_V5_2 > \
>> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
>> + -bash: echo: write error: Invalid argument
>> +
> 
> IIUC this path is against the physical device. IOW, the mgmt app would have
> to first write to the "version" file to choose a version, and then write to
> the "create" file to actually create an virtual device. This has the obvious
> concurrency problem if multiple devices are being created at the same time
> and distinct versions for each device are required. There would need to be
> a locking scheme defined to ensure safety.
> 
> Wouldn't it be better if we can pass the desired version when we write to
> the "create" file, so that we avoid any concurrent usage problems. "version"
> could be just a read-only file with a *list* of supported versions.
> 
> eg
> 
>  $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
>  5.0
>  5.1
>  5.2
> 
>  $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
>      /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
> 

I read Alex’s comment that creating an mdev with a specific version is not the intent here. However…

- If the goal is just to check compatibility with migration data, then I think the name should be more explicit, e.g. migration_version instead of version. Otherwise, I read the intent the way Daniel did.

- If we want to provide a list of versions and make it possible to create instances based on a version, I would rather use a directory structure for that, i.e. (replacing cat with ls):

$ ls /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
 5.0
 5.1
 5.2

where each version is a directory that has its own available_instances, device_api, description, create, …

$ echo 83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 > /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version/5.1/create

In addition to not having the problem of two consecutive writes to distinct files, I can imagine contrived cases where available_instances might depend on the version…


Thanks
Christophe

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
Alex Williamson April 24, 2019, 2:14 p.m. UTC | #21
On Tue, 23 Apr 2019 23:39:34 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Tue, Apr 23, 2019 at 11:02:56PM +0800, Alex Williamson wrote:
> > On Tue, 23 Apr 2019 01:41:57 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:  
> > > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > >  
> > > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:  
> > > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > > > >  
> > > > > > > device version attribute in mdev sysfs is used by user space software
> > > > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > > > migration.  
> > > > > >
> > > > > > The Subject: doesn't quite match what's being proposed here.
> > > > > >  
> > > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > > >              identifies device type. e.g., for pci device, it is
> > > > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).  
> > > > > >
> > > > > > What purpose does this serve?  If it's intended as some sort of
> > > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > > migration to devices of the same type?  Therefore each type would
> > > > > > already have its own namespace.  Also that would make the trailing bit
> > > > > > of the version string listed below in the example redundant.  A vendor
> > > > > > is still welcome to include this in their version string if they wish,
> > > > > > but I think the string should be entirely vendor defined.
> > > > > >  
> > > > > hi Alex,
> > > > > This common part is a kind of namespace.
> > > > > Because if version string is entirely defined by vendors, I'm worried about
> > > > > if there is a case that one vendor's version string happens to deceive and
> > > > > interfere with another vendor's version checking?
> > > > > e.g.
> > > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > > >
> > > > > In this corner case, the two vendors may regard the two device is
> > > > > migratable but actually they are not.
> > > > >
> > > > > That's the reason for this common part that serve as a kind of namespace
> > > > > that all vendors will comply with to avoid overlap.  
> > > >
> > > > If we assume that migration can only occur between matching mdev types,
> > > > this is redundant, each type already has their own namespace.
> > > >  
> > > hi Alex,
> > > do you mean user space software like libvirt needs to first check whether
> > > mdev type is matching and then check whether version is matching?  
> > 
> > Yes.
> >  
> may I know the drawback of defining the this common part?
> 	common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>              identifies device type. e.g., for pci device, it is
>              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> By doing so, user space software has no need to first check whether mdev type
> is matching.  A confident vendor driver can even allow devices migrating
> between different mdev types.


It's not practical to expect userspace to test the version of every
mdev type in the system to determine a match, let alone across a data
center.  Additionally, in order to be migration compatible the mdev
types must be software compatible to the VM, which is the basic
definition of the differences between mdev types.  Therefore let me
flip the question around, why would a vendor driver choose to create a
new mdev type for software compatible devices?  If the vendor wants to
maintain compatibility, indicate basic compatibility using the same
mdev type.  The original intention of the version attribute is to be
entirely opaque to userspace, introducing "common" parts is unnecessary
as above, degrades the original concept, and only defines a solution for
PCI devices. Thanks,

Alex

> > > if user space software only checks version for migration, it means vendor
> > > driver has to include mdev type in their vendor proprietary part string,
> > > right?  
> > 
> > Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
> > be a failure on the part of the user.
> >   
> > > Another thing is that could there be any future mdev parent driver that
> > > applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> > > driver (https://lkml.org/lkml/2019/3/13/114)?  
> > 
> > For starters, this is just a sample driver from which vendor specific
> > mdev drivers could be forked to support these features, but
> > additionally, the type is defined by the vendor driver, so even a meta
> > driver like vfio-pci-mdev could create types like
> > "vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
> > device.  The "vfio-pci-mdev-type1" is just a sample implementation to
> > say "the native type of the thing bound to me" and it's going to have
> > limited usefulness for any sort of persistence to userspace.  Thus,
> > it's a sample driver.  Thanks,  
> Thanks for this explanation:)
> 
> 
> >   
> > > > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > >              specify any string to identify a device.
> > > > > > >
> > > > > > > When reading this attribute, it should show device version string of the
> > > > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > > > should return errno.
> > > > > > > When writing a string to this attribute, it returns errno for
> > > > > > > incompatibility or returns written string length in compatibility case.
> > > > > > > If a device does not support live migration, it always returns errno.
> > > > > > >
> > > > > > > For user space software to use:
> > > > > > > 1.
> > > > > > > Before starting live migration, user space software first reads source side
> > > > > > > mdev device's version. e.g.
> > > > > > > "#cat \
> > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > > > 00028086-193b-i915-GVTg_V5_4
> > > > > > >
> > > > > > > 2.
> > > > > > > Then, user space software writes the source side returned version string
> > > > > > > to device version attribute in target side, and checks the return value.
> > > > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > > > source and target sides are not compatible;
> > > > > > > If a positive number is returned and it equals to the length of written
> > > > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > > > e.g.
> > > > > > > (a) compatibility case
> > > > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > >
> > > > > > > (b) incompatibility case
> > > > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > > -bash: echo: write error: Invalid argument
> > > > > > >
> > > > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > > > live migration, and vice versa.
> > > > > > >
> > > > > > > Note: if a mdev device does not support live migration, it either does
> > > > > > > not provide a version attribute, or always returns errno when its version
> > > > > > > attribute is read/written.  
> > > > > >
> > > > > > I think it would be cleaner to do the former, not supply the
> > > > > > attribute.  This seems to do the latter in the sample drivers.  Thanks,  
> > > > > Ok. you are right!
> > > > > what about just keep one sample driver to show how to do the latter,
> > > > > and let the others do the former?  
> > > >
> > > > I'd rather that if a vendor driver doesn't support features requiring
> > > > the version attribute that they don't implement it.  It's confusing to
> > > > developers looking at the sample driver for guidance if we have
> > > > different implementations.  Of course if you'd like to add migration
> > > > support to one of the sample drivers, that'd be very welcome.  Thanks,
> > > >  
> > > Got it:)
> > >
> > > Thanks!
> > > Yan
> > >  
> > > >  
> > > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > > > > Cc: Erik Skultety <eskultet@redhat.com>
> > > > > > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > > > > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > > > > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > > > > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > > > > > Cc: Neo Jia <cjia@nvidia.com>
> > > > > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > > > > >
> > > > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > > > ---
> > > > > > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > > > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > > > > > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > > > > > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > > > > > >  4 files changed, 85 insertions(+)
> > > > > > >
> > > > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > >    |     |   |--- available_instances
> > > > > > >    |     |   |--- device_api
> > > > > > >    |     |   |--- description
> > > > > > > +  |     |   |--- version
> > > > > > >    |     |   |--- [devices]
> > > > > > >    |     |--- [<type-id>]
> > > > > > >    |     |   |--- create
> > > > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > >    |     |   |--- available_instances
> > > > > > >    |     |   |--- device_api
> > > > > > >    |     |   |--- description
> > > > > > > +  |     |   |--- version
> > > > > > >    |     |   |--- [devices]
> > > > > > >    |     |--- [<type-id>]
> > > > > > >    |          |--- create
> > > > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > >    |          |--- available_instances
> > > > > > >    |          |--- device_api
> > > > > > >    |          |--- description
> > > > > > > +  |          |--- version
> > > > > > >    |          |--- [devices]
> > > > > > >
> > > > > > >  * [mdev_supported_types]
> > > > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > > >    that should be provided by vendor driver.
> > > > > > >
> > > > > > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > > > > > +
> > > > > > >  * [<type-id>]
> > > > > > >
> > > > > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > >    This attribute should show the number of devices of type <type-id> that can be
> > > > > > >    created.
> > > > > > >
> > > > > > > +* version
> > > > > > > +
> > > > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > > > +  device is regarded as not supporting live migration.
> > > > > > > +
> > > > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > > > +               device type. e.g., for pci device, it is
> > > > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > > +               specify any string to identify a device.
> > > > > > > +
> > > > > > > +  When reading this attribute, it should show device version string of the device
> > > > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > > > +  return errno.
> > > > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > > > +  support live migration, it always returns errno.
> > > > > > > +
> > > > > > > +  for example.
> > > > > > > +  # cat \
> > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > > > +  00028086-193b-i915-GVTg_V5_2
> > > > > > > +
> > > > > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > > > + -bash: echo: write error: Invalid argument
> > > > > > > +
> > > > > > >  * [device]
> > > > > > >
> > > > > > >    This directory contains links to the devices of type <type-id> that have been
> > > > > > > @@ -327,12 +361,14 @@ card.
> > > > > > >          |   |   |-- available_instances
> > > > > > >          |   |   |-- create
> > > > > > >          |   |   |-- device_api
> > > > > > > +        |   |   |-- version
> > > > > > >          |   |   |-- devices
> > > > > > >          |   |   `-- name
> > > > > > >          |   `-- mtty-2
> > > > > > >          |       |-- available_instances
> > > > > > >          |       |-- create
> > > > > > >          |       |-- device_api
> > > > > > > +        |       |-- version
> > > > > > >          |       |-- devices
> > > > > > >          |       `-- name
> > > > > > >          |-- mtty_dev
> > > > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > >  }
> > > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > > >
> > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > +             char *buf)
> > > > > > > +{
> > > > > > > +     /* do not support live migration */
> > > > > > > +     return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > +             const char *buf, size_t count)
> > > > > > > +{
> > > > > > > +     /* do not support live migration */
> > > > > > > +     return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > +
> > > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > > >       &mdev_type_attr_name.attr,
> > > > > > >       &mdev_type_attr_description.attr,
> > > > > > >       &mdev_type_attr_device_api.attr,
> > > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > > +     &mdev_type_attr_version.attr,
> > > > > > >       NULL,
> > > > > > >  };
> > > > > > >
> > > > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > >  }
> > > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > > >
> > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > +             char *buf)
> > > > > > > +{
> > > > > > > +     /* do not support live migration */
> > > > > > > +     return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > +             const char *buf, size_t count)
> > > > > > > +{
> > > > > > > +     /* do not support live migration */
> > > > > > > +     return -EINVAL;
> > > > > > > +}
> > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > +
> > > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > > >       &mdev_type_attr_name.attr,
> > > > > > >       &mdev_type_attr_description.attr,
> > > > > > >       &mdev_type_attr_device_api.attr,
> > > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > > +     &mdev_type_attr_version.attr,
> > > > > > >       NULL,
> > > > > > >  };
> > > > > > >
> > > > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > >
> > > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > > >
> > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > +             char *buf)
> > > > > > > +{
> > > > > > > +     /* do not support live migration */
> > > > > > > +     return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > +             const char *buf, size_t count)
> > > > > > > +{
> > > > > > > +     /* do not support live migration */
> > > > > > > +     return -EINVAL;
> > > > > > > +}
> > > > > > > +
> > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > > >       &mdev_type_attr_name.attr,
> > > > > > >       &mdev_type_attr_device_api.attr,
> > > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > > +     &mdev_type_attr_version.attr,
> > > > > > >       NULL,
> > > > > > >  };
> > > > > > >  
> > > > > >
> > > > > > _______________________________________________
> > > > > > intel-gvt-dev mailing list
> > > > > > intel-gvt-dev@lists.freedesktop.org
> > > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev  
> > > >
> > > > _______________________________________________
> > > > intel-gvt-dev mailing list
> > > > intel-gvt-dev@lists.freedesktop.org
> > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev  
> > 
> > _______________________________________________
> > intel-gvt-dev mailing list
> > intel-gvt-dev@lists.freedesktop.org
> > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Yan Zhao April 26, 2019, 1:44 a.m. UTC | #22
On Wed, Apr 24, 2019 at 10:14:50PM +0800, Alex Williamson wrote:
> On Tue, 23 Apr 2019 23:39:34 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > On Tue, Apr 23, 2019 at 11:02:56PM +0800, Alex Williamson wrote:
> > > On Tue, 23 Apr 2019 01:41:57 -0400
> > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > >   
> > > > On Tue, Apr 23, 2019 at 09:21:00AM +0800, Alex Williamson wrote:  
> > > > > On Mon, 22 Apr 2019 21:01:52 -0400
> > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > > >  
> > > > > > On Mon, Apr 22, 2019 at 10:39:50PM +0800, Alex Williamson wrote:  
> > > > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > > > > > >  
> > > > > > > > device version attribute in mdev sysfs is used by user space software
> > > > > > > > (e.g. libvirt) to query device compatibility for live migration of VFIO
> > > > > > > > mdev devices. This attribute is mandatory if a mdev device supports live
> > > > > > > > migration.  
> > > > > > >
> > > > > > > The Subject: doesn't quite match what's being proposed here.
> > > > > > >  
> > > > > > > > It consists of two parts: common part and vendor proprietary part.
> > > > > > > > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> > > > > > > >              identifies device type. e.g., for pci device, it is
> > > > > > > >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).  
> > > > > > >
> > > > > > > What purpose does this serve?  If it's intended as some sort of
> > > > > > > namespace feature, shouldn't we first assume that we can only support
> > > > > > > migration to devices of the same type?  Therefore each type would
> > > > > > > already have its own namespace.  Also that would make the trailing bit
> > > > > > > of the version string listed below in the example redundant.  A vendor
> > > > > > > is still welcome to include this in their version string if they wish,
> > > > > > > but I think the string should be entirely vendor defined.
> > > > > > >  
> > > > > > hi Alex,
> > > > > > This common part is a kind of namespace.
> > > > > > Because if version string is entirely defined by vendors, I'm worried about
> > > > > > if there is a case that one vendor's version string happens to deceive and
> > > > > > interfere with another vendor's version checking?
> > > > > > e.g.
> > > > > > vendor A has a version string like: vendor id + device id + mdev type
> > > > > > vendor B has a version string like: device id + vendor id + mdev type
> > > > > > but vendor A's vendor id is 0x8086, device id is 0x1217
> > > > > > vendor B's vendor id is 0x1217, device id is 0x8086.
> > > > > >
> > > > > > In this corner case, the two vendors may regard the two device is
> > > > > > migratable but actually they are not.
> > > > > >
> > > > > > That's the reason for this common part that serve as a kind of namespace
> > > > > > that all vendors will comply with to avoid overlap.  
> > > > >
> > > > > If we assume that migration can only occur between matching mdev types,
> > > > > this is redundant, each type already has their own namespace.
> > > > >  
> > > > hi Alex,
> > > > do you mean user space software like libvirt needs to first check whether
> > > > mdev type is matching and then check whether version is matching?  
> > > 
> > > Yes.
> > >  
> > may I know the drawback of defining the this common part?
> > 	common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >              identifies device type. e.g., for pci device, it is
> >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > By doing so, user space software has no need to first check whether mdev type
> > is matching.  A confident vendor driver can even allow devices migrating
> > between different mdev types.
> 
> 
> It's not practical to expect userspace to test the version of every
> mdev type in the system to determine a match, let alone across a data
> center.  Additionally, in order to be migration compatible the mdev
> types must be software compatible to the VM, which is the basic
> definition of the differences between mdev types.  Therefore let me
> flip the question around, why would a vendor driver choose to create a
> new mdev type for software compatible devices?  If the vendor wants to
> maintain compatibility, indicate basic compatibility using the same
> mdev type.  The original intention of the version attribute is to be
> entirely opaque to userspace, introducing "common" parts is unnecessary
> as above, degrades the original concept, and only defines a solution for
> PCI devices. Thanks,
> 
ok. Got it. Thanks for explanation.
I'll remove this common part in next revision.


> 
> > > > if user space software only checks version for migration, it means vendor
> > > > driver has to include mdev type in their vendor proprietary part string,
> > > > right?  
> > > 
> > > Userspace attempting to migrate an nvidia-64 to an i915-GVT_V5_4 would
> > > be a failure on the part of the user.
> > >   
> > > > Another thing is that could there be any future mdev parent driver that
> > > > applies to all mdev devices, just like vfio-pci? like Yi's vfio-pci-mdev
> > > > driver (https://lkml.org/lkml/2019/3/13/114)?  
> > > 
> > > For starters, this is just a sample driver from which vendor specific
> > > mdev drivers could be forked to support these features, but
> > > additionally, the type is defined by the vendor driver, so even a meta
> > > driver like vfio-pci-mdev could create types like
> > > "vfio-pci-mdev-8086_10c9_abcdef" if it wanted to provide that specific
> > > device.  The "vfio-pci-mdev-type1" is just a sample implementation to
> > > say "the native type of the thing bound to me" and it's going to have
> > > limited usefulness for any sort of persistence to userspace.  Thus,
> > > it's a sample driver.  Thanks,  
> > Thanks for this explanation:)
> > 
> > 
> > >   
> > > > > > > > vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > > >              specify any string to identify a device.
> > > > > > > >
> > > > > > > > When reading this attribute, it should show device version string of the
> > > > > > > > device of type <type-id>. If a device does not support live migration, it
> > > > > > > > should return errno.
> > > > > > > > When writing a string to this attribute, it returns errno for
> > > > > > > > incompatibility or returns written string length in compatibility case.
> > > > > > > > If a device does not support live migration, it always returns errno.
> > > > > > > >
> > > > > > > > For user space software to use:
> > > > > > > > 1.
> > > > > > > > Before starting live migration, user space software first reads source side
> > > > > > > > mdev device's version. e.g.
> > > > > > > > "#cat \
> > > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> > > > > > > > 00028086-193b-i915-GVTg_V5_4
> > > > > > > >
> > > > > > > > 2.
> > > > > > > > Then, user space software writes the source side returned version string
> > > > > > > > to device version attribute in target side, and checks the return value.
> > > > > > > > If a negative errno is returned in the target side, then mdev devices in
> > > > > > > > source and target sides are not compatible;
> > > > > > > > If a positive number is returned and it equals to the length of written
> > > > > > > > string, then the two mdev devices in source and target side are compatible.
> > > > > > > > e.g.
> > > > > > > > (a) compatibility case
> > > > > > > > "# echo 00028086-193b-i915-GVTg_V5_4 >
> > > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > > >
> > > > > > > > (b) incompatibility case
> > > > > > > > "#echo 00028086-193b-i915-GVTg_V5_1 >
> > > > > > > > /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> > > > > > > > -bash: echo: write error: Invalid argument
> > > > > > > >
> > > > > > > > 3. if two mdev devices are compatible, user space software can start
> > > > > > > > live migration, and vice versa.
> > > > > > > >
> > > > > > > > Note: if a mdev device does not support live migration, it either does
> > > > > > > > not provide a version attribute, or always returns errno when its version
> > > > > > > > attribute is read/written.  
> > > > > > >
> > > > > > > I think it would be cleaner to do the former, not supply the
> > > > > > > attribute.  This seems to do the latter in the sample drivers.  Thanks,  
> > > > > > Ok. you are right!
> > > > > > what about just keep one sample driver to show how to do the latter,
> > > > > > and let the others do the former?  
> > > > >
> > > > > I'd rather that if a vendor driver doesn't support features requiring
> > > > > the version attribute that they don't implement it.  It's confusing to
> > > > > developers looking at the sample driver for guidance if we have
> > > > > different implementations.  Of course if you'd like to add migration
> > > > > support to one of the sample drivers, that'd be very welcome.  Thanks,
> > > > >  
> > > > Got it:)
> > > >
> > > > Thanks!
> > > > Yan
> > > >  
> > > > >  
> > > > > > > > Cc: Alex Williamson <alex.williamson@redhat.com>
> > > > > > > > Cc: Erik Skultety <eskultet@redhat.com>
> > > > > > > > Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > > > > > Cc: Cornelia Huck <cohuck@redhat.com>
> > > > > > > > Cc: "Tian, Kevin" <kevin.tian@intel.com>
> > > > > > > > Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> > > > > > > > Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> > > > > > > > Cc: Neo Jia <cjia@nvidia.com>
> > > > > > > > Cc: Kirti Wankhede <kwankhede@nvidia.com>
> > > > > > > >
> > > > > > > > Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> > > > > > > > ---
> > > > > > > >  Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> > > > > > > >  samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> > > > > > > >  samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> > > > > > > >  samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> > > > > > > >  4 files changed, 85 insertions(+)
> > > > > > > >
> > > > > > > > diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> > > > > > > > index c3f69bcaf96e..bc28471c0667 100644
> > > > > > > > --- a/Documentation/vfio-mediated-device.txt
> > > > > > > > +++ b/Documentation/vfio-mediated-device.txt
> > > > > > > > @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > >    |     |   |--- available_instances
> > > > > > > >    |     |   |--- device_api
> > > > > > > >    |     |   |--- description
> > > > > > > > +  |     |   |--- version
> > > > > > > >    |     |   |--- [devices]
> > > > > > > >    |     |--- [<type-id>]
> > > > > > > >    |     |   |--- create
> > > > > > > > @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > >    |     |   |--- available_instances
> > > > > > > >    |     |   |--- device_api
> > > > > > > >    |     |   |--- description
> > > > > > > > +  |     |   |--- version
> > > > > > > >    |     |   |--- [devices]
> > > > > > > >    |     |--- [<type-id>]
> > > > > > > >    |          |--- create
> > > > > > > > @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > >    |          |--- available_instances
> > > > > > > >    |          |--- device_api
> > > > > > > >    |          |--- description
> > > > > > > > +  |          |--- version
> > > > > > > >    |          |--- [devices]
> > > > > > > >
> > > > > > > >  * [mdev_supported_types]
> > > > > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > > > >    that should be provided by vendor driver.
> > > > > > > >
> > > > > > > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > > > > > > +
> > > > > > > >  * [<type-id>]
> > > > > > > >
> > > > > > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > > > >    This attribute should show the number of devices of type <type-id> that can be
> > > > > > > >    created.
> > > > > > > >
> > > > > > > > +* version
> > > > > > > > +
> > > > > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > > > > +  device is regarded as not supporting live migration.
> > > > > > > > +
> > > > > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > > > > +               device type. e.g., for pci device, it is
> > > > > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > > > +               specify any string to identify a device.
> > > > > > > > +
> > > > > > > > +  When reading this attribute, it should show device version string of the device
> > > > > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > > > > +  return errno.
> > > > > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > > > > +  support live migration, it always returns errno.
> > > > > > > > +
> > > > > > > > +  for example.
> > > > > > > > +  # cat \
> > > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> > > > > > > > +  00028086-193b-i915-GVTg_V5_2
> > > > > > > > +
> > > > > > > > +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> > > > > > > > + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> > > > > > > > + -bash: echo: write error: Invalid argument
> > > > > > > > +
> > > > > > > >  * [device]
> > > > > > > >
> > > > > > > >    This directory contains links to the devices of type <type-id> that have been
> > > > > > > > @@ -327,12 +361,14 @@ card.
> > > > > > > >          |   |   |-- available_instances
> > > > > > > >          |   |   |-- create
> > > > > > > >          |   |   |-- device_api
> > > > > > > > +        |   |   |-- version
> > > > > > > >          |   |   |-- devices
> > > > > > > >          |   |   `-- name
> > > > > > > >          |   `-- mtty-2
> > > > > > > >          |       |-- available_instances
> > > > > > > >          |       |-- create
> > > > > > > >          |       |-- device_api
> > > > > > > > +        |       |-- version
> > > > > > > >          |       |-- devices
> > > > > > > >          |       `-- name
> > > > > > > >          |-- mtty_dev
> > > > > > > > diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
> > > > > > > > index b038aa9f5a70..2f5ba96b91a2 100644
> > > > > > > > --- a/samples/vfio-mdev/mbochs.c
> > > > > > > > +++ b/samples/vfio-mdev/mbochs.c
> > > > > > > > @@ -1391,11 +1391,28 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > >  }
> > > > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > > > >
> > > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > > +             char *buf)
> > > > > > > > +{
> > > > > > > > +     /* do not support live migration */
> > > > > > > > +     return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > > +             const char *buf, size_t count)
> > > > > > > > +{
> > > > > > > > +     /* do not support live migration */
> > > > > > > > +     return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > > +
> > > > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > > > >       &mdev_type_attr_name.attr,
> > > > > > > >       &mdev_type_attr_description.attr,
> > > > > > > >       &mdev_type_attr_device_api.attr,
> > > > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > > > +     &mdev_type_attr_version.attr,
> > > > > > > >       NULL,
> > > > > > > >  };
> > > > > > > >
> > > > > > > > diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
> > > > > > > > index cc86bf6566e4..ff15fdfc7d46 100644
> > > > > > > > --- a/samples/vfio-mdev/mdpy.c
> > > > > > > > +++ b/samples/vfio-mdev/mdpy.c
> > > > > > > > @@ -695,11 +695,27 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > >  }
> > > > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > > > >
> > > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > > +             char *buf)
> > > > > > > > +{
> > > > > > > > +     /* do not support live migration */
> > > > > > > > +     return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > > +             const char *buf, size_t count)
> > > > > > > > +{
> > > > > > > > +     /* do not support live migration */
> > > > > > > > +     return -EINVAL;
> > > > > > > > +}
> > > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > > +
> > > > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > > > >       &mdev_type_attr_name.attr,
> > > > > > > >       &mdev_type_attr_description.attr,
> > > > > > > >       &mdev_type_attr_device_api.attr,
> > > > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > > > +     &mdev_type_attr_version.attr,
> > > > > > > >       NULL,
> > > > > > > >  };
> > > > > > > >
> > > > > > > > diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
> > > > > > > > index 1c77c370c92f..4ae3aad3474d 100644
> > > > > > > > --- a/samples/vfio-mdev/mtty.c
> > > > > > > > +++ b/samples/vfio-mdev/mtty.c
> > > > > > > > @@ -1390,10 +1390,26 @@ static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
> > > > > > > >
> > > > > > > >  MDEV_TYPE_ATTR_RO(device_api);
> > > > > > > >
> > > > > > > > +static ssize_t version_show(struct kobject *kobj, struct device *dev,
> > > > > > > > +             char *buf)
> > > > > > > > +{
> > > > > > > > +     /* do not support live migration */
> > > > > > > > +     return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static ssize_t version_store(struct kobject *kobj, struct device *dev,
> > > > > > > > +             const char *buf, size_t count)
> > > > > > > > +{
> > > > > > > > +     /* do not support live migration */
> > > > > > > > +     return -EINVAL;
> > > > > > > > +}
> > > > > > > > +
> > > > > > > > +static MDEV_TYPE_ATTR_RW(version);
> > > > > > > >  static struct attribute *mdev_types_attrs[] = {
> > > > > > > >       &mdev_type_attr_name.attr,
> > > > > > > >       &mdev_type_attr_device_api.attr,
> > > > > > > >       &mdev_type_attr_available_instances.attr,
> > > > > > > > +     &mdev_type_attr_version.attr,
> > > > > > > >       NULL,
> > > > > > > >  };
> > > > > > > >  
> > > > > > >
> > > > > > > _______________________________________________
> > > > > > > intel-gvt-dev mailing list
> > > > > > > intel-gvt-dev@lists.freedesktop.org
> > > > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev  
> > > > >
> > > > > _______________________________________________
> > > > > intel-gvt-dev mailing list
> > > > > intel-gvt-dev@lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev  
> > > 
> > > _______________________________________________
> > > intel-gvt-dev mailing list
> > > intel-gvt-dev@lists.freedesktop.org
> > > https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev  
> 
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Yan Zhao April 26, 2019, 2:01 a.m. UTC | #23
On Wed, Apr 24, 2019 at 05:10:43PM +0800, Christophe de Dinechin wrote:
> 
> 
> > On 23 Apr 2019, at 12:39, Daniel P. Berrangé <berrange@redhat.com> wrote:
> > 
> > On Fri, Apr 19, 2019 at 04:35:04AM -0400, Yan Zhao wrote:
> >> device version attribute in mdev sysfs is used by user space software
> >> (e.g. libvirt) to query device compatibility for live migration of VFIO
> >> mdev devices. This attribute is mandatory if a mdev device supports live
> >> migration.
> >> 
> >> It consists of two parts: common part and vendor proprietary part.
> >> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >>             identifies device type. e.g., for pci device, it is
> >>             "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> >> vendor proprietary part: this part is varied in length. vendor driver can
> >>             specify any string to identify a device.
> >> 
> >> When reading this attribute, it should show device version string of the
> >> device of type <type-id>. If a device does not support live migration, it
> >> should return errno.
> >> When writing a string to this attribute, it returns errno for
> >> incompatibility or returns written string length in compatibility case.
> >> If a device does not support live migration, it always returns errno.
> >> 
> >> For user space software to use:
> >> 1.
> >> Before starting live migration, user space software first reads source side
> >> mdev device's version. e.g.
> >> "#cat \
> >> /sys/bus/pci/devices/0000\:00\:02.0/5ac1fb20-2bbf-4842-bb7e-36c58c3be9cd/mdev_type/version"
> >> 00028086-193b-i915-GVTg_V5_4
> >> 
> >> 2.
> >> Then, user space software writes the source side returned version string
> >> to device version attribute in target side, and checks the return value.
> >> If a negative errno is returned in the target side, then mdev devices in
> >> source and target sides are not compatible;
> >> If a positive number is returned and it equals to the length of written
> >> string, then the two mdev devices in source and target side are compatible.
> >> e.g.
> >> (a) compatibility case
> >> "# echo 00028086-193b-i915-GVTg_V5_4 >
> >> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >> 
> >> (b) incompatibility case
> >> "#echo 00028086-193b-i915-GVTg_V5_1 >
> >> /sys/bus/pci/devices/0000\:00\:02.0/882cc4da-dede-11e7-9180-078a62063ab1/mdev_type/version"
> >> -bash: echo: write error: Invalid argument
> > 
> > What you have written here seems to imply that each mdev type is able to
> > support many different versions at the same time. Writing a version into
> > this sysfs file then chooses which of the many versions to actually use.
> > 
> > This is good as it allows for live migration across driver software upgrades.
> > 
> > A mgmt application may well want to know what versions are supported for an
> > mdev type *before* starting a migration. A mgmt app can query all the 100's
> > of hosts it knows and thus figure out which are valid to use as the target
> > of a migration.
> > 
> > IOW, we want to avoid the ever hitting the incompatibility case in the
> > first place, by only choosing to migrate to a host that we know is going
> > to be compatible.
> > 
> > This would need some kind of way to report the full list of supported
> > versions against the mdev supported types on the host.
> > 
> > 
> >> 3. if two mdev devices are compatible, user space software can start
> >> live migration, and vice versa.
> >> 
> >> Note: if a mdev device does not support live migration, it either does
> >> not provide a version attribute, or always returns errno when its version
> >> attribute is read/written.
> >> 
> >> Cc: Alex Williamson <alex.williamson@redhat.com>
> >> Cc: Erik Skultety <eskultet@redhat.com>
> >> Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> >> Cc: Cornelia Huck <cohuck@redhat.com>
> >> Cc: "Tian, Kevin" <kevin.tian@intel.com>
> >> Cc: Zhenyu Wang <zhenyuw@linux.intel.com>
> >> Cc: "Wang, Zhi A" <zhi.a.wang@intel.com>
> >> Cc: Neo Jia <cjia@nvidia.com>
> >> Cc: Kirti Wankhede <kwankhede@nvidia.com>
> >> 
> >> Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
> >> ---
> >> Documentation/vfio-mediated-device.txt | 36 ++++++++++++++++++++++++++
> >> samples/vfio-mdev/mbochs.c             | 17 ++++++++++++
> >> samples/vfio-mdev/mdpy.c               | 16 ++++++++++++
> >> samples/vfio-mdev/mtty.c               | 16 ++++++++++++
> >> 4 files changed, 85 insertions(+)
> >> 
> >> diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
> >> index c3f69bcaf96e..bc28471c0667 100644
> >> --- a/Documentation/vfio-mediated-device.txt
> >> +++ b/Documentation/vfio-mediated-device.txt
> >> @@ -202,6 +202,7 @@ Directories and files under the sysfs for Each Physical Device
> >>   |     |   |--- available_instances
> >>   |     |   |--- device_api
> >>   |     |   |--- description
> >> +  |     |   |--- version
> >>   |     |   |--- [devices]
> >>   |     |--- [<type-id>]
> >>   |     |   |--- create
> >> @@ -209,6 +210,7 @@ Directories and files under the sysfs for Each Physical Device
> >>   |     |   |--- available_instances
> >>   |     |   |--- device_api
> >>   |     |   |--- description
> >> +  |     |   |--- version
> >>   |     |   |--- [devices]
> >>   |     |--- [<type-id>]
> >>   |          |--- create
> >> @@ -216,6 +218,7 @@ Directories and files under the sysfs for Each Physical Device
> >>   |          |--- available_instances
> >>   |          |--- device_api
> >>   |          |--- description
> >> +  |          |--- version
> >>   |          |--- [devices]
> >> 
> >> * [mdev_supported_types]
> >> @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> >>   [<type-id>], device_api, and available_instances are mandatory attributes
> >>   that should be provided by vendor driver.
> >> 
> >> +  version is a mandatory attribute if a mdev device supports live migration.
> >> +
> >> * [<type-id>]
> >> 
> >>   The [<type-id>] name is created by adding the device driver string as a prefix
> >> @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> >>   This attribute should show the number of devices of type <type-id> that can be
> >>   created.
> >> 
> >> +* version
> >> +
> >> +  This attribute is rw. It is used to check whether two devices are compatible
> >> +  for live migration. If this attribute is missing, then the corresponding mdev
> >> +  device is regarded as not supporting live migration.
> >> +
> >> +  It consists of two parts: common part and vendor proprietary part.
> >> +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> >> +               device type. e.g., for pci device, it is
> >> +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> >> +  vendor proprietary part: this part is varied in length. vendor driver can
> >> +               specify any string to identify a device.
> >> +
> >> +  When reading this attribute, it should show device version string of the device
> >> +  of type <type-id>. If a device does not support live migration, it should
> >> +  return errno.
> >> +  When writing a string to this attribute, it returns errno for incompatibility
> >> +  or returns written string length in compatibility case. If a device does not
> >> +  support live migration, it always returns errno.
> >> +
> >> +  for example.
> >> +  # cat \
> >> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
> >> +  00028086-193b-i915-GVTg_V5_2
> >> +
> >> +  #echo 00028086-193b-i915-GVTg_V5_2 > \
> >> + /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> >> + -bash: echo: write error: Invalid argument
> >> +
> > 
> > IIUC this path is against the physical device. IOW, the mgmt app would have
> > to first write to the "version" file to choose a version, and then write to
> > the "create" file to actually create an virtual device. This has the obvious
> > concurrency problem if multiple devices are being created at the same time
> > and distinct versions for each device are required. There would need to be
> > a locking scheme defined to ensure safety.
> > 
> > Wouldn't it be better if we can pass the desired version when we write to
> > the "create" file, so that we avoid any concurrent usage problems. "version"
> > could be just a read-only file with a *list* of supported versions.
> > 
> > eg
> > 
> >  $ cat /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
> >  5.0
> >  5.1
> >  5.2
> > 
> >  $ echo "83b8f4f2-509f-382f-3c1e-e6bfe0fa1001;version=5.2" >
> >      /sys/devices/virtual/mtty/mtty/mdev_supported_types/mtty-2/create
> > 
> 
> I read Alex’s comment that creating an mdev with a specific version is not the intent here. However…
> 
> - If the goal is just to check compatibility with migration data, then I think the name should be more explicit, e.g. migration_version instead of version. Otherwise, I read the intent the way Daniel did.
>
hi Christophe,
you are right, this "version" is not for creating mdev device. It is to
describe a mdev clearly (therefore it may include information about an mdev's parent device).
As a result, it can be leveraged by migration -- before starting migration,
user space software can check if two mdev devices are compatible for migration.
(of course, this check is shifting to vendor driver by writing one mdev's version
string to another mdev's version node and checking the result).
Compared to renaming it to migration_version, I would like to describe its usage
more clearly in doc.

Thanks for your input:)

Yan


> - If we want to provide a list of versions and make it possible to create instances based on a version, I would rather use a directory structure for that, i.e. (replacing cat with ls):
> 
> $ ls /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
>  5.0
>  5.1
>  5.2
> 
> where each version is a directory that has its own available_instances, device_api, description, create, …
> 
> $ echo 83b8f4f2-509f-382f-3c1e-e6bfe0fa1001 > /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version/5.1/create
> 
> In addition to not having the problem of two consecutive writes to distinct files, I can imagine contrived cases where available_instances might depend on the version…
> 
> 
> Thanks
> Christophe
> 
> > 
> > Regards,
> > Daniel
> > -- 
> > |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> > |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> > |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
>
Cornelia Huck April 30, 2019, 3:29 p.m. UTC | #24
On Wed, 24 Apr 2019 04:15:58 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Wed, Apr 24, 2019 at 03:56:24PM +0800, Cornelia Huck wrote:
> > On Tue, 23 Apr 2019 23:10:37 -0400
> > Yan Zhao <yan.y.zhao@intel.com> wrote:
> >   
> > > On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:  
> > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > Yan Zhao <yan.y.zhao@intel.com> wrote:

> > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > > >    that should be provided by vendor driver.
> > > > >
> > > > > +  version is a mandatory attribute if a mdev device supports live migration.    
> > > > 
> > > > What about "An mdev device wishing to support live migration must
> > > > provide the version attribute."?    
> > > yes, I just want to keep consistent with the line above it 
> > > " [<type-id>], device_api, and available_instances are mandatory attributes
> > >   that should be provided by vendor driver."
> > > what about below one?
> > >   "version is a mandatory attribute if a mdev device wishing to support live
> > >   migration."  
> > 
> > My point is that an attribute is not mandatory if it can be left out :)
> > (I'm not a native speaker, though; maybe this makes perfect sense
> > after all?)
> > 
> > Maybe "version is a required attribute if live migration is supported
> > for an mdev device"?
> >   
> you are right, "mandatory" may bring some confusion.
> Maybe
> "vendor driver must provide version attribute for an mdev device wishing to
> support live migration." ?
> based on your first version :)

"The vendor driver must provide the version attribute for any mdev
device it wishes to support live migration for." ?

> 
> > > 
> > >   
> > > > > +
> > > > >  * [<type-id>]
> > > > >
> > > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > >    This attribute should show the number of devices of type <type-id> that can be
> > > > >    created.
> > > > >
> > > > > +* version
> > > > > +
> > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > +  device is regarded as not supporting live migration.
> > > > > +
> > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > +               device type. e.g., for pci device, it is
> > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > +               specify any string to identify a device.
> > > > > +
> > > > > +  When reading this attribute, it should show device version string of the device
> > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > +  return errno.
> > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > +  support live migration, it always returns errno.    
> > > > 
> > > > I'm not sure whether a device that does not support live migration
> > > > should expose this attribute in the first place. Or is that to cover
> > > > cases where a driver supports live migration only for some of the
> > > > devices it supports?    
> > > yes, driver returning error code is to cover the cases where only part of devices it
> > > supports can be migrated.
> > > 
> > >   
> > > > Also, I'm not sure if a string that has to be parsed is a good idea...
> > > > is this 'version' attribute supposed to convey some human-readable
> > > > information as well? The procedure you describe for compatibility
> > > > checking does the checking within the vendor driver which I would
> > > > expect to have a table/rules for that anyway.    
> > > right. if a vendor driver has the confidence to migrate between devices of
> > > diffent platform or mdev types, it can maintain a compatibility table for that
> > > purpose. That's the reason why we would leave the compatibility check to vendor
> > > driver. vendor driver can freely choose its own complicated way to decide
> > > which device is migratable to which device.  
> > 
> > I think there are two scenarios here:
> > - Migrating between different device types, which is unlikely to work,
> >   except in special cases.
> > - Migrating between different versions of the same device type, which
> >   may work for some drivers/devices (and at least migrating to a newer
> >   version looks quite reasonable).
> > 
> > But both should be something that is decided by the individual driver;
> > I hope we don't want to support migration between different drivers :-O
> > 
> > Can we make this a driver-defined format?
> >  
> yes, this is indeed driver-defined format.
> Actually we define it into two parts: common part and vendor proprietary part.
> common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
>              identifies device type. e.g., for pci device, it is
>              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> vendor proprietary part: this part is varied in length. vendor driver can
>              specify any string to identify a device.
> 
> vendor proprietary part is defined by vendor driver. vendor driver can
> define any format it wishes to use. Also it is its own responsibility to
> ensure backward compatibility if it wants to update format definition in this
> part.
> 
> So user space only needs to get source side's version string, and asks
> target side whether the two are compatible. The decision maker is the
> vendor driver:)

If I followed the discussion correctly, I think you plan to drop this
format, don't you? I'd be happy if a vendor driver can use a simple
number without any prefixes if it so chooses.

I also like the idea of renaming this "migration_version" so that it is
clear we're dealing with versioning of the migration capability (and
not a version of the device or so).
Yan Zhao May 7, 2019, 5:39 a.m. UTC | #25
On Tue, Apr 30, 2019 at 11:29:08PM +0800, Cornelia Huck wrote:
> On Wed, 24 Apr 2019 04:15:58 -0400
> Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > On Wed, Apr 24, 2019 at 03:56:24PM +0800, Cornelia Huck wrote:
> > > On Tue, 23 Apr 2019 23:10:37 -0400
> > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> > >
> > > > On Tue, Apr 23, 2019 at 05:59:32PM +0800, Cornelia Huck wrote:
> > > > > On Fri, 19 Apr 2019 04:35:04 -0400
> > > > > Yan Zhao <yan.y.zhao@intel.com> wrote:
> 
> > > > > > @@ -225,6 +228,8 @@ Directories and files under the sysfs for Each Physical Device
> > > > > >    [<type-id>], device_api, and available_instances are mandatory attributes
> > > > > >    that should be provided by vendor driver.
> > > > > >
> > > > > > +  version is a mandatory attribute if a mdev device supports live migration.
> > > > >
> > > > > What about "An mdev device wishing to support live migration must
> > > > > provide the version attribute."?
> > > > yes, I just want to keep consistent with the line above it
> > > > " [<type-id>], device_api, and available_instances are mandatory attributes
> > > >   that should be provided by vendor driver."
> > > > what about below one?
> > > >   "version is a mandatory attribute if a mdev device wishing to support live
> > > >   migration."
> > >
> > > My point is that an attribute is not mandatory if it can be left out :)
> > > (I'm not a native speaker, though; maybe this makes perfect sense
> > > after all?)
> > >
> > > Maybe "version is a required attribute if live migration is supported
> > > for an mdev device"?
> > >
> > you are right, "mandatory" may bring some confusion.
> > Maybe
> > "vendor driver must provide version attribute for an mdev device wishing to
> > support live migration." ?
> > based on your first version :)
> 
> "The vendor driver must provide the version attribute for any mdev
> device it wishes to support live migration for." ?
> 
> >
> > > >
> > > >
> > > > > > +
> > > > > >  * [<type-id>]
> > > > > >
> > > > > >    The [<type-id>] name is created by adding the device driver string as a prefix
> > > > > > @@ -246,6 +251,35 @@ Directories and files under the sysfs for Each Physical Device
> > > > > >    This attribute should show the number of devices of type <type-id> that can be
> > > > > >    created.
> > > > > >
> > > > > > +* version
> > > > > > +
> > > > > > +  This attribute is rw. It is used to check whether two devices are compatible
> > > > > > +  for live migration. If this attribute is missing, then the corresponding mdev
> > > > > > +  device is regarded as not supporting live migration.
> > > > > > +
> > > > > > +  It consists of two parts: common part and vendor proprietary part.
> > > > > > +  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
> > > > > > +               device type. e.g., for pci device, it is
> > > > > > +               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > > > > > +  vendor proprietary part: this part is varied in length. vendor driver can
> > > > > > +               specify any string to identify a device.
> > > > > > +
> > > > > > +  When reading this attribute, it should show device version string of the device
> > > > > > +  of type <type-id>. If a device does not support live migration, it should
> > > > > > +  return errno.
> > > > > > +  When writing a string to this attribute, it returns errno for incompatibility
> > > > > > +  or returns written string length in compatibility case. If a device does not
> > > > > > +  support live migration, it always returns errno.
> > > > >
> > > > > I'm not sure whether a device that does not support live migration
> > > > > should expose this attribute in the first place. Or is that to cover
> > > > > cases where a driver supports live migration only for some of the
> > > > > devices it supports?
> > > > yes, driver returning error code is to cover the cases where only part of devices it
> > > > supports can be migrated.
> > > >
> > > >
> > > > > Also, I'm not sure if a string that has to be parsed is a good idea...
> > > > > is this 'version' attribute supposed to convey some human-readable
> > > > > information as well? The procedure you describe for compatibility
> > > > > checking does the checking within the vendor driver which I would
> > > > > expect to have a table/rules for that anyway.
> > > > right. if a vendor driver has the confidence to migrate between devices of
> > > > diffent platform or mdev types, it can maintain a compatibility table for that
> > > > purpose. That's the reason why we would leave the compatibility check to vendor
> > > > driver. vendor driver can freely choose its own complicated way to decide
> > > > which device is migratable to which device.
> > >
> > > I think there are two scenarios here:
> > > - Migrating between different device types, which is unlikely to work,
> > >   except in special cases.
> > > - Migrating between different versions of the same device type, which
> > >   may work for some drivers/devices (and at least migrating to a newer
> > >   version looks quite reasonable).
> > >
> > > But both should be something that is decided by the individual driver;
> > > I hope we don't want to support migration between different drivers :-O
> > >
> > > Can we make this a driver-defined format?
> > >
> > yes, this is indeed driver-defined format.
> > Actually we define it into two parts: common part and vendor proprietary part.
> > common part: 32 bit. lower 16 bits is vendor id and higher 16 bits
> >              identifies device type. e.g., for pci device, it is
> >              "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
> > vendor proprietary part: this part is varied in length. vendor driver can
> >              specify any string to identify a device.
> >
> > vendor proprietary part is defined by vendor driver. vendor driver can
> > define any format it wishes to use. Also it is its own responsibility to
> > ensure backward compatibility if it wants to update format definition in this
> > part.
> >
> > So user space only needs to get source side's version string, and asks
> > target side whether the two are compatible. The decision maker is the
> > vendor driver:)
> 
> If I followed the discussion correctly, I think you plan to drop this
> format, don't you? I'd be happy if a vendor driver can use a simple
> number without any prefixes if it so chooses.
> 
> I also like the idea of renaming this "migration_version" so that it is
> clear we're dealing with versioning of the migration capability (and
> not a version of the device or so).
hi Cornelia,
sorry I just saw this mail after sending v2 of this patch set...
yes, I dropped the common part and vendor driver now can define whatever it
wishes to identify a device version.
However, I don't agree to rename it to "migration_version", as it still may
bring some kind of confusing with the migration version a vendor driver is
using, e.g. vendor driver changes migration code and increases that migration
version.
In fact, what info we want to get from this attribute is whether this mdev
device is compatible with another mdev device, which is tied to device, and not
necessarily bound to migration.

do you think so?

Thanks
Yan
> _______________________________________________
> intel-gvt-dev mailing list
> intel-gvt-dev@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/intel-gvt-dev
Cornelia Huck May 7, 2019, 8:51 a.m. UTC | #26
On Tue, 7 May 2019 01:39:13 -0400
Yan Zhao <yan.y.zhao@intel.com> wrote:

> On Tue, Apr 30, 2019 at 11:29:08PM +0800, Cornelia Huck wrote:

> > If I followed the discussion correctly, I think you plan to drop this
> > format, don't you? I'd be happy if a vendor driver can use a simple
> > number without any prefixes if it so chooses.
> > 
> > I also like the idea of renaming this "migration_version" so that it is
> > clear we're dealing with versioning of the migration capability (and
> > not a version of the device or so).  
> hi Cornelia,
> sorry I just saw this mail after sending v2 of this patch set...
> yes, I dropped the common part and vendor driver now can define whatever it
> wishes to identify a device version.

Ok, I'll look at v2.

> However, I don't agree to rename it to "migration_version", as it still may
> bring some kind of confusing with the migration version a vendor driver is
> using, e.g. vendor driver changes migration code and increases that migration
> version.
> In fact, what info we want to get from this attribute is whether this mdev
> device is compatible with another mdev device, which is tied to device, and not
> necessarily bound to migration.
> 
> do you think so?

I'm not 100% convinced; but we can continue the discussion on v2.
diff mbox series

Patch

diff --git a/Documentation/vfio-mediated-device.txt b/Documentation/vfio-mediated-device.txt
index c3f69bcaf96e..bc28471c0667 100644
--- a/Documentation/vfio-mediated-device.txt
+++ b/Documentation/vfio-mediated-device.txt
@@ -202,6 +202,7 @@  Directories and files under the sysfs for Each Physical Device
   |     |   |--- available_instances
   |     |   |--- device_api
   |     |   |--- description
+  |     |   |--- version
   |     |   |--- [devices]
   |     |--- [<type-id>]
   |     |   |--- create
@@ -209,6 +210,7 @@  Directories and files under the sysfs for Each Physical Device
   |     |   |--- available_instances
   |     |   |--- device_api
   |     |   |--- description
+  |     |   |--- version
   |     |   |--- [devices]
   |     |--- [<type-id>]
   |          |--- create
@@ -216,6 +218,7 @@  Directories and files under the sysfs for Each Physical Device
   |          |--- available_instances
   |          |--- device_api
   |          |--- description
+  |          |--- version
   |          |--- [devices]
 
 * [mdev_supported_types]
@@ -225,6 +228,8 @@  Directories and files under the sysfs for Each Physical Device
   [<type-id>], device_api, and available_instances are mandatory attributes
   that should be provided by vendor driver.
 
+  version is a mandatory attribute if a mdev device supports live migration.
+
 * [<type-id>]
 
   The [<type-id>] name is created by adding the device driver string as a prefix
@@ -246,6 +251,35 @@  Directories and files under the sysfs for Each Physical Device
   This attribute should show the number of devices of type <type-id> that can be
   created.
 
+* version
+
+  This attribute is rw. It is used to check whether two devices are compatible
+  for live migration. If this attribute is missing, then the corresponding mdev
+  device is regarded as not supporting live migration.
+
+  It consists of two parts: common part and vendor proprietary part.
+  common part: 32 bit. lower 16 bits is vendor id and higher 16 bits identifies
+               device type. e.g., for pci device, it is
+               "pci vendor id" | (VFIO_DEVICE_FLAGS_PCI << 16).
+  vendor proprietary part: this part is varied in length. vendor driver can
+               specify any string to identify a device.
+
+  When reading this attribute, it should show device version string of the device
+  of type <type-id>. If a device does not support live migration, it should
+  return errno.
+  When writing a string to this attribute, it returns errno for incompatibility
+  or returns written string length in compatibility case. If a device does not
+  support live migration, it always returns errno.
+
+  for example.
+  # cat \
+ /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_2/version
+  00028086-193b-i915-GVTg_V5_2
+
+  #echo 00028086-193b-i915-GVTg_V5_2 > \
+ /sys/bus/pci/devices/0000\:00\:02.0/mdev_supported_types/i915-GVTg_V5_4/version
+ -bash: echo: write error: Invalid argument
+
 * [device]
 
   This directory contains links to the devices of type <type-id> that have been
@@ -327,12 +361,14 @@  card.
         |   |   |-- available_instances
         |   |   |-- create
         |   |   |-- device_api
+        |   |   |-- version
         |   |   |-- devices
         |   |   `-- name
         |   `-- mtty-2
         |       |-- available_instances
         |       |-- create
         |       |-- device_api
+        |       |-- version
         |       |-- devices
         |       `-- name
         |-- mtty_dev
diff --git a/samples/vfio-mdev/mbochs.c b/samples/vfio-mdev/mbochs.c
index b038aa9f5a70..2f5ba96b91a2 100644
--- a/samples/vfio-mdev/mbochs.c
+++ b/samples/vfio-mdev/mbochs.c
@@ -1391,11 +1391,28 @@  static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
 }
 MDEV_TYPE_ATTR_RO(device_api);
 
+static ssize_t version_show(struct kobject *kobj, struct device *dev,
+		char *buf)
+{
+	/* do not support live migration */
+	return -EINVAL;
+}
+
+static ssize_t version_store(struct kobject *kobj, struct device *dev,
+		const char *buf, size_t count)
+{
+	/* do not support live migration */
+	return -EINVAL;
+}
+
+static MDEV_TYPE_ATTR_RW(version);
+
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
 	&mdev_type_attr_description.attr,
 	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
+	&mdev_type_attr_version.attr,
 	NULL,
 };
 
diff --git a/samples/vfio-mdev/mdpy.c b/samples/vfio-mdev/mdpy.c
index cc86bf6566e4..ff15fdfc7d46 100644
--- a/samples/vfio-mdev/mdpy.c
+++ b/samples/vfio-mdev/mdpy.c
@@ -695,11 +695,27 @@  static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
 }
 MDEV_TYPE_ATTR_RO(device_api);
 
+static ssize_t version_show(struct kobject *kobj, struct device *dev,
+		char *buf)
+{
+	/* do not support live migration */
+	return -EINVAL;
+}
+
+static ssize_t version_store(struct kobject *kobj, struct device *dev,
+		const char *buf, size_t count)
+{
+	/* do not support live migration */
+	return -EINVAL;
+}
+static MDEV_TYPE_ATTR_RW(version);
+
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
 	&mdev_type_attr_description.attr,
 	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
+	&mdev_type_attr_version.attr,
 	NULL,
 };
 
diff --git a/samples/vfio-mdev/mtty.c b/samples/vfio-mdev/mtty.c
index 1c77c370c92f..4ae3aad3474d 100644
--- a/samples/vfio-mdev/mtty.c
+++ b/samples/vfio-mdev/mtty.c
@@ -1390,10 +1390,26 @@  static ssize_t device_api_show(struct kobject *kobj, struct device *dev,
 
 MDEV_TYPE_ATTR_RO(device_api);
 
+static ssize_t version_show(struct kobject *kobj, struct device *dev,
+		char *buf)
+{
+	/* do not support live migration */
+	return -EINVAL;
+}
+
+static ssize_t version_store(struct kobject *kobj, struct device *dev,
+		const char *buf, size_t count)
+{
+	/* do not support live migration */
+	return -EINVAL;
+}
+
+static MDEV_TYPE_ATTR_RW(version);
 static struct attribute *mdev_types_attrs[] = {
 	&mdev_type_attr_name.attr,
 	&mdev_type_attr_device_api.attr,
 	&mdev_type_attr_available_instances.attr,
+	&mdev_type_attr_version.attr,
 	NULL,
 };