Message ID | 1487910561-17825-1-git-send-email-peterx@redhat.com |
---|---|
State | New |
Headers | show |
Hello Peter, This solution looks to check dependency of 'vfio-pci' over 'intel-iommu' before 'intel-iommu' is not initialized. Overall it looks good to me, just a small nit below. > > Intel vIOMMU devices are created with "-device" parameter, while here > actually we need to make sure the dmar device be created before other > PCI devices (like vfio-pci) so that we know iommu_fn will be setup > correctly before realizations of those PCI devices (it is sensible that > PCI device fetch these info during its realization). Now this ordering > yet cannot be achieved elsewhere, and devices will be created in the > order that user specified. That might be dangerous. > > Here we add one more function to detect this kind of misordering issue, > then report to guest. Currently, the only known device that is affected > by this VT-d defect is the vfio-pci typed devices. So for now we just > check against it to make sure we are safe. > > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 22d8226..b723ece 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s, > Error **errp) > return true; > } > > +/* > + * TODO: we should have a better way to achieve the ordering rather > + * than this misorder check explicitly against vfio-pci. After all, no > + * one should be blamed for this, and vfio-pci did nothing wrong. > + */ > +static bool vtd_detected_misorder_init(Error **errp) > +{ > + Object *dev = object_resolve_path_type("", "vfio-pci", NULL); > + > + if (dev) { > + error_setg(errp, "Please specify \"intel-iommu\" before all the rest "before all the rest" does not give much clue to user. Do you think better error message would help? just a thought. > " > + "of the devices."); > + return true; > + } > + > + return false; > +} > + > static void vtd_realize(DeviceState *dev, Error **errp) > { > PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); > @@ -2567,6 +2585,10 @@ static void vtd_realize(DeviceState *dev, Error > **errp) > IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev); > > + if (vtd_detected_misorder_init(errp)) { > + return; > + } > + > VTD_DPRINTF(GENERAL, ""); > x86_iommu->type = TYPE_INTEL; > > -- > 2.7.4 > > >
On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote: > Hello Peter, Hi, Pankaj! > > This solution looks to check dependency of 'vfio-pci' over 'intel-iommu' > before 'intel-iommu' is not initialized. > > Overall it looks good to me, just a small nit below. > > > > > Intel vIOMMU devices are created with "-device" parameter, while here > > actually we need to make sure the dmar device be created before other > > PCI devices (like vfio-pci) so that we know iommu_fn will be setup > > correctly before realizations of those PCI devices (it is sensible that > > PCI device fetch these info during its realization). Now this ordering > > yet cannot be achieved elsewhere, and devices will be created in the > > order that user specified. That might be dangerous. > > > > Here we add one more function to detect this kind of misordering issue, > > then report to guest. Currently, the only known device that is affected > > by this VT-d defect is the vfio-pci typed devices. So for now we just > > check against it to make sure we are safe. > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > --- > > hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++ > > 1 file changed, 22 insertions(+) > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > index 22d8226..b723ece 100644 > > --- a/hw/i386/intel_iommu.c > > +++ b/hw/i386/intel_iommu.c > > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s, > > Error **errp) > > return true; > > } > > > > +/* > > + * TODO: we should have a better way to achieve the ordering rather > > + * than this misorder check explicitly against vfio-pci. After all, no > > + * one should be blamed for this, and vfio-pci did nothing wrong. > > + */ > > +static bool vtd_detected_misorder_init(Error **errp) > > +{ > > + Object *dev = object_resolve_path_type("", "vfio-pci", NULL); > > + > > + if (dev) { > > + error_setg(errp, "Please specify \"intel-iommu\" before all the rest > > "before all the rest" does not give much clue to user. Do you think better > error message would help? just a thought. Yes this is my intention to emphasize that we should possibly put intel-iommu even before all the PCI devices. As mentioned above (and also in the commit message), although vfio-pci is the only one that is affected by this, we should probably put intel-iommu even before all the rest of PCI devices. E.g., in the future we can have new devices that need this dependency as well (that vt-d being inited before that device), which is still legal imho. Meanwhile, I intended to avoid naming "vfio-pci" here since it'll let user think of "something bad with vfio-pci" but actually vfio-pci did nothing wrong. Thanks, -- peterx
> > On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote: > > Hello Peter, > > Hi, Pankaj! > > > > > This solution looks to check dependency of 'vfio-pci' over 'intel-iommu' > > before 'intel-iommu' is not initialized. > > > > Overall it looks good to me, just a small nit below. > > > > > > > > Intel vIOMMU devices are created with "-device" parameter, while here > > > actually we need to make sure the dmar device be created before other > > > PCI devices (like vfio-pci) so that we know iommu_fn will be setup > > > correctly before realizations of those PCI devices (it is sensible that > > > PCI device fetch these info during its realization). Now this ordering > > > yet cannot be achieved elsewhere, and devices will be created in the > > > order that user specified. That might be dangerous. > > > > > > Here we add one more function to detect this kind of misordering issue, > > > then report to guest. Currently, the only known device that is affected > > > by this VT-d defect is the vfio-pci typed devices. So for now we just > > > check against it to make sure we are safe. > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > --- > > > hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++ > > > 1 file changed, 22 insertions(+) > > > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > > index 22d8226..b723ece 100644 > > > --- a/hw/i386/intel_iommu.c > > > +++ b/hw/i386/intel_iommu.c > > > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s, > > > Error **errp) > > > return true; > > > } > > > > > > +/* > > > + * TODO: we should have a better way to achieve the ordering rather > > > + * than this misorder check explicitly against vfio-pci. After all, no > > > + * one should be blamed for this, and vfio-pci did nothing wrong. > > > + */ > > > +static bool vtd_detected_misorder_init(Error **errp) > > > +{ > > > + Object *dev = object_resolve_path_type("", "vfio-pci", NULL); > > > + > > > + if (dev) { > > > + error_setg(errp, "Please specify \"intel-iommu\" before all the > > > rest > > > > "before all the rest" does not give much clue to user. Do you > > think better > > error message would help? just a thought. > > Yes this is my intention to emphasize that we should possibly put > intel-iommu even before all the PCI devices. As mentioned above (and > also in the commit message), although vfio-pci is the only one that is > affected by this, we should probably put intel-iommu even before all > the rest of PCI devices. E.g., in the future we can have new devices > that need this dependency as well (that vt-d being inited before that > device), which is still legal imho. yes, something like this can help "intel-iommu should be specified as first device"? Or we can just check for "intel-iommu" as first device at the start in place of checking for "vfio-pci". This will also help us to check for all other devices. Thanks. > > Meanwhile, I intended to avoid naming "vfio-pci" here since it'll let > user think of "something bad with vfio-pci" but actually vfio-pci did > nothing wrong. o.k > > Thanks, > > -- peterx >
On Fri, Feb 24, 2017 at 01:35:10AM -0500, Pankaj Gupta wrote: > > > > > On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote: > > > Hello Peter, > > > > Hi, Pankaj! > > > > > > > > This solution looks to check dependency of 'vfio-pci' over 'intel-iommu' > > > before 'intel-iommu' is not initialized. > > > > > > Overall it looks good to me, just a small nit below. > > > > > > > > > > > Intel vIOMMU devices are created with "-device" parameter, while here > > > > actually we need to make sure the dmar device be created before other > > > > PCI devices (like vfio-pci) so that we know iommu_fn will be setup > > > > correctly before realizations of those PCI devices (it is sensible that > > > > PCI device fetch these info during its realization). Now this ordering > > > > yet cannot be achieved elsewhere, and devices will be created in the > > > > order that user specified. That might be dangerous. > > > > > > > > Here we add one more function to detect this kind of misordering issue, > > > > then report to guest. Currently, the only known device that is affected > > > > by this VT-d defect is the vfio-pci typed devices. So for now we just > > > > check against it to make sure we are safe. > > > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > > --- > > > > hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++ > > > > 1 file changed, 22 insertions(+) > > > > > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > > > index 22d8226..b723ece 100644 > > > > --- a/hw/i386/intel_iommu.c > > > > +++ b/hw/i386/intel_iommu.c > > > > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s, > > > > Error **errp) > > > > return true; > > > > } > > > > > > > > +/* > > > > + * TODO: we should have a better way to achieve the ordering rather > > > > + * than this misorder check explicitly against vfio-pci. After all, no > > > > + * one should be blamed for this, and vfio-pci did nothing wrong. > > > > + */ > > > > +static bool vtd_detected_misorder_init(Error **errp) > > > > +{ > > > > + Object *dev = object_resolve_path_type("", "vfio-pci", NULL); > > > > + > > > > + if (dev) { > > > > + error_setg(errp, "Please specify \"intel-iommu\" before all the > > > > rest > > > > > > "before all the rest" does not give much clue to user. Do you > > > think better > > > error message would help? just a thought. > > > > Yes this is my intention to emphasize that we should possibly put > > intel-iommu even before all the PCI devices. As mentioned above (and > > also in the commit message), although vfio-pci is the only one that is > > affected by this, we should probably put intel-iommu even before all > > the rest of PCI devices. E.g., in the future we can have new devices > > that need this dependency as well (that vt-d being inited before that > > device), which is still legal imho. > > yes, something like this can help "intel-iommu should be specified as first device"? I can switch this description if you prefer this one. :) Not sure whether this can be touched up during merge if this patch is good for 2.9, or I'll just repost with yours if there is further comments. > Or we can just check for "intel-iommu" as first device at the start in place of > checking for "vfio-pci". This will also help us to check for all other devices. Actually your suggestion is just what I did in version 1. The problem is that, it's not easy to let vt-d really be the first device to be inited... Please check pc_q35_init(), within that we have: pc_vga_init(isa_bus, host_bus); pc_nic_init(isa_bus, host_bus); These are integrated devices along with ICH9. Such devices are possibly pci devices as well, but they are created much earlier than vt-d device, since that's still during machine init. Thanks, -- peterx
> On Fri, Feb 24, 2017 at 01:35:10AM -0500, Pankaj Gupta wrote: > > > > > > > > On Fri, Feb 24, 2017 at 12:07:33AM -0500, Pankaj Gupta wrote: > > > > Hello Peter, > > > > > > Hi, Pankaj! > > > > > > > > > > > This solution looks to check dependency of 'vfio-pci' over > > > > 'intel-iommu' > > > > before 'intel-iommu' is not initialized. > > > > > > > > Overall it looks good to me, just a small nit below. > > > > > > > > > > > > > > Intel vIOMMU devices are created with "-device" parameter, while here > > > > > actually we need to make sure the dmar device be created before other > > > > > PCI devices (like vfio-pci) so that we know iommu_fn will be setup > > > > > correctly before realizations of those PCI devices (it is sensible > > > > > that > > > > > PCI device fetch these info during its realization). Now this > > > > > ordering > > > > > yet cannot be achieved elsewhere, and devices will be created in the > > > > > order that user specified. That might be dangerous. > > > > > > > > > > Here we add one more function to detect this kind of misordering > > > > > issue, > > > > > then report to guest. Currently, the only known device that is > > > > > affected > > > > > by this VT-d defect is the vfio-pci typed devices. So for now we just > > > > > check against it to make sure we are safe. > > > > > > > > > > Signed-off-by: Peter Xu <peterx@redhat.com> > > > > > --- > > > > > hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++ > > > > > 1 file changed, 22 insertions(+) > > > > > > > > > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > > > > > index 22d8226..b723ece 100644 > > > > > --- a/hw/i386/intel_iommu.c > > > > > +++ b/hw/i386/intel_iommu.c > > > > > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState > > > > > *s, > > > > > Error **errp) > > > > > return true; > > > > > } > > > > > > > > > > +/* > > > > > + * TODO: we should have a better way to achieve the ordering rather > > > > > + * than this misorder check explicitly against vfio-pci. After all, > > > > > no > > > > > + * one should be blamed for this, and vfio-pci did nothing wrong. > > > > > + */ > > > > > +static bool vtd_detected_misorder_init(Error **errp) > > > > > +{ > > > > > + Object *dev = object_resolve_path_type("", "vfio-pci", NULL); > > > > > + > > > > > + if (dev) { > > > > > + error_setg(errp, "Please specify \"intel-iommu\" before all > > > > > the > > > > > rest > > > > > > > > "before all the rest" does not give much clue to user. Do > > > > you > > > > think better > > > > error message would help? just a thought. > > > > > > Yes this is my intention to emphasize that we should possibly put > > > intel-iommu even before all the PCI devices. As mentioned above (and > > > also in the commit message), although vfio-pci is the only one that is > > > affected by this, we should probably put intel-iommu even before all > > > the rest of PCI devices. E.g., in the future we can have new devices > > > that need this dependency as well (that vt-d being inited before that > > > device), which is still legal imho. > > > > yes, something like this can help "intel-iommu should be specified as first > > device"? > > I can switch this description if you prefer this one. :) Not sure > whether this can be touched up during merge if this patch is good for > 2.9, or I'll just repost with yours if there is further comments. Sure. > > > Or we can just check for "intel-iommu" as first device at the start in > > place of > > checking for "vfio-pci". This will also help us to check for all other > > devices. > > Actually your suggestion is just what I did in version 1. The problem > is that, it's not easy to let vt-d really be the first device to be > inited... Please check pc_q35_init(), within that we have: > > pc_vga_init(isa_bus, host_bus); > pc_nic_init(isa_bus, host_bus); > > These are integrated devices along with ICH9. Such devices are > possibly pci devices as well, but they are created much earlier than > vt-d device, since that's still during machine init. o.k. Thanks, Pankaj > > Thanks, > > -- peterx >
On 02/24/2017 06:29 AM, Peter Xu wrote: > Intel vIOMMU devices are created with "-device" parameter, while here > actually we need to make sure the dmar device be created before other > PCI devices (like vfio-pci) so that we know iommu_fn will be setup > correctly before realizations of those PCI devices (it is sensible that > PCI device fetch these info during its realization). Now this ordering > yet cannot be achieved elsewhere, and devices will be created in the > order that user specified. That might be dangerous. > > Here we add one more function to detect this kind of misordering issue, > then report to guest. Currently, the only known device that is affected > by this VT-d defect is the vfio-pci typed devices. So for now we just > check against it to make sure we are safe. > Hi, I can't say that I like it but if we want it for 2.9 maybe we don't have a choice. I mentioned in another thread other idea: Maybe we should follow the same "template" as disk/drive, nic/netdev ? I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 . You are able to change the order, I didn't look how it is done. A nice side effect is that you can: 1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables) 2. In the future we can support multiple iommu devices. Thanks, Marcel > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c > index 22d8226..b723ece 100644 > --- a/hw/i386/intel_iommu.c > +++ b/hw/i386/intel_iommu.c > @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp) > return true; > } > > +/* > + * TODO: we should have a better way to achieve the ordering rather > + * than this misorder check explicitly against vfio-pci. After all, no > + * one should be blamed for this, and vfio-pci did nothing wrong. > + */ > +static bool vtd_detected_misorder_init(Error **errp) > +{ > + Object *dev = object_resolve_path_type("", "vfio-pci", NULL); > + > + if (dev) { > + error_setg(errp, "Please specify \"intel-iommu\" before all the rest " > + "of the devices."); > + return true; > + } > + > + return false; > +} > + > static void vtd_realize(DeviceState *dev, Error **errp) > { > PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); > @@ -2567,6 +2585,10 @@ static void vtd_realize(DeviceState *dev, Error **errp) > IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); > X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev); > > + if (vtd_detected_misorder_init(errp)) { > + return; > + } > + > VTD_DPRINTF(GENERAL, ""); > x86_iommu->type = TYPE_INTEL; > >
On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote: > On 02/24/2017 06:29 AM, Peter Xu wrote: > >Intel vIOMMU devices are created with "-device" parameter, while here > >actually we need to make sure the dmar device be created before other > >PCI devices (like vfio-pci) so that we know iommu_fn will be setup > >correctly before realizations of those PCI devices (it is sensible that > >PCI device fetch these info during its realization). Now this ordering > >yet cannot be achieved elsewhere, and devices will be created in the > >order that user specified. That might be dangerous. > > > >Here we add one more function to detect this kind of misordering issue, > >then report to guest. Currently, the only known device that is affected > >by this VT-d defect is the vfio-pci typed devices. So for now we just > >check against it to make sure we are safe. > > > > Hi, > I can't say that I like it but if we want it for 2.9 maybe we don't have a choice. > > I mentioned in another thread other idea: > Maybe we should follow the same "template" as disk/drive, nic/netdev ? > I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 . > You are able to change the order, I didn't look how it is done. I suppose that's done by using different keywords. For example: -netdev user,id=net0 -device e1000,netdev=net0 Here we are using "netdev" for the backend and "device" for the frontend. Since we have this difference, we can just make sure the ordering by init netdev first (in net_init_clients()), then the devices (in device_init_func()). > > A nice side effect is that you can: > 1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables) > 2. In the future we can support multiple iommu devices. Yes. Thanks, -- peterx
On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote: > On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote: > > On 02/24/2017 06:29 AM, Peter Xu wrote: > > >Intel vIOMMU devices are created with "-device" parameter, while here > > >actually we need to make sure the dmar device be created before other > > >PCI devices (like vfio-pci) so that we know iommu_fn will be setup > > >correctly before realizations of those PCI devices (it is sensible that > > >PCI device fetch these info during its realization). Now this ordering > > >yet cannot be achieved elsewhere, and devices will be created in the > > >order that user specified. That might be dangerous. > > > > > >Here we add one more function to detect this kind of misordering issue, > > >then report to guest. Currently, the only known device that is affected > > >by this VT-d defect is the vfio-pci typed devices. So for now we just > > >check against it to make sure we are safe. > > > > > > > Hi, > > I can't say that I like it but if we want it for 2.9 maybe we don't have a choice. > > > > I mentioned in another thread other idea: > > Maybe we should follow the same "template" as disk/drive, nic/netdev ? > > I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 . > > You are able to change the order, I didn't look how it is done. > > I suppose that's done by using different keywords. For example: > > -netdev user,id=net0 -device e1000,netdev=net0 > > Here we are using "netdev" for the backend and "device" for the > frontend. > > Since we have this difference, we can just make sure the ordering by > init netdev first (in net_init_clients()), then the devices (in > device_init_func()). > > > > > A nice side effect is that you can: > > 1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables) > > 2. In the future we can support multiple iommu devices. Unfortunately AFAIK this needs a bunch of work in practice: host and guest side. So this opens a can of worms: you can all too easily create configurations that we don't support. > > Yes. Thanks, > > -- peterx While I agree this fixes the specific problem, we have the ordering issue in many other places. For example, this is one of the reasons we don't create built-in PC devices using QOM composition. So I think that support for dependencies does make a lot of sense generally.
On 2017年03月01日 11:23, Michael S. Tsirkin wrote: > On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote: >> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote: >>> On 02/24/2017 06:29 AM, Peter Xu wrote: >>>> Intel vIOMMU devices are created with "-device" parameter, while here >>>> actually we need to make sure the dmar device be created before other >>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup >>>> correctly before realizations of those PCI devices (it is sensible that >>>> PCI device fetch these info during its realization). Now this ordering >>>> yet cannot be achieved elsewhere, and devices will be created in the >>>> order that user specified. That might be dangerous. >>>> >>>> Here we add one more function to detect this kind of misordering issue, >>>> then report to guest. Currently, the only known device that is affected >>>> by this VT-d defect is the vfio-pci typed devices. So for now we just >>>> check against it to make sure we are safe. >>>> >>> Hi, >>> I can't say that I like it but if we want it for 2.9 maybe we don't have a choice. >>> >>> I mentioned in another thread other idea: >>> Maybe we should follow the same "template" as disk/drive, nic/netdev ? >>> I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 . >>> You are able to change the order, I didn't look how it is done. >> I suppose that's done by using different keywords. For example: >> >> -netdev user,id=net0 -device e1000,netdev=net0 >> >> Here we are using "netdev" for the backend and "device" for the >> frontend. >> >> Since we have this difference, we can just make sure the ordering by >> init netdev first (in net_init_clients()), then the devices (in >> device_init_func()). >> >>> A nice side effect is that you can: >>> 1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables) >>> 2. In the future we can support multiple iommu devices. > Unfortunately AFAIK this needs a bunch of work in practice: > host and guest side. > So this opens a can of worms: you can all too easily create > configurations that we don't support. > > >> Yes. Thanks, >> >> -- peterx > While I agree this fixes the specific problem, we have the ordering > issue in many other places. Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet the exact issue as this patch, IOMMU needs to be created before any virtio-pci devices. Thanks > For example, this is one of the reasons we > don't create built-in PC devices using QOM composition. > > So I think that support for dependencies does make a lot of sense > generally. > >
On 03/01/2017 06:14 AM, Jason Wang wrote: > > > On 2017年03月01日 11:23, Michael S. Tsirkin wrote: >> On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote: >>> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote: >>>> On 02/24/2017 06:29 AM, Peter Xu wrote: >>>>> Intel vIOMMU devices are created with "-device" parameter, while here >>>>> actually we need to make sure the dmar device be created before other >>>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup >>>>> correctly before realizations of those PCI devices (it is sensible that >>>>> PCI device fetch these info during its realization). Now this ordering >>>>> yet cannot be achieved elsewhere, and devices will be created in the >>>>> order that user specified. That might be dangerous. >>>>> >>>>> Here we add one more function to detect this kind of misordering issue, >>>>> then report to guest. Currently, the only known device that is affected >>>>> by this VT-d defect is the vfio-pci typed devices. So for now we just >>>>> check against it to make sure we are safe. >>>>> >>>> Hi, >>>> I can't say that I like it but if we want it for 2.9 maybe we don't have a choice. >>>> >>>> I mentioned in another thread other idea: >>>> Maybe we should follow the same "template" as disk/drive, nic/netdev ? >>>> I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 . >>>> You are able to change the order, I didn't look how it is done. >>> I suppose that's done by using different keywords. For example: >>> >>> -netdev user,id=net0 -device e1000,netdev=net0 >>> >>> Here we are using "netdev" for the backend and "device" for the >>> frontend. >>> >>> Since we have this difference, we can just make sure the ordering by >>> init netdev first (in net_init_clients()), then the devices (in >>> device_init_func()). >>> >>>> A nice side effect is that you can: >>>> 1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables) >>>> 2. In the future we can support multiple iommu devices. >> Unfortunately AFAIK this needs a bunch of work in practice: >> host and guest side. >> So this opens a can of worms: you can all too easily create >> configurations that we don't support. >> >> >>> Yes. Thanks, >>> >>> -- peterx >> While I agree this fixes the specific problem, we have the ordering >> issue in many other places. > > Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet the exact issue as this patch, IOMMU needs to be created before any virtio-pci devices. > Hi Jason, I am not saying we don't need to create the IOMMU before some other device, I just don't think that the command line order should matter to user. BTW, are you working on a way to limit the IOMMU scope to a specific set of devices? If yes, this approach will also help you work. Thanks, Marcel > Thanks > >> For example, this is one of the reasons we >> don't create built-in PC devices using QOM composition. >> >> So I think that support for dependencies does make a lot of sense >> generally. >> >> >
On 2017年03月01日 15:03, Marcel Apfelbaum wrote: > On 03/01/2017 06:14 AM, Jason Wang wrote: >> >> >> On 2017年03月01日 11:23, Michael S. Tsirkin wrote: >>> On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote: >>>> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote: >>>>> On 02/24/2017 06:29 AM, Peter Xu wrote: >>>>>> Intel vIOMMU devices are created with "-device" parameter, while >>>>>> here >>>>>> actually we need to make sure the dmar device be created before >>>>>> other >>>>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup >>>>>> correctly before realizations of those PCI devices (it is >>>>>> sensible that >>>>>> PCI device fetch these info during its realization). Now this >>>>>> ordering >>>>>> yet cannot be achieved elsewhere, and devices will be created in the >>>>>> order that user specified. That might be dangerous. >>>>>> >>>>>> Here we add one more function to detect this kind of misordering >>>>>> issue, >>>>>> then report to guest. Currently, the only known device that is >>>>>> affected >>>>>> by this VT-d defect is the vfio-pci typed devices. So for now we >>>>>> just >>>>>> check against it to make sure we are safe. >>>>>> >>>>> Hi, >>>>> I can't say that I like it but if we want it for 2.9 maybe we >>>>> don't have a choice. >>>>> >>>>> I mentioned in another thread other idea: >>>>> Maybe we should follow the same "template" as disk/drive, >>>>> nic/netdev ? >>>>> I mean something like -device iommu,id=i1, -device >>>>> vfio-pci,iommu=e1 . >>>>> You are able to change the order, I didn't look how it is done. >>>> I suppose that's done by using different keywords. For example: >>>> >>>> -netdev user,id=net0 -device e1000,netdev=net0 >>>> >>>> Here we are using "netdev" for the backend and "device" for the >>>> frontend. >>>> >>>> Since we have this difference, we can just make sure the ordering by >>>> init netdev first (in net_init_clients()), then the devices (in >>>> device_init_func()). >>>> >>>>> A nice side effect is that you can: >>>>> 1. Limit the iommu scope only to the devices you want to protect >>>>> (tweaking APCI tables) >>>>> 2. In the future we can support multiple iommu devices. >>> Unfortunately AFAIK this needs a bunch of work in practice: >>> host and guest side. >>> So this opens a can of worms: you can all too easily create >>> configurations that we don't support. >>> >>> >>>> Yes. Thanks, >>>> >>>> -- peterx >>> While I agree this fixes the specific problem, we have the ordering >>> issue in many other places. >> >> Yes, just post a fix for virtio-pci with IOMMU. With this fix, we >> meet the exact issue as this patch, IOMMU needs to be created before >> any virtio-pci devices. >> > > Hi Jason, > > I am not saying we don't need to create the IOMMU before some other > device, > I just don't think that the command line order should matter to user. > > BTW, are you working on a way to limit the IOMMU scope to a specific > set of devices? > If yes, this approach will also help you work. Not yet, this may work (after 2.9) but I believe we still need a fix for 2.9. Thanks > > Thanks, > Marcel
On 03/01/2017 10:43 AM, Jason Wang wrote: > > > On 2017年03月01日 15:03, Marcel Apfelbaum wrote: >> On 03/01/2017 06:14 AM, Jason Wang wrote: >>> >>> >>> On 2017年03月01日 11:23, Michael S. Tsirkin wrote: >>>> On Wed, Mar 01, 2017 at 10:36:35AM +0800, Peter Xu wrote: >>>>> On Tue, Feb 28, 2017 at 04:42:25PM +0200, Marcel Apfelbaum wrote: >>>>>> On 02/24/2017 06:29 AM, Peter Xu wrote: >>>>>>> Intel vIOMMU devices are created with "-device" parameter, while here >>>>>>> actually we need to make sure the dmar device be created before other >>>>>>> PCI devices (like vfio-pci) so that we know iommu_fn will be setup >>>>>>> correctly before realizations of those PCI devices (it is sensible that >>>>>>> PCI device fetch these info during its realization). Now this ordering >>>>>>> yet cannot be achieved elsewhere, and devices will be created in the >>>>>>> order that user specified. That might be dangerous. >>>>>>> >>>>>>> Here we add one more function to detect this kind of misordering issue, >>>>>>> then report to guest. Currently, the only known device that is affected >>>>>>> by this VT-d defect is the vfio-pci typed devices. So for now we just >>>>>>> check against it to make sure we are safe. >>>>>>> >>>>>> Hi, >>>>>> I can't say that I like it but if we want it for 2.9 maybe we don't have a choice. >>>>>> >>>>>> I mentioned in another thread other idea: >>>>>> Maybe we should follow the same "template" as disk/drive, nic/netdev ? >>>>>> I mean something like -device iommu,id=i1, -device vfio-pci,iommu=e1 . >>>>>> You are able to change the order, I didn't look how it is done. >>>>> I suppose that's done by using different keywords. For example: >>>>> >>>>> -netdev user,id=net0 -device e1000,netdev=net0 >>>>> >>>>> Here we are using "netdev" for the backend and "device" for the >>>>> frontend. >>>>> >>>>> Since we have this difference, we can just make sure the ordering by >>>>> init netdev first (in net_init_clients()), then the devices (in >>>>> device_init_func()). >>>>> >>>>>> A nice side effect is that you can: >>>>>> 1. Limit the iommu scope only to the devices you want to protect (tweaking APCI tables) >>>>>> 2. In the future we can support multiple iommu devices. >>>> Unfortunately AFAIK this needs a bunch of work in practice: >>>> host and guest side. >>>> So this opens a can of worms: you can all too easily create >>>> configurations that we don't support. >>>> >>>> >>>>> Yes. Thanks, >>>>> >>>>> -- peterx >>>> While I agree this fixes the specific problem, we have the ordering >>>> issue in many other places. >>> >>> Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet the exact issue as this patch, IOMMU needs to be created before any virtio-pci devices. >>> >> >> Hi Jason, >> >> I am not saying we don't need to create the IOMMU before some other device, >> I just don't think that the command line order should matter to user. >> >> BTW, are you working on a way to limit the IOMMU scope to a specific set of devices? >> If yes, this approach will also help you work. > > Not yet, this may work (after 2.9) but I believe we still need a fix for 2.9. > I agree, I am thinking about sending a patch that at least takes the "vfio-pci" check outside the iommu code in a separate header file. We can then add other devices that need to be crated before the iommu. Thanks, Marcel > Thanks > >> >> Thanks, >> Marcel >
On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote: > On 03/01/2017 06:14 AM, Jason Wang wrote: [...] > Hi Jason, > > I am not saying we don't need to create the IOMMU before some other device, > I just don't think that the command line order should matter to user. > > BTW, are you working on a way to limit the IOMMU scope to a specific set of devices? > If yes, this approach will also help you work. Marcel, Do you have any plan after 2.9 to re-arrange the init order thing a bit? In general, maybe we need an ordering support for devices. Besides that, I am thinking whether it would be wise that we just init the IOMMUs during machine init just like before, but keep the "-device" interface? After all, at least the root IOMMU device should be with root complex, and it feels a little strange we just split the init process apart (we delay the IOMMU init into device initializations). Not sure whether above makes sense though. Thanks, -- peterx
On 03/01/2017 11:18 AM, Peter Xu wrote: > On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote: >> On 03/01/2017 06:14 AM, Jason Wang wrote: > > [...] > >> Hi Jason, >> >> I am not saying we don't need to create the IOMMU before some other device, >> I just don't think that the command line order should matter to user. >> >> BTW, are you working on a way to limit the IOMMU scope to a specific set of devices? >> If yes, this approach will also help you work. > > Marcel, > > Do you have any plan after 2.9 to re-arrange the init order thing a > bit? In general, maybe we need an ordering support for devices. I don;t know when I'll start it, but yes, I am thinking about taking this project. We need the ordering in order to be able to have less "built-in" devices. > Besides that, I am thinking whether it would be wise that we just init > the IOMMUs during machine init just like before, but keep the > "-device" interface? After all, at least the root IOMMU device should > be with root complex, and it feels a little strange we just split the > init process apart (we delay the IOMMU init into device > initializations). > Keeping the IOMMU creation as part of the Root Complex is problematic. What happens if we want to limit the IOMMU scope to a subset of devices? How will the command line look? Also we may want/need multiple iommu devices per system. It will not happen tomorrow, but we don't want to loose this possibility. The device re-ordering is not 2.9 material of course, and we need your patch. The only thing we can do better is to take out the "vfio-pci" in another header file and and have it in a array of devices that should be checked (in order to avoid polluting the IOMMU code with a not related device) I can try and send a patch for it if you prefer. Thanks, Marcel > Not sure whether above makes sense though. Thanks, > > -- peterx >
On Wed, Mar 01, 2017 at 11:29:47AM +0200, Marcel Apfelbaum wrote: > On 03/01/2017 11:18 AM, Peter Xu wrote: > >On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote: > >>On 03/01/2017 06:14 AM, Jason Wang wrote: > > > >[...] > > > >>Hi Jason, > >> > >>I am not saying we don't need to create the IOMMU before some other device, > >>I just don't think that the command line order should matter to user. > >> > >>BTW, are you working on a way to limit the IOMMU scope to a specific set of devices? > >>If yes, this approach will also help you work. > > > >Marcel, > > > >Do you have any plan after 2.9 to re-arrange the init order thing a > >bit? In general, maybe we need an ordering support for devices. > > I don;t know when I'll start it, but yes, I am thinking about > taking this project. We need the ordering in order to be able > to have less "built-in" devices. That'll be great! > > >Besides that, I am thinking whether it would be wise that we just init > >the IOMMUs during machine init just like before, but keep the > >"-device" interface? After all, at least the root IOMMU device should > >be with root complex, and it feels a little strange we just split the > >init process apart (we delay the IOMMU init into device > >initializations). > > > > Keeping the IOMMU creation as part of the Root Complex is problematic. > What happens if we want to limit the IOMMU scope to a subset of devices? > How will the command line look? Also we may want/need multiple iommu > devices per system. It will not happen tomorrow, but we don't want to loose > this possibility. I think that won't be a big problem. E.g., I don't see big problem if we just create all these vIOMMUs along with machine init. Is there? IMHO we can just pick these vIOMMU "devices" out of the device list, and we can avoid doing that again in general device_init_func(). > > The device re-ordering is not 2.9 material of course, and we need your patch. > The only thing we can do better is to take out the "vfio-pci" in another header > file and and have it in a array of devices that should be checked > (in order to avoid polluting the IOMMU code with a not related device) > > I can try and send a patch for it if you prefer. IMHO it'll be okay we "pollute" vtd code for a short while. We can revert my patch (or any workarounds, like Jason's just posted one) when we have device ordering support, and when we have a correct bus_master_as when device init. Thanks, -- peterx
On 03/01/2017 11:59 AM, Peter Xu wrote: > On Wed, Mar 01, 2017 at 11:29:47AM +0200, Marcel Apfelbaum wrote: >> On 03/01/2017 11:18 AM, Peter Xu wrote: >>> On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote: >>>> On 03/01/2017 06:14 AM, Jason Wang wrote: >>> >>> [...] >>> >>>> Hi Jason, >>>> >>>> I am not saying we don't need to create the IOMMU before some other device, >>>> I just don't think that the command line order should matter to user. >>>> >>>> BTW, are you working on a way to limit the IOMMU scope to a specific set of devices? >>>> If yes, this approach will also help you work. >>> >>> Marcel, >>> >>> Do you have any plan after 2.9 to re-arrange the init order thing a >>> bit? In general, maybe we need an ordering support for devices. >> >> I don;t know when I'll start it, but yes, I am thinking about >> taking this project. We need the ordering in order to be able >> to have less "built-in" devices. > > That'll be great! > >> >>> Besides that, I am thinking whether it would be wise that we just init >>> the IOMMUs during machine init just like before, but keep the >>> "-device" interface? After all, at least the root IOMMU device should >>> be with root complex, and it feels a little strange we just split the >>> init process apart (we delay the IOMMU init into device >>> initializations). >>> >> >> Keeping the IOMMU creation as part of the Root Complex is problematic. >> What happens if we want to limit the IOMMU scope to a subset of devices? >> How will the command line look? Also we may want/need multiple iommu >> devices per system. It will not happen tomorrow, but we don't want to loose >> this possibility. > > I think that won't be a big problem. E.g., I don't see big problem if > we just create all these vIOMMUs along with machine init. Is there? > How would you assign them do devices? The "normal" QEMU cmd line would look like I mentioned earlier: -device iommu,id=iommu1 -device pcie-root,iommu=iommu1 \ -device pcie-root \ -device iommu,id=iommu2 -device pcie-root,iommu=iommu2 \ How do you propose to do it otherwise? > IMHO we can just pick these vIOMMU "devices" out of the device list, > and we can avoid doing that again in general device_init_func(). > >> >> The device re-ordering is not 2.9 material of course, and we need your patch. >> The only thing we can do better is to take out the "vfio-pci" in another header >> file and and have it in a array of devices that should be checked >> (in order to avoid polluting the IOMMU code with a not related device) >> >> I can try and send a patch for it if you prefer. > > IMHO it'll be okay we "pollute" vtd code for a short while. We can > revert my patch (or any workarounds, like Jason's just posted one) Can you please send me a link to Jason's patch? If nobody else objects I am OK with it. Thanks, Marcel > when we have device ordering support, and when we have a correct > bus_master_as when device init. > > Thanks, > > -- peterx >
On Wed, Mar 01, 2017 at 12:14:21PM +0800, Jason Wang wrote: > On 2017年03月01日 11:23, Michael S. Tsirkin wrote: [...] > >While I agree this fixes the specific problem, we have the ordering > >issue in many other places. > > Yes, just post a fix for virtio-pci with IOMMU. With this fix, we meet the > exact issue as this patch, IOMMU needs to be created before any virtio-pci > devices. I posted v3 to include detection of virtio-pci devices. Hope that's what we needed for 2.9. Thanks, -- peterx
On Wed, Mar 01, 2017 at 02:32:34PM +0200, Marcel Apfelbaum wrote: > On 03/01/2017 11:59 AM, Peter Xu wrote: > >On Wed, Mar 01, 2017 at 11:29:47AM +0200, Marcel Apfelbaum wrote: > >>On 03/01/2017 11:18 AM, Peter Xu wrote: > >>>On Wed, Mar 01, 2017 at 09:03:47AM +0200, Marcel Apfelbaum wrote: > >>>>On 03/01/2017 06:14 AM, Jason Wang wrote: > >>> > >>>[...] > >>> > >>>>Hi Jason, > >>>> > >>>>I am not saying we don't need to create the IOMMU before some other device, > >>>>I just don't think that the command line order should matter to user. > >>>> > >>>>BTW, are you working on a way to limit the IOMMU scope to a specific set of devices? > >>>>If yes, this approach will also help you work. > >>> > >>>Marcel, > >>> > >>>Do you have any plan after 2.9 to re-arrange the init order thing a > >>>bit? In general, maybe we need an ordering support for devices. > >> > >>I don;t know when I'll start it, but yes, I am thinking about > >>taking this project. We need the ordering in order to be able > >>to have less "built-in" devices. > > > >That'll be great! > > > >> > >>>Besides that, I am thinking whether it would be wise that we just init > >>>the IOMMUs during machine init just like before, but keep the > >>>"-device" interface? After all, at least the root IOMMU device should > >>>be with root complex, and it feels a little strange we just split the > >>>init process apart (we delay the IOMMU init into device > >>>initializations). > >>> > >> > >>Keeping the IOMMU creation as part of the Root Complex is problematic. > >>What happens if we want to limit the IOMMU scope to a subset of devices? > >>How will the command line look? Also we may want/need multiple iommu > >>devices per system. It will not happen tomorrow, but we don't want to loose > >>this possibility. > > > >I think that won't be a big problem. E.g., I don't see big problem if > >we just create all these vIOMMUs along with machine init. Is there? > > > > How would you assign them do devices? > > The "normal" QEMU cmd line would look like I mentioned earlier: > > -device iommu,id=iommu1 -device pcie-root,iommu=iommu1 \ > -device pcie-root \ > -device iommu,id=iommu2 -device pcie-root,iommu=iommu2 \ > > How do you propose to do it otherwise? I totally agree that we can use the way above when we wants to bind device with specific vIOMMU device. I was just wondering whether we can move the init of "-device intel-iommu,..." to machine init phase, no matter whether it's the vIOMMU on the root complex or not. > > >IMHO we can just pick these vIOMMU "devices" out of the device list, > >and we can avoid doing that again in general device_init_func(). > > > >> > >>The device re-ordering is not 2.9 material of course, and we need your patch. > >>The only thing we can do better is to take out the "vfio-pci" in another header > >>file and and have it in a array of devices that should be checked > >>(in order to avoid polluting the IOMMU code with a not related device) > >> > >>I can try and send a patch for it if you prefer. > > > >IMHO it'll be okay we "pollute" vtd code for a short while. We can > >revert my patch (or any workarounds, like Jason's just posted one) > > Can you please send me a link to Jason's patch? Sorry for the unclearness! This one: https://lists.gnu.org/archive/html/qemu-devel/2017-02/msg07844.html I just posted v3 for this patch to co-op with Jason's patch. Thanks, -- peterx
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c index 22d8226..b723ece 100644 --- a/hw/i386/intel_iommu.c +++ b/hw/i386/intel_iommu.c @@ -2560,6 +2560,24 @@ static bool vtd_decide_config(IntelIOMMUState *s, Error **errp) return true; } +/* + * TODO: we should have a better way to achieve the ordering rather + * than this misorder check explicitly against vfio-pci. After all, no + * one should be blamed for this, and vfio-pci did nothing wrong. + */ +static bool vtd_detected_misorder_init(Error **errp) +{ + Object *dev = object_resolve_path_type("", "vfio-pci", NULL); + + if (dev) { + error_setg(errp, "Please specify \"intel-iommu\" before all the rest " + "of the devices."); + return true; + } + + return false; +} + static void vtd_realize(DeviceState *dev, Error **errp) { PCMachineState *pcms = PC_MACHINE(qdev_get_machine()); @@ -2567,6 +2585,10 @@ static void vtd_realize(DeviceState *dev, Error **errp) IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev); X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev); + if (vtd_detected_misorder_init(errp)) { + return; + } + VTD_DPRINTF(GENERAL, ""); x86_iommu->type = TYPE_INTEL;
Intel vIOMMU devices are created with "-device" parameter, while here actually we need to make sure the dmar device be created before other PCI devices (like vfio-pci) so that we know iommu_fn will be setup correctly before realizations of those PCI devices (it is sensible that PCI device fetch these info during its realization). Now this ordering yet cannot be achieved elsewhere, and devices will be created in the order that user specified. That might be dangerous. Here we add one more function to detect this kind of misordering issue, then report to guest. Currently, the only known device that is affected by this VT-d defect is the vfio-pci typed devices. So for now we just check against it to make sure we are safe. Signed-off-by: Peter Xu <peterx@redhat.com> --- hw/i386/intel_iommu.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)