Message ID | 20230629040707.115656-6-anisinha@redhat.com |
---|---|
State | New |
Headers | show |
Series | test and QEMU fixes to ensure proper PCIE device usage | expand |
On 2023/06/29 13:07, Ani Sinha wrote: > PCI Express ports only have one slot, so PCI Express devices can only be > plugged into slot 0 on a PCIE port. Enforce it. > > The change has been tested to not break ARI by instantiating seven vfs on an > emulated igb device (the maximum number of vfs the linux igb driver supports). > The vfs are seen to have non-zero device/slot numbers in the conventional > PCI BDF representation. > > CC: jusual@redhat.com > CC: imammedo@redhat.com > CC: akihiko.odaki@daynix.com > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > Signed-off-by: Ani Sinha <anisinha@redhat.com> > Reviewed-by: Julia Suvorova <jusual@redhat.com> > --- > hw/pci/pci.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index e2eb4c3b4a..0320ac2bb3 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -65,6 +65,7 @@ bool pci_available = true; > static char *pcibus_get_dev_path(DeviceState *dev); > static char *pcibus_get_fw_dev_path(DeviceState *dev); > static void pcibus_reset(BusState *qbus); > +static bool pcie_has_upstream_port(PCIDevice *dev); > > static Property pci_props[] = { > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > @@ -1190,6 +1191,20 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, > name); > > return NULL; > + } /* > + * With SRIOV and ARI, vfs can have non-zero slot in the conventional > + * PCI interpretation as all five bits reserved for slot addresses are > + * also used for function bits for the various vfs. Ignore that case. > + * It is too early here to check for ARI capabilities in the PCI config > + * space. Hence, we check for a vf device instead. > + */ Why don't just perform this check after the capabilities are set? Regards, Akihiko Odaki > + else if (!pci_is_vf(pci_dev) && > + pcie_has_upstream_port(pci_dev) && > + PCI_SLOT(devfn)) { > + error_setg(errp, "PCI: slot %d is not valid for %s," > + " parent device only allows plugging into slot 0.", > + PCI_SLOT(devfn), name); > + return NULL; > } > > pci_dev->devfn = devfn;
On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.odaki@daynix.com> wrote: > On 2023/06/29 13:07, Ani Sinha wrote: > > PCI Express ports only have one slot, so PCI Express devices can only be > > plugged into slot 0 on a PCIE port. Enforce it. > > > > The change has been tested to not break ARI by instantiating seven vfs > on an > > emulated igb device (the maximum number of vfs the linux igb driver > supports). > > The vfs are seen to have non-zero device/slot numbers in the conventional > > PCI BDF representation. > > > > CC: jusual@redhat.com > > CC: imammedo@redhat.com > > CC: akihiko.odaki@daynix.com > > > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > > Signed-off-by: Ani Sinha <anisinha@redhat.com> > > Reviewed-by: Julia Suvorova <jusual@redhat.com> > > --- > > hw/pci/pci.c | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > > index e2eb4c3b4a..0320ac2bb3 100644 > > --- a/hw/pci/pci.c > > +++ b/hw/pci/pci.c > > @@ -65,6 +65,7 @@ bool pci_available = true; > > static char *pcibus_get_dev_path(DeviceState *dev); > > static char *pcibus_get_fw_dev_path(DeviceState *dev); > > static void pcibus_reset(BusState *qbus); > > +static bool pcie_has_upstream_port(PCIDevice *dev); > > > > static Property pci_props[] = { > > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > > @@ -1190,6 +1191,20 @@ static PCIDevice > *do_pci_register_device(PCIDevice *pci_dev, > > name); > > > > return NULL; > > + } /* > > + * With SRIOV and ARI, vfs can have non-zero slot in the > conventional > > + * PCI interpretation as all five bits reserved for slot > addresses are > > + * also used for function bits for the various vfs. Ignore that > case. > > + * It is too early here to check for ARI capabilities in the PCI > config > > + * space. Hence, we check for a vf device instead. > > + */ > > Why don't just perform this check after the capabilities are set? > We don't want to allocate resources for wrong device parameters. We want to error out early. Other checks also are performed at the same place . Show quoted text > > > > Regards, > Akihiko Odaki > > > + else if (!pci_is_vf(pci_dev) && > > + pcie_has_upstream_port(pci_dev) && > > + PCI_SLOT(devfn)) { > > + error_setg(errp, "PCI: slot %d is not valid for %s," > > + " parent device only allows plugging into slot 0.", > > + PCI_SLOT(devfn), name); > > + return NULL; > > } > > > > pci_dev->devfn = devfn; > >
On 2023/06/29 17:05, Ani Sinha wrote: > > > On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.odaki@daynix.com > <mailto:akihiko.odaki@daynix.com>> wrote: > > On 2023/06/29 13:07, Ani Sinha wrote: > > PCI Express ports only have one slot, so PCI Express devices can > only be > > plugged into slot 0 on a PCIE port. Enforce it. > > > > The change has been tested to not break ARI by instantiating > seven vfs on an > > emulated igb device (the maximum number of vfs the linux igb > driver supports). > > The vfs are seen to have non-zero device/slot numbers in the > conventional > > PCI BDF representation. > > > > CC: jusual@redhat.com <mailto:jusual@redhat.com> > > CC: imammedo@redhat.com <mailto:imammedo@redhat.com> > > CC: akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com> > > > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > <https://bugzilla.redhat.com/show_bug.cgi?id=2128929> > > Signed-off-by: Ani Sinha <anisinha@redhat.com > <mailto:anisinha@redhat.com>> > > Reviewed-by: Julia Suvorova <jusual@redhat.com > <mailto:jusual@redhat.com>> > > --- > > hw/pci/pci.c | 15 +++++++++++++++ > > 1 file changed, 15 insertions(+) > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > > index e2eb4c3b4a..0320ac2bb3 100644 > > --- a/hw/pci/pci.c > > +++ b/hw/pci/pci.c > > @@ -65,6 +65,7 @@ bool pci_available = true; > > static char *pcibus_get_dev_path(DeviceState *dev); > > static char *pcibus_get_fw_dev_path(DeviceState *dev); > > static void pcibus_reset(BusState *qbus); > > +static bool pcie_has_upstream_port(PCIDevice *dev); > > > > static Property pci_props[] = { > > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > > @@ -1190,6 +1191,20 @@ static PCIDevice > *do_pci_register_device(PCIDevice *pci_dev, > > name); > > > > return NULL; > > + } /* > > + * With SRIOV and ARI, vfs can have non-zero slot in the > conventional > > + * PCI interpretation as all five bits reserved for slot > addresses are > > + * also used for function bits for the various vfs. Ignore > that case. > > + * It is too early here to check for ARI capabilities in > the PCI config > > + * space. Hence, we check for a vf device instead. > > + */ > > Why don't just perform this check after the capabilities are set? > > > We don't want to allocate resources for wrong device parameters. We want > to error out early. Other checks also are performed at the same place . It is indeed better to raise an error as early as possible so that we can avoid allocation and other operations that will be reverted and may go wrong due to the invalid condition. That should be the reason why other checks for the address are performed here. However, in this particular case, we cannot confidently perform the check here because it is unknown if the ARI capability will be advertised until the device realization code runs. This can justify delaying the check after the device realization, unlike the other checks. > Show quoted text > > > > > Regards, > Akihiko Odaki > > > + else if (!pci_is_vf(pci_dev) && > > + pcie_has_upstream_port(pci_dev) && > > + PCI_SLOT(devfn)) { > > + error_setg(errp, "PCI: slot %d is not valid for %s," > > + " parent device only allows plugging into > slot 0.", > > + PCI_SLOT(devfn), name); > > + return NULL; > > } > > > > pci_dev->devfn = devfn; >
> On 29-Jun-2023, at 2:19 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > On 2023/06/29 17:05, Ani Sinha wrote: >> On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com>> wrote: >> On 2023/06/29 13:07, Ani Sinha wrote: >> > PCI Express ports only have one slot, so PCI Express devices can >> only be >> > plugged into slot 0 on a PCIE port. Enforce it. >> > >> > The change has been tested to not break ARI by instantiating >> seven vfs on an >> > emulated igb device (the maximum number of vfs the linux igb >> driver supports). >> > The vfs are seen to have non-zero device/slot numbers in the >> conventional >> > PCI BDF representation. >> > >> > CC: jusual@redhat.com <mailto:jusual@redhat.com> >> > CC: imammedo@redhat.com <mailto:imammedo@redhat.com> >> > CC: akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com> >> > >> > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >> <https://bugzilla.redhat.com/show_bug.cgi?id=2128929> >> > Signed-off-by: Ani Sinha <anisinha@redhat.com >> <mailto:anisinha@redhat.com>> >> > Reviewed-by: Julia Suvorova <jusual@redhat.com >> <mailto:jusual@redhat.com>> >> > --- >> > hw/pci/pci.c | 15 +++++++++++++++ >> > 1 file changed, 15 insertions(+) >> > >> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> > index e2eb4c3b4a..0320ac2bb3 100644 >> > --- a/hw/pci/pci.c >> > +++ b/hw/pci/pci.c >> > @@ -65,6 +65,7 @@ bool pci_available = true; >> > static char *pcibus_get_dev_path(DeviceState *dev); >> > static char *pcibus_get_fw_dev_path(DeviceState *dev); >> > static void pcibus_reset(BusState *qbus); >> > +static bool pcie_has_upstream_port(PCIDevice *dev); >> > >> > static Property pci_props[] = { >> > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >> > @@ -1190,6 +1191,20 @@ static PCIDevice >> *do_pci_register_device(PCIDevice *pci_dev, >> > name); >> > >> > return NULL; >> > + } /* >> > + * With SRIOV and ARI, vfs can have non-zero slot in the >> conventional >> > + * PCI interpretation as all five bits reserved for slot >> addresses are >> > + * also used for function bits for the various vfs. Ignore >> that case. >> > + * It is too early here to check for ARI capabilities in >> the PCI config >> > + * space. Hence, we check for a vf device instead. >> > + */ >> Why don't just perform this check after the capabilities are set? >> We don't want to allocate resources for wrong device parameters. We want to error out early. Other checks also are performed at the same place . > > It is indeed better to raise an error as early as possible so that we can avoid allocation and other operations that will be reverted and may go wrong due to the invalid condition. That should be the reason why other checks for the address are performed here. > > However, in this particular case, we cannot confidently perform the check here because it is unknown if the ARI capability will be advertised until the device realization code runs. This can justify delaying the check after the device realization, unlike the other checks. Ok so are you proposing that the check we have right before (the check for unoccupied function 0) be also moved? It also uses the same vf approximation for seemingly to support ARI. Also where do you propose we move the check? > >> Show quoted text >> Regards, >> Akihiko Odaki >> > + else if (!pci_is_vf(pci_dev) && >> > + pcie_has_upstream_port(pci_dev) && >> > + PCI_SLOT(devfn)) { >> > + error_setg(errp, "PCI: slot %d is not valid for %s," >> > + " parent device only allows plugging into >> slot 0.", >> > + PCI_SLOT(devfn), name); >> > + return NULL; >> > } >> > >> > pci_dev->devfn = devfn;
On Thu, Jun 29, 2023 at 09:37:07AM +0530, Ani Sinha wrote: > PCI Express ports only have one slot, so PCI Express devices can only be > plugged into slot 0 on a PCIE port. Enforce it. > > The change has been tested to not break ARI by instantiating seven vfs on an > emulated igb device (the maximum number of vfs the linux igb driver supports). I guess we need to test with some other device then? 7 VFs is same slot so hardly a good test. > The vfs are seen to have non-zero device/slot numbers in the conventional > PCI BDF representation. > > CC: jusual@redhat.com > CC: imammedo@redhat.com > CC: akihiko.odaki@daynix.com > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > Signed-off-by: Ani Sinha <anisinha@redhat.com> > Reviewed-by: Julia Suvorova <jusual@redhat.com> > --- > hw/pci/pci.c | 15 +++++++++++++++ > 1 file changed, 15 insertions(+) > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > index e2eb4c3b4a..0320ac2bb3 100644 > --- a/hw/pci/pci.c > +++ b/hw/pci/pci.c > @@ -65,6 +65,7 @@ bool pci_available = true; > static char *pcibus_get_dev_path(DeviceState *dev); > static char *pcibus_get_fw_dev_path(DeviceState *dev); > static void pcibus_reset(BusState *qbus); > +static bool pcie_has_upstream_port(PCIDevice *dev); > > static Property pci_props[] = { > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > @@ -1190,6 +1191,20 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, > name); > > return NULL; > + } /* > + * With SRIOV and ARI, vfs can have non-zero slot in the conventional > + * PCI interpretation as all five bits reserved for slot addresses are > + * also used for function bits for the various vfs. Ignore that case. > + * It is too early here to check for ARI capabilities in the PCI config > + * space. Hence, we check for a vf device instead. > + */ > + else if (!pci_is_vf(pci_dev) && > + pcie_has_upstream_port(pci_dev) && > + PCI_SLOT(devfn)) { > + error_setg(errp, "PCI: slot %d is not valid for %s," > + " parent device only allows plugging into slot 0.", > + PCI_SLOT(devfn), name); > + return NULL; > } > > pci_dev->devfn = devfn; > -- > 2.39.1
> On 29-Jun-2023, at 7:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Jun 29, 2023 at 09:37:07AM +0530, Ani Sinha wrote: >> PCI Express ports only have one slot, so PCI Express devices can only be >> plugged into slot 0 on a PCIE port. Enforce it. >> >> The change has been tested to not break ARI by instantiating seven vfs on an >> emulated igb device (the maximum number of vfs the linux igb driver supports). > > I guess we need to test with some other device then? 7 VFs is same > slot so hardly a good test. No its not the same slot. Its using different slots/device numbers. I checked that. The same patch was failing without the vf check. > >> The vfs are seen to have non-zero device/slot numbers in the conventional >> PCI BDF representation. >> >> CC: jusual@redhat.com >> CC: imammedo@redhat.com >> CC: akihiko.odaki@daynix.com >> >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >> Signed-off-by: Ani Sinha <anisinha@redhat.com> >> Reviewed-by: Julia Suvorova <jusual@redhat.com> >> --- >> hw/pci/pci.c | 15 +++++++++++++++ >> 1 file changed, 15 insertions(+) >> >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >> index e2eb4c3b4a..0320ac2bb3 100644 >> --- a/hw/pci/pci.c >> +++ b/hw/pci/pci.c >> @@ -65,6 +65,7 @@ bool pci_available = true; >> static char *pcibus_get_dev_path(DeviceState *dev); >> static char *pcibus_get_fw_dev_path(DeviceState *dev); >> static void pcibus_reset(BusState *qbus); >> +static bool pcie_has_upstream_port(PCIDevice *dev); >> >> static Property pci_props[] = { >> DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >> @@ -1190,6 +1191,20 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, >> name); >> >> return NULL; >> + } /* >> + * With SRIOV and ARI, vfs can have non-zero slot in the conventional >> + * PCI interpretation as all five bits reserved for slot addresses are >> + * also used for function bits for the various vfs. Ignore that case. >> + * It is too early here to check for ARI capabilities in the PCI config >> + * space. Hence, we check for a vf device instead. >> + */ >> + else if (!pci_is_vf(pci_dev) && >> + pcie_has_upstream_port(pci_dev) && >> + PCI_SLOT(devfn)) { >> + error_setg(errp, "PCI: slot %d is not valid for %s," >> + " parent device only allows plugging into slot 0.", >> + PCI_SLOT(devfn), name); >> + return NULL; >> } >> >> pci_dev->devfn = devfn; >> -- >> 2.39.1 >
On Thu, Jun 29, 2023 at 08:07:57PM +0530, Ani Sinha wrote: > > > > On 29-Jun-2023, at 7:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Thu, Jun 29, 2023 at 09:37:07AM +0530, Ani Sinha wrote: > >> PCI Express ports only have one slot, so PCI Express devices can only be > >> plugged into slot 0 on a PCIE port. Enforce it. > >> > >> The change has been tested to not break ARI by instantiating seven vfs on an > >> emulated igb device (the maximum number of vfs the linux igb driver supports). > > > > I guess we need to test with some other device then? 7 VFs is same > > slot so hardly a good test. > > No its not the same slot. Its using different slots/device numbers. I checked that. > The same patch was failing without the vf check. Ah, playing with VF stride? Could you show the command line please? > > > >> The vfs are seen to have non-zero device/slot numbers in the conventional > >> PCI BDF representation. > >> > >> CC: jusual@redhat.com > >> CC: imammedo@redhat.com > >> CC: akihiko.odaki@daynix.com > >> > >> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > >> Signed-off-by: Ani Sinha <anisinha@redhat.com> > >> Reviewed-by: Julia Suvorova <jusual@redhat.com> > >> --- > >> hw/pci/pci.c | 15 +++++++++++++++ > >> 1 file changed, 15 insertions(+) > >> > >> diff --git a/hw/pci/pci.c b/hw/pci/pci.c > >> index e2eb4c3b4a..0320ac2bb3 100644 > >> --- a/hw/pci/pci.c > >> +++ b/hw/pci/pci.c > >> @@ -65,6 +65,7 @@ bool pci_available = true; > >> static char *pcibus_get_dev_path(DeviceState *dev); > >> static char *pcibus_get_fw_dev_path(DeviceState *dev); > >> static void pcibus_reset(BusState *qbus); > >> +static bool pcie_has_upstream_port(PCIDevice *dev); > >> > >> static Property pci_props[] = { > >> DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > >> @@ -1190,6 +1191,20 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, > >> name); > >> > >> return NULL; > >> + } /* > >> + * With SRIOV and ARI, vfs can have non-zero slot in the conventional > >> + * PCI interpretation as all five bits reserved for slot addresses are > >> + * also used for function bits for the various vfs. Ignore that case. > >> + * It is too early here to check for ARI capabilities in the PCI config > >> + * space. Hence, we check for a vf device instead. > >> + */ > >> + else if (!pci_is_vf(pci_dev) && > >> + pcie_has_upstream_port(pci_dev) && > >> + PCI_SLOT(devfn)) { > >> + error_setg(errp, "PCI: slot %d is not valid for %s," > >> + " parent device only allows plugging into slot 0.", > >> + PCI_SLOT(devfn), name); > >> + return NULL; > >> } > >> > >> pci_dev->devfn = devfn; > >> -- > >> 2.39.1 > >
> On 29-Jun-2023, at 9:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Jun 29, 2023 at 08:07:57PM +0530, Ani Sinha wrote: >> >> >>> On 29-Jun-2023, at 7:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Thu, Jun 29, 2023 at 09:37:07AM +0530, Ani Sinha wrote: >>>> PCI Express ports only have one slot, so PCI Express devices can only be >>>> plugged into slot 0 on a PCIE port. Enforce it. >>>> >>>> The change has been tested to not break ARI by instantiating seven vfs on an >>>> emulated igb device (the maximum number of vfs the linux igb driver supports). >>> >>> I guess we need to test with some other device then? 7 VFs is same >>> slot so hardly a good test. >> >> No its not the same slot. Its using different slots/device numbers. I checked that. >> The same patch was failing without the vf check. > > Ah, playing with VF stride? Could you show the command line please? Akhido mentioned this in the other thread. Basically For QEMU: -device pcie-root-port,id=p -device igb,bus=p Then from within the guest (in my case RHEL 9.2): $ echo 7 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs You’ll find that if you use something more than 7 there will be ERANGE from the guest kernel because the driver can create maximum 7 vfs. This above command line will fail if we do not check for !vfs in the patch with the following error from QEMU: (qemu) qemu-system-x86_64: PCI: slot 16 is not valid for igbvf, parent device only allows plugging into slot 0. and an IO error on the write from the guest kernel. In the current version of the patch with the vf check, you will find the vfs created with the addresses: 01:10.{2,4,6,8} and 01.11.{2,4,6} , that is bus 1 for the root port, devices 10 and 11, functions 2,4,6,8 etc. There would be no error from QEMU. > >>> >>>> The vfs are seen to have non-zero device/slot numbers in the conventional >>>> PCI BDF representation. >>>> >>>> CC: jusual@redhat.com >>>> CC: imammedo@redhat.com >>>> CC: akihiko.odaki@daynix.com >>>> >>>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >>>> Signed-off-by: Ani Sinha <anisinha@redhat.com> >>>> Reviewed-by: Julia Suvorova <jusual@redhat.com> >>>> --- >>>> hw/pci/pci.c | 15 +++++++++++++++ >>>> 1 file changed, 15 insertions(+) >>>> >>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>> index e2eb4c3b4a..0320ac2bb3 100644 >>>> --- a/hw/pci/pci.c >>>> +++ b/hw/pci/pci.c >>>> @@ -65,6 +65,7 @@ bool pci_available = true; >>>> static char *pcibus_get_dev_path(DeviceState *dev); >>>> static char *pcibus_get_fw_dev_path(DeviceState *dev); >>>> static void pcibus_reset(BusState *qbus); >>>> +static bool pcie_has_upstream_port(PCIDevice *dev); >>>> >>>> static Property pci_props[] = { >>>> DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >>>> @@ -1190,6 +1191,20 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, >>>> name); >>>> >>>> return NULL; >>>> + } /* >>>> + * With SRIOV and ARI, vfs can have non-zero slot in the conventional >>>> + * PCI interpretation as all five bits reserved for slot addresses are >>>> + * also used for function bits for the various vfs. Ignore that case. >>>> + * It is too early here to check for ARI capabilities in the PCI config >>>> + * space. Hence, we check for a vf device instead. >>>> + */ >>>> + else if (!pci_is_vf(pci_dev) && >>>> + pcie_has_upstream_port(pci_dev) && >>>> + PCI_SLOT(devfn)) { >>>> + error_setg(errp, "PCI: slot %d is not valid for %s," >>>> + " parent device only allows plugging into slot 0.", >>>> + PCI_SLOT(devfn), name); >>>> + return NULL; >>>> } >>>> >>>> pci_dev->devfn = devfn; >>>> -- >>>> 2.39.1
> On 29-Jun-2023, at 9:27 PM, Ani Sinha <anisinha@redhat.com> wrote: > > > >> On 29-Jun-2023, at 9:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> >> On Thu, Jun 29, 2023 at 08:07:57PM +0530, Ani Sinha wrote: >>> >>> >>>> On 29-Jun-2023, at 7:54 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>> >>>> On Thu, Jun 29, 2023 at 09:37:07AM +0530, Ani Sinha wrote: >>>>> PCI Express ports only have one slot, so PCI Express devices can only be >>>>> plugged into slot 0 on a PCIE port. Enforce it. >>>>> >>>>> The change has been tested to not break ARI by instantiating seven vfs on an >>>>> emulated igb device (the maximum number of vfs the linux igb driver supports). >>>> >>>> I guess we need to test with some other device then? 7 VFs is same >>>> slot so hardly a good test. >>> >>> No its not the same slot. Its using different slots/device numbers. I checked that. >>> The same patch was failing without the vf check. >> >> Ah, playing with VF stride? Indeed. You’ll see IGB_VF_STRIDE is 2. pcie_sriov_pf_init() uses this to initialise the PCIE config space attributes. register_vfs() uses this to increment the devfn values :-) >> Could you show the command line please? > > Akhido mentioned this in the other thread. Basically For QEMU: > > -device pcie-root-port,id=p -device igb,bus=p > > Then from within the guest (in my case RHEL 9.2): > > $ echo 7 > /sys/bus/pci/devices/0000\:01\:00.0/sriov_numvfs > > You’ll find that if you use something more than 7 there will be ERANGE from the guest kernel because the driver can create maximum 7 vfs. > This above command line will fail if we do not check for !vfs in the patch with the following error from QEMU: > > (qemu) qemu-system-x86_64: PCI: slot 16 is not valid for igbvf, parent device only allows plugging into slot 0. > > and an IO error on the write from the guest kernel. > > In the current version of the patch with the vf check, you will find the vfs created with the addresses: > > 01:10.{2,4,6,8} and 01.11.{2,4,6} , that is bus 1 for the root port, devices 10 and 11, functions 2,4,6,8 etc. > > There would be no error from QEMU. > >> >>>> >>>>> The vfs are seen to have non-zero device/slot numbers in the conventional >>>>> PCI BDF representation. >>>>> >>>>> CC: jusual@redhat.com >>>>> CC: imammedo@redhat.com >>>>> CC: akihiko.odaki@daynix.com >>>>> >>>>> Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >>>>> Signed-off-by: Ani Sinha <anisinha@redhat.com> >>>>> Reviewed-by: Julia Suvorova <jusual@redhat.com> >>>>> --- >>>>> hw/pci/pci.c | 15 +++++++++++++++ >>>>> 1 file changed, 15 insertions(+) >>>>> >>>>> diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>>> index e2eb4c3b4a..0320ac2bb3 100644 >>>>> --- a/hw/pci/pci.c >>>>> +++ b/hw/pci/pci.c >>>>> @@ -65,6 +65,7 @@ bool pci_available = true; >>>>> static char *pcibus_get_dev_path(DeviceState *dev); >>>>> static char *pcibus_get_fw_dev_path(DeviceState *dev); >>>>> static void pcibus_reset(BusState *qbus); >>>>> +static bool pcie_has_upstream_port(PCIDevice *dev); >>>>> >>>>> static Property pci_props[] = { >>>>> DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >>>>> @@ -1190,6 +1191,20 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, >>>>> name); >>>>> >>>>> return NULL; >>>>> + } /* >>>>> + * With SRIOV and ARI, vfs can have non-zero slot in the conventional >>>>> + * PCI interpretation as all five bits reserved for slot addresses are >>>>> + * also used for function bits for the various vfs. Ignore that case. >>>>> + * It is too early here to check for ARI capabilities in the PCI config >>>>> + * space. Hence, we check for a vf device instead. >>>>> + */ >>>>> + else if (!pci_is_vf(pci_dev) && >>>>> + pcie_has_upstream_port(pci_dev) && >>>>> + PCI_SLOT(devfn)) { >>>>> + error_setg(errp, "PCI: slot %d is not valid for %s," >>>>> + " parent device only allows plugging into slot 0.", >>>>> + PCI_SLOT(devfn), name); >>>>> + return NULL; >>>>> } >>>>> >>>>> pci_dev->devfn = devfn; >>>>> -- >>>>> 2.39.1
On 2023/06/29 23:18, Ani Sinha wrote: > > >> On 29-Jun-2023, at 2:19 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >> >> On 2023/06/29 17:05, Ani Sinha wrote: >>> On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com>> wrote: >>> On 2023/06/29 13:07, Ani Sinha wrote: >>> > PCI Express ports only have one slot, so PCI Express devices can >>> only be >>> > plugged into slot 0 on a PCIE port. Enforce it. >>> > >>> > The change has been tested to not break ARI by instantiating >>> seven vfs on an >>> > emulated igb device (the maximum number of vfs the linux igb >>> driver supports). >>> > The vfs are seen to have non-zero device/slot numbers in the >>> conventional >>> > PCI BDF representation. >>> > >>> > CC: jusual@redhat.com <mailto:jusual@redhat.com> >>> > CC: imammedo@redhat.com <mailto:imammedo@redhat.com> >>> > CC: akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com> >>> > >>> > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >>> <https://bugzilla.redhat.com/show_bug.cgi?id=2128929> >>> > Signed-off-by: Ani Sinha <anisinha@redhat.com >>> <mailto:anisinha@redhat.com>> >>> > Reviewed-by: Julia Suvorova <jusual@redhat.com >>> <mailto:jusual@redhat.com>> >>> > --- >>> > hw/pci/pci.c | 15 +++++++++++++++ >>> > 1 file changed, 15 insertions(+) >>> > >>> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>> > index e2eb4c3b4a..0320ac2bb3 100644 >>> > --- a/hw/pci/pci.c >>> > +++ b/hw/pci/pci.c >>> > @@ -65,6 +65,7 @@ bool pci_available = true; >>> > static char *pcibus_get_dev_path(DeviceState *dev); >>> > static char *pcibus_get_fw_dev_path(DeviceState *dev); >>> > static void pcibus_reset(BusState *qbus); >>> > +static bool pcie_has_upstream_port(PCIDevice *dev); >>> > >>> > static Property pci_props[] = { >>> > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >>> > @@ -1190,6 +1191,20 @@ static PCIDevice >>> *do_pci_register_device(PCIDevice *pci_dev, >>> > name); >>> > >>> > return NULL; >>> > + } /* >>> > + * With SRIOV and ARI, vfs can have non-zero slot in the >>> conventional >>> > + * PCI interpretation as all five bits reserved for slot >>> addresses are >>> > + * also used for function bits for the various vfs. Ignore >>> that case. >>> > + * It is too early here to check for ARI capabilities in >>> the PCI config >>> > + * space. Hence, we check for a vf device instead. >>> > + */ >>> Why don't just perform this check after the capabilities are set? >>> We don't want to allocate resources for wrong device parameters. We want to error out early. Other checks also are performed at the same place . >> >> It is indeed better to raise an error as early as possible so that we can avoid allocation and other operations that will be reverted and may go wrong due to the invalid condition. That should be the reason why other checks for the address are performed here. >> >> However, in this particular case, we cannot confidently perform the check here because it is unknown if the ARI capability will be advertised until the device realization code runs. This can justify delaying the check after the device realization, unlike the other checks. > > Ok so are you proposing that the check we have right before (the check for unoccupied function 0) be also moved? It also uses the same vf approximation for seemingly to support ARI. No, I don't think the check for function 0 is required to be disabled because of the change of addressing caused by ARI, but it is required because SR-IOV VF can be added and removed while the PF (function 0) remains. I think this check should be performed also when SR-IOV is disabled and ARI is enabled. Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > Also where do you propose we move the check? In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. See the check for failover pair as an example. I guess it's also placed in this region because it needs capability information. > >> >>> Show quoted text >>> Regards, >>> Akihiko Odaki >>> > + else if (!pci_is_vf(pci_dev) && >>> > + pcie_has_upstream_port(pci_dev) && >>> > + PCI_SLOT(devfn)) { >>> > + error_setg(errp, "PCI: slot %d is not valid for %s," >>> > + " parent device only allows plugging into >>> slot 0.", >>> > + PCI_SLOT(devfn), name); >>> > + return NULL; >>> > } >>> > >>> > pci_dev->devfn = devfn; >
On Fri, Jun 30, 2023 at 11:43:15AM +0900, Akihiko Odaki wrote: > On 2023/06/29 23:18, Ani Sinha wrote: > > > > > > > On 29-Jun-2023, at 2:19 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > > > > > On 2023/06/29 17:05, Ani Sinha wrote: > > > > On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com>> wrote: > > > > On 2023/06/29 13:07, Ani Sinha wrote: > > > > > PCI Express ports only have one slot, so PCI Express devices can > > > > only be > > > > > plugged into slot 0 on a PCIE port. Enforce it. > > > > > > > > > > The change has been tested to not break ARI by instantiating > > > > seven vfs on an > > > > > emulated igb device (the maximum number of vfs the linux igb > > > > driver supports). > > > > > The vfs are seen to have non-zero device/slot numbers in the > > > > conventional > > > > > PCI BDF representation. > > > > > > > > > > CC: jusual@redhat.com <mailto:jusual@redhat.com> > > > > > CC: imammedo@redhat.com <mailto:imammedo@redhat.com> > > > > > CC: akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com> > > > > > > > > > > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 > > > > <https://bugzilla.redhat.com/show_bug.cgi?id=2128929> > > > > > Signed-off-by: Ani Sinha <anisinha@redhat.com > > > > <mailto:anisinha@redhat.com>> > > > > > Reviewed-by: Julia Suvorova <jusual@redhat.com > > > > <mailto:jusual@redhat.com>> > > > > > --- > > > > > hw/pci/pci.c | 15 +++++++++++++++ > > > > > 1 file changed, 15 insertions(+) > > > > > > > > > > diff --git a/hw/pci/pci.c b/hw/pci/pci.c > > > > > index e2eb4c3b4a..0320ac2bb3 100644 > > > > > --- a/hw/pci/pci.c > > > > > +++ b/hw/pci/pci.c > > > > > @@ -65,6 +65,7 @@ bool pci_available = true; > > > > > static char *pcibus_get_dev_path(DeviceState *dev); > > > > > static char *pcibus_get_fw_dev_path(DeviceState *dev); > > > > > static void pcibus_reset(BusState *qbus); > > > > > +static bool pcie_has_upstream_port(PCIDevice *dev); > > > > > > > > > > static Property pci_props[] = { > > > > > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), > > > > > @@ -1190,6 +1191,20 @@ static PCIDevice > > > > *do_pci_register_device(PCIDevice *pci_dev, > > > > > name); > > > > > > > > > > return NULL; > > > > > + } /* > > > > > + * With SRIOV and ARI, vfs can have non-zero slot in the > > > > conventional > > > > > + * PCI interpretation as all five bits reserved for slot > > > > addresses are > > > > > + * also used for function bits for the various vfs. Ignore > > > > that case. > > > > > + * It is too early here to check for ARI capabilities in > > > > the PCI config > > > > > + * space. Hence, we check for a vf device instead. > > > > > + */ > > > > Why don't just perform this check after the capabilities are set? > > > > We don't want to allocate resources for wrong device parameters. We want to error out early. Other checks also are performed at the same place . > > > > > > It is indeed better to raise an error as early as possible so that we can avoid allocation and other operations that will be reverted and may go wrong due to the invalid condition. That should be the reason why other checks for the address are performed here. > > > > > > However, in this particular case, we cannot confidently perform the check here because it is unknown if the ARI capability will be advertised until the device realization code runs. This can justify delaying the check after the device realization, unlike the other checks. > > > > Ok so are you proposing that the check we have right before (the check for unoccupied function 0) be also moved? It also uses the same vf approximation for seemingly to support ARI. > > No, I don't think the check for function 0 is required to be disabled > because of the change of addressing caused by ARI, but it is required > because SR-IOV VF can be added and removed while the PF (function 0) > remains. I think this check should be performed also when SR-IOV is disabled > and ARI is enabled. > > Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of > checking ARI capability, and that can happen in do_pci_register_device(). > > > Also where do you propose we move the check? > > In pci_qdev_realize(), somewhere after pc->realize() and before option ROM > loading. See the check for failover pair as an example. I guess it's also > placed in this region because it needs capability information. How about instead of spending so much time on working around incomplete ARI support we actually complete ARI support? > > > > > > > > > Show quoted text > > > > Regards, > > > > Akihiko Odaki > > > > > + else if (!pci_is_vf(pci_dev) && > > > > > + pcie_has_upstream_port(pci_dev) && > > > > > + PCI_SLOT(devfn)) { > > > > > + error_setg(errp, "PCI: slot %d is not valid for %s," > > > > > + " parent device only allows plugging into > > > > slot 0.", > > > > > + PCI_SLOT(devfn), name); > > > > > + return NULL; > > > > > } > > > > > > > > > > pci_dev->devfn = devfn; > >
> On 30-Jun-2023, at 8:13 AM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > On 2023/06/29 23:18, Ani Sinha wrote: >>> On 29-Jun-2023, at 2:19 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >>> >>> On 2023/06/29 17:05, Ani Sinha wrote: >>>> On Thu, 29 Jun, 2023, 12:17 pm Akihiko Odaki, <akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com>> wrote: >>>> On 2023/06/29 13:07, Ani Sinha wrote: >>>> > PCI Express ports only have one slot, so PCI Express devices can >>>> only be >>>> > plugged into slot 0 on a PCIE port. Enforce it. >>>> > >>>> > The change has been tested to not break ARI by instantiating >>>> seven vfs on an >>>> > emulated igb device (the maximum number of vfs the linux igb >>>> driver supports). >>>> > The vfs are seen to have non-zero device/slot numbers in the >>>> conventional >>>> > PCI BDF representation. >>>> > >>>> > CC: jusual@redhat.com <mailto:jusual@redhat.com> >>>> > CC: imammedo@redhat.com <mailto:imammedo@redhat.com> >>>> > CC: akihiko.odaki@daynix.com <mailto:akihiko.odaki@daynix.com> >>>> > >>>> > Resolves: https://bugzilla.redhat.com/show_bug.cgi?id=2128929 >>>> <https://bugzilla.redhat.com/show_bug.cgi?id=2128929> >>>> > Signed-off-by: Ani Sinha <anisinha@redhat.com >>>> <mailto:anisinha@redhat.com>> >>>> > Reviewed-by: Julia Suvorova <jusual@redhat.com >>>> <mailto:jusual@redhat.com>> >>>> > --- >>>> > hw/pci/pci.c | 15 +++++++++++++++ >>>> > 1 file changed, 15 insertions(+) >>>> > >>>> > diff --git a/hw/pci/pci.c b/hw/pci/pci.c >>>> > index e2eb4c3b4a..0320ac2bb3 100644 >>>> > --- a/hw/pci/pci.c >>>> > +++ b/hw/pci/pci.c >>>> > @@ -65,6 +65,7 @@ bool pci_available = true; >>>> > static char *pcibus_get_dev_path(DeviceState *dev); >>>> > static char *pcibus_get_fw_dev_path(DeviceState *dev); >>>> > static void pcibus_reset(BusState *qbus); >>>> > +static bool pcie_has_upstream_port(PCIDevice *dev); >>>> > >>>> > static Property pci_props[] = { >>>> > DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), >>>> > @@ -1190,6 +1191,20 @@ static PCIDevice >>>> *do_pci_register_device(PCIDevice *pci_dev, >>>> > name); >>>> > >>>> > return NULL; >>>> > + } /* >>>> > + * With SRIOV and ARI, vfs can have non-zero slot in the >>>> conventional >>>> > + * PCI interpretation as all five bits reserved for slot >>>> addresses are >>>> > + * also used for function bits for the various vfs. Ignore >>>> that case. >>>> > + * It is too early here to check for ARI capabilities in >>>> the PCI config >>>> > + * space. Hence, we check for a vf device instead. >>>> > + */ >>>> Why don't just perform this check after the capabilities are set? >>>> We don't want to allocate resources for wrong device parameters. We want to error out early. Other checks also are performed at the same place . >>> >>> It is indeed better to raise an error as early as possible so that we can avoid allocation and other operations that will be reverted and may go wrong due to the invalid condition. That should be the reason why other checks for the address are performed here. >>> >>> However, in this particular case, we cannot confidently perform the check here because it is unknown if the ARI capability will be advertised until the device realization code runs. This can justify delaying the check after the device realization, unlike the other checks. >> Ok so are you proposing that the check we have right before (the check for unoccupied function 0) be also moved? It also uses the same vf approximation for seemingly to support ARI. > > No, I don't think the check for function 0 is required to be disabled because of the change of addressing caused by ARI, but it is required because SR-IOV VF can be added and removed while the PF (function 0) remains. I think this check should be performed also when SR-IOV is disabled and ARI is enabled. OK I am a little confused with your explanation here. I understand the non-ARI case - to detect the PF we need to have function 0 available. Looking at the comment in pci_init_multifunction() it seems in ARI case, we can simply have a vf in function 0 (conventional interpretation) as well. Hence we need to ignore vfs both in ARI and non-ARI cases. This is what I understood. > > Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > >> Also where do you propose we move the check? > > In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > See the check for failover pair as an example. I guess it's also placed in this region because it needs capability information. > >>> >>>> Show quoted text >>>> Regards, >>>> Akihiko Odaki >>>> > + else if (!pci_is_vf(pci_dev) && >>>> > + pcie_has_upstream_port(pci_dev) && >>>> > + PCI_SLOT(devfn)) { >>>> > + error_setg(errp, "PCI: slot %d is not valid for %s," >>>> > + " parent device only allows plugging into >>>> slot 0.", >>>> > + PCI_SLOT(devfn), name); >>>> > + return NULL; >>>> > } >>>> > >>>> > pci_dev->devfn = devfn;
On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: > > > > Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > > > >> Also where do you propose we move the check? > > > > In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. > > Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: > > -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 > > The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. I think it's allowed because it expects you to hotplug function 0 later, no? I am quite worried about all this work going into blocking what we think is disallowed configurations. We should have maybe blocked them originally, but now that we didn't there's a non zero chance of regressions, and the benefit is not guaranteed.
> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>> >>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>> >>>> Also where do you propose we move the check? >>> >>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >> >> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >> >> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >> >> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > > I think it's allowed because it expects you to hotplug function 0 later, This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. > no? > > I am quite worried about all this work going into blocking > what we think is disallowed configurations. We should have > maybe blocked them originally, but now that we didn't > there's a non zero chance of regressions, Sigh, no medals here for being brave :-) > and the benefit > is not guaranteed. > > -- > MST >
On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: > > > > On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: > >>> > >>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > >>> > >>>> Also where do you propose we move the check? > >>> > >>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. > >> > >> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: > >> > >> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 > >> > >> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > > > > I think it's allowed because it expects you to hotplug function 0 later, > > This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. yes but if you later add a device with ARI and with next field pointing slot 2 guest will suddently find both. > > no? > > > > I am quite worried about all this work going into blocking > > what we think is disallowed configurations. We should have > > maybe blocked them originally, but now that we didn't > > there's a non zero chance of regressions, > > Sigh, There's value in patches 1-4 I think - the last patch helped you find these. so there's value in this work. > no medals here for being brave :-) Try removing support for a 3.5mm jack next. Oh wait ... > > and the benefit > > is not guaranteed. > > > > -- > > MST > >
> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >> >> >>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>> >>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>> >>>>>> Also where do you propose we move the check? >>>>> >>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>> >>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>> >>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>> >>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>> >>> I think it's allowed because it expects you to hotplug function 0 later, >> >> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. > > yes but if you later add a device with ARI and with next field pointing > slot 2 guest will suddently find both. Hmm, I tried this: -device pcie-root-port,id=p \ -device igb,bus=p,addr=0x2.0x0 \ -device igb,bus=p,addr=0x0.0x0 \ The guest only found the second igb device not the first. You can try too. > >>> no? >>> >>> I am quite worried about all this work going into blocking >>> what we think is disallowed configurations. We should have >>> maybe blocked them originally, but now that we didn't >>> there's a non zero chance of regressions, >> >> Sigh, > > There's value in patches 1-4 I think - the last patch helped you find > these. so there's value in this work. > >> no medals here for being brave :-) > > Try removing support for a 3.5mm jack next. Oh wait ... Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) > >>> and the benefit >>> is not guaranteed. >>> >>> -- >>> MST
On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: > > > > On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: > >> > >> > >>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>> > >>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: > >>>>> > >>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > >>>>> > >>>>>> Also where do you propose we move the check? > >>>>> > >>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. > >>>> > >>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: > >>>> > >>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 > >>>> > >>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > >>> > >>> I think it's allowed because it expects you to hotplug function 0 later, > >> > >> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. > > > > yes but if you later add a device with ARI and with next field pointing > > slot 2 guest will suddently find both. > > Hmm, I tried this: > > -device pcie-root-port,id=p \ > -device igb,bus=p,addr=0x2.0x0 \ > -device igb,bus=p,addr=0x0.0x0 \ > > The guest only found the second igb device not the first. You can try too. Because next parameter in pcie_ari_init does not match. > > > >>> no? > >>> > >>> I am quite worried about all this work going into blocking > >>> what we think is disallowed configurations. We should have > >>> maybe blocked them originally, but now that we didn't > >>> there's a non zero chance of regressions, > >> > >> Sigh, > > > > There's value in patches 1-4 I think - the last patch helped you find > > these. so there's value in this work. > > > >> no medals here for being brave :-) > > > > Try removing support for a 3.5mm jack next. Oh wait ... > > Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) > > > > >>> and the benefit > >>> is not guaranteed. > >>> > >>> -- > >>> MST
On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: > Indeed. Everyone uses bluetooth these days. I for one is happy that > the jack is gone (and they were bold enough to do it while Samsung and > others still carry the useless port ) :-) 🤦🏻
> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >> >> >>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>> >>>> >>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> >>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>> >>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>> >>>>>>>> Also where do you propose we move the check? >>>>>>> >>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>> >>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>> >>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>> >>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>> >>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>> >>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>> >>> yes but if you later add a device with ARI and with next field pointing >>> slot 2 guest will suddently find both. >> >> Hmm, I tried this: >> >> -device pcie-root-port,id=p \ >> -device igb,bus=p,addr=0x2.0x0 \ >> -device igb,bus=p,addr=0x0.0x0 \ >> >> The guest only found the second igb device not the first. You can try too. > > Because next parameter in pcie_ari_init does not match. OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. > > >>> >>>>> no? >>>>> >>>>> I am quite worried about all this work going into blocking >>>>> what we think is disallowed configurations. We should have >>>>> maybe blocked them originally, but now that we didn't >>>>> there's a non zero chance of regressions, >>>> >>>> Sigh, >>> >>> There's value in patches 1-4 I think - the last patch helped you find >>> these. so there's value in this work. >>> >>>> no medals here for being brave :-) >>> >>> Try removing support for a 3.5mm jack next. Oh wait ... >> >> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >> >>> >>>>> and the benefit >>>>> is not guaranteed. >>>>> >>>>> -- >>>>> MST
On Fri, Jun 30, 2023 at 04:07:32PM +0530, Ani Sinha wrote: > > > > On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: > >> > >> > >>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>> > >>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: > >>>> > >>>> > >>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>>> > >>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: > >>>>>>> > >>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > >>>>>>> > >>>>>>>> Also where do you propose we move the check? > >>>>>>> > >>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. > >>>>>> > >>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: > >>>>>> > >>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 > >>>>>> > >>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > >>>>> > >>>>> I think it's allowed because it expects you to hotplug function 0 later, > >>>> > >>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. > >>> > >>> yes but if you later add a device with ARI and with next field pointing > >>> slot 2 guest will suddently find both. > >> > >> Hmm, I tried this: > >> > >> -device pcie-root-port,id=p \ > >> -device igb,bus=p,addr=0x2.0x0 \ > >> -device igb,bus=p,addr=0x0.0x0 \ > >> > >> The guest only found the second igb device not the first. You can try too. > > > > Because next parameter in pcie_ari_init does not match. > > OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. you need to patch igb to pass 2 as next parameter. maybe add a property to make it easier to play with. > > > > > >>> > >>>>> no? > >>>>> > >>>>> I am quite worried about all this work going into blocking > >>>>> what we think is disallowed configurations. We should have > >>>>> maybe blocked them originally, but now that we didn't > >>>>> there's a non zero chance of regressions, > >>>> > >>>> Sigh, > >>> > >>> There's value in patches 1-4 I think - the last patch helped you find > >>> these. so there's value in this work. > >>> > >>>> no medals here for being brave :-) > >>> > >>> Try removing support for a 3.5mm jack next. Oh wait ... > >> > >> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) > >> > >>> > >>>>> and the benefit > >>>>> is not guaranteed. > >>>>> > >>>>> -- > >>>>> MST
> On 30-Jun-2023, at 4:10 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Jun 30, 2023 at 04:07:32PM +0530, Ani Sinha wrote: >> >> >>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>> >>>> >>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> >>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>> >>>>>> >>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>> >>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>> >>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>> >>>>>>>>>> Also where do you propose we move the check? >>>>>>>>> >>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>> >>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>> >>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>> >>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>> >>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>> >>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>> >>>>> yes but if you later add a device with ARI and with next field pointing >>>>> slot 2 guest will suddently find both. >>>> >>>> Hmm, I tried this: >>>> >>>> -device pcie-root-port,id=p \ >>>> -device igb,bus=p,addr=0x2.0x0 \ >>>> -device igb,bus=p,addr=0x0.0x0 \ >>>> >>>> The guest only found the second igb device not the first. You can try too. >>> >>> Because next parameter in pcie_ari_init does not match. >> >> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. > > > you need to patch igb to pass 2 as next parameter. > maybe add a property to make it easier to play with. Yes but without patching, could I not change the device parameters like -device pcie-root-port,id=p \ -device igb,bus=p,addr=0x1.0x0 \ -device igb,bus=p,addr=0x0.0x0 \ I tried the above and it did not work. > >>> >>> >>>>> >>>>>>> no? >>>>>>> >>>>>>> I am quite worried about all this work going into blocking >>>>>>> what we think is disallowed configurations. We should have >>>>>>> maybe blocked them originally, but now that we didn't >>>>>>> there's a non zero chance of regressions, >>>>>> >>>>>> Sigh, >>>>> >>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>> these. so there's value in this work. >>>>> >>>>>> no medals here for being brave :-) >>>>> >>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>> >>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >>>> >>>>> >>>>>>> and the benefit >>>>>>> is not guaranteed. >>>>>>> >>>>>>> -- >>>>>>> MST
> On 30-Jun-2023, at 4:10 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Fri, Jun 30, 2023 at 04:07:32PM +0530, Ani Sinha wrote: >> >> >>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>> >>>> >>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> >>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>> >>>>>> >>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>> >>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>> >>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>> >>>>>>>>>> Also where do you propose we move the check? >>>>>>>>> >>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>> >>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>> >>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>> >>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>> >>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>> >>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>> >>>>> yes but if you later add a device with ARI and with next field pointing >>>>> slot 2 guest will suddently find both. >>>> >>>> Hmm, I tried this: >>>> >>>> -device pcie-root-port,id=p \ >>>> -device igb,bus=p,addr=0x2.0x0 \ >>>> -device igb,bus=p,addr=0x0.0x0 \ >>>> >>>> The guest only found the second igb device not the first. You can try too. >>> >>> Because next parameter in pcie_ari_init does not match. >> >> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. > > > you need to patch igb to pass 2 as next parameter. OK I tried this too along with -device pcie-root-port,id=p \ -device igb,bus=p,addr=0x2.0x0 \ -device igb,bus=p,addr=0x0.0x0 \ Still same result. The guest only detects the last one. > maybe add a property to make it easier to play with. > >>> >>> >>>>> >>>>>>> no? >>>>>>> >>>>>>> I am quite worried about all this work going into blocking >>>>>>> what we think is disallowed configurations. We should have >>>>>>> maybe blocked them originally, but now that we didn't >>>>>>> there's a non zero chance of regressions, >>>>>> >>>>>> Sigh, >>>>> >>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>> these. so there's value in this work. >>>>> >>>>>> no medals here for being brave :-) >>>>> >>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>> >>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >>>> >>>>> >>>>>>> and the benefit >>>>>>> is not guaranteed. >>>>>>> >>>>>>> -- >>>>>>> MST
On 2023/06/30 19:37, Ani Sinha wrote: > > >> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >> >> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>> >>> >>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>> >>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>> >>>>> >>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>> >>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>> >>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>> >>>>>>>>> Also where do you propose we move the check? >>>>>>>> >>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>> >>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>> >>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>> >>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>> >>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>> >>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>> >>>> yes but if you later add a device with ARI and with next field pointing >>>> slot 2 guest will suddently find both. >>> >>> Hmm, I tried this: >>> >>> -device pcie-root-port,id=p \ >>> -device igb,bus=p,addr=0x2.0x0 \ >>> -device igb,bus=p,addr=0x0.0x0 \ >>> >>> The guest only found the second igb device not the first. You can try too. >> >> Because next parameter in pcie_ari_init does not match. > > OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. I don't think there is one because the code for PCI multifunction does not care ARI. In my opinion, we need yet another check to make non-SR-IOV multifunction and ARI capability mutually exclusive; if a function has the ARI capability and it is not a VF, an attempt to assign non-zero function number for it should fail. But it should be a distinct check as it will need to check the function number bits. > >> >> >>>> >>>>>> no? >>>>>> >>>>>> I am quite worried about all this work going into blocking >>>>>> what we think is disallowed configurations. We should have >>>>>> maybe blocked them originally, but now that we didn't >>>>>> there's a non zero chance of regressions, >>>>> >>>>> Sigh, >>>> >>>> There's value in patches 1-4 I think - the last patch helped you find >>>> these. so there's value in this work. >>>> >>>>> no medals here for being brave :-) >>>> >>>> Try removing support for a 3.5mm jack next. Oh wait ... >>> >>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with a 100-yen earphone. Even people who ported Linux to this machine spent efforts to get the jack to work on Linux ;) >>> >>>> >>>>>> and the benefit >>>>>> is not guaranteed. >>>>>> >>>>>> -- >>>>>> MST >
> On 30-Jun-2023, at 5:06 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > On 2023/06/30 19:37, Ani Sinha wrote: >>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>> >>>> >>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> >>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>> >>>>>> >>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>> >>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>> >>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>> >>>>>>>>>> Also where do you propose we move the check? >>>>>>>>> >>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>> >>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>> >>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>> >>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>> >>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>> >>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>> >>>>> yes but if you later add a device with ARI and with next field pointing >>>>> slot 2 guest will suddently find both. >>>> >>>> Hmm, I tried this: >>>> >>>> -device pcie-root-port,id=p \ >>>> -device igb,bus=p,addr=0x2.0x0 \ >>>> -device igb,bus=p,addr=0x0.0x0 \ >>>> >>>> The guest only found the second igb device not the first. You can try too. >>> >>> Because next parameter in pcie_ari_init does not match. >> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. > > I don't think there is one because the code for PCI multifunction does not care ARI. In my opinion, we need yet another check to make non-SR-IOV multifunction and ARI capability mutually exclusive; if a function has the ARI capability and it is not a VF, an attempt to assign non-zero function number for it should fail. > > But it should be a distinct check as it will need to check the function number bits. I will leave this for mst to comment. > >>> >>> >>>>> >>>>>>> no? >>>>>>> >>>>>>> I am quite worried about all this work going into blocking >>>>>>> what we think is disallowed configurations. We should have >>>>>>> maybe blocked them originally, but now that we didn't >>>>>>> there's a non zero chance of regressions, >>>>>> >>>>>> Sigh, >>>>> >>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>> these. so there's value in this work. >>>>> >>>>>> no medals here for being brave :-) >>>>> >>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>> >>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) > > Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with a 100-yen earphone. Even people who ported Linux to this machine spent efforts to get the jack to work on Linux ;) Yes that is because the jack exists. If they did not make the jack work, it would be a broken device. With bluetooth being so pervasive, its mostly a matter of time before most devices stop shipping with a 3.5mm. All I am saying is sometimes one needs to be bold to bring changes. Otherwise changes never come. Status quo is not an option IMHO. > >>>> >>>>> >>>>>>> and the benefit >>>>>>> is not guaranteed. >>>>>>> >>>>>>> -- >>>>>>> MST
On 2023/06/30 20:36, Akihiko Odaki wrote: > On 2023/06/30 19:37, Ani Sinha wrote: >> >> >>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>> >>>> >>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> >>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>> >>>>>> >>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> >>>>>>> wrote: >>>>>>> >>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>> >>>>>>>>> Thus the check for unoccupied function 0 needs to use >>>>>>>>> pci_is_vf() instead of checking ARI capability, and that can >>>>>>>>> happen in do_pci_register_device(). >>>>>>>>> >>>>>>>>>> Also where do you propose we move the check? >>>>>>>>> >>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before >>>>>>>>> option ROM loading. >>>>>>>> >>>>>>>> Hmm, I tried this. The issue here is something like this would >>>>>>>> be now allowed since the PF has ARI capability: >>>>>>>> >>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>> >>>>>>>> The above should not be allowed and when used, we do not see the >>>>>>>> igb ethernet device from the guest OS. >>>>>>> >>>>>>> I think it's allowed because it expects you to hotplug function 0 >>>>>>> later, >>>>>> >>>>>> This is about the igb device being plugged into the non-zero slot >>>>>> of the pci-root-port. The guest OS ignores it. >>>>> >>>>> yes but if you later add a device with ARI and with next field >>>>> pointing >>>>> slot 2 guest will suddently find both. >>>> >>>> Hmm, I tried this: >>>> >>>> -device pcie-root-port,id=p \ >>>> -device igb,bus=p,addr=0x2.0x0 \ >>>> -device igb,bus=p,addr=0x0.0x0 \ >>>> >>>> The guest only found the second igb device not the first. You can >>>> try too. >>> >>> Because next parameter in pcie_ari_init does not match. >> >> OK send me a command line that I can test it with. I can’t come up >> with a case that actually works in practice. > > I don't think there is one because the code for PCI multifunction does > not care ARI. In my opinion, we need yet another check to make > non-SR-IOV multifunction and ARI capability mutually exclusive; if a > function has the ARI capability and it is not a VF, an attempt to assign > non-zero function number for it should fail. No, the more straightforward way to fix this problem is to check the device function number is less than the next function number advertised with ARI. > > But it should be a distinct check as it will need to check the function > number bits.
> On 30-Jun-2023, at 5:25 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > On 2023/06/30 20:36, Akihiko Odaki wrote: >> On 2023/06/30 19:37, Ani Sinha wrote: >>> >>> >>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>> >>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>>> >>>>> >>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>> >>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>>> >>>>>>> >>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>> >>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>>> >>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>>> >>>>>>>>>>> Also where do you propose we move the check? >>>>>>>>>> >>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>>> >>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>>> >>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>>> >>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>>> >>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>>> >>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>>> >>>>>> yes but if you later add a device with ARI and with next field pointing >>>>>> slot 2 guest will suddently find both. >>>>> >>>>> Hmm, I tried this: >>>>> >>>>> -device pcie-root-port,id=p \ >>>>> -device igb,bus=p,addr=0x2.0x0 \ >>>>> -device igb,bus=p,addr=0x0.0x0 \ >>>>> >>>>> The guest only found the second igb device not the first. You can try too. >>>> >>>> Because next parameter in pcie_ari_init does not match. >>> >>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. >> I don't think there is one because the code for PCI multifunction does not care ARI. In my opinion, we need yet another check to make non-SR-IOV multifunction and ARI capability mutually exclusive; if a function has the ARI capability and it is not a VF, an attempt to assign non-zero function number for it should fail. > > No, the more straightforward way to fix this problem is to check the device function number is less than the next function number advertised with ARI. > Personally I would leave the check for ARI capable devices to someone else. I am ok with moving the check to pci_qdev_realize() and I verified that both unit tests and the breaking test case for BZ 2128929 is caught. I have also verified that the change does not break igb vf generation. If there is any interest to push this change, I will spin a new version with tags for test fixes added and the rework for this patch with the check moved to the new location as you had suggested. >> But it should be a distinct check as it will need to check the function number bits.
On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: > On 2023/06/30 19:37, Ani Sinha wrote: > > > > > > > On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: > > > > > > > > > > > > > On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > > > On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: > > > > > > > > > > > > > > > > > > > On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > > > > > > > > > > > On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: > > > > > > > > > > > > > > > > > > Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > > > > > > > > > > > > > > > > > > > Also where do you propose we move the check? > > > > > > > > > > > > > > > > > > In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. > > > > > > > > > > > > > > > > Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: > > > > > > > > > > > > > > > > -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 > > > > > > > > > > > > > > > > The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > > > > > > > > > > > > > > I think it's allowed because it expects you to hotplug function 0 later, > > > > > > > > > > > > This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. > > > > > > > > > > yes but if you later add a device with ARI and with next field pointing > > > > > slot 2 guest will suddently find both. > > > > > > > > Hmm, I tried this: > > > > > > > > -device pcie-root-port,id=p \ > > > > -device igb,bus=p,addr=0x2.0x0 \ > > > > -device igb,bus=p,addr=0x0.0x0 \ > > > > > > > > The guest only found the second igb device not the first. You can try too. > > > > > > Because next parameter in pcie_ari_init does not match. > > > > OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. > > I don't think there is one because the code for PCI multifunction does not > care ARI. In my opinion, we need yet another check to make non-SR-IOV > multifunction and ARI capability mutually exclusive; if a function has the > ARI capability and it is not a VF, an attempt to assign non-zero function > number for it should fail. Why is that? My understanding is that ARI capable devices should also set the multifunction bit in the header. It's not terribly clear from the spec though. > But it should be a distinct check as it will need to check the function > number bits. > > > > > > > > > > > > > > > > > > > > > no? > > > > > > > > > > > > > > I am quite worried about all this work going into blocking > > > > > > > what we think is disallowed configurations. We should have > > > > > > > maybe blocked them originally, but now that we didn't > > > > > > > there's a non zero chance of regressions, > > > > > > > > > > > > Sigh, > > > > > > > > > > There's value in patches 1-4 I think - the last patch helped you find > > > > > these. so there's value in this work. > > > > > > > > > > > no medals here for being brave :-) > > > > > > > > > > Try removing support for a 3.5mm jack next. Oh wait ... > > > > > > > > Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) > > Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with > a 100-yen earphone. Even people who ported Linux to this machine spent > efforts to get the jack to work on Linux ;) > > > > > > > > > > > > > > > > > and the benefit > > > > > > > is not guaranteed. > > > > > > > > > > > > > > -- > > > > > > > MST > >
On 2023/06/30 22:56, Ani Sinha wrote: > > >> On 30-Jun-2023, at 5:25 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >> >> On 2023/06/30 20:36, Akihiko Odaki wrote: >>> On 2023/06/30 19:37, Ani Sinha wrote: >>>> >>>> >>>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>> >>>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>>>> >>>>>> >>>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>> >>>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>> >>>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>>>> >>>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>>>> >>>>>>>>>>>> Also where do you propose we move the check? >>>>>>>>>>> >>>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>>>> >>>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>>>> >>>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>>>> >>>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>>>> >>>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>>>> >>>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>>>> >>>>>>> yes but if you later add a device with ARI and with next field pointing >>>>>>> slot 2 guest will suddently find both. >>>>>> >>>>>> Hmm, I tried this: >>>>>> >>>>>> -device pcie-root-port,id=p \ >>>>>> -device igb,bus=p,addr=0x2.0x0 \ >>>>>> -device igb,bus=p,addr=0x0.0x0 \ >>>>>> >>>>>> The guest only found the second igb device not the first. You can try too. >>>>> >>>>> Because next parameter in pcie_ari_init does not match. >>>> >>>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. >>> I don't think there is one because the code for PCI multifunction does not care ARI. In my opinion, we need yet another check to make non-SR-IOV multifunction and ARI capability mutually exclusive; if a function has the ARI capability and it is not a VF, an attempt to assign non-zero function number for it should fail. >> >> No, the more straightforward way to fix this problem is to check the device function number is less than the next function number advertised with ARI. >> > > Personally I would leave the check for ARI capable devices to someone else. I am ok with moving the check to pci_qdev_realize() and I verified that both unit tests and the breaking test case for BZ 2128929 is caught. I have also verified that the change does not break igb vf generation. > If there is any interest to push this change, I will spin a new version with tags for test fixes added and the rework for this patch with the check moved to the new location as you had suggested. I sent a patch series to add ARI next function number check: https://lore.kernel.org/qemu-devel/20230701070133.24877-1-akihiko.odaki@daynix.com/ Yes, I want the slot number restriction to be enforced. If it worries you too much for regressions, you may implement it as a warning first and then turn it a hard error when the next development phase starts.
On 2023/07/01 0:29, Michael S. Tsirkin wrote: > On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: >> On 2023/06/30 19:37, Ani Sinha wrote: >>> >>> >>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>> >>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>>> >>>>> >>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>> >>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>>> >>>>>>> >>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>> >>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>>> >>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>>> >>>>>>>>>>> Also where do you propose we move the check? >>>>>>>>>> >>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>>> >>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>>> >>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>>> >>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>>> >>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>>> >>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>>> >>>>>> yes but if you later add a device with ARI and with next field pointing >>>>>> slot 2 guest will suddently find both. >>>>> >>>>> Hmm, I tried this: >>>>> >>>>> -device pcie-root-port,id=p \ >>>>> -device igb,bus=p,addr=0x2.0x0 \ >>>>> -device igb,bus=p,addr=0x0.0x0 \ >>>>> >>>>> The guest only found the second igb device not the first. You can try too. >>>> >>>> Because next parameter in pcie_ari_init does not match. >>> >>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. >> >> I don't think there is one because the code for PCI multifunction does not >> care ARI. In my opinion, we need yet another check to make non-SR-IOV >> multifunction and ARI capability mutually exclusive; if a function has the >> ARI capability and it is not a VF, an attempt to assign non-zero function >> number for it should fail. > > Why is that? My understanding is that ARI capable devices should also > set the multifunction bit in the header. It's not terribly clear from > the spec though. Something like the following will not work properly with ARI-capable device (think of a as an ARI-capable device): -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1 This is because the next function numbers advertised with ARI are not updated with the multifunction configuration, but they are hardcoded in the device implementation. In this sense, the traditional (non-SR/IOV) multifunction mechanism QEMU has will not work with ARI-capable devices. > >> But it should be a distinct check as it will need to check the function >> number bits. >> >>> >>>> >>>> >>>>>> >>>>>>>> no? >>>>>>>> >>>>>>>> I am quite worried about all this work going into blocking >>>>>>>> what we think is disallowed configurations. We should have >>>>>>>> maybe blocked them originally, but now that we didn't >>>>>>>> there's a non zero chance of regressions, >>>>>>> >>>>>>> Sigh, >>>>>> >>>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>>> these. so there's value in this work. >>>>>> >>>>>>> no medals here for being brave :-) >>>>>> >>>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>>> >>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >> >> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with >> a 100-yen earphone. Even people who ported Linux to this machine spent >> efforts to get the jack to work on Linux ;) >> >>>>> >>>>>> >>>>>>>> and the benefit >>>>>>>> is not guaranteed. >>>>>>>> >>>>>>>> -- >>>>>>>> MST >>> >
On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: > Yes, I want the slot number restriction to be enforced. If it worries you > too much for regressions, you may implement it as a warning first and then > turn it a hard error when the next development phase starts. That's not a bad idea.
> On 02-Jul-2023, at 10:29 AM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: >> Yes, I want the slot number restriction to be enforced. If it worries you >> too much for regressions, you may implement it as a warning first and then >> turn it a hard error when the next development phase starts. > > That's not a bad idea. If we had not enforced the check strongly, the tests that we fixed would not get noticed.
> On 02-Jul-2023, at 10:29 AM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: >> Yes, I want the slot number restriction to be enforced. If it worries you >> too much for regressions, you may implement it as a warning first and then >> turn it a hard error when the next development phase starts. > > That's not a bad idea. This is with just the warning - the device still gets added: (qemu) netdev_add socket,id=hostnet1,listen=:1234 (qemu) device_add e1000e,netdev=hostnet1,mac=00:11:22:33:44:03,id=net1,bus=pci.6,addr=0x2.0x5 warning: PCI: slot 2 is not valid for e1000e, parent device only allows plugging into slot 0. (qemu) info network igb.0: index=0,type=nic,model=igb,macaddr=52:54:00:12:34:56 igb.1: index=0,type=nic,model=igb,macaddr=52:54:00:12:34:57 net1: index=0,type=nic,model=e1000e,macaddr=00:11:22:33:44:03 \ hostnet1: index=0,type=socket, device_remove won’t be able to remove the nic unless guest cooperates.
On 2023/07/03 15:08, Ani Sinha wrote: > > >> On 02-Jul-2023, at 10:29 AM, Michael S. Tsirkin <mst@redhat.com> wrote: >> >> On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: >>> Yes, I want the slot number restriction to be enforced. If it worries you >>> too much for regressions, you may implement it as a warning first and then >>> turn it a hard error when the next development phase starts. >> >> That's not a bad idea. > > If we had not enforced the check strongly, the tests that we fixed would not get noticed. > Perhaps so, but we don't have much time before feature freeze. I rather want to see the check implemented as warning in 8.1 instead of delaying the initial implementation of the check after 8.1 (though I worry if it's already too late for 8.1.)
> On 04-Jul-2023, at 10:31 AM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > On 2023/07/03 15:08, Ani Sinha wrote: >>> On 02-Jul-2023, at 10:29 AM, Michael S. Tsirkin <mst@redhat.com> wrote: >>> >>> On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: >>>> Yes, I want the slot number restriction to be enforced. If it worries you >>>> too much for regressions, you may implement it as a warning first and then >>>> turn it a hard error when the next development phase starts. >>> >>> That's not a bad idea. >> If we had not enforced the check strongly, the tests that we fixed would not get noticed. > > Perhaps so, but we don't have much time before feature freeze. I rather want to see the check implemented as warning in 8.1 instead of delaying the initial implementation of the check after 8.1 (though I worry if it's already too late for 8.1.) The feature hard freeze window starts from 12th of next week. So I am still debating whether to keep the hard check or just have a warning. If the hard check causes regressions, we can always revert it to a warning later.
> On 04-Jul-2023, at 11:09 AM, Ani Sinha <anisinha@redhat.com> wrote: > > > >> On 04-Jul-2023, at 10:31 AM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >> >> On 2023/07/03 15:08, Ani Sinha wrote: >>>> On 02-Jul-2023, at 10:29 AM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>> >>>> On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: >>>>> Yes, I want the slot number restriction to be enforced. If it worries you >>>>> too much for regressions, you may implement it as a warning first and then >>>>> turn it a hard error when the next development phase starts. >>>> >>>> That's not a bad idea. >>> If we had not enforced the check strongly, the tests that we fixed would not get noticed. >> >> Perhaps so, but we don't have much time before feature freeze. I rather want to see the check implemented as warning in 8.1 instead of delaying the initial implementation of the check after 8.1 (though I worry if it's already too late for 8.1.) > > The feature hard freeze window starts from 12th of next week. So I am still debating whether to keep the hard check or just have a warning. If the hard check causes regressions, we can always revert it to a warning later. mst?
On Tue, Jul 04, 2023 at 04:03:54PM +0530, Ani Sinha wrote: > > > > On 04-Jul-2023, at 11:09 AM, Ani Sinha <anisinha@redhat.com> wrote: > > > > > > > >> On 04-Jul-2023, at 10:31 AM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > >> > >> On 2023/07/03 15:08, Ani Sinha wrote: > >>>> On 02-Jul-2023, at 10:29 AM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>> > >>>> On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: > >>>>> Yes, I want the slot number restriction to be enforced. If it worries you > >>>>> too much for regressions, you may implement it as a warning first and then > >>>>> turn it a hard error when the next development phase starts. > >>>> > >>>> That's not a bad idea. > >>> If we had not enforced the check strongly, the tests that we fixed would not get noticed. > >> > >> Perhaps so, but we don't have much time before feature freeze. I rather want to see the check implemented as warning in 8.1 instead of delaying the initial implementation of the check after 8.1 (though I worry if it's already too late for 8.1.) > > > > The feature hard freeze window starts from 12th of next week. So I am still debating whether to keep the hard check or just have a warning. If the hard check causes regressions, we can always revert it to a warning later. > > mst? I'd go for a warning now. Let's see what triggers for users without actually breaking things too badly for them.
> On 04-Jul-2023, at 4:06 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Tue, Jul 04, 2023 at 04:03:54PM +0530, Ani Sinha wrote: >> >> >>> On 04-Jul-2023, at 11:09 AM, Ani Sinha <anisinha@redhat.com> wrote: >>> >>> >>> >>>> On 04-Jul-2023, at 10:31 AM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >>>> >>>> On 2023/07/03 15:08, Ani Sinha wrote: >>>>>> On 02-Jul-2023, at 10:29 AM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>> >>>>>> On Sat, Jul 01, 2023 at 04:09:31PM +0900, Akihiko Odaki wrote: >>>>>>> Yes, I want the slot number restriction to be enforced. If it worries you >>>>>>> too much for regressions, you may implement it as a warning first and then >>>>>>> turn it a hard error when the next development phase starts. >>>>>> >>>>>> That's not a bad idea. >>>>> If we had not enforced the check strongly, the tests that we fixed would not get noticed. >>>> >>>> Perhaps so, but we don't have much time before feature freeze. I rather want to see the check implemented as warning in 8.1 instead of delaying the initial implementation of the check after 8.1 (though I worry if it's already too late for 8.1.) >>> >>> The feature hard freeze window starts from 12th of next week. So I am still debating whether to keep the hard check or just have a warning. If the hard check causes regressions, we can always revert it to a warning later. >> >> mst? > > I'd go for a warning now. Let's see what triggers for users without > actually breaking things too badly for them. 🤦🏻
On Sat, 1 Jul 2023 16:28:30 +0900 Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > On 2023/07/01 0:29, Michael S. Tsirkin wrote: > > On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: > >> On 2023/06/30 19:37, Ani Sinha wrote: > >>> > >>> > >>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>> > >>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: > >>>>> > >>>>> > >>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>>>> > >>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: > >>>>>>> > >>>>>>> > >>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>>>>>> > >>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: > >>>>>>>>>> > >>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > >>>>>>>>>> > >>>>>>>>>>> Also where do you propose we move the check? > >>>>>>>>>> > >>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. > >>>>>>>>> > >>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: > >>>>>>>>> > >>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 > >>>>>>>>> > >>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > >>>>>>>> > >>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, > >>>>>>> > >>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. > >>>>>> > >>>>>> yes but if you later add a device with ARI and with next field pointing > >>>>>> slot 2 guest will suddently find both. > >>>>> > >>>>> Hmm, I tried this: > >>>>> > >>>>> -device pcie-root-port,id=p \ > >>>>> -device igb,bus=p,addr=0x2.0x0 \ > >>>>> -device igb,bus=p,addr=0x0.0x0 \ > >>>>> > >>>>> The guest only found the second igb device not the first. You can try too. > >>>> > >>>> Because next parameter in pcie_ari_init does not match. > >>> > >>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. > >> > >> I don't think there is one because the code for PCI multifunction does not > >> care ARI. In my opinion, we need yet another check to make non-SR-IOV > >> multifunction and ARI capability mutually exclusive; if a function has the > >> ARI capability and it is not a VF, an attempt to assign non-zero function > >> number for it should fail. is it stated somewhere in spec(s) that ARI and !SR-IOV are mutually exclusive? > > > > Why is that? My understanding is that ARI capable devices should also > > set the multifunction bit in the header. It's not terribly clear from > > the spec though. > > Something like the following will not work properly with ARI-capable > device (think of a as an ARI-capable device): > -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1 (I had a crazy idea, to use it like that so we could put more devices on port without resorting to adding extra bridges) Can you elaborate some more why it won't work? > This is because the next function numbers advertised with ARI are not > updated with the multifunction configuration, but they are hardcoded in > the device implementation. In this sense, the traditional (non-SR/IOV) > multifunction mechanism QEMU has will not work with ARI-capable devices. > > > > >> But it should be a distinct check as it will need to check the function > >> number bits. > >> > >>> > >>>> > >>>> > >>>>>> > >>>>>>>> no? > >>>>>>>> > >>>>>>>> I am quite worried about all this work going into blocking > >>>>>>>> what we think is disallowed configurations. We should have > >>>>>>>> maybe blocked them originally, but now that we didn't > >>>>>>>> there's a non zero chance of regressions, > >>>>>>> > >>>>>>> Sigh, > >>>>>> > >>>>>> There's value in patches 1-4 I think - the last patch helped you find > >>>>>> these. so there's value in this work. > >>>>>> > >>>>>>> no medals here for being brave :-) > >>>>>> > >>>>>> Try removing support for a 3.5mm jack next. Oh wait ... > >>>>> > >>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) > >> > >> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with > >> a 100-yen earphone. Even people who ported Linux to this machine spent > >> efforts to get the jack to work on Linux ;) > >> > >>>>> > >>>>>> > >>>>>>>> and the benefit > >>>>>>>> is not guaranteed. > >>>>>>>> > >>>>>>>> -- > >>>>>>>> MST > >>> > > >
On 2023/07/04 20:38, Igor Mammedov wrote: > On Sat, 1 Jul 2023 16:28:30 +0900 > Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > >> On 2023/07/01 0:29, Michael S. Tsirkin wrote: >>> On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: >>>> On 2023/06/30 19:37, Ani Sinha wrote: >>>>> >>>>> >>>>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>> >>>>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>>>>> >>>>>>> >>>>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>> >>>>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>>> >>>>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>>>>> >>>>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>>>>> >>>>>>>>>>>>> Also where do you propose we move the check? >>>>>>>>>>>> >>>>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>>>>> >>>>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>>>>> >>>>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>>>>> >>>>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>>>>> >>>>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>>>>> >>>>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>>>>> >>>>>>>> yes but if you later add a device with ARI and with next field pointing >>>>>>>> slot 2 guest will suddently find both. >>>>>>> >>>>>>> Hmm, I tried this: >>>>>>> >>>>>>> -device pcie-root-port,id=p \ >>>>>>> -device igb,bus=p,addr=0x2.0x0 \ >>>>>>> -device igb,bus=p,addr=0x0.0x0 \ >>>>>>> >>>>>>> The guest only found the second igb device not the first. You can try too. >>>>>> >>>>>> Because next parameter in pcie_ari_init does not match. >>>>> >>>>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. >>>> >>>> I don't think there is one because the code for PCI multifunction does not >>>> care ARI. In my opinion, we need yet another check to make non-SR-IOV >>>> multifunction and ARI capability mutually exclusive; if a function has the >>>> ARI capability and it is not a VF, an attempt to assign non-zero function >>>> number for it should fail. > > is it stated somewhere in spec(s) that ARI and !SR-IOV are mutually exclusive? > >>> >>> Why is that? My understanding is that ARI capable devices should also >>> set the multifunction bit in the header. It's not terribly clear from >>> the spec though. >> >> Something like the following will not work properly with ARI-capable >> device (think of a as an ARI-capable device): >> -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1 > (I had a crazy idea, to use it like that so we could put more devices > on port without resorting to adding extra bridges) > > Can you elaborate some more why it won't work? It won't work because the ARI next function number field is fixed. In this case, the field of the Function at 0x1.0x0 should point to 0x1.0x1, but it doesn't. As the result, the Function at 0x1.0x1 won't be recognized. It's more problematic if some of the Functions are ARI-capable but others are not. In my understanding, all Functions in a ARI-capable device need to have ARI capability, but that's not enforced. > >> This is because the next function numbers advertised with ARI are not >> updated with the multifunction configuration, but they are hardcoded in >> the device implementation. In this sense, the traditional (non-SR/IOV) >> multifunction mechanism QEMU has will not work with ARI-capable devices. >> >>> >>>> But it should be a distinct check as it will need to check the function >>>> number bits. >>>> >>>>> >>>>>> >>>>>> >>>>>>>> >>>>>>>>>> no? >>>>>>>>>> >>>>>>>>>> I am quite worried about all this work going into blocking >>>>>>>>>> what we think is disallowed configurations. We should have >>>>>>>>>> maybe blocked them originally, but now that we didn't >>>>>>>>>> there's a non zero chance of regressions, >>>>>>>>> >>>>>>>>> Sigh, >>>>>>>> >>>>>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>>>>> these. so there's value in this work. >>>>>>>> >>>>>>>>> no medals here for being brave :-) >>>>>>>> >>>>>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>>>>> >>>>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >>>> >>>> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with >>>> a 100-yen earphone. Even people who ported Linux to this machine spent >>>> efforts to get the jack to work on Linux ;) >>>> >>>>>>> >>>>>>>> >>>>>>>>>> and the benefit >>>>>>>>>> is not guaranteed. >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> MST >>>>> >>> >> >
On Tue, 4 Jul 2023 20:50:49 +0900 Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > On 2023/07/04 20:38, Igor Mammedov wrote: > > On Sat, 1 Jul 2023 16:28:30 +0900 > > Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > > >> On 2023/07/01 0:29, Michael S. Tsirkin wrote: > >>> On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: > >>>> On 2023/06/30 19:37, Ani Sinha wrote: > >>>>> > >>>>> > >>>>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>>>> > >>>>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: > >>>>>>> > >>>>>>> > >>>>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>>>>>> > >>>>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: > >>>>>>>>> > >>>>>>>>> > >>>>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > >>>>>>>>>> > >>>>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: > >>>>>>>>>>>> > >>>>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). > >>>>>>>>>>>> > >>>>>>>>>>>>> Also where do you propose we move the check? > >>>>>>>>>>>> > >>>>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. > >>>>>>>>>>> > >>>>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: > >>>>>>>>>>> > >>>>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 > >>>>>>>>>>> > >>>>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. > >>>>>>>>>> > >>>>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, > >>>>>>>>> > >>>>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. > >>>>>>>> > >>>>>>>> yes but if you later add a device with ARI and with next field pointing > >>>>>>>> slot 2 guest will suddently find both. > >>>>>>> > >>>>>>> Hmm, I tried this: > >>>>>>> > >>>>>>> -device pcie-root-port,id=p \ > >>>>>>> -device igb,bus=p,addr=0x2.0x0 \ > >>>>>>> -device igb,bus=p,addr=0x0.0x0 \ > >>>>>>> > >>>>>>> The guest only found the second igb device not the first. You can try too. > >>>>>> > >>>>>> Because next parameter in pcie_ari_init does not match. > >>>>> > >>>>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. > >>>> > >>>> I don't think there is one because the code for PCI multifunction does not > >>>> care ARI. In my opinion, we need yet another check to make non-SR-IOV > >>>> multifunction and ARI capability mutually exclusive; if a function has the > >>>> ARI capability and it is not a VF, an attempt to assign non-zero function > >>>> number for it should fail. > > > > is it stated somewhere in spec(s) that ARI and !SR-IOV are mutually exclusive? > > > >>> > >>> Why is that? My understanding is that ARI capable devices should also > >>> set the multifunction bit in the header. It's not terribly clear from > >>> the spec though. > >> > >> Something like the following will not work properly with ARI-capable > >> device (think of a as an ARI-capable device): > >> -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1 > > (I had a crazy idea, to use it like that so we could put more devices > > on port without resorting to adding extra bridges) > > > > Can you elaborate some more why it won't work? > > It won't work because the ARI next function number field is fixed. In > this case, the field of the Function at 0x1.0x0 should point to 0x1.0x1, > but it doesn't. As the result, the Function at 0x1.0x1 won't be recognized. > > It's more problematic if some of the Functions are ARI-capable but > others are not. In my understanding, all Functions in a ARI-capable > device need to have ARI capability, but that's not enforced. that doesn't look to me as an real issue but rather as a QEMU problem that we could fix and handle it properly. > >> This is because the next function numbers advertised with ARI are not > >> updated with the multifunction configuration, but they are hardcoded in > >> the device implementation. In this sense, the traditional (non-SR/IOV) > >> multifunction mechanism QEMU has will not work with ARI-capable devices. > >> > >>> > >>>> But it should be a distinct check as it will need to check the function > >>>> number bits. > >>>> > >>>>> > >>>>>> > >>>>>> > >>>>>>>> > >>>>>>>>>> no? > >>>>>>>>>> > >>>>>>>>>> I am quite worried about all this work going into blocking > >>>>>>>>>> what we think is disallowed configurations. We should have > >>>>>>>>>> maybe blocked them originally, but now that we didn't > >>>>>>>>>> there's a non zero chance of regressions, > >>>>>>>>> > >>>>>>>>> Sigh, > >>>>>>>> > >>>>>>>> There's value in patches 1-4 I think - the last patch helped you find > >>>>>>>> these. so there's value in this work. > >>>>>>>> > >>>>>>>>> no medals here for being brave :-) > >>>>>>>> > >>>>>>>> Try removing support for a 3.5mm jack next. Oh wait ... > >>>>>>> > >>>>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) > >>>> > >>>> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with > >>>> a 100-yen earphone. Even people who ported Linux to this machine spent > >>>> efforts to get the jack to work on Linux ;) > >>>> > >>>>>>> > >>>>>>>> > >>>>>>>>>> and the benefit > >>>>>>>>>> is not guaranteed. > >>>>>>>>>> > >>>>>>>>>> -- > >>>>>>>>>> MST > >>>>> > >>> > >> > > >
On 2023/07/04 21:36, Igor Mammedov wrote: > On Tue, 4 Jul 2023 20:50:49 +0900 > Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > >> On 2023/07/04 20:38, Igor Mammedov wrote: >>> On Sat, 1 Jul 2023 16:28:30 +0900 >>> Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >>> >>>> On 2023/07/01 0:29, Michael S. Tsirkin wrote: >>>>> On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: >>>>>> On 2023/06/30 19:37, Ani Sinha wrote: >>>>>>> >>>>>>> >>>>>>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>> >>>>>>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>>> >>>>>>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also where do you propose we move the check? >>>>>>>>>>>>>> >>>>>>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>>>>>>> >>>>>>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>>>>>>> >>>>>>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>>>>>>> >>>>>>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>>>>>>> >>>>>>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>>>>>>> >>>>>>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>>>>>>> >>>>>>>>>> yes but if you later add a device with ARI and with next field pointing >>>>>>>>>> slot 2 guest will suddently find both. >>>>>>>>> >>>>>>>>> Hmm, I tried this: >>>>>>>>> >>>>>>>>> -device pcie-root-port,id=p \ >>>>>>>>> -device igb,bus=p,addr=0x2.0x0 \ >>>>>>>>> -device igb,bus=p,addr=0x0.0x0 \ >>>>>>>>> >>>>>>>>> The guest only found the second igb device not the first. You can try too. >>>>>>>> >>>>>>>> Because next parameter in pcie_ari_init does not match. >>>>>>> >>>>>>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. >>>>>> >>>>>> I don't think there is one because the code for PCI multifunction does not >>>>>> care ARI. In my opinion, we need yet another check to make non-SR-IOV >>>>>> multifunction and ARI capability mutually exclusive; if a function has the >>>>>> ARI capability and it is not a VF, an attempt to assign non-zero function >>>>>> number for it should fail. >>> >>> is it stated somewhere in spec(s) that ARI and !SR-IOV are mutually exclusive? >>> >>>>> >>>>> Why is that? My understanding is that ARI capable devices should also >>>>> set the multifunction bit in the header. It's not terribly clear from >>>>> the spec though. >>>> >>>> Something like the following will not work properly with ARI-capable >>>> device (think of a as an ARI-capable device): >>>> -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1 >>> (I had a crazy idea, to use it like that so we could put more devices >>> on port without resorting to adding extra bridges) >>> >>> Can you elaborate some more why it won't work? >> >> It won't work because the ARI next function number field is fixed. In >> this case, the field of the Function at 0x1.0x0 should point to 0x1.0x1, >> but it doesn't. As the result, the Function at 0x1.0x1 won't be recognized. >> >> It's more problematic if some of the Functions are ARI-capable but >> others are not. In my understanding, all Functions in a ARI-capable >> device need to have ARI capability, but that's not enforced. > > that doesn't look to me as an real issue but rather as > a QEMU problem that we could fix and handle it properly. Yes, it's certainly possible to fix QEMU so that non-SR/IOV multifunction works with ARI-capable Functions. > >>>> This is because the next function numbers advertised with ARI are not >>>> updated with the multifunction configuration, but they are hardcoded in >>>> the device implementation. In this sense, the traditional (non-SR/IOV) >>>> multifunction mechanism QEMU has will not work with ARI-capable devices. >>>> >>>>> >>>>>> But it should be a distinct check as it will need to check the function >>>>>> number bits. >>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>>>> no? >>>>>>>>>>>> >>>>>>>>>>>> I am quite worried about all this work going into blocking >>>>>>>>>>>> what we think is disallowed configurations. We should have >>>>>>>>>>>> maybe blocked them originally, but now that we didn't >>>>>>>>>>>> there's a non zero chance of regressions, >>>>>>>>>>> >>>>>>>>>>> Sigh, >>>>>>>>>> >>>>>>>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>>>>>>> these. so there's value in this work. >>>>>>>>>> >>>>>>>>>>> no medals here for being brave :-) >>>>>>>>>> >>>>>>>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>>>>>>> >>>>>>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >>>>>> >>>>>> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with >>>>>> a 100-yen earphone. Even people who ported Linux to this machine spent >>>>>> efforts to get the jack to work on Linux ;) >>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> and the benefit >>>>>>>>>>>> is not guaranteed. >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> MST >>>>>>> >>>>> >>>> >>> >> >
> On 04-Jul-2023, at 5:20 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: > > On 2023/07/04 20:38, Igor Mammedov wrote: >> On Sat, 1 Jul 2023 16:28:30 +0900 >> Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >>> On 2023/07/01 0:29, Michael S. Tsirkin wrote: >>>> On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: >>>>> On 2023/06/30 19:37, Ani Sinha wrote: >>>>>> >>>>>> >>>>>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>> >>>>>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>>>>>> >>>>>>>> >>>>>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>> >>>>>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>>>> >>>>>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>>>>>> >>>>>>>>>>>>>> Also where do you propose we move the check? >>>>>>>>>>>>> >>>>>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>>>>>> >>>>>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>>>>>> >>>>>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>>>>>> >>>>>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>>>>>> >>>>>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>>>>>> >>>>>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>>>>>> >>>>>>>>> yes but if you later add a device with ARI and with next field pointing >>>>>>>>> slot 2 guest will suddently find both. >>>>>>>> >>>>>>>> Hmm, I tried this: >>>>>>>> >>>>>>>> -device pcie-root-port,id=p \ >>>>>>>> -device igb,bus=p,addr=0x2.0x0 \ >>>>>>>> -device igb,bus=p,addr=0x0.0x0 \ >>>>>>>> >>>>>>>> The guest only found the second igb device not the first. You can try too. >>>>>>> >>>>>>> Because next parameter in pcie_ari_init does not match. >>>>>> >>>>>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. >>>>> >>>>> I don't think there is one because the code for PCI multifunction does not >>>>> care ARI. In my opinion, we need yet another check to make non-SR-IOV >>>>> multifunction and ARI capability mutually exclusive; if a function has the >>>>> ARI capability and it is not a VF, an attempt to assign non-zero function >>>>> number for it should fail. >> is it stated somewhere in spec(s) that ARI and !SR-IOV are mutually exclusive? >>>> >>>> Why is that? My understanding is that ARI capable devices should also >>>> set the multifunction bit in the header. It's not terribly clear from >>>> the spec though. >>> >>> Something like the following will not work properly with ARI-capable >>> device (think of a as an ARI-capable device): >>> -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1 >> (I had a crazy idea, to use it like that so we could put more devices >> on port without resorting to adding extra bridges) >> Can you elaborate some more why it won't work? > > It won't work because the ARI next function number field is fixed. In this case, the field of the Function at 0x1.0x0 should point to 0x1.0x1, but it doesn’t. Where does it point to in this case then? 0x1.0x2 becasue of the stride? > As the result, the Function at 0x1.0x1 won't be recognized. > > It's more problematic if some of the Functions are ARI-capable but others are not. In my understanding, all Functions in a ARI-capable device need to have ARI capability, but that's not enforced. > >>> This is because the next function numbers advertised with ARI are not >>> updated with the multifunction configuration, but they are hardcoded in >>> the device implementation. In this sense, the traditional (non-SR/IOV) >>> multifunction mechanism QEMU has will not work with ARI-capable devices. >>> >>>> >>>>> But it should be a distinct check as it will need to check the function >>>>> number bits. >>>>> >>>>>> >>>>>>> >>>>>>> >>>>>>>>> >>>>>>>>>>> no? >>>>>>>>>>> >>>>>>>>>>> I am quite worried about all this work going into blocking >>>>>>>>>>> what we think is disallowed configurations. We should have >>>>>>>>>>> maybe blocked them originally, but now that we didn't >>>>>>>>>>> there's a non zero chance of regressions, >>>>>>>>>> >>>>>>>>>> Sigh, >>>>>>>>> >>>>>>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>>>>>> these. so there's value in this work. >>>>>>>>> >>>>>>>>>> no medals here for being brave :-) >>>>>>>>> >>>>>>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>>>>>> >>>>>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >>>>> >>>>> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with >>>>> a 100-yen earphone. Even people who ported Linux to this machine spent >>>>> efforts to get the jack to work on Linux ;) >>>>> >>>>>>>> >>>>>>>>> >>>>>>>>>>> and the benefit >>>>>>>>>>> is not guaranteed. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> MST
On 2023/07/04 23:03, Ani Sinha wrote: > > >> On 04-Jul-2023, at 5:20 PM, Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >> >> On 2023/07/04 20:38, Igor Mammedov wrote: >>> On Sat, 1 Jul 2023 16:28:30 +0900 >>> Akihiko Odaki <akihiko.odaki@daynix.com> wrote: >>>> On 2023/07/01 0:29, Michael S. Tsirkin wrote: >>>>> On Fri, Jun 30, 2023 at 08:36:38PM +0900, Akihiko Odaki wrote: >>>>>> On 2023/06/30 19:37, Ani Sinha wrote: >>>>>>> >>>>>>> >>>>>>>> On 30-Jun-2023, at 3:30 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>> >>>>>>>> On Fri, Jun 30, 2023 at 02:52:52PM +0530, Ani Sinha wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>>> On 30-Jun-2023, at 2:13 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>>> >>>>>>>>>> On Fri, Jun 30, 2023 at 02:06:59PM +0530, Ani Sinha wrote: >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> On 30-Jun-2023, at 2:02 PM, Michael S. Tsirkin <mst@redhat.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> On Fri, Jun 30, 2023 at 01:11:33PM +0530, Ani Sinha wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thus the check for unoccupied function 0 needs to use pci_is_vf() instead of checking ARI capability, and that can happen in do_pci_register_device(). >>>>>>>>>>>>>> >>>>>>>>>>>>>>> Also where do you propose we move the check? >>>>>>>>>>>>>> >>>>>>>>>>>>>> In pci_qdev_realize(), somewhere after pc->realize() and before option ROM loading. >>>>>>>>>>>>> >>>>>>>>>>>>> Hmm, I tried this. The issue here is something like this would be now allowed since the PF has ARI capability: >>>>>>>>>>>>> >>>>>>>>>>>>> -device pcie-root-port,id=p -device igb,bus=p,addr=0x2.0x0 >>>>>>>>>>>>> >>>>>>>>>>>>> The above should not be allowed and when used, we do not see the igb ethernet device from the guest OS. >>>>>>>>>>>> >>>>>>>>>>>> I think it's allowed because it expects you to hotplug function 0 later, >>>>>>>>>>> >>>>>>>>>>> This is about the igb device being plugged into the non-zero slot of the pci-root-port. The guest OS ignores it. >>>>>>>>>> >>>>>>>>>> yes but if you later add a device with ARI and with next field pointing >>>>>>>>>> slot 2 guest will suddently find both. >>>>>>>>> >>>>>>>>> Hmm, I tried this: >>>>>>>>> >>>>>>>>> -device pcie-root-port,id=p \ >>>>>>>>> -device igb,bus=p,addr=0x2.0x0 \ >>>>>>>>> -device igb,bus=p,addr=0x0.0x0 \ >>>>>>>>> >>>>>>>>> The guest only found the second igb device not the first. You can try too. >>>>>>>> >>>>>>>> Because next parameter in pcie_ari_init does not match. >>>>>>> >>>>>>> OK send me a command line that I can test it with. I can’t come up with a case that actually works in practice. >>>>>> >>>>>> I don't think there is one because the code for PCI multifunction does not >>>>>> care ARI. In my opinion, we need yet another check to make non-SR-IOV >>>>>> multifunction and ARI capability mutually exclusive; if a function has the >>>>>> ARI capability and it is not a VF, an attempt to assign non-zero function >>>>>> number for it should fail. >>> is it stated somewhere in spec(s) that ARI and !SR-IOV are mutually exclusive? >>>>> >>>>> Why is that? My understanding is that ARI capable devices should also >>>>> set the multifunction bit in the header. It's not terribly clear from >>>>> the spec though. >>>> >>>> Something like the following will not work properly with ARI-capable >>>> device (think of a as an ARI-capable device): >>>> -device a,addr=0x1.0x0,multifunction=on -device a,addr=0x1.0x1 >>> (I had a crazy idea, to use it like that so we could put more devices >>> on port without resorting to adding extra bridges) >>> Can you elaborate some more why it won't work? >> >> It won't work because the ARI next function number field is fixed. In this case, the field of the Function at 0x1.0x0 should point to 0x1.0x1, but it doesn’t. > > Where does it point to in this case then? 0x1.0x2 becasue of the stride? With "[PATCH v5 0/2] pcie: Fix ARI next function numbers"*, it will point to 0, which ends the linked list formed with the ARI next number fields and make it only contain Function 0. * https://lore.kernel.org/qemu-devel/20230705022421.13115-1-akihiko.odaki@daynix.com/ > >> As the result, the Function at 0x1.0x1 won't be recognized. >> >> It's more problematic if some of the Functions are ARI-capable but others are not. In my understanding, all Functions in a ARI-capable device need to have ARI capability, but that's not enforced. >> >>>> This is because the next function numbers advertised with ARI are not >>>> updated with the multifunction configuration, but they are hardcoded in >>>> the device implementation. In this sense, the traditional (non-SR/IOV) >>>> multifunction mechanism QEMU has will not work with ARI-capable devices. >>>> >>>>> >>>>>> But it should be a distinct check as it will need to check the function >>>>>> number bits. >>>>>> >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>>> >>>>>>>>>>>> no? >>>>>>>>>>>> >>>>>>>>>>>> I am quite worried about all this work going into blocking >>>>>>>>>>>> what we think is disallowed configurations. We should have >>>>>>>>>>>> maybe blocked them originally, but now that we didn't >>>>>>>>>>>> there's a non zero chance of regressions, >>>>>>>>>>> >>>>>>>>>>> Sigh, >>>>>>>>>> >>>>>>>>>> There's value in patches 1-4 I think - the last patch helped you find >>>>>>>>>> these. so there's value in this work. >>>>>>>>>> >>>>>>>>>>> no medals here for being brave :-) >>>>>>>>>> >>>>>>>>>> Try removing support for a 3.5mm jack next. Oh wait ... >>>>>>>>> >>>>>>>>> Indeed. Everyone uses bluetooth these days. I for one is happy that the jack is gone (and they were bold enough to do it while Samsung and others still carry the useless port ) :-) >>>>>> >>>>>> Hello from a guy using a shiny M2 Macbook Air carrying the legacy jack with >>>>>> a 100-yen earphone. Even people who ported Linux to this machine spent >>>>>> efforts to get the jack to work on Linux ;) >>>>>> >>>>>>>>> >>>>>>>>>> >>>>>>>>>>>> and the benefit >>>>>>>>>>>> is not guaranteed. >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> MST >
diff --git a/hw/pci/pci.c b/hw/pci/pci.c index e2eb4c3b4a..0320ac2bb3 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -65,6 +65,7 @@ bool pci_available = true; static char *pcibus_get_dev_path(DeviceState *dev); static char *pcibus_get_fw_dev_path(DeviceState *dev); static void pcibus_reset(BusState *qbus); +static bool pcie_has_upstream_port(PCIDevice *dev); static Property pci_props[] = { DEFINE_PROP_PCI_DEVFN("addr", PCIDevice, devfn, -1), @@ -1190,6 +1191,20 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, name); return NULL; + } /* + * With SRIOV and ARI, vfs can have non-zero slot in the conventional + * PCI interpretation as all five bits reserved for slot addresses are + * also used for function bits for the various vfs. Ignore that case. + * It is too early here to check for ARI capabilities in the PCI config + * space. Hence, we check for a vf device instead. + */ + else if (!pci_is_vf(pci_dev) && + pcie_has_upstream_port(pci_dev) && + PCI_SLOT(devfn)) { + error_setg(errp, "PCI: slot %d is not valid for %s," + " parent device only allows plugging into slot 0.", + PCI_SLOT(devfn), name); + return NULL; } pci_dev->devfn = devfn;