Message ID | 20241025215038.3125626-6-michal.winiarski@intel.com |
---|---|
State | New |
Headers | show |
Series | PCI: VF resizable BAR | expand |
On Fri, Oct 25, 2024 at 11:50:36PM +0200, Michał Winiarski wrote: > VF MMIO resource reservation, either created by system firmware and > inherited by Linux PCI subsystem or created by the subsystem itself, > should contain enough space to fit the BAR of all SR-IOV Virtual > Functions that can potentially be created (total VFs supported by the > device). > > However, that assumption only holds in an environment where VF BAR size > can't be modified. > > Add an additional check that verifies that VF BAR for all enabled VFs > fits within the underlying reservation resource. > > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> > --- > drivers/pci/iov.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index 79143c1bc7bb4..5de828e5a26ea 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -645,10 +645,14 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) > > nres = 0; > for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { > + int vf_bar_sz = pci_iov_resource_size(dev, > + pci_resource_to_iov(i)); And that should be resource_size_t. Fortunately, CI caught it: https://lore.kernel.org/intel-xe/172990062085.1334319.8298861567163770193@2413ebb6fbb6/ I'll address that in the next rev. -Michał > bars |= (1 << pci_resource_to_iov(i)); > res = &dev->resource[pci_resource_to_iov(i)]; > - if (res->parent) > - nres++; > + if (!res->parent || vf_bar_sz * nr_virtfn > resource_size(res)) > + continue; > + > + nres++; > } > if (nres != iov->nres) { > pci_err(dev, "not enough MMIO resources for SR-IOV\n"); > -- > 2.47.0 >
On Fri, Oct 25, 2024 at 11:50:36PM +0200, Michał Winiarski wrote: > VF MMIO resource reservation, either created by system firmware and > inherited by Linux PCI subsystem or created by the subsystem itself, > should contain enough space to fit the BAR of all SR-IOV Virtual > Functions that can potentially be created (total VFs supported by the > device). I don't think "VF resource reservation ... should contain enough space" is really accurate or actionable. It would be *nice* if the PF BAR is large enough to accommodate the largest supported VF BARs for all possible VFs, but if it doesn't, it's not really an error. It's just a reflection of the fact that resource space is limited. > However, that assumption only holds in an environment where VF BAR size > can't be modified. There's no reason to assume anything about how many VF BARs fit. The existing code should avoid enabling the requested nr_virtfn VFs if the PF doesn't have enough space -- I think that's what the "if (res->parent)" is supposed to be checking. The fact that you need a change here makes me suspect that we're missing some resource claim (and corresponding res->parent update) elsewhere when resizing the VF BAR. > Add an additional check that verifies that VF BAR for all enabled VFs > fits within the underlying reservation resource. > > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> > --- > drivers/pci/iov.c | 8 ++++++-- > 1 file changed, 6 insertions(+), 2 deletions(-) > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > index 79143c1bc7bb4..5de828e5a26ea 100644 > --- a/drivers/pci/iov.c > +++ b/drivers/pci/iov.c > @@ -645,10 +645,14 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) > > nres = 0; > for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { > + int vf_bar_sz = pci_iov_resource_size(dev, > + pci_resource_to_iov(i)); > bars |= (1 << pci_resource_to_iov(i)); > res = &dev->resource[pci_resource_to_iov(i)]; > - if (res->parent) > - nres++; > + if (!res->parent || vf_bar_sz * nr_virtfn > resource_size(res)) > + continue; > + > + nres++; > } > if (nres != iov->nres) { > pci_err(dev, "not enough MMIO resources for SR-IOV\n"); > -- > 2.47.0 >
On Mon, Oct 28, 2024 at 11:56:04AM -0500, Bjorn Helgaas wrote: > On Fri, Oct 25, 2024 at 11:50:36PM +0200, Michał Winiarski wrote: > > VF MMIO resource reservation, either created by system firmware and > > inherited by Linux PCI subsystem or created by the subsystem itself, > > should contain enough space to fit the BAR of all SR-IOV Virtual > > Functions that can potentially be created (total VFs supported by the > > device). > > I don't think "VF resource reservation ... should contain enough > space" is really accurate or actionable. It would be *nice* if the PF > BAR is large enough to accommodate the largest supported VF BARs for > all possible VFs, but if it doesn't, it's not really an error. It's > just a reflection of the fact that resource space is limited. From PCI perspective, you're right, IOV resources are optional, and it's not really an error for PF device itself. From IOV perspective - we do need those resources to be able to create VFs. All I'm trying to say here, is that the context of the change is the "success" case, where the VF BAR reservation was successfully assigned, and the PF will be able to create VFs. The case where there were not enough resources for VF BAR (and PF won't be able to create VFs) remains unchanged. > > However, that assumption only holds in an environment where VF BAR size > > can't be modified. > > There's no reason to assume anything about how many VF BARs fit. The > existing code should avoid enabling the requested nr_virtfn VFs if the > PF doesn't have enough space -- I think that's what the "if > (res->parent)" is supposed to be checking. > > The fact that you need a change here makes me suspect that we're > missing some resource claim (and corresponding res->parent update) > elsewhere when resizing the VF BAR. My understanding is that res->parent is only expressing that the resource is assigned. We don't really want to change that, the resource is still there and is assigned - we just want to make sure that VF enabling fails if the caller wants to enable more VFs than possible for current resource size. Let's use an example. A device with a single BAR. initial_vf_bar_size = X total_vfs = 4 supported_vf_resizable_bar_sizes = X, 2X, 4X With that - the initial underlying resource looks like this: +----------------------+ |+--------------------+| || || |+--------------------+| |+--------------------+| || || |+--------------------+| |+--------------------+| || || |+--------------------+| |+--------------------+| || || |+--------------------+| +----------------------+ Its size is 4X, and it contains BAR for 4 VFs. "resource_size >= vf_bar_size * num_vfs" is true for any num_vfs Let's assume that there are enough resources to assign it. Patch 4/7 allows to resize the entire resource (in addition to changing the VF BAR size), which means that after calling: pci_resize_resource() with size = 2X, the underlying resource will look like this: +----------------------+ |+--------------------+| || || || || || || || || |+--------------------+| |+--------------------+| || || || || || || || || |+--------------------+| |+--------------------+| || || || || || || || || |+--------------------+| |+--------------------+| || || || || || || || || |+--------------------+| +----------------------+ Its size is 8X, and it contains BAR for 4 VFs. "resource_size >= vf_bar_size * num_vfs" is true for any num_vfs It does require an extra 4X of MMIO resources, so this can fail in resource constrained environment, even though the original 4X resource was able to be assigned. The following patch 6/7 allows to change VF BAR size without touching the underlying reservation size. After calling pci_iov_vf_bar_set_size() to 4X and enabling a single VF, the underlying resource will look like this: +----------------------+ |+--------------------+| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| ||░░░░░░░░░░░░░░░░░░░░|| |+--------------------+| +----------------------+ Its size is 4X, but since pci_iov_vf_bar_set_size() was called, it is no longer able to accomodate 4 VFs. "resource_size >= vf_bar_size * num_vfs" is only true for num_vfs = 1 and any attempts to create more than 1 VF should fail. We don't need to worry about being MMIO resource constrained, no extra MMIO resources are needed. -Michał > > Add an additional check that verifies that VF BAR for all enabled VFs > > fits within the underlying reservation resource. > > > > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> > > --- > > drivers/pci/iov.c | 8 ++++++-- > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > > index 79143c1bc7bb4..5de828e5a26ea 100644 > > --- a/drivers/pci/iov.c > > +++ b/drivers/pci/iov.c > > @@ -645,10 +645,14 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) > > > > nres = 0; > > for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { > > + int vf_bar_sz = pci_iov_resource_size(dev, > > + pci_resource_to_iov(i)); > > bars |= (1 << pci_resource_to_iov(i)); > > res = &dev->resource[pci_resource_to_iov(i)]; > > - if (res->parent) > > - nres++; > > + if (!res->parent || vf_bar_sz * nr_virtfn > resource_size(res)) > > + continue; > > + > > + nres++; > > } > > if (nres != iov->nres) { > > pci_err(dev, "not enough MMIO resources for SR-IOV\n"); > > -- > > 2.47.0 > >
On Wed, Oct 30, 2024 at 12:43:19PM +0100, Michał Winiarski wrote: > On Mon, Oct 28, 2024 at 11:56:04AM -0500, Bjorn Helgaas wrote: > > On Fri, Oct 25, 2024 at 11:50:36PM +0200, Michał Winiarski wrote: > > > VF MMIO resource reservation, either created by system firmware and > > > inherited by Linux PCI subsystem or created by the subsystem itself, > > > should contain enough space to fit the BAR of all SR-IOV Virtual > > > Functions that can potentially be created (total VFs supported by the > > > device). > > > > I don't think "VF resource reservation ... should contain enough > > space" is really accurate or actionable. It would be *nice* if the PF > > BAR is large enough to accommodate the largest supported VF BARs for > > all possible VFs, but if it doesn't, it's not really an error. It's > > just a reflection of the fact that resource space is limited. > > From PCI perspective, you're right, IOV resources are optional, and it's > not really an error for PF device itself. > From IOV perspective - we do need those resources to be able to create > VFs. > > All I'm trying to say here, is that the context of the change is the > "success" case, where the VF BAR reservation was successfully assigned, > and the PF will be able to create VFs. > The case where there were not enough resources for VF BAR (and PF won't > be able to create VFs) remains unchanged. > > > > However, that assumption only holds in an environment where VF BAR size > > > can't be modified. > > > > There's no reason to assume anything about how many VF BARs fit. The > > existing code should avoid enabling the requested nr_virtfn VFs if the > > PF doesn't have enough space -- I think that's what the "if > > (res->parent)" is supposed to be checking. > > > > The fact that you need a change here makes me suspect that we're > > missing some resource claim (and corresponding res->parent update) > > elsewhere when resizing the VF BAR. > > My understanding is that res->parent is only expressing that the > resource is assigned. > We don't really want to change that, the resource is still there and is > assigned - we just want to make sure that VF enabling fails if the > caller wants to enable more VFs than possible for current resource size. > > Let's use an example. A device with a single BAR. > initial_vf_bar_size = X > total_vfs = 4 > supported_vf_resizable_bar_sizes = X, 2X, 4X In addition, IIUC we're assuming the PF BAR size is 4X, since the conclusion is that 4 VF BARs of size X fill it completely. > With that - the initial underlying resource looks like this: > +----------------------+ > |+--------------------+| > || || > |+--------------------+| > |+--------------------+| > || || > |+--------------------+| > |+--------------------+| > || || > |+--------------------+| > |+--------------------+| > || || > |+--------------------+| > +----------------------+ > Its size is 4X, and it contains BAR for 4 VFs. > "resource_size >= vf_bar_size * num_vfs" is true for any num_vfs > Let's assume that there are enough resources to assign it. > > Patch 4/7 allows to resize the entire resource (in addition to changing > the VF BAR size), which means that after calling: > pci_resize_resource() with size = 2X, the underlying resource will look > like this: > +----------------------+ > |+--------------------+| > || || > || || > || || > || || > |+--------------------+| > |+--------------------+| > || || > || || > || || > || || > |+--------------------+| > |+--------------------+| > || || > || || > || || > || || > |+--------------------+| > |+--------------------+| > || || > || || > || || > || || > |+--------------------+| > +----------------------+ > Its size is 8X, and it contains BAR for 4 VFs. > "resource_size >= vf_bar_size * num_vfs" is true for any num_vfs With the assumption that the PF BAR size is 4X, these VFs would no longer fit. I guess that's basically what you say here: > It does require an extra 4X of MMIO resources, so this can fail in > resource constrained environment, even though the original 4X resource > was able to be assigned. > > The following patch 6/7 allows to change VF BAR size without touching > the underlying reservation size. > After calling pci_iov_vf_bar_set_size() to 4X and enabling a single VF, > the underlying resource will look like this: > +----------------------+ > |+--------------------+| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > ||░░░░░░░░░░░░░░░░░░░░|| > |+--------------------+| > +----------------------+ > Its size is 4X, but since pci_iov_vf_bar_set_size() was called, it is no > longer able to accomodate 4 VFs. > "resource_size >= vf_bar_size * num_vfs" is only true for num_vfs = 1 > and any attempts to create more than 1 VF should fail. > We don't need to worry about being MMIO resource constrained, no extra > MMIO resources are needed. IIUC this series only resizes VF BARs. Those VF BARs are carved out of a PF BAR, and this series doesn't touch the PF BAR resizing path. I guess the driver might be able to increase the PF BAR size if necessary, and then increase the VF BAR size. It sounds like this patch is really a bug fix independent of VF BAR resizing. If we currently allow enabling more VFs than will fit in a PF BAR, that sounds like a bug. So if we try to enable too many VFs, sriov_enable() should fail. I still don't see why this check should change the res->parent test, though. > > > Add an additional check that verifies that VF BAR for all enabled VFs > > > fits within the underlying reservation resource. > > > > > > Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> > > > --- > > > drivers/pci/iov.c | 8 ++++++-- > > > 1 file changed, 6 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c > > > index 79143c1bc7bb4..5de828e5a26ea 100644 > > > --- a/drivers/pci/iov.c > > > +++ b/drivers/pci/iov.c > > > @@ -645,10 +645,14 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) > > > > > > nres = 0; > > > for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { > > > + int vf_bar_sz = pci_iov_resource_size(dev, > > > + pci_resource_to_iov(i)); > > > bars |= (1 << pci_resource_to_iov(i)); > > > res = &dev->resource[pci_resource_to_iov(i)]; > > > - if (res->parent) > > > - nres++; > > > + if (!res->parent || vf_bar_sz * nr_virtfn > resource_size(res)) > > > + continue; > > > + > > > + nres++; > > > } > > > if (nres != iov->nres) { > > > pci_err(dev, "not enough MMIO resources for SR-IOV\n"); > > > -- > > > 2.47.0 > > >
diff --git a/drivers/pci/iov.c b/drivers/pci/iov.c index 79143c1bc7bb4..5de828e5a26ea 100644 --- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -645,10 +645,14 @@ static int sriov_enable(struct pci_dev *dev, int nr_virtfn) nres = 0; for (i = 0; i < PCI_SRIOV_NUM_BARS; i++) { + int vf_bar_sz = pci_iov_resource_size(dev, + pci_resource_to_iov(i)); bars |= (1 << pci_resource_to_iov(i)); res = &dev->resource[pci_resource_to_iov(i)]; - if (res->parent) - nres++; + if (!res->parent || vf_bar_sz * nr_virtfn > resource_size(res)) + continue; + + nres++; } if (nres != iov->nres) { pci_err(dev, "not enough MMIO resources for SR-IOV\n");
VF MMIO resource reservation, either created by system firmware and inherited by Linux PCI subsystem or created by the subsystem itself, should contain enough space to fit the BAR of all SR-IOV Virtual Functions that can potentially be created (total VFs supported by the device). However, that assumption only holds in an environment where VF BAR size can't be modified. Add an additional check that verifies that VF BAR for all enabled VFs fits within the underlying reservation resource. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> --- drivers/pci/iov.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)