diff mbox series

[6/6] PCI: vmd: Disable MSI/X remapping when possible

Message ID 20200728194945.14126-7-jonathan.derrick@intel.com
State New
Headers show
Series VMD MSI Remapping Bypass | expand

Commit Message

Jon Derrick July 28, 2020, 7:49 p.m. UTC
VMD will retransmit child device MSI/X using its own MSI/X table and
requester-id. This limits the number of MSI/X available to the whole
child device domain to the number of VMD MSI/X interrupts. Some VMD
devices have a mode where this remapping can be disabled, allowing child
device interrupts to bypass processing with the VMD MSI/X domain
interrupt handler and going straight the child device interrupt handler,
allowing for better performance and scaling. The requester-id still gets
changed to the VMD endpoint's requester-id, and the interrupt remapping
handlers have been updated to properly set IRTE for child device
interrupts to the VMD endpoint's context.

Some VMD platforms have existing production BIOS which rely on MSI/X
remapping and won't explicitly program the MSI/X remapping bit. This
re-enables MSI/X remapping on unload.

Disabling MSI/X remapping is only available for Icelake Server and
client VMD products.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
---
 drivers/pci/controller/vmd.c | 58 +++++++++++++++++++++++++++++++-----
 1 file changed, 50 insertions(+), 8 deletions(-)

Comments

Bjorn Helgaas Oct. 20, 2020, 8:23 p.m. UTC | #1
On Tue, Jul 28, 2020 at 01:49:45PM -0600, Jon Derrick wrote:
> VMD will retransmit child device MSI/X using its own MSI/X table and
> requester-id. This limits the number of MSI/X available to the whole
> child device domain to the number of VMD MSI/X interrupts. Some VMD
> devices have a mode where this remapping can be disabled, allowing child
> device interrupts to bypass processing with the VMD MSI/X domain
> interrupt handler and going straight the child device interrupt handler,
> allowing for better performance and scaling. The requester-id still gets
> changed to the VMD endpoint's requester-id, and the interrupt remapping
> handlers have been updated to properly set IRTE for child device
> interrupts to the VMD endpoint's context.
> 
> Some VMD platforms have existing production BIOS which rely on MSI/X
> remapping and won't explicitly program the MSI/X remapping bit. This
> re-enables MSI/X remapping on unload.
> 
> Disabling MSI/X remapping is only available for Icelake Server and
> client VMD products.

I'm trying to button up the PCI pull request.  I'm updating Lorenzo's
pci/vmd branch for other reasons, and I notice it still contains this
commit: 67b219dc3a6d ("PCI: vmd: Disable MSI/X remapping when
possible").

I want to remove "MSI/X" from the commit log and comments of that
commit because I don't know what "MSI/X" means.

Should I replace it with "MSI" or "MSI-X" or "MSI/MSI-X"?  Something
else?  When possible, I want to use exactly the same terminology as
the PCIe spec to avoid ambiguity.

It talks about "MSI/X table", which makes me think it should be
"MSI-X", since plain MSI does not have a table.

> Reviewed-by: Andy Shevchenko <andriy.shevchenko@intel.com>
> Signed-off-by: Jon Derrick <jonathan.derrick@intel.com>
> ---
>  drivers/pci/controller/vmd.c | 58 +++++++++++++++++++++++++++++++-----
>  1 file changed, 50 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 3214d785fa5d..e8cde2c390b9 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -53,6 +53,12 @@ enum vmd_features {
>  	 * vendor-specific capability space
>  	 */
>  	VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP	= (1 << 2),
> +
> +	/*
> +	 * Device remaps MSI/X transactions into its MSI/X table and requires
> +	 * VMD MSI domain for child device interrupt handling
> +	 */
> +	VMD_FEAT_REMAPS_MSI			= (1 << 3),
>  };
>  
>  /*
> @@ -298,6 +304,15 @@ static struct msi_domain_info vmd_msi_domain_info = {
>  	.chip		= &vmd_msi_controller,
>  };
>  
> +static void vmd_enable_msi_remapping(struct vmd_dev *vmd, bool enable)
> +{
> +	u16 reg;
> +
> +	pci_read_config_word(vmd->dev, PCI_REG_VMCONFIG, &reg);
> +	reg = enable ? (reg & ~0x2) : (reg | 0x2);
> +	pci_write_config_word(vmd->dev, PCI_REG_VMCONFIG, reg);
> +}
> +
>  static int vmd_create_irq_domain(struct vmd_dev *vmd)
>  {
>  	struct fwnode_handle *fn;
> @@ -318,6 +333,13 @@ static int vmd_create_irq_domain(struct vmd_dev *vmd)
>  
>  static void vmd_remove_irq_domain(struct vmd_dev *vmd)
>  {
> +	/*
> +	 * Some production BIOS won't enable remapping between soft reboots.
> +	 * Ensure remapping is restored before unloading the driver
> +	 */
> +	if (!vmd->msix_count)
> +		vmd_enable_msi_remapping(vmd, true);
> +
>  	if (vmd->irq_domain) {
>  		struct fwnode_handle *fn = vmd->irq_domain->fwnode;
>  
> @@ -606,6 +628,27 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  			return ret;
>  	}
>  
> +	/*
> +	 * Currently MSI remapping must be enabled in guest passthrough mode
> +	 * due to some missing interrupt remapping plumbing. This is probably
> +	 * acceptable because the guest is usually CPU-limited and MSI
> +	 * remapping doesn't become a performance bottleneck.
> +	 */
> +	if (features & VMD_FEAT_REMAPS_MSI || offset[0] || offset[1]) {
> +		ret = vmd_alloc_irqs(vmd);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	/*
> +	 * Disable remapping for performance if possible based on if VMD IRQs
> +	 * had been allocated.
> +	 */
> +	if (vmd->msix_count)
> +		vmd_enable_msi_remapping(vmd, true);
> +	else
> +		vmd_enable_msi_remapping(vmd, false);
> +
>  	/*
>  	 * Certain VMD devices may have a root port configuration option which
>  	 * limits the bus range to between 0-127, 128-255, or 224-255
> @@ -674,9 +717,11 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
>  
>  	sd->node = pcibus_to_node(vmd->dev->bus);
>  
> -	ret = vmd_create_irq_domain(vmd);
> -	if (ret)
> -		return ret;
> +	if (vmd->msix_count) {
> +		ret = vmd_create_irq_domain(vmd);
> +		if (ret)
> +			return ret;
> +	}
>  
>  	pci_add_resource(&resources, &vmd->resources[0]);
>  	pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]);
> @@ -738,10 +783,6 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
>  	    dma_set_mask_and_coherent(&dev->dev, DMA_BIT_MASK(32)))
>  		return -ENODEV;
>  
> -	err = vmd_alloc_irqs(vmd);
> -	if (err)
> -		return err;
> -
>  	spin_lock_init(&vmd->cfg_lock);
>  	pci_set_drvdata(dev, vmd);
>  	err = vmd_enable_domain(vmd, (unsigned long) id->driver_data);
> @@ -809,7 +850,8 @@ static SIMPLE_DEV_PM_OPS(vmd_dev_pm_ops, vmd_suspend, vmd_resume);
>  
>  static const struct pci_device_id vmd_ids[] = {
>  	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D),
> -		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP,},
> +		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
> +				VMD_FEAT_REMAPS_MSI,},
>  	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0),
>  		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW |
>  				VMD_FEAT_HAS_BUS_RESTRICTIONS,},
> -- 
> 2.27.0
>
diff mbox series

Patch

diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 3214d785fa5d..e8cde2c390b9 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -53,6 +53,12 @@  enum vmd_features {
 	 * vendor-specific capability space
 	 */
 	VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP	= (1 << 2),
+
+	/*
+	 * Device remaps MSI/X transactions into its MSI/X table and requires
+	 * VMD MSI domain for child device interrupt handling
+	 */
+	VMD_FEAT_REMAPS_MSI			= (1 << 3),
 };
 
 /*
@@ -298,6 +304,15 @@  static struct msi_domain_info vmd_msi_domain_info = {
 	.chip		= &vmd_msi_controller,
 };
 
+static void vmd_enable_msi_remapping(struct vmd_dev *vmd, bool enable)
+{
+	u16 reg;
+
+	pci_read_config_word(vmd->dev, PCI_REG_VMCONFIG, &reg);
+	reg = enable ? (reg & ~0x2) : (reg | 0x2);
+	pci_write_config_word(vmd->dev, PCI_REG_VMCONFIG, reg);
+}
+
 static int vmd_create_irq_domain(struct vmd_dev *vmd)
 {
 	struct fwnode_handle *fn;
@@ -318,6 +333,13 @@  static int vmd_create_irq_domain(struct vmd_dev *vmd)
 
 static void vmd_remove_irq_domain(struct vmd_dev *vmd)
 {
+	/*
+	 * Some production BIOS won't enable remapping between soft reboots.
+	 * Ensure remapping is restored before unloading the driver
+	 */
+	if (!vmd->msix_count)
+		vmd_enable_msi_remapping(vmd, true);
+
 	if (vmd->irq_domain) {
 		struct fwnode_handle *fn = vmd->irq_domain->fwnode;
 
@@ -606,6 +628,27 @@  static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 			return ret;
 	}
 
+	/*
+	 * Currently MSI remapping must be enabled in guest passthrough mode
+	 * due to some missing interrupt remapping plumbing. This is probably
+	 * acceptable because the guest is usually CPU-limited and MSI
+	 * remapping doesn't become a performance bottleneck.
+	 */
+	if (features & VMD_FEAT_REMAPS_MSI || offset[0] || offset[1]) {
+		ret = vmd_alloc_irqs(vmd);
+		if (ret)
+			return ret;
+	}
+
+	/*
+	 * Disable remapping for performance if possible based on if VMD IRQs
+	 * had been allocated.
+	 */
+	if (vmd->msix_count)
+		vmd_enable_msi_remapping(vmd, true);
+	else
+		vmd_enable_msi_remapping(vmd, false);
+
 	/*
 	 * Certain VMD devices may have a root port configuration option which
 	 * limits the bus range to between 0-127, 128-255, or 224-255
@@ -674,9 +717,11 @@  static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
 
 	sd->node = pcibus_to_node(vmd->dev->bus);
 
-	ret = vmd_create_irq_domain(vmd);
-	if (ret)
-		return ret;
+	if (vmd->msix_count) {
+		ret = vmd_create_irq_domain(vmd);
+		if (ret)
+			return ret;
+	}
 
 	pci_add_resource(&resources, &vmd->resources[0]);
 	pci_add_resource_offset(&resources, &vmd->resources[1], offset[0]);
@@ -738,10 +783,6 @@  static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
 	    dma_set_mask_and_coherent(&dev->dev, DMA_BIT_MASK(32)))
 		return -ENODEV;
 
-	err = vmd_alloc_irqs(vmd);
-	if (err)
-		return err;
-
 	spin_lock_init(&vmd->cfg_lock);
 	pci_set_drvdata(dev, vmd);
 	err = vmd_enable_domain(vmd, (unsigned long) id->driver_data);
@@ -809,7 +850,8 @@  static SIMPLE_DEV_PM_OPS(vmd_dev_pm_ops, vmd_suspend, vmd_resume);
 
 static const struct pci_device_id vmd_ids[] = {
 	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_201D),
-		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP,},
+		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW_VSCAP |
+				VMD_FEAT_REMAPS_MSI,},
 	{PCI_DEVICE(PCI_VENDOR_ID_INTEL, PCI_DEVICE_ID_INTEL_VMD_28C0),
 		.driver_data = VMD_FEAT_HAS_MEMBAR_SHADOW |
 				VMD_FEAT_HAS_BUS_RESTRICTIONS,},