diff mbox

[2/7] vxge: fix crash of VF when unloading PF

Message ID 1292025782-16372-2-git-send-email-jon.mason@exar.com
State Accepted, archived
Delegated to: David Miller
Headers show

Commit Message

Jon Mason Dec. 11, 2010, 12:02 a.m. UTC
Calling pci_disable_sriov when unloading a SR-IOV physical function
driver from a host when a guest is using a virtual function from that
device can cause a host crash or VM crash.  The crash is caused by the
virtual config space no longer being present when PF is removed (due to
the pci_disable_sriov).  This can be avoided by not calling
pci_disable_sriov to disable the PCI space when shutting down the PF.
Each function in the X3100 operates independently and in this case will
operate properly in the absence of the PF.

Also, added improved logic in the detection of SR-IOV initialization.

Signed-off-by: Jon Mason <jon.mason@exar.com>
Signed-off-by: Ram Vepa <ram.vepa@exar.com>
---
 drivers/net/vxge/vxge-main.c |   25 ++++++++++++++++++-------
 1 files changed, 18 insertions(+), 7 deletions(-)

Comments

David Miller Dec. 11, 2010, 12:08 a.m. UTC | #1
From: Jon Mason <jon.mason@exar.com>
Date: Fri, 10 Dec 2010 18:02:57 -0600

> Calling pci_disable_sriov when unloading a SR-IOV physical function
> driver from a host when a guest is using a virtual function from that
> device can cause a host crash or VM crash.  The crash is caused by the
> virtual config space no longer being present when PF is removed (due to
> the pci_disable_sriov).  This can be avoided by not calling
> pci_disable_sriov to disable the PCI space when shutting down the PF.
> Each function in the X3100 operates independently and in this case will
> operate properly in the absence of the PF.
> 
> Also, added improved logic in the detection of SR-IOV initialization.
> 
> Signed-off-by: Jon Mason <jon.mason@exar.com>
> Signed-off-by: Ram Vepa <ram.vepa@exar.com>

Applied.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Chris Wright Dec. 11, 2010, 1:04 a.m. UTC | #2
* Jon Mason (jon.mason@exar.com) wrote:
> +static int __devinit is_sriov_initialized(struct pci_dev *pdev)
> +{
> +	int pos;
> +	u16 ctrl;
> +
> +	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
> +	if (pos) {
> +		pci_read_config_word(pdev, pos + PCI_SRIOV_CTRL, &ctrl);
> +		if (ctrl & PCI_SRIOV_CTRL_VFE)
> +			return 1;
> +	}
> +	return 0;
> +}

This is a helper that should go in drivers/pci/iov.c (it's a pure
pci thing).

> +
>  /**
>   * vxge_probe
>   * @pdev : structure containing the PCI related information of the device.
> @@ -4370,14 +4384,13 @@ vxge_probe(struct pci_dev *pdev, const struct pci_device_id *pre)
>  		num_vfs = vxge_get_num_vfs(function_mode) - 1;
>  
>  	/* Enable SRIOV mode, if firmware has SRIOV support and if it is a PF */
> -	if (is_sriov(function_mode) && (max_config_dev > 1) &&
> -		(ll_config->intr_type != INTA) &&
> -		(is_privileged == VXGE_HW_OK)) {
> -		ret = pci_enable_sriov(pdev, ((max_config_dev - 1) < num_vfs)
> -			? (max_config_dev - 1) : num_vfs);
> +	if (is_sriov(function_mode) && !is_sriov_initialized(pdev) &&
> +	   (ll_config->intr_type != INTA)) {
> +		ret = pci_enable_sriov(pdev, num_vfs);

This fundamentally changes the way VF's are allocated.  Now you cannot
specifiy the number of vfs to allocate with max_config_dev module
parameter.

>  		if (ret)
>  			vxge_debug_ll_config(VXGE_ERR,
>  				"Failed in enabling SRIOV mode: %d\n", ret);
> +			/* No need to fail out, as an error here is non-fatal */
>  	}
>  
>  	/*
> @@ -4673,8 +4686,6 @@ static void __devexit vxge_remove(struct pci_dev *pdev)
>  
>  	iounmap(vdev->bar0);
>  
> -	pci_disable_sriov(pdev);
> -

And you can never disable sriov.

This doesn't look like the right behaviour.

thanks,
-chris
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Ramkrishna Vepa Dec. 11, 2010, 1:35 a.m. UTC | #3
> > +
> >  /**
> >   * vxge_probe
> >   * @pdev : structure containing the PCI related information of the
> device.
> > @@ -4370,14 +4384,13 @@ vxge_probe(struct pci_dev *pdev, const struct
> pci_device_id *pre)
> >             num_vfs = vxge_get_num_vfs(function_mode) - 1;
> >
> >     /* Enable SRIOV mode, if firmware has SRIOV support and if it is a
> PF */
> > -   if (is_sriov(function_mode) && (max_config_dev > 1) &&
> > -           (ll_config->intr_type != INTA) &&
> > -           (is_privileged == VXGE_HW_OK)) {
> > -           ret = pci_enable_sriov(pdev, ((max_config_dev - 1) < num_vfs)
> > -                   ? (max_config_dev - 1) : num_vfs);
> > +   if (is_sriov(function_mode) && !is_sriov_initialized(pdev) &&
> > +      (ll_config->intr_type != INTA)) {
> > +           ret = pci_enable_sriov(pdev, num_vfs);
>
> This fundamentally changes the way VF's are allocated.  Now you cannot
> specifiy the number of vfs to allocate with max_config_dev module
> parameter.
The X3100 supports 11 different pci function modes where the user has the ability to choose the number of functions for each mode. This is more efficient usage of the hardware as the resources are carved out equally for the functions. By configuring the max_config_dev less than num_vfs, there's unnecessary wastage of resources.

>
> >             if (ret)
> >                     vxge_debug_ll_config(VXGE_ERR,
> >                             "Failed in enabling SRIOV mode: %d\n", ret);
> > +                   /* No need to fail out, as an error here is non-fatal */
> >     }
> >
> >     /*
> > @@ -4673,8 +4686,6 @@ static void __devexit vxge_remove(struct pci_dev
> *pdev)
> >
> >     iounmap(vdev->bar0);
> >
> > -   pci_disable_sriov(pdev);
> > -
>
> And you can never disable sriov.
If the device's pci function mode is changed, a power cycle is required in which case the functions are re-enumerated.

>
> This doesn't look like the right behaviour.
When the driver is loaded for the X3100 in SRIOV mode, it will be working in that mode even after it is unloaded and reloaded. As mentioned earlier, a change in the function mode requires and power cycle of the system.

The SRIOV feature is shipping in many distros and we need this fix back ported to prevent a possible crash when the PF is unloaded while the VFs are running in the guest OS in pass through mode.

If you have a better or simpler solution, that may take longer to implement, I would suggest that this solution be accepted in the interim.

Thanks,
Ram

The information and any attached documents contained in this message
may be confidential and/or legally privileged.  The message is
intended solely for the addressee(s).  If you are not the intended
recipient, you are hereby notified that any use, dissemination, or
reproduction is strictly prohibited and may be unlawful.  If you are
not the intended recipient, please contact the sender immediately by
return e-mail and destroy all copies of the original message.
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/net/vxge/vxge-main.c b/drivers/net/vxge/vxge-main.c
index 70c3279..9c68c60 100644
--- a/drivers/net/vxge/vxge-main.c
+++ b/drivers/net/vxge/vxge-main.c
@@ -4182,6 +4182,20 @@  static int vxge_probe_fw_update(struct vxgedev *vdev)
 	return ret;
 }
 
+static int __devinit is_sriov_initialized(struct pci_dev *pdev)
+{
+	int pos;
+	u16 ctrl;
+
+	pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
+	if (pos) {
+		pci_read_config_word(pdev, pos + PCI_SRIOV_CTRL, &ctrl);
+		if (ctrl & PCI_SRIOV_CTRL_VFE)
+			return 1;
+	}
+	return 0;
+}
+
 /**
  * vxge_probe
  * @pdev : structure containing the PCI related information of the device.
@@ -4370,14 +4384,13 @@  vxge_probe(struct pci_dev *pdev, const struct pci_device_id *pre)
 		num_vfs = vxge_get_num_vfs(function_mode) - 1;
 
 	/* Enable SRIOV mode, if firmware has SRIOV support and if it is a PF */
-	if (is_sriov(function_mode) && (max_config_dev > 1) &&
-		(ll_config->intr_type != INTA) &&
-		(is_privileged == VXGE_HW_OK)) {
-		ret = pci_enable_sriov(pdev, ((max_config_dev - 1) < num_vfs)
-			? (max_config_dev - 1) : num_vfs);
+	if (is_sriov(function_mode) && !is_sriov_initialized(pdev) &&
+	   (ll_config->intr_type != INTA)) {
+		ret = pci_enable_sriov(pdev, num_vfs);
 		if (ret)
 			vxge_debug_ll_config(VXGE_ERR,
 				"Failed in enabling SRIOV mode: %d\n", ret);
+			/* No need to fail out, as an error here is non-fatal */
 	}
 
 	/*
@@ -4673,8 +4686,6 @@  static void __devexit vxge_remove(struct pci_dev *pdev)
 
 	iounmap(vdev->bar0);
 
-	pci_disable_sriov(pdev);
-
 	/* we are safe to free it now */
 	free_netdev(dev);