diff mbox series

[SRU,F,1/1] scsi: lpfc: Move NPIV's transport unregistration to after resource clean up

Message ID 20241106133955.35489-2-massimiliano.pellizzer@canonical.com
State New
Headers show
Series CVE-2024-36952 | expand

Commit Message

Massimiliano Pellizzer Nov. 6, 2024, 1:39 p.m. UTC
From: Justin Tee <justin.tee@broadcom.com>

[ Upstream commit 4ddf01f2f1504fa08b766e8cfeec558e9f8eef6c ]

There are cases after NPIV deletion where the fabric switch still believes
the NPIV is logged into the fabric.  This occurs when a vport is
unregistered before the Remove All DA_ID CT and LOGO ELS are sent to the
fabric.

Currently fc_remove_host(), which calls dev_loss_tmo for all D_IDs including
the fabric D_ID, removes the last ndlp reference and frees the ndlp rport
object.  This sometimes causes the race condition where the final DA_ID and
LOGO are skipped from being sent to the fabric switch.

Fix by moving the fc_remove_host() and scsi_remove_host() calls after DA_ID
and LOGO are sent.

Signed-off-by: Justin Tee <justin.tee@broadcom.com>
Link: https://lore.kernel.org/r/20240305200503.57317-3-justintee8345@gmail.com
Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
(backported from commit f2c7f029051edc4b394bb48edbe2297575abefe0 linux-5.15.y)
[mpellizzer: since the fix commit moves the function calls
fc_remove_host and scsi_remove_host at the end of lpfc_vport_delete the
variable ns_ndlp_referenced, and the logic around it, become meaningless;
the same logic has been removed by e9b1108316b9b in mainline, however
e9b1108316b9b is a huge commit which is not worth backporting for the
CVE fix.]
CVE-2024-36952
Signed-off-by: Massimiliano Pellizzer <massimiliano.pellizzer@canonical.com>
---
 drivers/scsi/lpfc/lpfc_vport.c | 28 +++-------------------------
 1 file changed, 3 insertions(+), 25 deletions(-)

Comments

Guoqing Jiang Nov. 7, 2024, 1:59 a.m. UTC | #1
On 11/6/24 14:39, Massimiliano Pellizzer wrote:
> From: Justin Tee <justin.tee@broadcom.com>
>
> [ Upstream commit 4ddf01f2f1504fa08b766e8cfeec558e9f8eef6c ]
>
> There are cases after NPIV deletion where the fabric switch still believes
> the NPIV is logged into the fabric.  This occurs when a vport is
> unregistered before the Remove All DA_ID CT and LOGO ELS are sent to the
> fabric.
>
> Currently fc_remove_host(), which calls dev_loss_tmo for all D_IDs including
> the fabric D_ID, removes the last ndlp reference and frees the ndlp rport
> object.  This sometimes causes the race condition where the final DA_ID and
> LOGO are skipped from being sent to the fabric switch.
>
> Fix by moving the fc_remove_host() and scsi_remove_host() calls after DA_ID
> and LOGO are sent.
>
> Signed-off-by: Justin Tee <justin.tee@broadcom.com>
> Link: https://lore.kernel.org/r/20240305200503.57317-3-justintee8345@gmail.com
> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> (backported from commit f2c7f029051edc4b394bb48edbe2297575abefe0 linux-5.15.y)
> [mpellizzer: since the fix commit moves the function calls
> fc_remove_host and scsi_remove_host at the end of lpfc_vport_delete the
> variable ns_ndlp_referenced, and the logic around it, become meaningless;
> the same logic has been removed by e9b1108316b9b in mainline, however
> e9b1108316b9b is a huge commit which is not worth backporting for the
> CVE fix.]

I would prefer not add partial change in e9b1108316b9b to this patch, 
pls add another patch for e9b1108316b9b if it is valuable or drop the 
partial change from the patch. Thanks, Guoqing
> CVE-2024-36952
> Signed-off-by: Massimiliano Pellizzer <massimiliano.pellizzer@canonical.com>
> ---
>   drivers/scsi/lpfc/lpfc_vport.c | 28 +++-------------------------
>   1 file changed, 3 insertions(+), 25 deletions(-)
>
> diff --git a/drivers/scsi/lpfc/lpfc_vport.c b/drivers/scsi/lpfc/lpfc_vport.c
> index d0296f7cf45fc..409d5bc405f4d 100644
> --- a/drivers/scsi/lpfc/lpfc_vport.c
> +++ b/drivers/scsi/lpfc/lpfc_vport.c
> @@ -606,7 +606,6 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
>   	struct Scsi_Host *shost = lpfc_shost_from_vport(vport);
>   	struct lpfc_hba   *phba = vport->phba;
>   	long timeout;
> -	bool ns_ndlp_referenced = false;
>   
>   	if (vport->port_type == LPFC_PHYSICAL_PORT) {
>   		lpfc_printf_vlog(vport, KERN_ERR, LOG_VPORT,
> @@ -656,22 +655,6 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
>   
>   	lpfc_debugfs_terminate(vport);
>   
> -	/*
> -	 * The call to fc_remove_host might release the NameServer ndlp. Since
> -	 * we might need to use the ndlp to send the DA_ID CT command,
> -	 * increment the reference for the NameServer ndlp to prevent it from
> -	 * being released.
> -	 */
> -	ndlp = lpfc_findnode_did(vport, NameServer_DID);
> -	if (ndlp && NLP_CHK_NODE_ACT(ndlp)) {
> -		lpfc_nlp_get(ndlp);
> -		ns_ndlp_referenced = true;
> -	}
> -
> -	/* Remove FC host and then SCSI host with the vport */
> -	fc_remove_host(shost);
> -	scsi_remove_host(shost);
> -
>   	ndlp = lpfc_findnode_did(phba->pport, Fabric_DID);
>   
>   	/* In case of driver unload, we shall not perform fabric logo as the
> @@ -774,14 +757,9 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
>   
>   skip_logo:
>   
> -	/*
> -	 * If the NameServer ndlp has been incremented to allow the DA_ID CT
> -	 * command to be sent, decrement the ndlp now.
> -	 */
> -	if (ns_ndlp_referenced) {
> -		ndlp = lpfc_findnode_did(vport, NameServer_DID);
> -		lpfc_nlp_put(ndlp);
> -	}
> +	/* Remove FC host to break driver binding. */
> +	fc_remove_host(shost);
> +	scsi_remove_host(shost);
>   
>   	lpfc_cleanup(vport);
>   	lpfc_sli_host_down(vport);
Massimiliano Pellizzer Nov. 7, 2024, 7:27 a.m. UTC | #2
On Thu, 7 Nov 2024 at 02:59, Guoqing Jiang <guoqing.jiang@canonical.com> wrote:
>
>
>
> On 11/6/24 14:39, Massimiliano Pellizzer wrote:
> > From: Justin Tee <justin.tee@broadcom.com>
> >
> > [ Upstream commit 4ddf01f2f1504fa08b766e8cfeec558e9f8eef6c ]
> >
> > There are cases after NPIV deletion where the fabric switch still believes
> > the NPIV is logged into the fabric.  This occurs when a vport is
> > unregistered before the Remove All DA_ID CT and LOGO ELS are sent to the
> > fabric.
> >
> > Currently fc_remove_host(), which calls dev_loss_tmo for all D_IDs including
> > the fabric D_ID, removes the last ndlp reference and frees the ndlp rport
> > object.  This sometimes causes the race condition where the final DA_ID and
> > LOGO are skipped from being sent to the fabric switch.
> >
> > Fix by moving the fc_remove_host() and scsi_remove_host() calls after DA_ID
> > and LOGO are sent.
> >
> > Signed-off-by: Justin Tee <justin.tee@broadcom.com>
> > Link: https://lore.kernel.org/r/20240305200503.57317-3-justintee8345@gmail.com
> > Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> > Signed-off-by: Sasha Levin <sashal@kernel.org>
> > (backported from commit f2c7f029051edc4b394bb48edbe2297575abefe0 linux-5.15.y)
> > [mpellizzer: since the fix commit moves the function calls
> > fc_remove_host and scsi_remove_host at the end of lpfc_vport_delete the
> > variable ns_ndlp_referenced, and the logic around it, become meaningless;
> > the same logic has been removed by e9b1108316b9b in mainline, however
> > e9b1108316b9b is a huge commit which is not worth backporting for the
> > CVE fix.]
>
> I would prefer not add partial change in e9b1108316b9b to this patch,
> pls add another patch for e9b1108316b9b if it is valuable or drop the
> partial change from the patch. Thanks, Guoqing
> > CVE-2024-36952
> > Signed-off-by: Massimiliano Pellizzer <massimiliano.pellizzer@canonical.com>
> > ---
> >   drivers/scsi/lpfc/lpfc_vport.c | 28 +++-------------------------
> >   1 file changed, 3 insertions(+), 25 deletions(-)
> >
> > diff --git a/drivers/scsi/lpfc/lpfc_vport.c b/drivers/scsi/lpfc/lpfc_vport.c
> > index d0296f7cf45fc..409d5bc405f4d 100644
> > --- a/drivers/scsi/lpfc/lpfc_vport.c
> > +++ b/drivers/scsi/lpfc/lpfc_vport.c
> > @@ -606,7 +606,6 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
> >       struct Scsi_Host *shost = lpfc_shost_from_vport(vport);
> >       struct lpfc_hba   *phba = vport->phba;
> >       long timeout;
> > -     bool ns_ndlp_referenced = false;
> >
> >       if (vport->port_type == LPFC_PHYSICAL_PORT) {
> >               lpfc_printf_vlog(vport, KERN_ERR, LOG_VPORT,
> > @@ -656,22 +655,6 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
> >
> >       lpfc_debugfs_terminate(vport);
> >
> > -     /*
> > -      * The call to fc_remove_host might release the NameServer ndlp. Since
> > -      * we might need to use the ndlp to send the DA_ID CT command,
> > -      * increment the reference for the NameServer ndlp to prevent it from
> > -      * being released.
> > -      */
> > -     ndlp = lpfc_findnode_did(vport, NameServer_DID);
> > -     if (ndlp && NLP_CHK_NODE_ACT(ndlp)) {
> > -             lpfc_nlp_get(ndlp);
> > -             ns_ndlp_referenced = true;
> > -     }
> > -
> > -     /* Remove FC host and then SCSI host with the vport */
> > -     fc_remove_host(shost);
> > -     scsi_remove_host(shost);
> > -
> >       ndlp = lpfc_findnode_did(phba->pport, Fabric_DID);
> >
> >       /* In case of driver unload, we shall not perform fabric logo as the
> > @@ -774,14 +757,9 @@ lpfc_vport_delete(struct fc_vport *fc_vport)
> >
> >   skip_logo:
> >
> > -     /*
> > -      * If the NameServer ndlp has been incremented to allow the DA_ID CT
> > -      * command to be sent, decrement the ndlp now.
> > -      */
> > -     if (ns_ndlp_referenced) {
> > -             ndlp = lpfc_findnode_did(vport, NameServer_DID);
> > -             lpfc_nlp_put(ndlp);
> > -     }
> > +     /* Remove FC host to break driver binding. */
> > +     fc_remove_host(shost);
> > +     scsi_remove_host(shost);
> >
> >       lpfc_cleanup(vport);
> >       lpfc_sli_host_down(vport);
>

Sending a v2 soon, thanks.
diff mbox series

Patch

diff --git a/drivers/scsi/lpfc/lpfc_vport.c b/drivers/scsi/lpfc/lpfc_vport.c
index d0296f7cf45fc..409d5bc405f4d 100644
--- a/drivers/scsi/lpfc/lpfc_vport.c
+++ b/drivers/scsi/lpfc/lpfc_vport.c
@@ -606,7 +606,6 @@  lpfc_vport_delete(struct fc_vport *fc_vport)
 	struct Scsi_Host *shost = lpfc_shost_from_vport(vport);
 	struct lpfc_hba   *phba = vport->phba;
 	long timeout;
-	bool ns_ndlp_referenced = false;
 
 	if (vport->port_type == LPFC_PHYSICAL_PORT) {
 		lpfc_printf_vlog(vport, KERN_ERR, LOG_VPORT,
@@ -656,22 +655,6 @@  lpfc_vport_delete(struct fc_vport *fc_vport)
 
 	lpfc_debugfs_terminate(vport);
 
-	/*
-	 * The call to fc_remove_host might release the NameServer ndlp. Since
-	 * we might need to use the ndlp to send the DA_ID CT command,
-	 * increment the reference for the NameServer ndlp to prevent it from
-	 * being released.
-	 */
-	ndlp = lpfc_findnode_did(vport, NameServer_DID);
-	if (ndlp && NLP_CHK_NODE_ACT(ndlp)) {
-		lpfc_nlp_get(ndlp);
-		ns_ndlp_referenced = true;
-	}
-
-	/* Remove FC host and then SCSI host with the vport */
-	fc_remove_host(shost);
-	scsi_remove_host(shost);
-
 	ndlp = lpfc_findnode_did(phba->pport, Fabric_DID);
 
 	/* In case of driver unload, we shall not perform fabric logo as the
@@ -774,14 +757,9 @@  lpfc_vport_delete(struct fc_vport *fc_vport)
 
 skip_logo:
 
-	/*
-	 * If the NameServer ndlp has been incremented to allow the DA_ID CT
-	 * command to be sent, decrement the ndlp now.
-	 */
-	if (ns_ndlp_referenced) {
-		ndlp = lpfc_findnode_did(vport, NameServer_DID);
-		lpfc_nlp_put(ndlp);
-	}
+	/* Remove FC host to break driver binding. */
+	fc_remove_host(shost);
+	scsi_remove_host(shost);
 
 	lpfc_cleanup(vport);
 	lpfc_sli_host_down(vport);